Assessing political news quality:
An automated comparison of political news quality indicators across German
newspapers with different modalities and reach
Nicolas Mattis
Student number: 12283177
Research Master’s Thesis
Graduate School of Communication University of Amsterdam
Research Master in Communication Science
Supervised by Dr. Anne Kroon
Word count: 7,497
1
Abstract
In order to best perform their societal functions, news media must adhere to certain normative standards for news quality – especially when reporting about events with political significance. While various past studies have examined (political) news quality, they often differ in the indicators and operationalisations that they use, making it difficult to compare findings across studies. Hence, this thesis proposes a comprehensive framework for
automatically measuring political news quality that is easily scalable and can be applied in various contexts as well as over longer timespans. It combines existing measures with newly developed classifiers that assess impartiality, thereby highlighting the potential that
supervised machine learning has for journalism studies and providing a means for future studies to assess impartiality in an automated manner. Furthermore, this thesis generates new insights into differences in political news quality across German newspapers that differ in their reach (national vs. regional) and modality (online vs. offline). The results indicate that both modality and reach appear to affect newspapers’ performance in terms of political news quality indicators, even though these differences tend to not be particularly pronounced. While especially online newspapers performed comparably worse in terms of indicators such as actor diversity, impartiality, and emotionality, the results suggest that modality and reach alone are not sufficient to explain differences across news outlets. On the whole, this thesis highlights the potential that automated research methods have for future research into (political) news quality and urges scholars to employ and advance existing measures to provide a fuller picture of (political) news quality across countries, outlets and, maybe most importantly, over time.
Keywords: Automated content analysis, News quality, Impartiality, Diversity, Supervised machine learning
2 Introduction
Often referred to as the fourth estate, news media are widely regarded as crucial for well-functioning democracies (Jacobi, Kleinen-von Königslöw, & Ruigrok, 2016). Building on Locke (1967), Strömbäck (2005) argues that one can describe the relationship between news media and democracy as a social contract: Democracy creates the necessary conditions for news media to operate in, while news media contribute to democracy by providing relevant, high-quality information to both the public and the government, as well as by
serving as a watchdog of a countries’ institutions.To live up to those standards and inform the public both accurately and fairly, news outlets need to adhere to certain normative news quality standards such as diversity and impartiality (Urban & Schweiger, 2014).
Naturally, this begs the question how well newspapers in a given media market adhere to such standards. While there is ample (comparative) research on different news quality indicators such as diversity, negativity, and objectivity (e.g. Burggraaff & Trilling, 2017; Humprecht & Esser, 2018; Jacobi et al., 2016; Masini et al., 2018), studies often differ in their choice and operationalisation of these indicators. Hence, this thesis proposes a comprehensive framework for assessing news quality through an automated content analysis (ACA) by combining existing measures with newly developed classifiers that assess three key indicators of impartiality on the article level.
ACA constitutes an efficient and affordable research methodology for the analysis of large bodies of data (Grimmer & Stewart, 2013) that can be applied to journalistic content in both an inductive and a deductive manner (Boumans & Trilling, 2016). Given that the field of journalism studies tends to largely neglect automated research methods (Boumans & Trilling, 2016), this thesis hopes to a) drive the field methodologically forward – by illustrating the potential of supervised machine learning (SML) and moving beyond mere case studies - and b) facilitate future comparative research by providing a means to assess and monitor
3 On a theoretical level, this thesis addresses concerns over an overall decrease in
journalistic news quality, that a number of scholars have voiced since new technological affordances and increased economic pressures have begun transforming traditional newspaper markets (e.g. Burggraaff & Trilling, 2017; Humprecht & Esser, 2018; Jacobi et al., 2016; Jungnickel, 2011; McManus, 2009; Plasser, 2005). The underlying argument of those concerns is that the current transformation of the newspaper market results in a fierce
competition for advertising revenue. In order to cope, newspapers attempt to boost their reach to attract advertisers, often at the expense of journalistic quality (McManus, 2009) – a process that scholars refer to as commercialisation (Jacobi et al., 2016) or tabloidization (Esser, 1999).
Commercialisation is often assumed to be especially pronounced in online news content (e.g. Burggraaf & Trilling, 2017). However, existent research on the effects of modality is inconclusive as some researchers have found evidence for lower news quality online (e.g. Burggraaff & Trilling, 2017; Welbers, Van Atteveldt, Kleinnijenhuis, & Ruigrok, 2018), whereas others have found no notable differences (Ghersetti, 2014) or even
contradictory ones (e.g. Humprecht & Esser, 2018). Other important factors that might affect news quality are the structure of a given media market (Esser & Umbricht, 2013) and the size of a newspaper (Masini et al., 2018). For example, Masini et al. (2018) claim that local newspapers can allocate fewer resources to quality reporting, especially about events on the national level. In light of these considerations, this thesis compares political news quality across German newspapers with different modalities (online vs. offline) and reach (national vs. regional). By unravelling the effects that those factors have, this thesis hopes to add to existing research by providing a clearer picture of political news quality in Germany.
In the following, this thesis will a) lay out the theoretical underpinnings of an automated news quality measurement framework, b) apply it to a sample of German
newspapers, c) present the differences across newspapers with different reach and modalities, and d) close with implications of ACA and suggestions for future research.
4 Theoretical Framework
Political news quality and its indicators
What constitutes good political news? The answer to this question will likely depend on who answers it. As Urban and Schweiger (2014) argue, a journalist might judge an article’s quality by the effort that it took to produce, whereas a reader might simply judge it by how enjoyable it is to read. This study builds on McQuail’s (1992) notion of the
‘marketplace of ideas’ and takes a normative perspective on the quality of news accordingly. Following Urban and Schweiger (2014), it posits that high-quality political news should provide accurate and impartial information that gives room to a wide variety of relevant actors and their positions in order to enhance the public’s understanding of important political matters as well as broader societal debates. This perspective builds on Strömbäck’s (2005) idea of a “participatory democracy”, the notion that citizens should (be able to) participate in all aspects of political life. Naturally, to do so effectively, citizens need to have access to high quality political information – not only during and before elections, but all year round.
Over time, various media and journalism scholars have spelled out the elements that constitute (political) quality news. For example, Jungnickel (2011) identified seven quality criteria, namely lawfulness, accuracy, relevance, comprehensibility, transparency,
impartiality, and diversity, with various sub-dimensions. Urban and Schweiger (2014) propose a somewhat similar, yet slightly more parsimonious model with six quality criteria: diversity, impartiality, relevance, comprehensibility, accuracy, and ethics. Although many analyses of news quality indicators have relied on manual content analyses (e.g. Esser & Umbricht, 2013; Masini et al., 2018; Ramírez de la Piscina, Gonzalez Gorosarri, Aiestaran, Zabalondo, & Agirre, 2015), several of those indicators can be assessed through ACA. In fact, a few studies have already done so (Burggraaff & Trilling, 2017; Jacobi et al., 2016). ACA constitutes a valuable research methodology in journalism studies as it a) significantly reduces the cost of traditional content analysis, b) provides a means to test hypotheses on a larger
5 scale, and c) potentially might even reveal insights that more traditional methods have missed (Boumans & Trilling, 2016). It also allows researcher to explore over-time developments with comparable ease. In the following, four core dimensions of an automated measurement
approach as taken in this study, namely diversity, impartiality, emotionality, and comprehensibility are discussed.
Diversity
“[D]iversity in public affairs coverage is crucial because the news media are expected to create a mediated public sphere that reflects the diversity of interests, voices, and views in society” (McQuail 1992, as cited in Humprecht & Esser, 2018, p. 1825). However, despite a sharp increase in literature on the topic of news diversity, the concept’s exact definition remains contested (Humprecht & Esser, 2018). Furthermore, diversity can be assessed at different levels of analysis, such as on the article- or newspaper-level (Masini et al., 2018).
Despite these issues, most studies agree on two core dimensions: viewpoint diversity and actor (or source) diversity (e.g. Masini et al., 2018; Urban & Schweiger, 2014; Voakes, Kapfer, Kurpius & Chern, 1996). While these dimensions are undoubtedly intertwined (Masini et al., 2018), they differ in the granularity of their operationalisations. Viewpoint diversity is a multidimensional and context-dependent concept that often refers specifically to frames (e.g. Benson, 2009). Automatically measuring viewpoint diversity is therefore a considerable challenge that exceeds the scope of this project (for an attempt see Czymara & van Klingeren, 2019). Actor diversity in contrast is a more straightforward concept in that it merely measures the quantity and range of different sources. Often, a differentiation is made between elite and non-elite sources (e.g. Humprecht & Esser, 2018). Other studies examine the proportional representation of governing and opposition parties (e.g. van Hoof et al., 2014). The logic underlying both approaches is that elite actors such as governing parties or their representatives are inherently more newsworthy and therefore covered more frequently than opposition parties or laypeople (Castells, 2009).
6 While the assumption that a greater variety of actors equals a greater variety of news does not necessarily hold true (Carpenter, 2010), actor diversity can have important
implications for viewpoint diversity (Bennet, 1996), as it reveals to what extent different actors are given the space to shape public debates (Benson & Wood, 2015). In fact, Masini and van Aelst (2018) showed that actor and viewpoint diversity are strongly intertwined. Hence, actor diversity can be considered a necessary precondition for viewpoint diversity.
Existing research into actor diversity points towards several medium-specific
differences. For example, Masini et al. (2018) found that overall, national newspapers exhibit greater actor diversity than local newspapers - supposedly due to differences in capital, staff, and resources (for contradicting findings see Voakes et al., 1996). Regarding differences between modalities, Burggraaff and Trilling (2017) argue that commercialisation affects online news outlets more strongly, as they a) face a higher degree of competition within a dynamic and distraction provoking environment, b) exhibit a slightly different understanding of their journalistic roles, and c) profit from detailed insights into what types of articles generate the most attention that allow them to fine-tune news accordingly. Accordingly, they found that online newspapers to amplify differences between elite and popular news outlets. Lastly, Jacobi et al. (2016) demonstrated that online news articles are more likely to focus on leaders and reference elites. Together, these insights motivate the following hypotheses:
H1: National newspapers exhibit greater degrees of actor diversity than local newspapers. H2: The positive effect of national (vs. regional) newspaper types on actor diversity will be
more pronounced for print than online news.
Impartiality
The notion of impartiality emerged as a journalistic norm in the early 20th
7 ever since (Maras, 2013). It is often equated with objectivity (Boudana, 2016; Maras, 2013), and remains one of the core principles that news editors and journalists around the world operate by (Maras, 2013). However, despite its popularity, impartiality still lacks a clear and agreed-upon definition and operationalisation (Cushion & Thomas, 2019). Prior examinations of impartiality have either examined journalists’ and editors’ selection processes (e.g.
Cushion, Kilby, Thomas, Morani, & Sambrook, 2018), or zoomed in on specific indicators such as the proportion of different sources or the use of and elaboration on statistics (Cushion, Lewis, & Callaghan, 2017, Wahl-Jorgensen, Berry, Garcia-Blanco, Bennett, & Cable, 2017).
This thesis focuses on impartiality on the content level. It builds on Urban and
Schweiger’s (2014) definition of impartiality as “a neutral and balanced coverage of all facts, demands and positions” (p. 823). Accordingly, it employs three key indicators to assess impartiality by: neutrality, balance of viewpoints, and balance of sources. These dimensions are taken from Urban and Schweiger (2014) and lend themselves rather well to content analysis, as articles can be coded according to the presence or absence of each dimension. In its purest form, balance is defined as “the allocation of equal space to opposing views” (Cox, 2007, as cited in Wahl-Jorgensen et al., 2017, p.783). However, systematically balancing sources and viewpoints might still distort reality and introduce artificial balance (Boudana, 2016). A good illustration of this is climate change journalism: Balancing believers and deniers, as has frequently been done in news media (Hiles & Hinnant, 2014), creates a false image of an open debate that is arguably worse than, for example, a “’weight of evidence’ approach” (Cushion & Thomas, 2019, p. 395). For this reason, this thesis
operationalises balance in terms of whether or not an article gives room to challengers of the central actor. An article that does so arguably depicts at least a limited range of sources and viewpoints that a) constitute an attempt by the journalist to create a certain degree of balance, and b) expose citizens to a certain range of views. Neutrality refers to lack of evaluation by the author, which relates directly to the notion of an objective reporting style (Maras, 2013).
8 Given the various different operationalisations of impartiality, specific insights into differences in impartiality among German newspapers are still missing. Hence, a research question is formulated.
RQ1: Does impartiality differ depending on a) the modality (online vs. print) and b) the type (national vs. regional) of newspaper outlets?
Emotionality & negativity
Various scholars have argued that an increased use of emotions might be one of the ways in which news media react to the economic pressures they are facing (e.g. Burggraff & Trilling, 2017; Jacobi et al., 2016), as emotional news is more likely to grab people’s
attention, therefore maximising readership and advertising revenue (Burggraff & Trilling, 2017; McManus, 2009). This thesis conceptualises emotionality as a bi-polar concept with positivity on the one, and negativity on the other side of the spectrum. Arguably, negativity has received considerably more scholarly attention than positivity. Negative information has repeatedly been shown to attract more attention and be better remembered
(Knobloch-Westerwick, Mothes, & Polavin, 2020; Soroka & McAdams, 2015). This so-called negativity bias provides a strong incentive for journalists to use negativity strategically in order to attract attention. While research also suggests a certain demand for it (Shoemaker & Cohen, 2006, as cited by Burggraaf & Trilling, 2017), similar effects cannot be claimed for positive news. However, it can be argued that strongly positive news still deviates from the ideal of neutrality that is traditionally valued in political news (Jacobi et al., 2016).
Existing research into emotionality has shown that a) regional newspapers employ comparably much negativity (Boukes, & Vliegenthart, 2020), b) print news tends to be more positive than online news (Burggraaff & Trilling, 2017) and c) emotionality is less
9 to higher reliance on agency material among online newspapers (Jacobi et al., 2016; Welbers et al., 2018). Taken together, these findings motivate the following hypotheses as well as an explorative research question that addresses emotionality across news outlets with a different reach:
H3a: Online news will feature more negativity than print news. H3b: Regional news will feature more negativity than national news. H4: Online news will feature less emotionality than print news.
RQ2: (To what extent) does emotionality differ between national and regional newspapers?
Comprehensibility
Especially in light of the vast amount of literature that suggests that the average citizen lacks a detailed understanding of politics (Lau & Redlawsk, 2001), it is easy to argue that in order to live up to its ideal societal role, news media needs to convey information in an understandable fashion. Although comprehensibility is determined by several factors such as coherence, conciseness or the use of additional stimuli (see Urban & Schweiger, 2014), this thesis focuses exclusively on readability. Readability refers to how easy or difficult it is to read a given text, thereby capturing quite closely what Urban and Schweiger (2014) term simplicity. Readability has been linked to newspaper circulation in Germany in the past (Schoenbach & Lauf, 2002) as it constitutes not only a normative ideal to evaluate news by, but it also appears to be a factor that affects audience evaluation and readership (Humprecht & Esser, 2018). Thus, readability constitutes an important aspect of comprehensibility that can be measured reliably. The readability of German dailies appears to be comparably high (Björnsson, 1983), but the literature does not yet reveal generalisable differences between various types of outlets. Hence, potential differences are explored by means of the following research question.
10
RQ3: (To what extent) do German news media differ in their readability depending on a) reach (national vs. regional) and b) modality (online vs. print)?
The Framework
Taken together, these quality dimensions result in a comprehensive framework for automatically assessing news quality (see Figure 1). The framework combines various quality criteria that are largely laid out by Urban and Schweiger (2014) (see Appendix A) and, despite being incomplete, allows establishing a benchmark for assessing news quality in a resource-effective manner.
Figure 1. Framework for automatically assessing news quality.
Methodology
This thesis combined several computational methods in order to tap into four
indicators of news quality. Due to a lack of automated measurements for impartiality, a new measurement approach was developed through manual content analysis (MCA) and SML. For
11 the other three quality indicators, this thesis built on and partly adapted previous work (e.g. Burggraaff & Trilling, 2017; Jacobi et al., 2016, Masini et al., 2018).
Sample
The final sample consisted of 11,491 political news articles that were gathered from six German newspapers’ online and print editions over a seven-week period between the 20th
of April 2020 and the 8th of June 2020. 8,077 duplicated or incorrectly scraped articles were deleted from the initial dataset (N= 19,568). The newspapers had either a national (“Die Welt”, “Die Süddeutsche”, “Der Tagesspiegel”) or a regional scope („Aachener Zeitung“, “Rheinische Post“, “Stuttgarter Zeitung“). The national newspapers are usually referred to as elite newspapers (e.g. Masini et al., 2018). For the regional newspapers, a distinction between elite and popular is more difficult (Boukes & Vliegenthart, 2017) if applicable at all.
Importantly, the sampling was conducted during the height of the Covid-19 crisis. As a result, the article content might differ uniquely from comparable samples.
The German media market
Furthermore, it is important to consider two particularities of the German media market that might affect the results and their comparability to other studies. First German newspapers perform comparably well in terms of news quality, as they profit from strong levels of professionalisation and institutionalised self-regulation (Hallin & Mancini, 2004), a media culture that values the notion of a marketplace of ideas (Esser & Brüggemann, 2010) and a strong public broadcast sector that appears to have spill over effects on other media (Humprecht & Esser, 2018). Moreover, the challenges brought about by declining readership, increased competition, and the internet are less pronounced in Germany than they are in many other countries (Brüggemann, Engesser, Büchel, & Castro, 2016).
Second, German regional newspapers are not per se localised, but often cover a wide range of topics and reach comparably high levels of readership (Humprecht & Esser, 2018). In fact, regional newspapers constitute about 75% of the total market and even quality papers
12 such as the Süddeutsche “draw a large chunk of their readership from their […] area” (Esser & Brüggemann, 2010, p. 40f). Hence, differences that have been found in other European countries might be less pronounced in the German market.
Data collection
All online content was gathered by means of RSS-scrapers within the inca
infrastructure for automated content analysis (Trilling et al., 2018). All scrapers were written by the author prior to the data collection. The scrapers accessed each newspapers RSS-feed on an hourly basis and checked if new articles were available. If so, the key elements of each article (date, title, teaser, text, category, author) were downloaded, parsed, and stored in a database. However, due to server issues during the sampling period, only very few articles were scraped in the first month (see Figure 2). For the final sample of political online news articles (N= 1,072), only articles that were published in the politics section of a given newspaper were retained. The scrapers are available in a public GitHub repository together with the rest of the code that was run for this thesis (https://github.com/nickma101/Thesis).
Figure 2. Number of sampled online and print political news by publication date
Print articles were accessed through NexisUni, downloaded manually in sets of 100 articles at a time, and parsed with the LexisNexisTools package (Gruber, 2020) in R. Articles
13 were selected if at least one of the following terms was present in Nexis Uni’s classification section: politik, politische, politisch, partei, parteien, landtag, bundestag, regierung, wahl, wahlen. Arguably, this sampling procedure resulted in a broader scope of articles than the category-based sampling for the online articles. Together with the server issues, this might explain the stark difference in the amount of print (N= 10,419) and online news articles (N= 1,072) in the sample. For an overview of the final sample distribution see Appendix B. Data pre-processing
In order to remove unnecessary noise within the data, several data cleaning steps were performed prior to the hypothesis testing. All article texts were processed with the python packages SpaCy (Honnibal & Montani, 2017) and NLTK (Bird, Klein, & Loper, 2009) in order to remove duplicates, formatting errors, and articles that had not been scraped correctly (e.g. because they were behind a paywall). In addition to that, a second version of the article text was created by removing stop words (words that are very frequent but not important for the meaning of a sentence) and reducing all words to their stems. This step was necessary to improve the accuracy of the emotionality analyses as well as the overall performance of some of the impartiality classifiers.
Independent variables
The two independent variables under study where the reach and the modality of a given news article. An article’s reach (M= .56, SD= .50) was determined by the newspaper that published it and assessed by means of a dummy variable that was coded as one for national and zero for regional newspapers. Similarly, an article’s modality (M= .91, SD= .29) was assessed by means of a dummy variable that was coded with one for print and zero for online articles.
Dependent variables
Following Masini et al. (2018), actor diversity was assessed as a count variable on the article level. This thesis differentiated four actor types, namely political elite actors, political
14 opposition actors, persons, and organisations. It thereby accounted for the frequency of not only different types of political actors, but also laypeople and non-political organisations. All actors were detected through SpaCy’s NER feature. If an entity that SpaCy had classified as a person was present in one of the manually created political actor lists (see
https://github.com/nickma101/Thesis), it was coded accordingly. If not, it was coded as a generic person with no particular political significance. For each actor group, the overall number of references to their respective actors was calculated. Next, a dummy variable was created for each entity group with the value one, if at least one actor from this group was named and zero if not. Lastly, the four dummy variables were added together into an index that ranged from zero (no actor groups mentioned) to four (all actor groups mentioned) (M= 2.55, SD= .80).
Impartiality was defined as a balanced coverage of relevant sources and viewpoints in
combination with an author that refrains from personal evaluation. As laid out in the
theoretical framework, balance was operationalised in terms of whether or not an article gave room to challengers of the central actor. By using a definition that expects journalists to provide more than just a single view and source for a particular topic or standpoint rather than to achieve a (near-) perfect balance, this thesis hoped to avoid earlier mentioned fallacies of assessing balance.
Impartiality was assessed through three indicators: 1) The presence/absence of balanced viewpoints (“Is the standpoint of the central political actor challenged by another actor in the text?”), 2) the presence/absence of balanced sources (“Does the article quote two or more different types of political actors - e.g. a national elite and a national opposition actor?”), and 3) the presence / absence of evaluation by the author (“Does the author
personally evaluate anything within the article?”). Added together, these indicators amount to a four-point impartiality index that ranges from a minimum of zero for not impartial, to a maximum of three for very impartial (M= 1.39, SD= .80).
15
Manual content analysis
Given the nuance that was necessary for assessing these indicators, dictionary-based measures did not suffice to accurately determine to what extent an article was impartial. Hence, impartiality was assessed through a SML approach, where binary classifiers were trained on manually coded training material. The manual content analysis was performed by a set of four student coders, the researcher being one of them. Before the final coding, all coders received training and the codebook (see Appendix C) was amended in accordance with the problems that had emerged during this training. Overall, a total of 487 articles were coded into three binary categories. See table 1 in Appendix E for their distribution.
Intercoder reliability
Several intercoder reliability tests were performed to ensure a sufficient level of reliability. Overall, three different sets of articles (Datasets A, B, and C) were checked for intercoder reliability: a) a subsample (N= 25) of the initial print data (N= 250) for all coders, b) a subsample (N= 15) of the online data (N= 150) for two coders, and c) a subsample (N= 10) for a second set of print articles (N= 98) for another two coders. The indicator “neutrality” proved to be reliable across all datasets and coders with a Krippendorff’s alpha of .79 or higher and a Cohen’s Kappa of .60 or higher. However, the other two indicators were less reliable. For balance of actors, dataset A (α = .63) and dataset B (α = .61) failed to meet the recommended intercoder reliability threshold of .667 (Neuendorf, 2002). For balance of viewpoints, the same was true for dataset A (α = .53) For a detailed overview of all results see Appendix D.
Given that SML relies on highly reliable data, coders who didn’t achieve acceptable results in a dataset were excluded from the training data. Specifically, coder 4 was excluded from the training data for balance of both actors and viewpoints, due to the comparably low Cohen’s Kappa results (see Appendix D, table 1). For the same reason Coder 3 was excluded from the online training data for balance of viewpoints (see Appendix D, table 3).
16
Classifier training & prediction
The classifiers were trained in python using the sklearn package. To do so, the training data for each variable was split into a training (80%) and a validation set (20%). The article text was represented in the form of vectors. Specifically, four types of vectors were created and compared: 1) Count vectors, 2) Term Frequency-Inverse Document Frequency (TF-IDF) vectors with unigrams, 3) TF-IDF vectors with bigrams, and 4) TF-IDF vectors with both, uni- and bigrams. For each indicator, four different types of classifiers were tested: 1) a stochastic gradient descent classifier, 2) a naïve Bayes classifier, 3) a support vector machines classifier and 4) a k-nearest neighbour classifier. All classifiers were cross-validated and their hyperparameters were tuned using either grid-search or randomised search. Furthermore, all classifiers were trained on both the original and the clean text to compare their performance.
Table 1. Best text classification results for impartiality indicators
Indicator Classifier Text Vector type Categories Precision Recall F1
Balance of viewpoints Stochastic Gradient Descent Original Count 0 (N=204) .82 .68 .75 1 (N=99) .52 .70 .60 .69 Balance of actors K-nearest neighbour Original TF-IDF with uni- &
bigrams 0 (N=297) .77 .73 .75 1 (N=152) .43 .48 .46 .66 Neutrality Support vector machine Clean Count 0 (N= 204) .68 .60 .63 1 (N= 283) .72 .79 .75 .70 Classifier parameters as follows:1) Balance of viewpoints: loss="hinge", alpha = .0001, max_iter=200, random_state=8, 2) Balance of actors: default settings, 3) Neutrality: default settings
The final classifier evaluation was based on their precision, recall, and f1-score. Preference was given to balanced results, as both categories were equally important for all indicators. Overall, the distribution of the predicted categories mirrored the distribution of the manually coded categories, except for balance of actors where the trained classifier reversed the two categories’ distribution (see table 5 in Appendix E). Table 1 provides an overview of
17 the best classification results per indicator as well as the text versions and vector
representations that they worked best on. For additional information see tables 2 through 4 in Appendix E.
Emotionality was defined as “the presence of positivity and/or negativity as opposed
to the absence of both” (Burggraaff & Trilling, 2017, p. 6) in a given news article.
Emotionality was assessed on the article-level through dictionary-based counting of positive or negative words. All analyses were performed based on the Rauh sentiment dictionary (Rauh, 2018), which has been specifically developed for the application to political texts. It augments two more general sentiment dictionaries, namely SentiWS (Remus, Quasthoff, & Heyer, 2010) and GPC (Waltinger, 2010) and allows for a better and more valid measurement of sentiment in political texts (Rauh, 2018). To account for article length, the number of emotional words was divided by the number of words in a text, resulting in a final emotionality ratio that was used for the hypothesis testing (M= .11, SD= .03)
In addition to emotionality, this study also measured negativity. Negativity was assessed through the same dictionary-based procedure as emotionality, where all negative words in a text were counted based on Rauh’s (2018) sentiment dictionary. Dividing the sum of negative words by the number of words in an article resulted in a final negativity ratio (M= .48, SD= .02), that was used for the hypothesis testing.
Readability (M= 40.68, SD= 10.67) was used as a single indicator for news article’s
comprehensibility. It was measured with the Flesch-reading-ease score (FRE), which assigns different weights to a text’s average sentence length (ASL) and the average number of syllables per word (ASW). It was computed with the textstat python package
(https://github.com/shivam5992/textstat). For German texts, the package relies on Amstad’s (1987) adapted formula:
18 The FRE has been shown to be almost identical to similar other readability measures (Štajner, Evans, Orasan, & Mitkov, 2012) and has been applied to news articles before in various countries (e.g. Amstad, 1978; Dalecki, Lasorsa, & Lewis, 2009; Plavén-Sigray, Matheson, Schiffler, & Thompson, 2017). It ranges from a minimum of zero (very difficult) to a maximum of 100 (very easy).
Data analysis & storage
All hypothesis tests were performed in either Python or SPSS. The code for both the data preparation and the analyses is available in a public GitHub repository
(https://github.com/nickma101/Thesis). The raw data on which the code was run as well as all relevant SPSS output can be accessed on an OSF server (https://osf.io/rdw9z/).
Results
This thesis explored four automatically measured news quality indicators as well as negativity. Table 2 provides an overview with means and standard deviations for the overall sample and the four subsamples under study (see Appendix F for newspaper comparisons). Since the dependent variables were not normally distributed (see Appendix G), all following analyses relied on statistical approaches that do not require normally distributed data.
Actor diversity
The first political news quality indicator under study was the diversity of actors. H1 assumed that national newspapers exhibit greater degrees of actor diversity than regional newspapers and H2 assumed that modality moderates this effect in such a way that online news exhibit greater differences than print news. The two hypotheses were tested through an ordinal regression in SPSS, with the reach and modality dummies as predictors, article length as a covariate and the diversity index as the dependent variable. The results from table 3 supported H2 but not H1, as, contrary to what H1 had expected, the odds of a regional article exhibiting a higher degree of actor diversity was 1.167 (95% CI [1.082, 1.258]) that of a national article. H2 was supported, as an interaction effect showed that the odds of an online
19 article by a regional newspaper to exhibit higher levels of actor diversity was .71 (95% CI [.599, .905]).1
Overall, H2 was thus supported, whereas H1 was rejected, as regional newspapers displayed higher actor diversity when controlling for article length and comparable actor diversity one when article length was not taken into account.
Table 2.Overall means and standard deviations of dependent variables.
Group N Diversity Impartiality Emotionality Negativity Readability Length
total 11,491 2.52 (.80) 1.39 (.80) .11 (.03) .05 (.02) 40.68 (10.67) 495.96 (709.94) online 1,072 2.27 (.84) 1.38 (.81) .12 (.03) .06 (.03) 40.66 (10.10) 460.12 (310.21) print 10,419 2.55 (.79) 1.39 (.80) .11 (.03) .05 (.02) 40.68 (10.73) 499.65 (738.81) regional 5,038 2.52 (.84) 1.49 (.77) .11 (.04) .04 (.02) 40.66 (10.90) 399.33 (976.15) national 6,453 2.52 (.76) 1.30 (.81) .12 (.03) .05 (.02) 40.70 (10.90) 571.40 (375.08)
Means with standard deviations in brackets.
Length is calculated as the average number of words in a text.
1 The results must be interpreted with caution, due to bad model fit and violotation of the assumption of
proportional odds (see Appendix H). Independent Kruskal-Wallis H tests showed a significant mean difference between online (M= 2.27, SD= .84) and print newspapers (M= 2.55, SD= .79); H(1)= 112.70, p= <.001, but an insignificant one between national (M= 2.52, SD= .76) and regional newspapers (M= 2.52, SD= .84); H(1)= .69, p= .407.
20 Table 3. Ordinal regression results for the effects of reach, modality, and article length on actor diversity and impartiality.
Parameter estimates
Actor Diversity Impartiality
Parameters B (SE) OR (95% CI) B (SE) OR (95% CI)
Diversity index = 0 -4.686 (.12)*** .009 (.007 - .012) - - Diversity index = 1 -2.236 (.05)*** .107 (.097 - .118) - - Diversity index = 2 .252 (.04)*** 1.286 (1.189 – 1.391) - - Diversity index = 3 2.490 (.04)*** 12.061 (1.082 – 1.258) - - Impartiality = 0 - - -3.457 (.06)*** .032 (1.189 – 1.391) Impartiality = 1 - - -.890 (.04)*** .378 (.378 - .446) Impartiality = 2 - - 1.494 (.05)*** 4.046 (4.046 – 4.905) Reach = regional .154 (.04)*** 1.167 (1.082 – 1.258) -.012 (.04) .988 (.916 – 1.066) Modality = online -.514 (.08)*** .598 (.511 - .700) -.224 (.08)** .799 (.682 - .937) Length .001 (<.01)*** 1.001 (1.000 – 1.001) -.002 (<.01)*** .998 (1.048 – 1.699) Reach x Modality -.341 (.12)** .711 (.559 - .905) .288 (.12)* 1.334 (1-048 – 1.699) R2 .022 .177 N = 11,491
OR (95% CI) = Odds ratios with 95% confidence intervals Function= Logit
Diversity and impartiality indexes are the intercepts. *p< .05, **p<.01, ***p<.001
Impartiality
RQ 1 asked whether impartiality differs depending on a) the modality (online vs. print) and b) the type (national vs. regional) of different newspaper outlets? This RQ was answered through an ordinal regression with the modality and reach dummies as well as article length as predictors of the dependent variable impartiality. The results from table 3 showed that the
21 odds of an online article exhibiting a higher degree of impartiality was .80 (95% CI [682, .937]) that of a print article. This effect was statistically significant; X2(1)= 7.65, p= .006. In contrast, the reach of an article had a statistically insignificant effect; X2(1)= .09, p= .760. Article length had a very minor, but significant positive effect on impartiality with an odds ratio of 1.00 (95% CI [1.048, 1.699]); X2(1)= -.002, p< .001. Overall, the model explained 17.7% of the variance in the dependent variable and fit the data significantly better than an intercept only model; X2(4)= 2007.99, p<.001. While the model failed to pass the test of parallel lines (X2(8)= 192.04, p< .001), its outcomes still strongly suggest that online news
articles are on average less impartial than print articles, whereas the reach of a newspaper does not appear to make a difference when length is controlled for (See Appendix H for model fit).
Emotionality & negativity
This thesis assumed that the use of emotionality and negativity are driven by the amount of competition that newspapers face and the resources that they have at their disposal. Specifically, it argued that online news articles (H3a) and regional news articles (H3b) tend to exhibit significantly more negativity than print and national news respectively, as they try to catch readers attention in a way that does not necessarily rely on resource extensive quality reporting. Furthermore, online news articles were expected to feature less emotionality than print news articles (H4), as they tend to rely more on agency-copy.
All hypotheses were tested by means of two linear ordinary least square regressions in SPSS – one with emotionality and a second one with negativity as the dependent variable. The three predictors, namely modality, reach, and article length were added stepwise, thereby allowing for a comparison of model fit with different predictor variables. For both regression models, model fit increased significantly with each added predictor. Overall, the regression model for negativity explained 4.1% and the regression model for emotionality explained 2.1% of the overall variance. See Appendix I for an overview of model fit measures.
22 Table 4. Ordinary least squares (OLS) regressions for emotionality, negativity, and readability
Emotionality Negativity Readability
b (SE) β b (SE) β b (SE) β
Constant .114 (.001) .054 (.001) 40.64
Modality = print -.006 (.001) -.053*** -.011 -.136*** .019 .001
Reach = national .008 (.001) .116*** .007 .140*** .033 .002
Length .000003(.000) .056*** .000001 (.000) .041***
R2 .021 .041 <.001
SE: standard error. N = 11,491.
*p < .05 **p < .01 ***p < .001.
While the difference in negativity between online (M= .06, SD= .03) and print
newspapers (M= .05, SD= .02) was rather small overall, the results of the regression analysis in table 5 show that this difference was indeed significant, even when controlling for an articles’ length. Thus, H3a was supported. In contrast, the regression results in table 5 did not support H3b, as national newspapers exhibited slightly more negativity (M= .05, SD= .02) than regional newspapers (M= .04, SD= .02) when controlling for article length. Regarding H4, the results of the regression analysis in table 4 showed that emotionality was higher in online news articles (M= .12, SD= .03), than it was in print news articles (M= .11, SD= .03). While this result contradicts the initial hypothesis, it aligns with the negativity results and indicates that online news articles might generally employ more emotional words in order to attract readers’ attention.
RQ3 examined differences in emotionality between national and regional newspapers. Mirroring the results for negativity, the regression results in table 4 showed that news articles published by national newspapers (M= .12, SD= .03) featured a higher proportion of
23 Figure 3. Emotionality ratio by modality Figure 5. Emotionality ratio by reach
Figure 5. Negativity ratio by modality Figure 6. Negativity ratio by reach
Figure 7. Readability score by modality Figure 8. Readability score by reach
24 This could indicate that regional newspapers rely more on agency copy in order to make up for a lack of resources.
However, in light of a) the low percentage of variance that the two models explained and b) the fact that in large samples even small differences can become significant, it is important to note that the differences between newspapers with different reach and modalities were rather small on the aggregate and do not explain the variation in emotionality or
negativity very well. Especially given the large variation in individual news articles’ scores (see Figures 3 to 8) and the fact that paid-for online content was missing from the sample. Readability
RQ3 explored differences in readability across modalities (online vs. print) and reach (national vs. regional). It was answered through a linear ordinary least square regression with the readability score as the dependent and the modality and reach dummies as the independent variables. The independent variables were added step by step. Adding reach to the base model with modality as the only predictor led to a significantly better model. The regression results from table 4 showed that on the aggregate there was no significant difference in readability between online (M= 40.66, SD= 10.10) and print (M= 40.68, SD= 10.73) newspapers, nor was there a significant difference between national (M= 40.70, SD= 10.90) and regional (M= 40.66, SD= 10.90) newspapers. Despite interesting and statistically significant differences on the newspaper level2, the overall readability scores for each newspaper were somewhat close to the value of 40, indicating that the articles were somewhat difficult to read but still
understandable for a larger part of the population. see Appendix J for newspaper comparisons. Conclusion & Discussion
In an ideal democracy, news media should provide citizens with high-quality political information, so that they can best perform their civic duties (Strömbäck, 2005). This thesis
25 automatically measured four important news quality indicators, to explore if and to what extent different modalities and types of German newspapers adhere to this ideal. The results revealed that both the modality and the reach of a newspaper play a role in determining its performance in terms of news quality indicators.
Modality affected news quality in so far, as print news exhibited a more diverse set of actors and a higher degree of impartiality as well as a less emotional and less negative
reporting style. This largely aligns with Burggraaff and Trilling’s (2017) assumption that the comparably high online competition leads to content with lower news quality. However, the results for emotionality and negativity deviate from the outcomes one would expect if online news relied more on agency copy, as suggested by Welbers et al. (2018). That said, the different levels of news quality across modalities might also be (partly) due to only freely available articles being scraped. Possibly, German newspapers offer a certain extent of free but lower-quality content online, whereas high-quality content must be paid for. Future research should try to find ways to overcome the difficulties of sampling paid-for articles in order to investigate if this assumption is true, especially as it might have important societal implications for news consumers that are not willing to pay for online subscriptions.
For newspapers with different levels of reach, fewer differences emerged, although regional newspapers did appear to report in a less emotional and negative manner. This might either be due to reliance on agency copy as a result of comparably limited resources (Welbers et al., 2018) or attest to the comparably high quality of regional German newspapers
(Humprecht & Esser, 2018) – a notion that was supported by comparably high levels of actor diversity and impartiality among regional sample. A third explanation might be that this thesis’ differentiation of reach a) constitutes an oversimplification, as two of the three sampled national newspaper also cater to local audiences, and b) underestimates the high levels of readership that regional newspapers have (Esser & Brüggemann, 2013) which might translate into considerable financial resources that can be invested into quality reporting.
26 Thus, future research should explore different newspaper classifications, for example
differentiating them by their number of subscribers in order to account for differences in their available resources.
Apart from the theoretical contributions of this thesis, its arguably biggest value lies in proposing a scalable framework for the assessment of news quality and especially in the creation of classifiers that assess impartiality. While the classifiers did not reach optimal performance in terms of their precision and recall and while the balance of actors classifier seemed to overstate the prevalence of a balanced set of actors, they still show the potential of SML approaches for journalism studies. Especially, since such methodological approaches are still under-utilised (Boumans & Trilling, 2015). Future research should build on this work either by advancing the existing classifiers or by developing new and more comprehensive measurements for impartiality and other high-level constructs. Furthermore, scholars should use automated approaches such as the one taken in this study in order to assess the
development of news quality indicators over time. In contrast to the comparison of newspaper categories, research into the over-time development of news quality could better address the arguments that have been put forward about the implications of commercialisation (e.g. Burggraaff & Trilling, 2017; Humprecht & Esser, 2018; Jacobi et al., 2016).
Naturally, the results of this thesis must be interpreted in light og several important limitations. First, the server issues led to a comparably small amount of political news articles that might somewhat impede their comparability with the considerably larger print sample. Second, the classifiers, especially the one for balance of actors, did not reach optimal performance.
Third, the large sample size (N=11.489) might have turned even small differences statistically significant. Hence, it is important to stress that most mean differences in the sample were rather small – at least across the different modalities and levels of reach. This directly relates to a third limitation, namely the classification of newspapers. The regression
27 models with modality and reach as predictors mostly only accounted for a small amount of variance in the dependent variables. Given the at times considerable differences between individual newspapers, this suggests that other factors such as available resources (Masini et al., 2018), journalistic style and reporting culture might be more important.
Lastly, it is important to stress that dictionary-based approaches cannot substitute the analytical depth and contextualisation that manual content analyses (MCA) provide (Boyd & Crawford, 2012) as they rely on language models that are at best an approximation of the real phenomenon (Grimmer & Stewart, 2013).
Nonetheless, by combining MCA with ACA through SML, this thesis has shown that automated research cannot only provide basic insights into news quality, but that it can in fact also be used for capturing high-level constructs in a resource-efficient and scalable manner. Since doing so holds great potential for comparative studies, this thesis hopes to inspire future applications of SML that draw on and extend the framework employed by this study.
28 References
Amstad, T. (1978). Wie verständlich sind unsere Zeitungen?[How understandable are our newspapers?]. Unpublished doctoral dissertation, University of Zürich, Switzerland. Bennett, W. L. (1996). An introduction to journalism norms and representations of politics.
Political Communication 13(4), 373–384.
https://doi.org/10.1080/10584609.1996.9963126
Benson, R. (2009). What makes news more multiperspectival? A field analysis. Poetics, 37(5-6), 402-418. https://doi.org/10.1016/j.poetic.2009.09.002
Benson, R., & Wood, T. (2015). Who says what or nothing at all? Speakers, frames, and frameless quotes in unauthorized immigration news in the United States, Norway, and France. American Behavioral Scientist, 59(7), 802-821.
https://doi.org/10.1177/0002764215573257
Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python: analyzing
text with the natural language toolkit. O'Reilly Media, Inc..
Björnsson, C. H. (1983). Readability of newspapers in 11 languages. Reading Research
Quarterly, 480-497. https://doi.org/10.2307/747382
Boudana, S. (2016). Impartiality is not fair: Toward an alternative approach to the evaluation of content bias in news stories. Journalism, 17(5), 600-618.
https://doi.org/10.1177/1464884915571295
Boukes, M., & Vliegenthart, R. (2020). A general pattern in the construction of economic newsworthiness? Analyzing news factors in popular, quality, regional, and financial newspapers. Journalism, 21(2), 279-300. https://doi.org/10.1177/1464884917725989 Boumans, J. W., & Trilling, D. (2016). Taking stock of the toolkit: An overview of relevant
automated content analysis approaches and techniques for digital journalism scholars.
29 Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural,
technological, and scholarly phenomenon. Information, communication & society,
15(5), 662-679. https://doi.org/10.1080/1369118X.2012.678878
Brüggemann, M., Engesser, S., Büchel, F., Humprecht, E., and Castro, L. (2016). “Framing the Newspaper Crisis.” Journalism Studies 17(5), 533–551.
http://dx.doi.org/10.1080/1461670X.2015.1006871
Burggraaff, C., & Trilling, D. (2017). Through a different gate: An automated content analysis of how online news and print news differ. Journalism.
https://doi.org/10.1177/1464884917716699
Carpenter, S. (2010). A study of content diversity in online citizen journalism and online newspaper articles. New Media & Society, 12(7), 1064-1084.
https://doi.org/10.1177/1461444809348772
Carpenter, S., Boehmer, J., & Fico, F. (2016). The measurement of journalistic role enactments: A study of organizational constraints and support in for-profit and
nonprofit journalism. Journalism & Mass Communication Quarterly, 93(3), 587-608. https://doi.org/10.1177/1077699015607335
Castells, M. (2013). Communication power. Oxford University Press. Oxford.
Cushion, S., Lewis, J., & Callaghan, R. (2017). Data journalism, impartiality and statistical claims: Towards more independent scrutiny in news reporting. Journalism Practice,
11(10), 1198-1215. https://doi.org/10.1080/17512786.2016.1256789
Cushion, S., Kilby, A., Thomas, R., Morani, M., & Sambrook, R. (2018). Newspapers, impartiality and television news: Intermedia agenda-setting during the 2015 UK general election campaign. Journalism Studies, 19(2), 162-181.
https://doi.org/10.1080/1461670X.2016.1171163
Cushion, S., & Thomas, R. (2019). From quantitative precision to qualitative judgements: Professional perspectives about the impartiality of television news during the 2015 UK
30 General Election. Journalism, 20(3), 392-409.
https://doi.org/10.1177/1464884916685909
Czymara, C. S., & van Klingeren, M. (2019). New perspective? Comparing Frame
Occurrence in Online and Traditional News Media Reporting on Europe’s “Migration Crisis”. https://doi.org/10.31235/osf.io/h3tpy
Dalecki, L., Lasorsa, D. L., & Lewis, S. C. (2009). The news readability problem. Journalism
Practice, 3(1), 1-12. https://doi.org/10.1080/17512780802560708
Esser, F. (1999). Tabloidization'of news: A comparative analysis of Anglo-American and German press journalism. European journal of communication, 14(3), 291-324. https://doi.org/10.1177/0267323199014003001
Esser, F., & Brüggemann, M. (2010). The strategic crisis of German newspapers. The
changing business of journalism and its implications for democracy, 39-54.
Esser, F., & Umbricht, A. (2013). Competing models of journalism? Political affairs coverage in US, British, German, Swiss, French and Italian newspapers. Journalism, 14(8), 989-1007. https://doi.org/10.1177/1464884913482551
Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political analysis, 21(3), 267-297. https://doi.org/10.1093/pan/mps028
Gruber, J. (2020). LexisNexisTools. An R package for working with newspaper data from 'LexisNexis’. Retreived from: https://github.com/JBGruber/LexisNexisTools
Hallin, D. C., & Mancini, P. (2004). Comparing media systems: Three models of media and
politics. Cambridge university press. https://doi.org/10.1017/CBO9780511790867
Hiles, S. S., & Hinnant, A. (2014). Climate change in the newsroom: Journalists’ evolving standards of objectivity when covering global warming. Science Communication,
31 Honnibal, M., & Montani, I. (2017). spaCy 2: Natural language understanding with Bloom
embeddings, convolutional neural networks and incremental parsing.
Humprecht, E., & Esser, F. (2018). Diversity in online news: On the importance of ownership types and media system types. Journalism Studies, 19(12), 1825-1847.
https://doi.org/10.1080/1461670X.2017.1308229
Jacobi, C., Kleinen-von Königslöw, K., & Ruigrok, N. (2016). Political News in Online and Print Newspapers: Are online editions better by electoral democratic standards?.
Digital Journalism, 4(6), 723-742. https://doi.org/10.1080/21670811.2015.1087810
Jungnickel, K. (2011). Nachrichtenqualität aus Nutzersicht. Ein Vergleich zwischen
Leserurteilen und wissenschaftlich-normativen Qualitätsansprüchen. M&K Medien &
Kommunikationswissenschaft, 59(3), 360-378.
https://doi.org/10.5771/1615-634x-2011-3-360
Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big data & society,
1(1), 2053951714528481. https://doi.org/10.1177/2053951714528481
Knobloch-Westerwick, S., Mothes, C., & Polavin, N. (2020). Confirmation bias, ingroup bias, and negativity bias in selective exposure to political information. Communication
Research, 47(1), 104-124. https://doi.org/10.1177/0093650217719596
Lau, R. R., & Redlawsk, D. P. (2001). Advantages and disadvantages of cognitive heuristics in political decision making. American Journal of Political Science, 951-971.
https://doi.org/10.2307/2669334
Leung, D. K., & Lee, F. L. (2015). How journalists value positive news: The influence of professional beliefs, market considerations, and political attitudes. Journalism Studies,
16(2), 289-304. https://doi.org/10.1080/1461670X.2013.869062
Locke, J. (1967). Locke: Two treatises of government. Cambridge University Press. Maras, S. (2013). Objectivity in journalism. John Wiley & Sons.
32 Masini, A., Van Aelst, P., Zerback, T., Reinemann, C., Mancini, P., Mazzoni, M., ... & Coen,
S. (2018). Measuring and explaining the diversity of voices and viewpoints in the news: A comparative study on the determinants of content diversity of immigration news. Journalism Studies, 19(15), 2324-2343.
https://doi.org/10.1080/1461670X.2017.1343650
Masini, A., & Van Aelst, P. (2017). Actor diversity and viewpoint diversity: Two of a kind?.
Communications, 42(2), 107-126. https://doi.org/10.1515/commun-2017-0017
McManus, J. H. (2009). The commercialization of news. In The handbook of journalism
studies (pp. 238-254). Routledge.
McQuail, D. (1992). Media performance: Mass communication and the public interest (Vol. 144). London: Sage.
Plasser, F. (2005). From hard to soft news standards? How political journalists in different media systems evaluate the shifting quality of news. Harvard International Journal of
Press/Politics, 10(2), 47-68. https://doi.org/10.1177/1081180X05277746
Plavén-Sigray, P., Matheson, G. J., Schiffler, B. C., & Thompson, W. H. (2017). The readability of scientific texts is decreasing over time. Elife, 6, e27725.
https://doi.org/10.7554/eLife.27725.029
Ramírez de la Piscina, T., Gonzalez Gorosarri, M., Aiestaran, A., Zabalondo, B., & Agirre, A. (2015). Differences between the quality of the printed version and online editions of the European reference press. Journalism, 16(6), 768-790.
https://doi.org/10.1177/1464884914540432
Rauh, C. (2018). Validating a sentiment dictionary for German political language—a workbench note. Journal of Information Technology & Politics, 15(4), 319-343. https://doi.org/10.1080/19331681.2018.1485608
33 R. Remus, U. Quasthoff & G. Heyer: SentiWS - a Publicly Available German-language
Resource for Sentiment Analysis. In: Proceedings of the 7th International Language
Resources and Evaluation (LREC'10), pp. 1168-1171, 2010
Schoenbach, K., & Lauf, E. (2002). Content or design? Factors influencing the circulation of American and German newspapers. Communications, 27(1), 1-14.
https://doi.org/10.1515/comm.27.1.1
Soroka, S., & McAdams, S. (2015). News, politics, and negativity. Political Communication,
32(1), 1-22. https://doi.org/10.1080/10584609.2014.881942
Štajner, S., Evans, R., Orasan, C., & Mitkov, R. (2012). What can readability measures really tell us about text complexity. In Proceedings of the the Workshop on Natural
Language Processing for Improving Textual Accessibility (NLP4ITA) (pp. 14-21).
Strömbäck, J. (2005). In search of a standard: Four models of democracy and their normative implications for journalism. Journalism studies, 6(3), 331-345.
https://doi.org/10.1080/14616700500131950
Trilling, D., Van De Velde, B., Kroon, A. C., Löcherbach, F., Araujo, T., Strycharz, J., ... & Jonkman, J. G. (2018, October). INCA: Infrastructure for content analysis. In 2018
IEEE 14th International Conference on e-Science (e-Science) (pp. 329-330). IEEE.
https://doi.org/10.1109/eScience.2018.00078
Urban, J., & Schweiger, W. (2014). News Quality from the Recipients' Perspective: Investigating recipients' ability to judge the normative quality of news. Journalism
Studies, 15(6), 821-840. https://doi.org/10.1080/1461670X.2013.856670
Van Hoof, A. M., Jacobi, C., Ruigrok, N., & Van Atteveldt, W. (2014). Diverse politics, diverse news coverage? A longitudinal study of diversity in Dutch political news during two decades of election campaigns. European Journal of Communication,
34 Voakes, P. S., Kapfer, J., Kurpius, D., & Chern, D. S. Y. (1996). Diversity in the news: A
conceptual and methodological framework. Journalism & Mass Communication
Quarterly, 73(3), 582-593. https://doi.org/10.1177/107769909607300306
Wahl-Jorgensen, K., Berry, M., Garcia-Blanco, I., Bennett, L., & Cable, J. (2017). Rethinking balance and impartiality in journalism? How the BBC attempted and failed to change the paradigm. Journalism, 18(7), 781-800. https://doi.org/10.1177/1464884916648094 Waltinger, U. (2010, May). GermanPolarityClues: A Lexical Resource for German Sentiment
Analysis. In LREC (pp. 1638-1642).
Welbers, K., Van Atteveldt, W., Kleinnijenhuis, J., & Ruigrok, N. (2018). A gatekeeper among gatekeepers: News agency influence in print and online newspapers in the Netherlands. Journalism Studies, 19(3), 315-333.
35 Appendix A
36 Appendix B
Sample distribution
Table 1. Sample distribution across news outlets and modalities.
Newspaper Print articles Online articles Total articles
Der Tagesspiegel (national) 1,286 264 1,550
Die Süddeutsche (national) 3,720 175 3,895
Die Welt (national) 831 177 1,008
Aachener Zeitung (regional) 970 168 1,138
Rheinische Post (regional) 2,375 173 2,548
Stuttgarter Zeitung (regional) 1,237 115 1,350
37 Appendix C
Codebook for manual content analysis of impartiality training material
V1. Coder ID
01 Coder 1
02 Coder 2
03 Coder 3
04 Coder 4
V2. Article identification number
V3. News outlet 1 Aachener Zeitung 2 Stuttgarter Zeitung 3 Rheinische Post 4 Der Tagesspiegel 5 Die Welt 6 Die Süddeutsche
V4. Who is the central political actor in the story? (if in doubt, see list of actors below) 1 A governing party or a member of it on the national level
2 An opposition party or a member of it on the national level 3 A governing party or a member of it on the regional level 4 An opposition party or a member of it on the regional level 5 A foreign/international politician, party, or organisation 6 No political actor mentioned
Indicators of importance are…
… duration, space of information about the actor … frequency of being mentioned
38 … mentioned in the headline or teaser
Notes:
➢ If two actors are equally prominent in the article with regard to the above criteria,
then count the number of references to each actor and choose the one who is most often referred to. However, this rule only applies if two actors are really exactly evenly prominent with regard to the above criteria.
➢ Everything that happens on the federal state level or below is considered regional ➢ If there are two equally central actors of opposing categories, code for the one that is
mentioned first (headline included)
➢ Foreign/international actors are all political actors that are not working in German
politics. This includes foreign countries, heads of states or other foreign politicians, foreign parties, international political organisations (e.g. NATO, EU) and also German politicians that work on the EU level.
➢ It doesn’t matter if political actors are not very prominent in an article. As long as it
mentions at least a single political actor once or more, that is enough to code for central political actor.
➢ See Appendix A for a list of relevant politicians and parties per category
V5. Balance of political viewpoints
“Is the standpoint of the central political actor challenged by another actor in the text?” 1 Yes
2 No
Notes:
➢ A challenge has to be expressed in the form of a quote (either direct or indirect) ➢ The challenging actor can either be …
… another political actor (of the same or a different party), or
… another actor such as an expert, a journalist, or anyone else who is relevant in the context of the article’s topic
… the author
➢ Challenging a viewpoint means critically engaging with it. Therefore, it encompasses
39
Example (for code 1):
➢ Berlin - Tübingens Oberbürgermeister Boris Palmer (Grüne) hat Forderungen nach
einem Parteiaustritt zurückgewiesen. „Selbstverständlich trete ich nicht aus meiner Partei aus“, sagte Palmer am Freitag der „Bild“-Zeitung. „Ich bleibe weiterhin aus ökologischer Überzeugung Mitglied der Grünen. Da die Vorwürfe gegen mich von meinen Gegnern erfunden beziehungsweise konstruiert worden sind, gibt es überhaupt keinen Grund, darüber nachzudenken.“
Der Landesvorstand der Grünen in Baden-Württemberg hatte den umstrittenen Kommunalpolitiker zuvor zum Parteiaustritt aufgefordert. Mit seinen Äußerungen stelle sich Palmer gegen politische Werte und Grundsätze der Partei und agiere „systematisch“ gegen sie, erklärte der Landesvorstand nach einer Sitzung am Freitagabend. Mit seinem Auftreten diene der Politiker „nicht der politischen oder innerparteilichen Debatte, sondern der persönlichen Profilierung“.
V6. Balance of political sources
“Does the article quote two or more different types of actors (V4)?” 1 Yes
2 No
Notes:
➢ A quote can be either direct, indirect, or a mix of the two; e.g.:
o „Selbstverständlich trete ich nicht aus meiner Partei aus“, sagte Palmer am Freitag der „Bild“-Zeitung“
o Flynn’s Eingeständnis, dass er im Dezember 2016, also vor der
Amtseinführung Trumps, den russischen Botschafter bei einem zunächst bestrittenen Geheimtelefonat um eine zurückhaltende Reaktion auf die vom amtierenden Präsidenten Barack Obama verhängten Sanktionen bat, war ein wichtiger Beleg für die Zusammenarbeit der Trump-Kampagne mit Moskau
o Mit seinen Äußerungen stelle sich Palmer gegen politische Werte und
Grundsätze der Partei und agiere „systematisch“ gegen sie, erklärte der Landesvorstand nach einer Sitzung am Freitagabend.