• No results found

The role of linguistic text measures in the prediction of text difficulty

N/A
N/A
Protected

Academic year: 2021

Share "The role of linguistic text measures in the prediction of text difficulty"

Copied!
58
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The role of linguistic text measures in the

prediction of text difficulty

Klarien Haan

MA in Applied Linguistics

Faculty of Liberal Arts

University of Groningen

Supervisor: dr. H.I. Hacquebord

Second reader: dr. W.M. Lowie

(2)

Table of contents

0. Abstract

4

1. Introduction

5

2. Reading and readability

6

2.1 The process of reading comprehension 6

2.2 Frameworks of reference 9

2.3 Text research for appropriate measures for text difficulty 13

2.4 Readability studies 15

2.5 Are readability studies useful? 16

2.6 Readability measures 17

2.6.1 Lexical complexity 17

2.6.2 Syntactic complexity 19

2.6.3 Global text complexity 19

2.7 Statement of purpose 20

3. Method

22

3.1 Text characteristics 22

3.1.1 Lexical complexity 23

3.1.2 Syntactic complexity 23

3.1.3 Global text complexity 24

3.2 Text corpus 25

3.3 The underlying research for Diataal 25

3.4 Procedure 26

3.5 Design and analyses 28

4. Results

29

4.1 Descriptives 29

4.1.1 Reading comprehension scores 30

4.1.2 Levels from the framework of reference 32

4.2 Testing causal relations 33

4.2.1 Reading comprehension scores 33

4.2.2 Levels from the framework of reference 35

5. Discussion

38

5.1 The prediction of comprehension scores and levels of reference 38

5.1.1 The predictive power of AWL, COV and ASL 38

(3)

5.3 Readability studies 41

6. Conclusion

43

Reference List 46

Appendix A: Overview of the text corpus 53

(4)

0. Abstract

(5)

1.

Introduction

In society, it is important that people can read and understand texts. Therefore reading comprehension plays an important role in education. However, in the Netherlands the reading comprehension of students has decreased during the last few decades (de Knecht-van Eekelen et al., 2007; Mullis et al., 2007). Individual differences can play an important role in reading outcome, but textual differences may be important as well.

For deciding what the difficulty of a text is, different methods are available. For example, one can take reading comprehension scores of students, or read the text and determine what the level is, for example with help of a framework of reference. In many studies texts are linguistically analyzed to decide how linguistic characteristics influence text difficulty. From literature it is known that average sentence length, average word length and percentage of coverage are good predictors for text difficulty. In this thesis it is investigated whether this is the case for the texts that are used in the study. Further it is investigated if there is a relation between other linguistic text measures, and which other measures are predictive for the reading comprehension scores of students and expert judgements in terms of levels from the framework of reference Doorlopende Leerlijnen Taal.

To find an answer on these question 40 texts were linguistically analyzed. Some of the texts were read by students who took a comprehension test about the text. Other texts were read by experts, who have rated the texts in terms of the framework of reference Doorlopende

Leerlijnen Taal. As a result, there were two sub corpora; therefore all analyses will be done for both sub corpora.

(6)

2.

Reading and readability

2.1 The process of reading comprehension

Reading comprehension plays an important role in schools. Students must be able to read and comprehend their books and other texts to learn the subject materials. However, lots of books contain texts which are difficult to comprehend for students. Many students in secondary education have therefore difficulties in reading comprehension (Andringa & Hacquebord, 2000). In the Dutch (pre)vocational education (in Dutch: vmbo) reading comprehension is a problem for almost 25% of the students (Hacquebord, 2007; Inspectie van het Onderwijs, 2006). According to Andringa & Hacquebord (2000) there is a relation between reading comprehension and text difficulty. This relation is based on the fact that texts become more difficult during the curriculum: texts for younger students are in general both textually and linguistically less complex than texts for older students. This connection between reading comprehension and text difficulty implies that problems with reading comprehension might be caused by texts that are too difficult. In order to make clear what is meant by reading comprehension, literature about the reading process and reading comprehension will be discussed first.

Different stages can be distinguished in the process of reading comprehension. The first step in the process contains decoding and word recognition, a bottom-up process. In this stage the written words have to be read and recognized. There is variation in writing systems. This variation causes that reading systems vary as well. For some languages, such as Italian and Dutch, there is a one-to-one grapheme-phoneme correspondence. That means that every phoneme is represented by one grapheme and vice versa. Other languages, such as English, don’t have this one-to-one correspondence (Harley, 2001). Therefore there is not one way in which all languages can be read. The English language contains words that are spelled regularly, such as the word ‘beef’. In such words the phonemes correspond to the graphemes in a regular way and all the graphemes in the word have the standard pronunciation. Besides these words, there are words that are spelled irregularly, such as the word ‘steak’, in which the grapheme ‘ea’ does not have the standard pronunciation as in ‘sneak’ and ‘speak’. Finally, there are pseudowords, words that don’t have a meaning. These words can be pronounced correctly even if we have never seen them before (Harley, 2001).

(7)

comprehend texts. This model describes reading comprehension as the sum or product of word decoding and linguistic comprehension skills (Kendeou et al., 2009). Hoover & Gough (1990) explain decoding as ‘the ability to rapidly derive a representation from printed input that allows access to the appropriate entry in the mental lexicon, and thus, the retrieval of semantic information at the word level’ (p. 130). To derive a phonological representation from the printed input rapidly, it may be helpful that the reader knows the word, or that the word is spelled regularly. However, this is not the central point in the simple view of reading, but more in the dual-route model for reading (Coltheart, 1978; Coltheart, 1980). The simple view of reading requires that the reader can efficiently access the mental lexicon for proper orthographic representations (Hoover & Gough, 1990).

The next step in processing is parsing. To comprehend a sentence the reader has to build syntactic structures. After the structures have been built, thematic roles are identified and individual meanings of words are accessed. Then the stage of comprehension follows, which is the other part in the simple view of reading that is necessary for reading comprehension. Hoover & Gough (1990) describe that as ‘the ability to take lexical information and derive sentence and discourse interpretations’ (p. 131). In this stage the information from previous stages is integrated into a representation of the sentence and connections with prior information and knowledge are made. This is a top-down process. An important text characteristic for comprehension is coherence: that there is a topic and that the text forms a semantically integrated whole in who and what is talked about, in time, and in why and where events occur (Gernsbacher, 1990). Another important aspect is cohesiveness. A text is cohesive when the words in successive sentences refer to the same entities (Bishop, 1997; Harley, 2001). Research showed that vmbo-students comprehend an integrated text with explicit connections better than a text without signals of structure which is less coherent (Land, 2009; Land et al., 2002). That means that coherence is important in the process of text comprehension. Most researchers state that making a coherent mental representation of a text is the core of reading comprehension (Kintsch & Van Dijk, 1978; van den Broek, 2009).

(8)

the meanings of words. For example, from the sentence ‘Maria is a mother’ it can be inferred that Maria is a woman. Bridging inferences help the reader to relate previous information to the new information. This type of inferences is needed to maintain coherence in a text, because the inferences link ideas in different sentences together to maintain coherence. Finally, with elaborative inferences the reader can extend the information that is in the text with prior knowledge. An important part of making inferences for comprehension is finding out what words refer to: reference. Research has shown that making inferential relations in a text is the core of the mental representation a reader makes from a text. There are estimations that a good reader makes about 250-300 inferences per page in a normal text (van den Broek, 2009).

The interactive approach to reading development attempts to combine bottom-up and top-down models for reading. According to interactive reading models the reader uses a combination of lexical and syntactic information (bottom-up) and prior knowledge (top-down) to come to text comprehension. Chall (1983) distinguishes six stages in the reading process. Stage 0 is the prereading stage. This stage is characterized by the growth in knowledge and use of the spoken language. The control of words and syntax increases in this stage. In stage 1 children acquire the letter-sound system. In stage 2 children learn to apply the things they learned in stage 1 to read stories and words. In stage 3 children begin to gain new knowledge and information by reading. The most important goals in stage are growth of vocabulary and background knowledge. In this stage children start to criticize what they read. In stage 4 the texts are more difficult and students have to deal with different points of view. Finally, at stage 5 readers are able to read materials in detail and they can distinguish important facts in a text. They compare their own interpretation of the words with their analysis of the content and their own ideas about the subject of the text (Carnine et al., 2004) .

(9)

less-skilled readers, which reduces the number of inferences they have to make (Gernsbacher, 1997).

Now it will be clear how the reading comprehension process works and develops. There are a lot of individual factors that may form a difficulty for the reader in the comprehension process, like differences in working memory capacity (Daneman & Carpenter, 1980), motivation, background knowledge, attitude, and environment. The individual differences in reading can start already before the beginning of formal instruction in reading, for example as a result of parental influence on the reading process or innate competencies (Bast & Reitsma, 1998), and are still found among college students (Perfetti, 1985). It even seems to be probable that such individual differences increase with further schooling (Bast & Reitsma, 1998). Stanovich (1986) was the first one who applied the ‘Matthew effect’ to reading . The Matthew effect refers to the phenomenon that ‘the rich get richer and the poor get poorer’. For reading this implies that good readers get better and that poor readers can’t make up their arrears. However, besides these individual differences, there are also textual factors that can make the comprehension process more difficult for the reader. In the next paragraphs the focus is on textual characteristics for reading comprehension. To give more insight in how, among other things, textual characteristics may influence reading comprehension, frameworks have been designed in which textual characteristics are linked to certain reading comprehension levels.

2.2 Frameworks of reference

So called frameworks of reference describe what texts look like and what can be expected from readers at a certain text level. Most frameworks have been designed to make clear to teachers what they can expect from their students, and to students what teachers can ask from them. An example of a framework of references for second language learning is the Common European Framework of Reference for Languages (CEFR) (Council of Europe, 2001). This framework has been designed in the European Union, to stimulate people from different countries to learn a foreign language. The instrument is now used in many languages to define targets of language proficiency for different school types in European countries. The framework consists of six levels for four skills: reading, writing, hearing and speaking. The levels are A1, A2, B1, B2, C1, and C2. Each level is described for each skill, so that the learner or the teacher knows what can be expected.

(10)

on the CEFR. The reason for the development of this framework for the (pre)vocational education (vmbo) is the increased attention for language ability in the past few years. Teachers and students are expected to give more attention on language ability. The framework can help them to make progress in their Dutch language ability visible. In addition, the framework can help students to reach a language level that gives them chances to pass their exams and to function well in their next course or a future job. (Bohnen et al., 2007). The

Raamwerk Nederlands consists, even as the CEFR of descriptions of levels for four skills. In contrast to the CEFR that consists of six levels, the Raamwerk Nederlands consists of four levels. The reason for this difference is that the framework is particularly designed for mother tongue speakers of Dutch. Therefore level A1 could be left out of the framework. The framework was designed for the prevocational education. Level C2, which is the academic level, does not occur in this educational level. Therefore this was not recorded in the framework either. (Bohnen et al., 2007).

The reading ability of children has decreased during the last years in the Netherlands in general (de Knecht-van Eekelen et al., 2007; Mullis et al., 2007). Besides that, teachers did not always know what they could expect from students who came from another school type (Expertgroep Taal en Rekenen, 2009; Van den Bergh et al., 2009). For example, teachers in secondary education do not always know what students can and know exactly when they come from primary school. Similarly, there is not enough knowledge about the knowledge en skills students have in higher (professional) education when they come from secondary school (Van den Bergh et al., 2009). Therefore a research group (Expertgroep Doorlopende Leerlijnen Taal en Rekenen) has conducted a framework of reference, the Referentiekader

Doorlopende Leerlijnen Taal en Rekenen, in which is determined what can be expected from

(11)

Table 1

Summary of the descriptions of the four levels for reading informative texts as given in Referentiekader Doorlopende Leerlijnen Taal en Rekenen (Expertgroep Taal en Rekenen, 2009) (own translation)

Level 1 Level 2 Level 3 Level 4

General description

Reading informative texts

Is able to read texts about daily subjects, subjects that connect to students’

environment

Is able to read texts about daily subjects, subjects that connect to students’

environment and subjects that are not related to the student

Is able to read a great variety of texts from (professional) education and society independently. Reads with understanding for the whole and the details

Is able to read a great variety of texts about different subjects from (professional) education and society and is able to understand them in detail.

Texts

Text characteristics The structure of the texts is simple. Connections are given clearly. The information density is low and the texts are not too long

The structure of the texts is clear. Connections in the texts are given explicitly. The information density is generally low and the texts are not too long

The texts are fairly complex, but have a clear composition which can be explicated by the use of headers. The information density can be high.

The texts are complex, and the structure is not always clear Tasks 1. Reading informal texts Is able to read simple informative texts, like texts for lessons, reference books, (simple) texts from the Internet, simple schematic overviews

Can read informative texts, like

schoolbooks, standard forms, popular magazines, texts from the Internet, notes and schematic

information (n which different dimensions are combined), and the daily news in journals.

Can read informative texts, like

information

materials, brochures, texts from methods, but also newspaper articles, formal correspondence, complicated schemas and reports about the own working area

Is able to read informative texts with high

information density, like long and complicated reports

2.Reading instructions

Is able to read simple instructive texts, like simple travelling plans and hints in assignments

Is able to read instructive texts, like recipes, often occurring hints and manuals, and package leaflets of medicines

Is able to read instructive texts, like complicated instructions in manuals with unknown devices and procedures 3.Reading discursive texts Is able to read simple discursive texts, like occurring in schoolbooks, but also advertisements and commercials

Is able to read discursive, often redundant texts, like commercials, advertisements, brochures of formal offices and flyers

Is able to read discursive texts, like texts from

schoolbooks and opinion forming articles

Is able to read discursive texts, like texts with

complicated argumentations, or articles in which the writer takes

(12)

These three different frameworks, CEFR, Raamwerk Nederlands and Referentiekader

Doorlopende Leerlijnen Taal en Rekenen, all use descriptions of language skills on different levels to indicate what can be expected from students at a certain level. Raamwerk

Nederlands and Referentiekader Doorlopende Leerlijnen Taal en Rekenen both were designed

for the Dutch language. A difference between these two frameworks is that Referentiekader

Doorlopende Leerlijnen Taal en Rekenen mostly uses quite abstract descriptions and no empirical measures, whereas Raamwerk Nederlands does give some empirical ‘measurable’ text characteristics for the lower levels, such as sentence length in words. In table 2 is given a translated example from the part about reading of Raamwerk Nederlands, in which some of the measures are shown that are used in this study.

Table 2

An example of text characteristics for reading from Raamwerk Nederlands (Bohnen et al., 2007) (own translation)

A2 B1 B2 C1

Text length Comprehension and text processing: 1-2 pages.

Amount of text per page is limited. Comprehension and text processing: 2-3 pages. Not relevant anymore. Not relevant

Sentence length About 10 words per sentence

About 15 words per sentence

Not relevant anymore.

Not relevant

Information density Text is high redundant. In general one principal idea per sentence

Text is redundant. Mostly one principal idea per sentence. One paragraph contains one information-content. Not too much information in one time.

A part of the sentences has more than one principal idea. The text is not definitely

redundant.

Information density is high

Sentence structure Many simple sentences with simple

conjunctions. Few passive sentences. Principal idea at the start of the

sentence.

Sentences contain more adjuncts or are compound/complex sentences. More active than passive sentences. Principal idea sometimes at the end of the sentence. Limited amount of relatives. Complex, compound sentences occur with embedded clauses. A large amount of passives can occur. A large amount of complex, compound sentences can occur with low frequent conjunctions. Large amount of passives can occur.

(13)

Referentiekader Doorlopende Leerlijnen Taal en Rekenen the differences between two levels are not always very clear. Table 3 gives a translated example from the framework.

Table 3

An example from the Referentiekader Doorlopende Leerlijnen Taal en Rekenen (Expertgroep Taal en Rekenen, 2009) (own translation)

Level 2F Level 3F

Text characteristics The structure of the texts is clear. Connections in the texts are given explicitly. The information density is generally low and the texts are not too long

The texts are fairly complex, but have a clear composition which can be explicated by the use of headers. The information density may be high.

Table 3 shows a part of the framework for reading informative texts. The description for level 2F says that the structure of the text is clear, but the framework does not describe what is meant with ‘structure’. The description for level 3F says that ‘the information density may be high’. These descriptions are rather vague, and it is difficult to decide what the level of a text is with descriptions like these. This is one of the remarks that are made in a report in which is tried to make the levels of reference more concrete (Meestringa et al., 2010). To make frameworks more applicable, it would be helpful if they describe some more ‘measurable’ characteristics like Raamwerk Nederlands does, as shown in table 2, besides the more abstract characteristics like in table 3. Measurable characteristics are available through text research. An example of such an investigation is the study of Andringa & Hacquebord (2000).

2.3 Text research for appropriate measures for text difficulty

(14)

Nieuwborg (1991) (COV). This analysis yielded a definition of text levels on the three measures. The results are given in table 4, in which Bavo 1 corresponds to texts from year 1 of VMBO, Bavo 2 to texts from year 1 of senior general secondary education (HAVO) and university preparatory education (VWO) and Bavo 3 corresponds to the most difficult texts from year 1 and texts from year 2 of HAVO/VWO and years 2 and 3 of VMBO.

Table 4

Definition of text levels on three linguistic measures: Average Word Length (AWL), Average Sentence Length (ASL) and percentage of coverage of basic words (COV) (taken from Andringa & Hacquebord, 2000).

Text level AWL ASL COV Source level

1 <4,7 <10,0 >88,5 Grade 5

2 4,7 - 4,8 10,0 - 11,5 86,0 - 88,5 Grade 6

3 4,9 - 5,0 11,6 - 13,0 83,5 – 85,9 Bavo 1

4 5,1 – 5,2 13,1 – 14,5 81,0 – 83,4 Bavo 2

5 5,2> 14,6> 81,0< Bavo 3

(15)

A text order has been created that was based on the findings of Andringa & Hacquebord (2000). What is needed now is an investigation that links this text order with the levels from the framework of reference that was published in the Netherlands recently (Expertgroep Taal en Rekenen, 2009). This will give more insight in how more empirical measures can be used in the framework of reference. In addition, it will give more insight in the relationship between the scores of the text comprehension test of Hacquebord & Andringa (2000), the reading comprehension scores of a text comprehension test called Diataal (Hacquebord et al., 2006), and the framework of reference Referentiekader Doorlopende

Leerlijnen Taal. In chapter 3 the underlying text research of the comprehension test Diataal and the relation of this test with different frameworks of reference will be discussed in more detail. In the study of Andringa & Hacquebord (2000) text characteristics were used for the prediction of comprehension difficulty. Therefore this study can be located in the field of readability studies. The goal of many readability studies is to find text characteristics that can predict how difficult a text is.

2.4 Readability studies

The focus of many studies on reading comprehension is on the reader. Several researchers have investigated how factors like motivation, background, attitude and reading frequency influence reading comprehension (e.g. Tellegen & Lampe, 2000; Land, 2009). Another part of research on reading comprehension focuses on text characteristics. A text that is too difficult for a reader may cause reading comprehension problems. Many studies have investigated what it is that makes a text difficult and how the difficulty of a text can be measured. Studies like that are called ‘readability studies’. The outline of readability studies is generally as follows: from a sample of texts the reading ability necessary (the readability) is determined. Then the different characteristics of the texts are compared to each other, and one determines which characteristics may explain the differences in readability with a technique of analysis. The result of this analysis is a so called readability formula. With the outcome of such a formula, a readability index, the readability of all texts in a population can be determined (Staphorsius & Krom, 2008).

(16)

number of syllables per hundred words (LGHW) and average sentence length in words (GZW) had to be filled in:

(1) R.E. = 206.835 – 0.46 x LGHW – 1.015 x GZW

The result of the so called Reading Ease (R.E.) formula in (1) was the predicted reading ease of the text that had been analysed. The lower the R.E.-score, the more difficult the text was. In the 1960’s two applications of this formula were made for the Dutch language (Douma, 1960; Brouwer, 1963). Staphorsius (1994) has conducted a series of reading ability tests with a domain referenced interpretation. Therefore he investigated a lot of variables that have been used in readability research. Later in this thesis his study will be discussed further. Recently readability research is given a new impulse by computational linguistics (Collins-Thompson & Callan, 2005; vor der Brück et al., 2008). In this type of research new predictors like statistical language models are often used to search for the relation between text characteristics and reading comprehension (Kraf & Pander Maat, 2009).

2.5 Are readability studies useful?

(17)

that the readability studies underestimate the variance of reading scores by working with average scores (Anderson & Davison, 1988; Kraf & Pander Maat, 2009).

Kraf & Pander Maat (2009) argue that there are reasons to go on with readability research. They think it is possible to do better studies by using predictors that have a causal relation with the comprehensibility of texts. Moreover, they argue that with that kind of predictors it will be easier to diagnose comprehension problems in a text. Besides that, Kraf &Pander Maat (2009) argue for the use of text structural predictors. In this thesis some text structural predictors are used as well. An overview of linguistic measures that have been used in literature is given in the next paragraph.

2.6 Readability measures

To measure the readability of a text, many different text characteristics can be used in the investigation. In readability studies, the text characteristics that may influence the readability of the text are the predictors or independent variables. There are lexical characteristics, syntactic characteristics and global text characteristics that can all make a text difficult in their own way. In several readability studies (e.g. Staphorsius, 1994; Vogel & Washburne, 1928) was investigated at first a large number of characteristics. Subsequently, it was determined with a multiple regression analysis which characteristics predicted the readability of texts the best. With that analysis the weights of the predictors were determined in such a way that there was a maximal correlation between the readability that had been determined by the predictors and the empirically determined readability. With a multiple regression analysis the readability of a text can be predicted with the values of the text characteristics only (Staphorsius, 1994): no other factors play a role in that analysis. Staphorsius (1994) has given an overview of text characteristics that have been used as predictors in traditional readability studies. He divided the predictors in measures and indexes for lexical complexity and measures and indexes for syntactic complexity. However, in other studies characteristics were used that can’t be assigned to either lexical or syntactic complexity measures. Therefore a distinction is made in this study between lexical complexity, syntactic complexity and global text complexity.

2.6.1 Lexical Complexity

(18)

for lexical complexity that is used in many studies is word reputation. In the United States ‘The Teachers Word Book’ (Thorndike, 1921) is often used as a measure for this characteristic. This word book was the first book that made it possible to make a distinction between known and unknown words. In the Netherlands a basic word list has been conducted (De Kleijn & Nieuwborg, 1991). In this list the 2000 most appearing or most elementary words of the Dutch language have been recorded. When word lists are used in readability research, the assumption is that texts are lexically more complex when they contain more infrequent words (see Staphorsius, 1994).

An other well-known predictor that is often used to measure lexical complexity is

type/token ratio, which measures the diversity of the lexicon. The type/token ratio (TTR) indicates the relation between the total number of words (tokens) and the number of different words (types). In one of the earliest readability studies the number of types already turned out to be a good readability predictor (Vogel & Washburne, 1928). Experimental support for the TTR can be found in Kintsch et al. (1975). In this study it is shown that a text with the same number of propositions but more different concepts (types) was read slower and was memorized worse. However, as Van Hout & Vermeer (2007) suggest, if there would be a linear relationship between types and tokens, the TTR would not increase in productive language when the language learner had a larger lexicon. In general, the number of types increases much more slowly than the number of tokens. Language learners tend to use more words and longer sentences, but when they do that they usually use more of the same words. For reading this means that texts with a higher level or with more words do not necessarily contain more types than texts with a lower level or with not so many words. However, the study of Van Hout & Vermeer (2007) was focused on productive language. For identifying a text level, the difference between types and tokens could be less important, because more different words (types) make a text probably more difficult to read than a text with a low TTR. For reading, the language user needs a larger receptive lexicon to be able to read lexical rich texts than texts that contain not so much different words. Therefore a simple TTR is for readability measures possibly more appropriate than for research into productive language development.

Research has shown that there is an effect of word length for beginning readers, but that this effect disappears when readers are more experienced (Kraf & Pander Maat, 2009). Word length is taken as predictor in many readability studies. Other lexical characteristics are

(19)

2.6.2 Syntactic complexity

In readability research syntactic complexity is operationalized with measures for the length and density of syntactic structures, for example the number of words per sentence. A sentence is considered as a syntactic structure between a dot-space and a dot-space-capital. The number of words in a text as average sentence length is used by many researchers as a measure of syntactic complexity (Andringa & Hacquebord, 2000; Bormuth, 1969; Flesch, 1948; see Staphorsius, 1994).

Other syntactic complexity measures are the number of sentences in a text; the number

of simple sentences in a text, for which a simple sentence means a sentence with only one finite verb; the number of compound sentences, which contains at least two finite verbs; the

number of clauses, which are independent parts of sentences that belong to the sentence, but have their own subject and predicate; and the number of ‘tongs’ (in Dutch: tang-constructies). A ‘tong’ is a construction of words or word groups that can follow each other, but where words are placed in between (Staphorsius, 1994). An example of a ‘tong construction’, that occurs in Dutch often, is given in (I).

(I) Hij heeft gisteren voor het laatst zijn hond gevoerd

He has yesterday for the last time his dog fed ‘Yesterday he fed his dog for the last time’

The underlined words heeft en gevoerd are actually the two parts of the tongs. They belong together and can follow each other immediately, but there are placed some words in between.

2.6.3 Global text complexity

Besides the lexical and syntactic complexity measures that are discussed now, there are global text characteristics that can make a text difficult. These are characteristics that make a text cohesive and coherent. Researchers have used different predictors to operationalize these global characteristics, such as the density of propositions per word (Kintsch & Vipond, 1979),

the number of words per proposition (Miller & Kintsch, 1980), the number of inferences

necessary (Kintsch & Vipond, 1979; Miller & Kintsch, 1980), the number of connectives and

the presence of a principal idea in the title of the text (Land, 2009).

(20)

coherence predicts that it is easier for a reader to make a text representation when it is clear how different text parts are related to each other. Making coherence relations between sentences and text parts is crucial in this hypothesis. The hypothesis of minimal cognitive load predicts that short sentences without complex structure signals are necessary for a good text representation. According to this hypothesis fragmented texts are more easy to read and comprehend than integrated texts with structure signals like connectives. The results of Land’s study showed that many books for the prevocational education contain fragmented texts, whereas students comprehend integrated texts with structure signals better (Land, 2009). This confirms that structure signals are important for text comprehension.

2.7 Statement of purpose

Now it will be clear that there are a lot of variables that may influence text difficulty and possibly also text comprehension. It is important to know what it is that makes a text difficult for readers. A lot of research into different text characteristics has been done to make it possible to predict text difficulty. However, what is also interesting in a readability study is that it can be used as a validation study for tests or programs based on text characteristics. In such a validation study one searches in other texts for the text characteristics that are used for setting up the test or program. Moreover, one tries to find new additional text characteristics that can contribute to the prediction of text difficulty.

In this thesis it is investigated to what extent reading comprehension scores can be predicted by a program based on linguistic measures. Therefore it is examined how linguistic text characteristics correlate to each other and how they relate to reading scores of students. The main questions of this thesis are:

1. How strong is the relationship between the different linguistic text characteristics? 2. To what extent do the linguistic measures predict a) the reading comprehension

scores of students and b) the expert judgments of text difficulty?

3. To what extent do the linguistic measures contribute to the prediction of a) the reading comprehension scores of students and b) the expert judgments of text difficulty, besides known predictors, and what is the best model?

(21)
(22)

3.

Method

In this chapter it is described how this study has been conducted. First it is described which text characteristics were used in the study. Then the text corpus and the test that is used in the study are discussed. Finally the way in which the materials were analyzed is explained.

3.1 Text characteristics

As it has been described in chapter 2, Staphorsius (1994) has investigated the readability or comprehensibility of texts to predict the readability from the semantic and syntactic characteristics of passages of texts. Therefore he has described a lot of different text characteristics that can be used in a readability study. Some of the characteristics that he has described and some other measures were used in this study. An important criterion for the selection of the characteristics were the different groups where the measures belong to, as they have been described in chapter 2. All groups represent a level of text analysis on micro, meso and macro level (Andringa & Hacquebord, 2000; Prenger, 2001). Micro is here the word level, meso the sentence level and macro the text level: lexical complexity represents the micro level, syntactic complexity the meso level and global text complexity the macro level. At least two measures were selected for each group. Besides that, the measurability of the characteristics was an important criterion. The characteristics that were used in this study are given in table 5 below.

Table 5

Text characteristics used in this study

Group Text characteristic Index

Lexical complexity Average word length (AWL) Number of characters per word shared by the number of words Percentage of coverage basic words

(COV)

The percentage of words in the text that is covered by the words in the wordlist of de Kleijn & Nieuwborg (1991)

Type Token Ratio (TTR) The number of different word lemmas (types) shared by the total number of words (tokens)

Syntactic complexity Average sentence length (ASL) The number of words per sentence shared by the number of sentences. A sentence is considered as a syntactic structure between a dot-space and a dot-dot-space-capital Average number of clauses per text

(ANC)

The number of clauses shared by the number of words

Dependency length subject - verb (DLSV)

The distance between the finite verb and the subject

(23)

Global text complexity Percentage of connectives (CON) Number of connectives shared by the total number of words Text length (TL) Total number of words in a text

3.1.1 Lexical complexity

As the table shows, there are three measures for lexical complexity. The first lexical measure is average word length (AWL). This is the sum of the number of characters per word shared by the number of words in the text. The second lexical complexity measure is percentage of coverage of basic words (COV). As described in chapter 1, this is the percentage of words in a text that is mentioned in de basic word list of de Kleijn & Nieuwborg (1991). Both AWL and COV are measures that are taken from the study of Andringa & Hacquebord (2000). The last measure for lexical complexity is an additional characteristic to these two lexical measures that has been used in many other studies: Type Token Ratio (TTR). For this ratio the number of different words (types) is shared by the total number of words (tokens). In this study types are taken as different lemmas. For this measure it is important that the text length in words is about the same for all texts. Therefore samples of the texts are taken: each text with more than 150 words was shortened and titles and subtitles were removed. The texts were shortened at the end of the sentence were the 150th word belonged to. If the text existed of less than 500 words, the first 150 words of the text were taken. If the text existed of more than 500 words, the first 150 words from the second paragraph on were taken, for the reason that the first paragraph is mostly an introducing one in which many of the same types are used. According to some researchers TTR is more a measure for information density than for lexical complexity (Kraf & Pander Maat, 2009), but in this study it will be used as a measure for lexical complexity.

3.1.2 Syntactic complexity

(24)

predictors are measures for tong width. Dependency lengths are the factor that make tong constructions difficult. More words between the subject or object and the verb means that the sentence is syntactically more complex. Possibly that is more difficult to read and comprehend than when the subject or object and verb stand next to each other in the sentence. For the reason that ANC and the measures for dependency lengths are measured with the same program as TTR, text samples are used instead of complete texts. Dependency lengths in the samples are taken as a measure for dependency lengths in the text. The program counted the distances, that is the number of words, between the subject and the finite verb in the samples, and the distances between the object and the finite verb. These are two measures that are interesting to measure tong width.

3.1.3 Global text complexity

To measure global text complexity two measures are taken in this study. The first is the percentage of connectives (CON). This is the number of connectives in a text sample shared by the total number of words in the sample. The connectives were counted manually. Each text sample was read twice, and all connectives were counted. Thereby several things were taken into account: connectives that existed of two parts, e.g. niet alleen.. maar ook.. (not only.. but also..) were counted as two connectives. Further, the connectives had to be words. Colons, semicolons, dashes en other fonts were not counted as connectives. Finally, only connectives that are given in an overview in Pander Maat (2002) were counted. The number of connectives in a text is an important measure for the presence of explicit structure signals. As described in chapter 2, explicit structure signals are helpful for comprehending the coherence in a text. The second characteristic for global text complexity is total text length (TL). This is the total number of words in a text, which was counted in Microsoft Word. This measure was taken in this study because long texts are probably more difficult to read than shorter texts.

(25)

3.2 Text corpus

An overview of the text corpus is given in Appendix A. The text corpus exists of 40 texts. 21 texts are used for Diataal (sub corpus A) and 19 texts were selected from the internet (sub corpus B). The texts that were selected from the internet were originally selected for a report that should give teachers an impression of how difficult or how easy different texts could be. Criteria for selecting these texts were that the texts were informative, that the topic could be linked to the courses that were chosen for the report (economy, history, social studies, science, biology, physics, chemistry, mathematics and Dutch language), and that the texts were good examples for one of the levels of the Referentiekader Doorlopende Leerlijnen

Taal. For each of the levels of the Referentiekader Doorlopende Leerlijnen Taal, there are a few texts in the corpus. The texts of Diataal were selected by the research team of that test. The underlying research for these texts and the Diataal test is described further in the next paragraph.

3.3 The underlying research for Diataal

The underlying research for the selection of the texts for Diataal has been described in chapter 2. The study of Andringa & Hacquebord (2000) revealed a definition of text levels on the linguistic measures average word length, average sentence length and percentage of coverage of basic words. In their study the ranking according to linguistic measures was compared to the ranking according to text source. After that the text order according to reading comprehension scores of students was compared to the text orders according to the three linguistic measures. That analysis showed that only percentage of coverage was a good predictor for reading score. Per text the average reading score of students, so called theta-scores, were compared to the linguistic measures that are described in table 5 above. This gives more insight in what makes a text difficult, and probably in what is difficult in texts for students. This study can therefore be called a validation study.

(26)

The outcome of the Diataal test is a BLN-score. This score indicates the reading comprehension level of the students on a developmental scale. BLN-scores are embedded in a scale of text levels, so the score refers to both a reading comprehension level of the student and a text level. The BLN-scores are related to different frameworks of reference. For example, a BLN-score of 51 belongs to grade 6 of primary education or the first year of vmbo (Hacquebord & Sanders, 2010). This is comparable to level 1F of the Referentiekader

Doorlopende Leerlijnen Taal. When it is clear how the linguistic measures are related to the different levels in the frameworks of reference, it may be possible to indicate the level of the student according to these frameworks with tests like Diataal.

3.4 Procedure

(27)

Table 6

Words that were substituted for other words Number of letters Substituted word

2 En 3 Het 4 Vier 5 Reken 6 Vallen 7 Vaasjes 8 Vakantie 9 Vakanties 10 Uitvoering 11 Uitvoerigst 12 Universiteit 13 Uitgenodigden 14 Universiteiten 15 Tegenstellingen 16 Vanzelfsprekende 17 Waarschijnlijkere 18 Verantwoordelijker 19 Gemeenschappelijker 20 Gemeenschappelijkste

T-Scan, a computer program that has been developed in Utrecht (Kraf & Pander Maat, 2009), is an application that gets a lot of characteristics from a text. The program works as follows: a user can put texts in the program and select measures from a list. Subsequently the texts are annotated with the language technological software Tadpole (van den Bosch et al., 2007) and sentences are parsed with the Alpino-parser from the Rijksuniversiteit Groningen (Malouf & van Noord, 2004). Furthermore, data from a lexical database is used (Kraf & Pander Maat, 2009). In that way a lot of predictors is available. For this study TTR, ANC, DLSV and DLOV were calculated with T-Scan. One of the texts, text 24, could not be read by T-Scan. Therefore this text is not taken in the analyses with T-Scan.

(28)

3.5 Design and Analyses

(29)

4.

Results

The linguistic characteristics of the texts were collected and analyzed as described in chapter 2. After that, the characteristics were first compared with each other to make clear how they correlate with each other, and to investigate if there is a relationship between the different measures and a) the measured reading scores of students and b) the by experts assigned levels from the framework of reference. Second it was investigated if the reading score and the levels of reference could be predicted from the linguistic measures, and if so, which predictor was the strongest. For each comparison or analysis first the results for the texts in sub corpus A are given, the texts with reading comprehension scores (COMS), and second the results for the texts in sub corpus B, the texts with levels from the framework of reference Doorlopende

Leerlijnen Taal (Expertgroep Taal en Rekenen, 2009)(LEX),.

4.1 Descriptives

Means and standard deviations of the text characteristics, the comprehension scores and the levels from the framework of reference are given in table 7.

Table 7

Means and standard deviations for all text characteristics, the reading comprehension scores and the levels from the framework of reference for the whole corpus. Explanations of the text characteristics are given in table 5.

Text Characteristic N Mean SD

(30)

For the comparison of the different characteristics of the texts Spearman’s rank correlations were calculated. The correlations are shown in table 8. There are significant correlations between TL and ASL (.562), TL and AWL (.403), TL and DLSV (.337), ASL and AWL (.606), ASL and COV (-.451), ASL and CON (-.381), ASL and DLSV (.546), AWL and COV (-.714), AWL and DLSV (.369), COV and CON (.347), CON and DLSV (-.324), CON and DLOV (-.318), and between ANC and DLSV (.342).

Table 8

Spearman’s correlations between all linguistic measures for all texts

ASL AWL COV CON ANC DLSV DLOV TTR

TL .562** .403** -.264 -.075 .242 .337* .114 .164 ASL .606** -.451** -.381* .282 .546** .231 .231 AWL -.714** .179 .140 .369* .036 .160 COV .347* -.068 -.131 -.127 -.269 CON .050 -.324* -.318* -.019 ANC .342* .096 -.043 DLSV .137 .182 DLOV .276 * p <.05. ** p<.01

The correlations between the characteristics indicate that there is a relation between some of the measures. A positive correlation means that if one measure increases, the other measure increases as well. A negative correlation means that if one measure increases, the other measure decreases. For example, the positive correlation between ASL and AWL indicates that in a text with on average longer sentences, there are on average longer words. The negative correlation between AWL and COV indicates that in a text with on average longer sentences, the percentage of coverage is lower (and thus there are more unknown words). The three ‘known’ predictors ASL, AWL and COV correlate with each other. From the new predictors, CON and DLSV correlate with some other measures, but these correlations are not very high. In a regression analysis it was investigated if these relations are causal. This is described in paragraph 4.2.

4.1.1 Reading comprehension scores

(31)

and AWL. Also, DLSV does not correlate with so many other characteristics anymore. Subsequently the text characteristics were compared to the reading comprehension scores of students (COMS). Spearman’s correlations between the text characteristics and the reading comprehension scores were calculated and are shown in table 10. There are significant correlations with COMS for AWL (.753), COV (-.693) and TTR (.463). Graphical representations of these relations are shown in scatter plots (figures 1,2 and 3) in appendix B.

Table 9

Means and standard deviations for all text characteristics for sub corpus A.

N Mean SD TL 21 220.71 31.21 ASL 21 11.69 2.00 AWL 21 4.99 0.31 COV 21 86.18 5.69 CON 21 3.82 1.64 ANC 20 1.35 0.22 DLSV 20 2.16 0.58 DLOV 20 2.42 0.92 TTR 20 0.61 0.05 * p<.05. ** p<.01 Table 10

Spearman’s correlations between all linguistic measures and reading scores for sub corpus A

ASL AWL COV CON ANC DLSV DLOV TTR COMS

TL .194 -.030 .029 .178 .180 .265 -.043 -.075 .152 ASL .427 -.462** -.341 .487* .662** .419 .129 .422 AWL -.767** -.269 .138 .304 .031 .421 .753** COV .462* .014 -.340 -.142 -.641** -.693** CON .248 -.332 -.415 -.283 -.404 ANC .352 .186 .148 .020 DLSV .466* .009 .266 DLOV .202 -.053 TTR .463* * p<.05. ** p<.01

(32)

4.1.2 Levels from the framework of reference

For the sub corpus that exists of the texts from the Internet, sub corpus B, means and standard deviations are given in table 11. The internal correlations between the text characteristics and the correlations between the lexical measures and the by experts assigned levels from the framework of reference Doorlopende Leerlijnen Taal (LEX) were calculated. These correlations are given in table 12. This table shows that there are other significant correlations for this sub corpus than for the whole corpus. For example, for sub corpus B there is no correlation between ASL and AWL and between COV and ASL. There is however a correlation between TTR and TL. A comparison with the correlations for sub corpus A shows that there are in both sub corpora only a few internal correlations and that there are some differences between the corpora. Furthermore, table 12 shows that LEX correlates significantly with ASL (.713), DLSV (.493) and TTR (.490). These relations are graphically shown in scatter plots (figures 4, 5 and 6) in appendix B.

Table 11

Means and standard deviations for all text characteristics for sub corpus B.

N Mean SD TL 19 621.06 339.14 ASL 19 14.71 4.28 AWL 19 5.32 0.32 COV 19 81.53 3.54 CON 19 2.79 1.17 ANC 19 1.43 0.30 DLSV 19 2.66 1.21 DLOV 19 2.25 1.15 TTR 19 0.60 0.06 * p<.05. ** p<.01 Table 12

Spearman’s correlations between all linguistic measures and levels of reference for sub corpus B

ASL AWL COV CON ANC DLSV DLOV TTR LEX

(33)

The correlations of some characteristics with LEX indicate the possible predictors. ASL, DLSV and TTR are possible predictors for the levels of reference that are assigned to the texts. ASL is a predictor that is assumed as known in this study, but DLSV and TTR are ‘new’ measures that can possibly add something to the prediction of levels of reference. This was further investigated with a regression analysis.

4.2 Testing causal relations

The correlations between the linguistic measures and COMS and the correlations between the linguistic measures and LEX showed different strong relations. With a regression-analysis was investigated if there was a causal relationship between one or more text characteristic(s) and either the comprehension scores or the levels from the framework of reference.

4.2.1 Reading comprehension scores

First a multiple regression analysis was done for the linguistic measures as independent variables and the reading comprehension scores (COMS) for the sub corpus with texts from Diataal as dependent variable. In the first analysis all text characteristics were taken with the procedure ENTER in SPSS, through which the multiple correlation coefficient would reach its maximum value. The 9 characteristics together produced a multiple correlation coefficient of R=.889, R2 =.79 (sig. p<.05). That means that 79,1% of the variance in the reading comprehension scores can be explained by the text characteristics. For one characteristic, AWL, there is a significant regression coefficient: beta = 0.731, t = 3.304, p<0.01.

(34)

chance that predictors are excluded by suppressor effects is smaller than in the forward or stepwise method (Field, 2005). The results of this analysis are given in table 13.

Table 13

Results of multiple regression analysis between text characteristics and COMS

B SE B Beta Constant ASL AWL COV -1.372 -0.007 0.394 -0.006 0.878 0.014 0.108 0.005 -0.086 0.732* -0.218 R2=.721 (p<.01). *p<.01.

Note: In step 2 no predictors were added to the model, therefore only step 1 is shown.

The table shows that there are no other text characteristics than the three that were known from previous research taken in the analysis. Possibly this is caused by the correlations that some characteristics have with one of the measures. Therefore a weighted compound measure of AWL, ASL and COV is used in the next analysis. This measure, called DIFDIA is calculated in the program Textscreen by averaging the levels of difficulty that the program calculated for the different education types. That levels were based on the measures AWL, ASL and COV.

A multiple regression analysis was done for the combined characteristics and COMS. Instead of AWL, ASL and COV the measure DIFDIA was taken in step 1 in the model. In step 2 the other 5 measures were put in the model according to the backward method. The results are shown in table 14. Again none of the 5 measures were taken in the regression model in step 2. That means that none of the ‘new’ characteristics in this study can add something to the prediction of students’ reading comprehension scores.

Table 14

Results of the regression analysis between DIFDIA and COMS

B SE B Beta Constant DIFDIA -0.290 0.089 0.066 0.021 0.714* R2=.510 (p.01). * p<.01

Note: no measures were added to the model in step 2, therefore only step 1 is shown.

(35)

AWL (0.732). R2 is larger for the three characteristics together (.721) than for the weighted value DIFDIA (.510). It seems that AWL is the best predictor for the comprehension scores of students that are used in this study. The other predictors do not contribute enough to the prediction of the comprehension scores to take them in the model. This does not correspond to the results that Staphorsius (1994) and Andringa & Hacquebord (2000) found in their studies. Staphorsius found average word length, average sentence length, the percentage of types and word frequency, that can be compared to coverage, to be important predictors for readability. Andringa & Hacquebord found from average sentence length, average word length and percentage of coverage, only coverage to have a significant correlation with students’ scores, especially for the lower education types. In the present study AWL, COV and TTR correlate significantly with the reading comprehension scores of students, but only AWL is a significant predictor in the regression analysis. That does not mean that COV, ASL and TTR do not predict COMS, but it is clear that they do not contribute enough to the prediction in addition to AWL.

4.2.2 Levels from the framework of reference

For the sub corpus with texts from the Internet for which levels from the framework of reference Doorlopende Leerlijnen Taal were assigned by experts, first an analysis was done with the Enter method in SPSS in which all characteristics were added in a multiple regression model with LEX as dependent variable. In that analysis it was investigated if the different text characteristics could predict the level of reference of the text. The 9 characteristics together produced a multiple correlation coefficient of R=.898. R2 = .806 (sig., p<.05), which means that 80.6% of the variance in LEX can be explained by the text characteristics. The analysis showed no significant regression coefficients.

(36)

Table 15

Multiple regression analysis between text characteristics and LEX

B SE B Beta Constant ASL AWL COV -1.043 0.090 0.744 -0.017 5.552 0.032 0.518 0.044 0.526* 0.325 -0.085 R2=0.526 (p<0.01). * p<0.05

The multiple R is .725, R2=.526 (sig., p<.01). That means 52.6% of the variance in LEX can be explained by the combination of ASL, AWL and COV. As expected from the correlations, ASL is a significant predictor in this analysis. Although R2 is not very high for the regression between the three predictors and LEX, the three predictors were fixed in the next analysis. In that way a better comparison could be made between the two sub corpora.

A hierarchical multiple regression analysis was done in which ASL, AWL and COV were entered as fixed factors in the model first (method Enter) and the other predictors were all added and/or removed in the second step (method backward) The results showed that in the second step no predictors were added to the model, which means that no other predictors could contribute to the prediction of LEX. When the three predictors were substituted by the weighted predictor DIFDIA, the other predictors did not contribute to the prediction as well.

(37)
(38)

5.

Discussion

The results of this study show that there is a connection between some text characteristics, that there is a relation between a few text characteristics and students’ comprehension scores and that there is a relation between some text characteristics and levels from the framework of reference Doorlopende Leerlijnen Taal assigned by experts. In addition, the results show that the linguistic measures together predict the variance in reading comprehension scores of students to a certain extent (79.1%). The variance in the levels from the framework of reference can also be predicted from the linguistic measures to a certain extent (80.6%). The regression analyses showed that there were no linguistic measures besides the measures that had already been used for Diataal, which are AWL, ASL and COV, that can contribute to the prediction of both comprehension scores and levels of reference. For the prediction of comprehension scores AWL turned out to be the strongest predictor. Of all measures ASL predicted the levels of reference the best.

5.1 The prediction of comprehension scores and levels of reference

The results as they are given in chapter 4 show that there are for both COMS and LEX no measures that can contribute to the known predictors for the prediction of text difficulty. There are two possible explanations for this outcome: 1) ASL, AWL and COV are strong enough to predict the comprehension scores and the levels of reference or 2) the measures that are taken in this study are not good enough to contribute something to the prediction.

5.1.1 The predictive power of AWL, COV and ASL

(39)

Flesch, 1948), but not in all studies a high correlation was found with the readability criterion. Andringa & Hacquebord (2000) do however state that ASL is a good measure for sentence complexity. The relation between text source and sentence length in their study indicates that sentence length does play a role for text judgments. After all, texts for schoolbooks are often written for or adapted to a certain level that is assigned to a text by an expert. Nevertheless, in Andringa & Hacquebord’s study, there is not a significant relation between expert judgments and average sentence length, but in this study there is a relation between ASL and LEX. Moreover, ASL predicts the levels of reference that are given to the texts. Apparently ASL is a good predictor for text judgments, but not for comprehension scores. Reasons for this and other differences between the two readability criteria are discussed in paragraph 5.2

5.1.2 The contribution of the additional predictors

The percentage of coverage, type token ratio, the average number of clauses, dependency lengths, the number of connection words, and text length all could not contribute to the prediction of text difficulty. It follows that it is not clear what it is that makes a text difficult. For lexical complexity only AWL plays a role in the prediction of comprehension scores. The other lexical measures, COV and TTR, do not contribute to the prediction, which means that the difficulty of words and the lexical variation are not strong enough to predict the reading outcome, although TTR correlates significantly to both COMS and LEX.

For syntactic complexity, ANC and the dependency lengths do not play a significant role in the prediction of both criteria. As a result, this study does not give an answer to the question what is difficult in long sentences. For the prediction of levels of reference ASL is a good predictor, but it is not clear why long sentences are more difficult. There is although a relation between ASL and DLSV which indicates that longer sentences contain longer dependency lengths between subject and verb. Kraf & Pander Maat (2009) discuss the role of dependency lengths. They found that dependency lengths do not predict text difficulty better than sentence length.

(40)

counted for the text samples. Perhaps it would have been better to count the connection words in the complete texts, to get more insight in the actual number of explicit connections in a text.

One of the goals of this study was to make a model for the prediction of both COMS and LEX. Since for both criteria only one linguistic measure appeared to be a good predictor, it was not possible to make an appropriate model or formula for the readability of the texts that were used in this study. A possible cause for this inconclusive result is the difference between the two corpora. This is discussed in the next paragraph.

5.2 Differences between the sub corpora

The interconnection between some of the text characteristics shows that there different measures that play a role when a text is (more) difficult. When the interconnections of measures in sub corpus A and the interconnections of measures in sub corpus B are compared, it turns out that there are only a few linguistic measures that are interconnected in both corpora: AWL-COV and ASL-DLSV. Also, the measures that are highly related to COMS are not all the same as the measures that are highly related to LEX. In fact, only TTR is significantly related to both. This difference and the differences in the prediction of text difficulty imply that there are for both sub corpora other text characteristics that play an important role. These differences are caused by different factors.

The first factor is text length. The texts in sub corpus B are in general longer than the texts in sub corpus A. Moreover, the differences between the texts in text length are larger in sub corpus B than in sub corpus A. Therefore the influence of text length is higher in sub corpus B than in sub corpus A. That may explain why there were some correlations with text length in sub corpus B but not in sub corpus A. However, also in sub corpus B where the length of the texts differed, text length could not contribute to the prediction of LEX.

(41)

and the global text level, the meso and macro levels, are more important for text comprehension. This is in line with the interactive approach to reading development: the reading process is for beginning readers mostly a bottom-up process and for skilled readers more a top-down process (Chall et al., 1990).

The third factor that has influence on the differences between the corpora, is the difference between COMS and LEX. Comprehension scores display how students comprehend a text, and the expert judgments display the perception of difficulty. A complicating factor is that students possibly have other problems with reading a text than experts. This is also shown in Andringa & Hacquebord (2000). In their study it turns out that there are other measures that are related with teachers’ judgments than with students’ comprehension scores. The test could have played a role in this: linguistic measures are not necessarily good predictors for test difficulty, especially not on a higher level. In this study the reading comprehension scores of students were based on a comprehension test, just like in the study of Andringa & Hacquebord (2000). This issue is discussed in more detail in chapter 6.

In short, several differences between the corpora are possible causes for the inconclusive results. However, there may be other reasons for the fact that this study did not find additional predictors. As it has been described in chapter 2, readability studies are subject of discussion. In the next paragraph it will be discussed how a readability study like this should be interpreted.

5.3 Readability studies

(42)

According to the ‘simple view of reading’ (Gough & Tumner, 1986; Hoover & Gough, 1990; Kendeou et al., 2009) a reader uses word decoding and linguistic comprehension to read and comprehend a text. Word decoding requires that the reader derives a representation from printed input, in which lexical complexity might play a role. For linguistic comprehension there have to be made sentence and discourse interpretations. In that part of the process structure signals and global text complexity signals can be helpful. The results of this study support that the lexical level is more important in ‘easy’ texts, while in more difficult texts the sentence level is important. That implies for the simple view of reading that the process of word decoding is most important for beginning readers, and that skilled readers use more and more linguistic comprehension to read and comprehend a text.

Referenties

GERELATEERDE DOCUMENTEN

boss based upstairs, expects call centre managers and the team leaders just below them in the hierarchy to ensure, first, that agents read the master script word by word and

This is a blind text.. This is a

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

In Experiment 1, the presence of a binding site in the preceding sentence that was related to the central theme produced a reduction in the N400 on the critical word, the first

Linguistic measures like Average Sentence Length, Average Word Length and Coverage seem to have an influence on how difficult students believe a text is.. The Coverage of a text

[r]

Liquid-liquid axial cyclones perform best when the swirl element is sized such that in the swirl element the maximum droplet size based on the critical Weber number for the flow

In de toekomst zijn burgers zich meer bewust van de invloed van hun eigen gedrag op ziekte en zorg en vervullen zelf een actieve rol in de zorg voor hun gezondheid.. In de