The influence of lexical frequency and working memory on relative clause processing in Dutch

(1)

The influence of lexical frequency and working memory on relative clause processing in Dutch

Thomas Tienkamp - 12772062

Amsterdam Centre for Language and Communication University of Amsterdam

rMA Thesis Linguistics

Supervisors: Dr Evy Visch-Brink, Dr Laura Bos, Dr Ileana Grama Second reader: Dr Monique Flecken

Date: 02-07-2021

Words: 14.138 (excluding abstract, references, and appendices)

(2)

Abstract

Numerous studies indicate that object relative clauses are more difficult to process and comprehend than their subject relative counterparts. It has also been suggested that individuals with higher working memory capacity process these complex structures more efficiently. In this self-paced reading experiment (n=33), it was investigated whether different configurations of low versus high frequency noun phrases could reduce this processing asymmetry. The findings are discussed in light of a new syntactic therapy programme for Dutch people with mild aphasia and whether the frequency manipulation can be used in constructing therapy material. Results indicate that object relative clauses were more difficult to process in general, but that frequency did not modulate this asymmetry. Similar findings were observed for the comprehension questions. Individuals with higher working memory capacity processed object relative clauses with an infrequent first noun phrase more efficiently than those with lower working memory. Moreover, an effect was found that people with lower working memory adopt more risky processing strategies when reading object relatives. Directions for the therapy programme as well as for future research are provided.

Key words: relative clauses, Dutch, lexical frequency, sentence processing, self-paced reading, aphasia

(3)

Acknowledgments

Throughout the course of writing this thesis, I have benefited enormously from some people and some thank yous are definitely in place. First, I want to thank my supervisors, Dr Evy Visch-Brink, Dr Laura Bos, and Dr Ileana Grama. Evy, thank you for your critical but very friendly approach to supervising. You were always enthusiastic about my ideas but also reminded me about the clinical and practical aspects I had to keep in mind. After we were done discussing the project in our bi-weekly meetings, we often chatted about linguistics and academia. I really enjoyed our meetings and I have learned a lot from you. Laura, thank you for your insightful comments, your kind words when working from home during a lockdown proved to be difficult at times, and all the positive feedback I have received from you. I really appreciated it. Ileana, thank you for your comments on my writing, but I especially thank you for your trust in me. When I panic e-mailed you about the deadline you managed to make me feel like I was right on track and that everything would be fine. Your words of encouragement meant a lot to me and I really admire the way you fully support your students. Monique, despite your very hectic schedule you took the time to read my thesis and be my second reader. Thank you very much and I look forward to discussing the end product with you. Lastly, I thank Dirk- Jan Vet for programming the experiment and answering my many questions within an hour.

Thank you!

(4)

1. Introduction

Despite its rapid time course and seemingly effortless nature, sentence processing is a complex mental process. People continually process the incoming input and build and/or update the interpretation of the sentence by making use of phonological, syntactic, semantic, and pragmatic information. One of the goals of psycholinguistic research is to understand how these different factors interact and contribute to processing difficulty. As a corollary, it aims to understand which factors facilitate processing. The present study was designed to investigate whether lexical frequency has a facilitating effect on relative clause (RC) processing in Dutch.

This facilitation effect is of significant importance in the design of therapy materials for people with aphasia (PWA). Aphasia is an acquired language disorder that is usually caused by a focal brain lesion in the language areas and causes difficulties in producing and comprehending language (Bastiaanse & Prins, 2013). There is accumulating evidence that PWA have lower processing resources and process linguistic material slower as compared to non-brain-damaged individuals (e.g. Burckhardt et al., 2008; Purdy, 2002). Moreover, almost all PWA find it difficult to comprehend syntactically complex structures (Bastiaanse & Prins, 2013). Thus, a central aim for therapy programmes is to minimise processing costs as much as possible in order to promote the production and comprehension of complex structures.

To investigate the processing costs associated with complex structures, studies often draw on subject and object relative clauses. In a subject RC (1a), the first noun phrase (NP1),

‘the man’, is both the subject of the matrix clause and the subject of the RC. On the other hand, in object RCs (1b), the NP1 is the subject of the matrix clause, but the object of the RC. The second NP (NP2), ‘the senator’, is the subject of the object RC instead.

(1) a. The man, [who attacked the senator], left the building.

b. The man, [who the senator attacked], left the building.

Despite the fact that both sentences are of equal length and contain the same words, object RCs are often deemed more complex than subject RCs (Frazier, 1987; Johnson et al., 2011; Mak et al., 2002, 2006; Traxler et al., 2002, 2005, among others). This difference in processing costs, also called the subject-object asymmetry, seems to have some universal tendency as a subject preference has been established for the vast majority of languages (Lau & Tanaka, 2021). This study investigates the role of lexical frequency in reducing the subject-object asymmetry and working memory during the processing of RCs. It also assesses the potential use in constructing therapy material for PWA. Dutch provides a good testing-ground for extra syntactic variables

(5)

like lexical frequency as the verb placement of RCs in Dutch is identical for subject RCs and object RCs whereas this differs in English RCs (see (1)). First, some background on relativisation in Dutch is provided as well as the syntactic and extra-syntactic variables that are relevant for the processing of RCs.

1.1 Relativisation in Dutch

In matrix clauses, the word order of Dutch is SVO or VSO due to the verb-second (V-2) rule as illustrated in (2a&b). The V-2 rule demands that the finite verb in a matrix clause must always be in the second position of the clause. In (2b), that features a clause initial prepositional phrase, the subject (Paul) is moved to the third position in order to satisfy the V-2 rule. In subordinate constructions on the other hand, like the relative clause, the verb is always in final position and Dutch can therefore be considered as an SOV language (den Besten, 1990; Koster, 1975). This is illustrated in (2c). However, this analysis leaves out the fact that in Dutch object RCs, the object comes first in the subordinate construction, making it an OSV construction.

This is illustrated in (2d). Dutch relativisation can thus result in both SOV and OSV constructions.

(2) a. [Paul gooit de bal op zondag] (SVO)

Paul throws the ball on Sunday

b. Op zondag gooit Paul de bal (VSO)

On Sunday throws Paul the ball

‘On Sunday, Paul throws the ball'

c. [Paul, [die de hond aait], gooit de bal] (SOV) Paul, who the dog pets, throws the ball

‘Paul, who pets the dog, throws the ball’

d. [Paul gooit de bal [die hij gekocht had] (OSV)

Paul throws the ball that he bought had

‘Paul throws the ball that he had bought’

In RCs as in (2d), where a noun phrase (NP) and a pronoun make up the RC, the clause is disambiguated after reading the pronoun ‘hij’ (he) which has nominative case. This makes the RC in (2d) an object relative since the antecedent, the ball, is the object of the clause. If it was a subject RC, the accusative pronoun ‘hem’ (him) would have been used, making the ball the subject of the clause. In cases where there are two NPs instead of an NP and a pronoun, the

(6)

sentence is only disambiguated at the auxiliary in the RC as in (3) (taken from Mak et al., 2002, p. 50). However, it could also be the case that the relative is not disambiguated at all as in (4) (adapted from Mak et al. (2002, p. 50). In (4), the auxiliary ‘hebben’ (have) could refer to both the professors and the students as the verb could agree in number with both NPs.

(3) Morgen zal de professor, [die de studenten ontmoet heeft], de diploma’s uitreiken.

Tomorrow will the professor, who the students met has, the diplomas present

‘Tomorrow the professor, who has met the students, will present the diplomas’

(4) Morgen zullen de professoren, [die de studenten ontmoet hebben], de diploma’s uitreiken

Tomorrow will the professors, who the students met have, the diplomas present

‘Tomorrow the professors, who have met the students, will present the diplomas’

1.2 Processing Relative Clauses

The field has identified numerous syntactic, lexical, semantic, and pragmatic factors that modulate the subject-object asymmetry, but it is emphasised that most of these accounts have been formulated primarily on the basis of English data. As Gorden and Lowder (2012) note, accounts can be roughly divided into three groups: (1) accounts that focus on memory limitations due to syntactic complexity (e.g. Gibson, 1998); (2) accounts that focus on sentence internal semantic and pragmatic information (e.g. Gordon et al., 2004; MacWhinney, 1977;

Mak et al., 2006; 2008); and (3) accounts that focus on the experience the individual has with specific constructions (e.g. Christiansen & MacDonald, 2009; Engelmann & Vasishth, 2009).

1.2.1 Syntactic factors

The Dependency Locality Theory (henceforth: DLT) is a theory of syntactic complexity based on two main components: integration costs and memory costs (Gibson, 1998, 2000). Following Just and Carpenter (1992), the DLT assumes that linguistic integration processes and storage access the same pool of working memory (WM) resources. As a result, when a sentence imposes either higher integration costs or memory costs, the sentence is regarded as more complex and will take more time to process. Both integration costs and memory costs will be elaborated on before they are linked to RC comprehension.

Sentence comprehension involves the updating of existing syntactic structures with the new input words (Gibson, 1998). To perform an integration, one has to make a prediction about the syntactic category of the incoming word in light of the syntactic structures being considered

(7)

for its interpretation. Subsequently, the incoming word reactivates the syntactic head it belongs to and integrates the new information. Gibson (1998) postulates that this integration cost is dependent on the type of integration and a distance-based cost that depends on the distance between the head and the newly encountered word. He provides the following example to illustrate this (p. 12):

(5) The bartender told the detective that the suspect left the country yesterday.

The sentence-final adverb ‘yesterday’ could be linked to either ‘told’ in the matrix clause, or

‘left’ in the embedded clause. In that sense, integration costs are equal. However, reactivating the verb in the matrix clause is more costly than reactivating the verb in the embedded clause as the activation strength decays over time. Therefore, the sentence comprehension mechanism prefers to integrate ‘yesterday’ with ‘left’, as ‘left’ has a lower distance-based integration cost than ‘told’ in the matrix clause.

The second component of the DLT is a memory cost. Gibson (1998) defines the memory cost as follows: ‘a memory cost associated with remembering each category that is required to complete the current input string as a grammatical sentence’ (p. 13). Additionally, the prediction of the matrix predicate is not associated with any additional memory costs as Gibson (1998) assumes that this prediction is built in the parser (p. 14). If possible, no new arguments or categories are assumed by the comprehension mechanism in order to keep the memory costs down, which is known as the principle of parsimony. Consider sentences (6) and (7). When the comprehension mechanism encounters the second determiner ‘het’ (the), it will assume that the determiner introduces the direct object as in (6) and that this is the only argument that will follow. However, when the mechanism parses the next instance of ‘het’

(the) in the sentence in (7), it has to assume a third argument, the indirect object. Gibson (1998) postulates that this increases the memory cost component as more arguments need to be temporarily stored in order for the string to be grammatical.

(6) De plek, waar de jongen het verhaal vertelde, was verlaten The place, where the boy the story told, was deserted

‘The place, where the boy told the story, was deserted

(7) De plek, waar de jongen het meisje het verhaal vertelde, was verlaten The place, where the boy the girl the story told, was deserted

‘The place, where the boy told the girl the story, was deserted.

(8)

Turning to RCs now, let us recall that Dutch is verb-final in embedded clauses (den Besten, 1990; Koster, 1975). Example (3) is repeated below as an object RC in (8).

(8) Morgen zal de professor, [die de studenten ontmoet hebben], de diploma’s uitreiken.

Tomorrow will the professor, who the students met have, the diplomas present

‘Tomorrow the professor, who the students have met, will present the diplomas’

Gibson (1998, p. 56) explains the subject-object preference documented in the literature for Dutch (cf. Frazier, 1987; Kaan, 1997) as follows: at the point of processing the pronoun ‘die’

(that), there are two options regarding its processing: ‘die’ as the subject, or ‘die’ as the object.

If ‘die’ is analysed as the subject, then only a verb and a subject NP-trace head are needed to complete the string as a grammatical sentence. Note that the predicate induces an increased memory cost here as only the predicate of the main clause is assumed to be cost-free. If ‘die’

is to be analysed as the object instead, then three components are needed in order to complete the string as a grammatical sentence: a verb, an object NP-trace and a subject noun for the verb.

Let us recall the principle of parsimony, that the comprehension mechanism does not predict new categories or arguments if they are not needed. The preference for subject over object readings is clear: two components are needed for a subject reading versus three components for an object reading. Thus, the parser initially ascribes a subject reading to ‘die’. In the case that ‘die’ turns out to be an object as in (8), then the parser needs to reanalyse the relative pronoun, which was initially analysed as a subject, as an object. This results in the longer reading times found at the auxiliary in object RCs as compared to subject relatives in Dutch (Frazier, 1987; Mak et al., 2002, 2006, 2008).

1.2.2 Working memory in processing

A considerable body of literature has shown that WM is involved in language comprehension as individuals with higher WM capacity make fewer mistakes on comprehension questions (e.g. King & Just, 1991). However, whether domain general WM is also involved in the online processing of sentences is less clear and the DLT is agnostic to this point.¹ Some claim that people with higher WM capacity tend to process language more efficiently (e.g. Just &

Carpenter, 1992; King & Just, 1991; Roberts & Gibson, 2002, Traxler et al., 2005, but see

1 The DLT does not make predictions about whether it is domain general or linguistic specific WM capacity that influences sentence processing and comprehension. It merely states that WM resources are recruited in the processing of linguistic input.

(9)

Caplan & Waters, 1995, 1999 for opposing views). For example, the results of Traxler et al.

(2005), who used the eye-tracking while reading paradigm, indicate that readers with higher WM capacity, as measured by the sentence-span test, experience less difficulty in processing object RCs when relevant semantic cues were present as compared to readers with lower WM scores. Traxler et al. (2005) argued that a higher WM capacity facilitates the integration of semantic cues in ‘the processes used to assign constituents to argument positions in a syntactic tree’ (p. 217). Thus, having a higher WM capacity allows one to more efficiently combine multiple sources of information during sentence processing. Using the self-paced reading paradigm, King and Just (1991) show that people with a higher WM capacity, as measured by the reading span test, process object RC structures more efficiently in general and perform better on the comprehension questions, too.

If one follows the notion that WM is involved in sentence processing, one would expect older adults to process language less efficiently as WM capacity declines with age. A common finding in the literature is that older adults read slower and have more regressive eye- movements during reading, meaning that they look back at previous parts of the sentence more often (Gordon et al., 2016). Indeed, Caplan et al. (2011), who tested 200 adults aged 19-90, found a correlation between age and speed of online processing such that older participants processed the critical part of object RCs slower than younger adults. However, no correlation was found with WM, as measured by the alphabet-span, subtract-2-span, and sentence-span, which suggests that the slowing was not due to WM decline. Instead, the results of Caplan et al. (2011) are better explained by a model in which some language comprehension processes are insensitive to working memory limitations either due to automaticity or by virtue of access to exclusive, dedicated working memory resources (e.g. Caplan & Waters, 1995, 1999, Traxler et al., 2005).

However, other studies suggest that older adults do not process the sentence more slowly (e.g. Stine-Marlow et al., 2000). The older adults in their study did not read the critical parts of the sentence more slowly, but similarly to Caplan et al. (2011), they did make more mistakes in the comprehension questions. Stine-Marlow et al. (2000) argue that this is due to the fact that older adults stopped attempting to build the correct structure and meaning in more complex structures when memory load increased. Moreover, some suggest a ‘good-enough’

risky processing strategy (Christianson et al., 2006; Christianson, 2016). As processing resources decrease with age, older adults tend to adopt a more cost efficient risky processing strategy that works in the vast majority of cases. When reanalysis needs to take place, older adults make more errors as they went with the ‘cheaper’ alternative in terms of processing

(10)

costs. Thus, it allows them to process linguistic input faster, but sometimes this risk does not pay off and leads to more mistakes.

1.2.3 Pragmatic and semantic factors

Another set of theories bases their explanation on semantic and pragmatic factors. Various studies suggest that the semantic cue of animacy guides the choice of the initial analysis of the RC. Animate subjects, subjects that are alive and sentient (e.g. the man), are more likely to be the agent of the sentence than inanimate objects (e.g. the box) (Gordon et al., 2004). Animate subjects are therefore more likely to be the “do’er” of the sentence as compared to inanimate subjects. Thus, having an animate NP in the matrix clause (NP1) facilitates a subject RC reading whereas an inanimate NP1 facilitates an object RC reading. If this dichotomy leads the parser in assigning subject/object relations, then this would lead to the following predictions:

slower object RC readings for animate NP1s on the one hand, and slower subject RCs for inanimate NP1s on the other hand. This prediction turns out to be false; object RCs are still processed more slowly as compared to subject RCs when both the NP in the matrix clause and the NP in the RC (NP2) were inanimate as in (9) (adapted from Mak et al., 2006, p. 469). The same result was found by Mak et al. (2002), who used self-paced reading paradigm, for two animate NP pairings as the object RCs were read slower than the subject RCs.

(9) De lekkages, die de gel verhelpt, moeten in één keer verdwenen zijn.

The leakages, that the gel remedies, must in one time gone are.

‘The leakages, that the gel remedies, must disappear at once’

Thus, keeping animacy constant, having either two animate or two inanimate NPs, does not influence the processing difficulty of RCs and the research on animacy so far is in line with the DLT (Gibson, 1998).

Different results are obtained when the animacy of the NPs is not constant, thus having an animate/inanimate pairing or the reverse. In various studies, it has been shown that the difficulty associated with object RCs was reduced when NP1 was inanimate and NP2 was animate as in (10) (e.g. Gennari & MacDonald, 2008, 2009; Lowder & Gordon, 2014; Mak et al., 2006; Traxler et al., 2002, 2005).

(10) The box that the man carried was filled with toys.

(11)

On the other hand, object RCs with an animate NP1 and inanimate NP2 resulted in the longest reading times as in (11) (taken from Mak et al, 2006, p. 473).

(11) The hikers, that the rock has crushed, are the talk of the day.

These animacy effects are not accounted for by the DLT as Gibson (1998) does not assume that nouns differing in animacy induce different memory or integration costs. In principle, one could propose a lower integration cost for inanimate NPs, but that would then fail to account for the difference in object RCs when both NPs are (in)animate. This would lead us to presume that there is more to RC processing than just syntax.

Related to animacy is similarity-based inference which denotes the structural and semantic similarity between NP1 and NP2. In a series of experiments that made use of both the self-paced reading and eye-tracking while reading paradigms, Gordon and colleagues (Gordon et al., 2001, 2004, 2006) found that the subject-object difference in processing was significantly reduced when the embedded NP, NP2, was a proper name, indexical pronoun or quantified expression. Consider (12) and (13). In (13) both NPs follow the structure of [determiner + noun] and denote two semantically related referents in terms of number, animacy and definiteness. In (12) on the other hand, there is less structural overlap as NP2 is a proper name rather than a [determiner + noun] construction. According to Gordon et al. (2004), the structural overlap in (13) makes it harder to integrate both NPs in the syntactic framework. As a corollary, there is less similarity in (12), facilitating the retrieval and integration of both NPs in RC processing.

(12) The man, who Jeremy attacked, had brown hair.

(13) The man, who the waiter attacked, had brown hair.

A second explanation for the reduced processing asymmetry for proper names and pronouns, is the fact that they are more given in the current discourse as formulated in the givenness hierarchy by Gundel et al. (1993). This hierarchy proposes that referents that are more central to the current discourse impose a lesser cognitive burden on the speaker. Thus, proper names and pronouns are more central (i.e. more given to the discourse) as they are more thoroughly grounded in the current discourse than new [determiner + noun] descriptors.

Although similarity-based inference is not a purely syntactic phenomenon, it can be accounted for by the DLT (Gibson, 1998). The DLT postulates that new referents induce a

(12)

memory cost and an integration cost when the syntactic head is encountered. One could make the prediction that referents that are grounded within the discourse, referents higher up the givenness hierarchy, induce a lower integration cost as they are more readily available to the speaker than new descriptors. Indeed, Warren and Gibson (2002), who used the self-paced reading paradigm, showed that the object-subject asymmetry was inversely related to the givenness of NP2: the more given the NP (i.e. a proper name), the lower the subject-object difference.

1.2.4 Frequency factors

The last set of explanations are based around experience in the form of frequency effects.

Frequency effects are ubiquitous within language learning and processing. For lexical frequency, it has long been known that lexical items with a higher frequency are named, recognised and read faster than those with a lower frequency as evidenced by the seminal work by Oldfield and Wingfield (1965). This frequency effect is considerably robust as it has been replicated many times in both picture naming and lexical decision tasks (e.g. Griffin & Bock, 1998; Jescheniak & Levelt, 1994; Levelt et al., 1998). Furthermore, it has been shown that lexical frequency effects are more pronounced in older adults than in younger adults in terms of reading time (Kliegl et al., 2004; Rayner et al., 2006).

This local effect of lexical frequency seems to have a more global effect in sentence processing, too. In an eye-tracking while reading study, Johnson et al. (2011) investigated the role of lexical frequency in the processing of RCs. When comparing sentences where NP1 and NP2 are both high or low frequency, the results indicated that the higher frequency words were read faster than items with a lower frequency while controlling for lexical length. This finding can be attributed to a higher base activation of lexical items with a high frequency. It takes less resources to activate this item, thus leading to faster retrieval and encoding. However, having two frequent or infrequent NPs did not modulate the subject-object asymmetry. An interesting pattern arose when NP1 and NP2 differed in frequency. Results indicated that it was not a high frequency NP1 that facilitated object-RC processing. Rather, the subject-object asymmetry was reduced when NP1 was low frequency and NP2 was high frequency. Johnson et al. (2011) attribute this finding to the list-composition effect in memory research (e.g. Merrit et al., 2006).

This memory research has found that lower frequency words are retrieved better and faster if they appear in a list where items vary in frequency. This is because the lower frequency items are encoded more completely as they are less common than their high frequency counterparts.

Thus, Johnson et al. (2011) postulate that the object-subject asymmetry is reduced as the lower

(13)

frequency NP1 is encoded more completely in memory and is therefore easier to retrieve when it needs to be integrated in the syntactic frame.

The same frequency effect that was found on the lexical level, has also been found on the sentence level. Initial evidence for the construction frequency effect could be seen in the work of Mak et al. (2002) as they found a discrepancy in the distribution of subject and object RCs based on animacy. The corpus analysis found that subject RCs were most likely to follow an animate NP1 and inanimate NP2 construction whereas the opposite pattern was found for object RCs (inanimate NP1 and animate NP2). This distribution, according to Mak et al. (2002) could in part explain the behavioural results that mirrored the corpus distribution. The most frequently occurring animacy pairings were the easiest to process, suggesting that experience modulates processing efficiency. Other work that supports the role of experience are various usage-based computational models of sentence processing (e.g. Christiansen & MacDonald, 2009; Engelmann & Vasishth, 2009; Hale, 2001; Levy, 2008). These models all postulate that the distributional properties of the NP modulate resource allocation in sentence processing. If a particular NP is infrequent in a certain position, for example an inanimate NP2 in an object RC (see (11)), then the sentence processing mechanism has to allocate more resources to the processing and integration of this NP than when it occurs in a more frequent position (e.g. the same inanimate NP, but then in NP1 position of the object RC).

The effect of experience is not easily reconciled within the DLT framework (1998) as it does not postulate a prediction mechanism based on content, but one based on structure. This rules out the possibility of having differential memory/integration costs based on the position of the NP.

1.3 Possible implications for aphasia treatment

As mentioned, aphasia is an acquired language disorder, most frequently caused by a cerebrovascular accident, commonly known as a stroke (Bastiaanse & Prins, 2013). This accounts for almost 85% of the cases. Other causes include traumatic brain injury, tumours, or infections in the brain. One of the most common ways to classify different aphasia types is through the fluency of the output, resulting in fluent and non-fluent aphasia types (Clough &

Gordon, 2020).

Non-fluent aphasia types are usually associated with effortful speech. People with non- fluent aphasia, like Broca’s aphasia or agrammatism, produce short sentences, sometimes telegraphically, and grammatical morphemes are omitted or substituted (Goodglass & Kaplan, 1983; Kertesz, 2006). Bastiaanse and Van Zonneveld (2004) argue that non-fluent PWA suffer

(14)

from poor grammatical encoding abilities which is associated with damage to primarily frontal structures, especially Broca’s area (den Ouden et al., 2019; Wilson et al., 2010). On the other hand, people with fluent aphasia, like Wernicke’s aphasia or paragrammatism, produce sentences of normal length, but constituents may be grouped erroneously, making it hard to establish grammatical relations between the constituents (Goodglass & Kaplan, 1983; Kertesz, 2006). Fluent aphasia types are associated with posterior temporal-parietal lesions (Buchsbaum et al., 2011; Yourganov et al., 2015). Irrespective of the fluent/non-fluent distinction, language comprehension difficulties are present in almost all aphasia types (Bastiaanse & Prins, 2013).

Comprehension difficulties are most pronounced when the sentence is semantically reversible, meaning that both NPs can be the agent of the sentence and when the sentence appears in a non-canonical order, such as in passives or object RCs (14a&b) (e.g. Caplan &

Futter, 1986; Grodzinsky, 1989). In (14) both Mary and Karen are appropriate agents for the verb ‘hugged’. The syntactic deficit combined with a possible semantic deficit, make these reversible sentences considerably difficult for the PWA, both in fluent and in non-fluent aphasia types.

(14) a. Mary is hugged by Karen.

b. Mary, who Karen hugged, is leaving.

PWA consistently perform worse on passives and object RCs than on active sentences and subject RCs (Caramazza & Zurif, 1976; Cho-Reyes & Thompson, 2012; Friedmann & Shapiro, 2003; Grodzinsky, et al., 1999). This is because the sentence cannot be processed incrementally or on the basis of world knowledge. Rather, thematic roles are derived on the basis of grammatical relations. Even though these studies mostly tested agrammatic individuals, some studies suggest that people with fluent aphasia have difficulty with non-canonical structures, too (e.g. Bastiaanse & Edwards, 2004; Cho-Reyes & Thompson, 2012; Faroqi-Shah &

Thompson, 2003).

A key question in the aphasiological literature is whether grammatical knowledge is lost (e.g. Caplan & Futter 1986; Grodzinsky, 1989, 2000; Friedmann, 2000) or harder to access due to processing limitations (e.g. Lukatela et al., 1995). Regardless of whether syntactic knowledge is degraded or harder to access, a central aim for therapy programmes should be to minimise processing costs as much as possible in order to promote the production and comprehension of complex structures and make exercises as accessible as possible. In addition, it is important to offer exercises at various difficulty levels for two reasons. First, the available

(15)

resources may differ per individual. Second, according to the Complexity Account of Treatment Efficacy (CATE) introduced by Thompson et al. (2003), the training of more complex structures in a syntactically related domain will result in generalisation to less complex structures. Thus, when the PWA is able to comprehend and produce the more difficult exercises, it is hypothesised to result in enhanced performance on less difficult exercises. In the case of RCs, the CATE predicts that learning to comprehend and produce object relatives will result in generalisation to subject relatives.

At present, there are therapy options available to train lexical and phonological aspects of language production at various difficulty levels in Dutch (e.g. BOX for lexical training, Visch-Brink, 2019; and FIKS for phonological training, Visch-Brink & De Waard-Van Rijn, 2018). However, options for syntactic training for Dutch PWA are more limited because (1) they are mostly suited for severe aphasia types (e.g. Bastiaanse et al., 1997); (2) the various levels of complexity lack thorough grounding in linguistic theory; or (3), there is a general lack of different difficulty levels. Practice materials that exist for mild aphasia types are divided over multiple programmes or books and not bundled in one place. To fill this gap for treatment options, a new syntactic therapy programme for Dutch people with mild aphasia is under development in collaboration with the department of neurosurgery at the Erasmus Medical Centre in Rotterdam (PI: Visch-Brink; authors: Bos, Van der Keur-van Driel, Visch-Brink, 2021).

1.4 Present study

The present study was designed to investigate whether lexical frequency has a facilitating effect on RC processing in Dutch and whether it can reduce the subject-object asymmetry. If it does, it provides a fruitful method to reduce processing costs in order to facilitate the training of complex syntactic constructions in Dutch PWA. However, it is unclear whether the results found by Johnson et al. (2011) may also be found in Dutch. This is because both NPs need to be stored longer in Dutch as compared to English. Therefore, the effect may go either way.

Either, the results may be replicated as the NPs are encoded more completely which still gives a processing advantage, or the initial encoding advantage decays over time to a point where no processing benefits are found. The present study will shed light on this.

The outcomes of the present study are also relevant for therapy programmes. RCs are an important part of daily communication as they are relatively frequent, especially in newspapers and other forms of written communication. This makes practicing, especially the comprehension of them, relevant for people with mild aphasia. The usefulness depends on

(16)

whether the facilitating effect frequency might have in RC processing overrides the lexical retrieval problems many PWA have. Bastiaanse et al. (2016) investigated the effect of lexical frequency on the retrieval of nouns and verbs in PWA. Their results indicate that frequency does not affect verb retrieval, but it plays a minor role in noun retrieval. Higher frequency nouns are easier to retrieve than lower frequency nouns. It should be noted that these words were tested in isolation and that the nouns in questions were objects, not animate NPs as those in the present study. On the other hand, some studies did not find a frequency effect or even found a reversed frequency effect. That is, lower frequency items were retrieved or comprehended more accurately (e.g. Crutch & Warrington, 2003; Hoffman et al., 2011). These reversed effects, combined with the facilitatory role of infrequent NP1s as found by Johnson et al. (2011) make frequency a potentially useful method to structure exercises for PWA.

An additional aim of the present study was to assess the role of WM on RC processing in Dutch as some claim that individuals with higher WM capacity process object RC structures more efficiently (e.g. King & Just, 1991; Traxler et al., 2005). However, this remains disputed (e.g. Caplan & Waters, 1995). Moreover, it is unclear how WM capacity interacts with lexical frequency and is therefore assessed in the present study. The research questions can be stated as follows:

(1) To what extent do different frequency pairings of NP1 and NP2 modulate the processing costs of subject and object RCs in terms of reading times in neurotypical adults?

(2) To what extent do different frequency pairings of NP1 and NP2 affect the comprehension questions in subject and object RCs in neurotypical adults?

(3) To what extent does WM capacity predict ease of processing in Dutch RCs in neurotypical adults?

For (1), based on the DLT (Gibson, 1998) and the existent literature on RC processing in Dutch (Frazier, 1987; Mak et al., 2002, 2006, 2008), it is predicted that object RCs will be more difficult to process as shown by longer reading times on the auxiliary, the place of disambiguation in Dutch RCs. Furthermore, based on the results of Johnson et al. (2011), it is predicted that the frequency pairing infrequent NP1 and frequent NP2 reduces the subject- object asymmetry. However, this prediction is not entirely precise as Johnson et al. (2011) found the effect on gaze regression, when participants go back to earlier parts of the sentence, using the eye-tracking while reading paradigm. Here, self-paced reading is used. Thus, an effect

(17)

may be found as participants cannot go back to earlier parts of the sentence and may therefore spend more time on the auxiliary.

For (2), it is predicted that participants will make more mistakes on object RCs as compared to subject RCs, but that an infrequent NP1 modulates the number of mistakes. That is, it is predicted that participants make fewer mistakes on the comprehension questions in object RCs with an infrequent NP1 as compared to object RCs with a frequent NP1. However, it must again be noted that this prediction is based on English data. In Dutch, as opposed to English, the NPs must be stored longer before the point of disambiguation is reached.

For (3), it is predicted that higher WM capacity is associated with faster reading times of the auxiliary in object RCs and higher comprehension question accuracy (King & Just, 1991;

Traxler et al., 2005). Whether WM capacity differentially affects object RCs differing in the frequency of the NPs is an open question.

2. Methodology 2.1 Participants

Thirty-seven native Dutch speakers were recruited for the study via personal contacts, and advertisements on social media platforms like Facebook. The age of the participants was based on the age of the PWA that will most likely make use of the therapy programme. These are people in the age range of 50-70 as stroke incidence increases with age (Vaartjes et al., 2008).

Participants were not preferentially selected based on educational level or age apart from the initial criterion of 50-70 years. This was opted for to represent the background characteristics of the possible treatment group. Participants were included if they had normal or corrected-to- normal vision, did not report having any reading or language problems (e.g. dyslexia or developmental language disorder) and neurological disorders such as Parkinson’s. Participant characteristics were collected via a short background questionnaire presented in Qualtrics (Qualtrics, 2019) and contained questions targeting the participant’s gender, age, education level, and possible language problems. Based on this, four participants were excluded: two were excluded because they reported having dyslexia, one participant fell out of the age range at 73 years old, and the data of one participant was not uploaded correctly to the server. In total, the data of 33 participants (11 men) with a mean age of 55.9 years (sd = 3.9, range = 50-69) were analysed for the present study. 20 participants (61%) received more theoretical education and 13 (39%) received more practical education.

(18)

2.2 Materials

2.2.1 Working memory task

To assess WM capacity, the digit-span task as included in the CAT-NL (Visch-Brink et al., 2014) was used to ensure comparability of results if the sentences were tested on Dutch PWA.

The CAT-NL uses the forward digit-span task.

2.2.2 Self-paced reading task

A self-paced reading (SPR) task was constructed in ED (Vet, 2021). In an SPR task, the participant reads a sentence word-by-word and presses the spacebar to continue to the next word. The SPR task contained 32 experimental sentences. Examples are provided in Table 2.

The first condition is a subject-RC with a high frequency (HF) NP1, and low frequency (LF) NP2; the second condition is a subject-RC (SRC) with a LF NP1 and HF NP 2. Conditions three and four mirror conditions one and two, the only difference being that condition three and four are object-RCs (ORC) as opposed to subject RCs.

Table 2. Experimental material for the experiment.

(1) SRC: HF - LF

De beroemde schrijver, die de kundige vertalers gebeld heeft, brengt een nieuw boek uit.

The famous writer, who the skilled translators called has, brings a new book out.

‘The famous writer, who has called the skilled translators, is releasing a new book.’

(2) SRC: LF - HF

De leuke reisleiders, die de grappige chauffeur bedankt hadden, organiseerden een mooie trip.

The fun tour guides, who the funny driver thanked had, organised a nice trip.

‘The fun tour guides, who had thanked the funny driver, organised a nice trip.’

(3) ORC: HF - LF

De corrupte minister, die de fanatieke activisten beschuldigd hadden, nam uiteindelijk ontslag.

The corrupt minister, who the fanatical activists accused had, resigned eventually.

‘The corrupt minister, who the fanatical activists had accused, resigned eventually.’

(4) ORC: LF - HF

De sterke rekruten, die de meedogenloze luitenant getraind heeft, mogen eindelijk op missie.

The strong recruits, who the ruthless lieutenant trained has, may finally go on mission.

‘The strong recruits, who the ruthless lieutenant has trained, may finally go on mission.’

Following Mak et al. (2006) the RC-internal structure follows a [relative pronoun - determiner - noun - past participle - auxiliary] structure. An adjective was added to both NPs to make the

(19)

sentences more appealing to PWA. An initial set of 32 sentences was constructed and checked by two clinical linguists and a speech language pathologist. This set was corrected with their feedback to make the sentences comprehensible and appealing to PWA. The final set of sentences can be found in Appendix B.

To avoid ordering effects, NP1 is singular and NP2 plural in half of the sentences. The other half had the opposite structure: plural NP1, singular NP2. Moreover, half of the sentences were in present tense and half were in past tense. The mean length of HF items was 8.1 letters and 8.2 letters for LF items (t(61.1) = -0.41, p = 0.69) to avoid disparate reading times due to word length. To reduce the effects of animacy (e.g. Mak et al., 2002) and similarity based inference (e.g. Gordon et al., 2004), all NPs were animate and no names or pronouns were used.

Frequency was calculated using the CELEX corpus (Baayen et al., 1993). Following Johnson et al. (2011), HF items had a mean log-frequency of 1.75 per million words whereas LF items had a mean log-frequency of 0.43 per million words. A t-test indicated that this difference was significant (t(54.5) = 35.9, p < 0.001). The target words and their log frequency can be found in Appendix A.

Only one list was distributed among the participants, thus deviating from a 2x2 design.

This was opted for as the test sentences were designed to be included in the therapy programme and background variables were controlled for and equally divided among conditions. The only variable that was not directly controlled for, was thematic fit. However, Mak et al. (2002) argue that the subject RC preference is established before encountering the past participle as there is considerable evidence for Dutch and German, which has a similar RC structure, that thematic fit does not influence the reading times (e.g. Brown et al., 2000; Mak et al., 2002; Mecklinger et al., 1995; Schriefers et al., 1995; Vonk et al., 2000).

The 32 experimental items were mixed with 38 filler sentences, resulting in 70 sentences in total. Filler sentences were syntactically complex but did not contain subject or object RCs. Rather they contained coordinated sentences and subject/object clefts. After three out of four experimental trials, the participant was presented with a true or false comprehension question. Filler sentences were followed by a comprehension question one-third of the time.

Following Gordon et al. (2004) one-third of the questions were about the matrix clause and two-thirds were about the action in the RC. Half of the correct answers were true, and the other half were false. A practice session of four filler sentences preceded the experimental blocks to familiarise the participant with the experimental procedure. The remaining 66 items were pseudo randomised over three experimental blocks containing 22 items. Each block had ten or eleven experimental items and eleven or twelve filler items.

(20)

2.3 Procedures

Participants filled out an informed consent form paired with the background questionnaire at the start of the session. All tasks were administered in the same order for all participants. The digit-span was administered first. Participants were asked to repeat the digit-sequence in the same order. If the participant repeated the sequence correctly, a new sequence with one additional digit was given to the participant. If the participant failed to recall the sequence correctly, a second sequence with the same number of digits was given to the participant. If the participant failed to recall the sequence again, the task was stopped and the score was the highest correctly recalled sequence.

The second task was the SPR. Following Gordon et al. (2004), participants were instructed to read the sentences at a natural pace, but not to linger longer than necessary at any particular word before pressing the spacebar to proceed to the next word. The course of a trial was presented as follows. First the participants saw a fixation cross after which the first word was presented. Following Mak et al., (2006) “the letters of the other words and the commas were replaced by dashes and the full stop at the end of the sentence was visible.” (p. 471). After the last word was read, a fixation cross appeared again. Pressing the spacebar a last time brought the participant to the comprehension question. After the question was answered by means of a button press (Z for true, M for false), a fixation cross appeared to signal the start of the next trial. Participants were able to take a brief break in between the experimental blocks.

2.4 Analyses

Prior to analysis, responses under 50 ms were removed as the participant did not have sufficient time to process the stimulus (Mak et al., 2008; van Witteloostuijn et al., 2019). Following Mak et al. (2008) responses over 4.000 ms were removed as this indicates that the individual lingered longer than necessary at the target word. Following Johnson et al. (2011), all trials were analysed, regardless of whether the comprehension question was answered correctly or not.

Visual inspection of the QQ-plots revealed that per word of interest, the reading time (RT) data was not normally distributed. A normal distribution was confirmed after log-transforming the RT data and these were subsequently used for the statistical analyses. Non-normalised data are used for the descriptives in the tables to aid interpretability of the results, but the log-RT data is reported for the statistical results. RT data was analysed in R version 4.0.3 with linear mixed effects models using the lmer function in the lme4 package 1.1-26 (Bates et al., 2015; R Core Team, 2020). P-values were approximated with the lmerTest package 3.1-3 (Kuznetsova et al., 2017)

(21)

To answer research question (1), whether lexical frequency differentially affects the processing of subject and object RCs, a linear mixed effects model was built with RT (in log ms) of the auxiliary verb as a function of the binary predictors Type (Subject - Object) and Frequency (HF-LF - LF-HF). Age and trial number were added as fixed effects to control for additional confounds. The categorical predictors were coded with sum-to-zero orthogonal contrasts: Type (Subject as -0.5, Object as +0.5) and Frequency (HF-LF as -0.5, LF-HF as +0.5) (Baguley, 2012). A maximal by-item and by-participants random effects structure that did not result in non-convergence was added following the recommendation of Barr et al.

(2013). To avoid problems of non-convergence as much as possible, the number of iterations was increased to 100.000 (Powell, 2009). Lastly, 95% Confidence Intervals (CIs) were computed using a bootstrapping method at 100.000 iterations. The statistic that will answer RQ1 is the Type x Frequency interaction. To investigate a possible spillover effect, a similar model was run with the auxiliary RT + the RT of the succeeding word. A spillover effect occurs when the processing of word n has not been fully completed when the participant reads word n+1. Thus, the processing of word n continues on n+1.

To answer research question (2), whether lexical frequency differentially affects the comprehension questions, generalised linear mixed effects models were built using the glmer function in the lme4 package (Bates et al., 2015). Similar model structures were adopted as those presented for the reading time analysis. However, in this case, the dependent variable is not the reading time of the auxiliary, but the answer to the comprehension question (Correct - Incorrect). Thus a model was built with Correct as a predictor of the binary predictors Type (Subject - Object) and Frequency (HF-LF - LF-HF). Age and trial number were added as fixed effects to control for additional confounds. Again, sum-to-zero contrasts were applied and iterations were increased to 100.000 to avoid non-convergence.

To answer research question (3), whether WM predicts ease of processing, an additional linear mixed effects model was built that included a triple interaction between the digit span score, Type (Subject - Object) and Frequency (HF-LF - LF-HF). The statistics that will answer the RQ is the interaction between digit span score and Type. Moreover, the triple interaction between all three predictors indicates whether WM differentially affects RC processing when the frequency of the NPs differ. Trial number and age were added as fixed effects. To assess whether WM explains additional variance in the dataset, a model comparison was conducted using the anova function (R Core Team, 2020). The same procedure was also applied to the comprehension questions using the glmer function instead (Bates et al., 2015).

(22)

3. Results

Participants had a mean forward digit-span of 6.3 (sd = 1.0, range = 4-8). First, it was checked whether education level improved the model fit. However, for all three models, including education did not significantly improve the model fit: reading time of the auxiliary (p = 0.84), the auxiliary plus the succeeding word (p = 0.79), and for the comprehension questions (p = 0.36). Therefore, only the models without education will be reported on to keep the models as parsimonious as possible and education level will not be further considered.

3.1 Reading times

Table 3. presents the mean and standard deviation of the reading time (RT) per word of interest:

NP1, NP2, the auxiliary in the RC, the matrix clause verb, and the RC auxiliary + the matrix clause verb combined (the spillover). All RTs, including the log-transformed RTs, are documented in ms. The output of the linear mixed effects models per word of interest are reported below.

Table 3. Reading time measures per word of interest per condition in ms.

Condition NP1 (sd) NP2 (sd) Auxiliary (sd) Matrix verb (sd) Spillover (sd)

HF-LF SRC 775 (462) 757 (440) 1134 (707) 698 (359) 1945 (1080)

LF-HF SRC 855 (571) 756 (464) 1116 (727) 682 (283) 1912 (1316)

HF-LF ORC 821 (501) 783 (480) 1204 (811) 788 (442) 2234 (1444)

LF-HF ORC 814 (571) 797 (475) 1210 (766) 741 (347) 2164 (1279)

3.1.1 Head noun (NP1)

There was no significant measure of early processing due to word frequency as the NPs with a low frequency were not read significantly slower than NPs with a high frequency across conditions (β = 0.03 log RT slower for LF NPs, SEM = 0.03, t(26.8) = 1.4, p = 0.18, 95% CI [-0.01 … 0.08 log RT]). The RT of NP1 did not depend on the clause type (β = 0.008 log RT faster in ORC sentences , SEM = 0.02, t(26.7) = -0.31, p = 0.75, 95% CI [0.04 log RT slower

… 0.06 log RT faster]). There was a trend effect of age such that older participants read NP1 slower than younger participants (β = 0.03 log RT, SEM = 0.02, t(31) = 1.719, p = 0.095, 95%

CI[-0.004 … 0.07 log RT]). Lastly, there was a significant effect of trial number such that NP1 was read 0.004 log RT faster in later trials (SEM = 0.007, t(27.1) = -5.9, p < 0.001, 95% CI [0.002 … 0.005 log RT faster]).

(23)

3.1.2 Embedded noun (NP2)

As with NP1, there was no significant effect of lexical frequency. LF NP2s were read 0.008 log RT slower as compared to HF NP2s (SEM = 0.03, t(21.87) = -0.28, p = 0.78, 95% CI [- 0.07 … 0.05 log RT]). The RT of NP1 did not depend on clause type (β = 0.04 log RT slower in ORC sentences , SEM = 0.03, t(21.7) = 1.3, p = 0.2, 95% CI [0.01 log RT faster … 0.1 log RT slower]). There was no effect of age (β = 0.02 log RT slower for older participants, SEM

= 0.02, t(31) = 1.1, p = 0.28, 95% CI [0.01 log RT faster … 0.05 log RT slower]). Like at NP1, there was a significant effect of trial number such that NP2 was read 0.003 log RT faster in later trials (SEM = 0.008, t(31.6) = -4.5, p < 0.001, 95% CI [0.002 … 0.005 log RT faster]).

In sum, only an effect of trial number was found for both NP1 and NP2 where the NP was read faster in later trials as compared to earlier trials. For both NPs, no difference in RT was found regarding their frequency, clause type, or the participant’s age.

3.1.3 RC auxiliary

As can be observed from Table 3, the reading time of the auxiliary and the auxiliary plus the matrix verb was longer in ORCs than in SRCs. Statistically, there was a trend of RT of the auxiliary across RC types. The auxiliary in ORCs was read 0.08 log RT slower as compared to the auxiliary in SRCs (SEM = 0.04, t(23.97) = 2.03, p = 0.053, 95% CI [0.003 … 0.14 log RT]). There was no main effect of Frequency (β = 0.009 log RT faster in LF-HF pairings, SEM

= 0.036, t(27.4) = -0.25, p = 0.8, 95% CI [-0.08 … 0.06 log RT]). There was also no significant interaction between RC type and Frequency (β = 0.06 log RT bigger difference between SRC and ORC in the LF-HF condition as compared to the HF-LF condition, SEM = 0.07, t(27.3) = 0.81, p = 0.42, 95% CI [-0.07 … 0.18 log RT]). There were no effects of age or trial number.

Older participants read the auxiliary 0.03 log RT slower than younger participants (SEM = 0.02, t(30.9) = 1.3, p = 0.19, 95% CI [-0.01 … 0.06 log RT]). Later trials were read 0.0003 log RT slower as compared to earlier trials (SEM = 0.009, t(39.6) = 0.3, p = 0.76, 95% CI [-0.001

… 0.002 log RT]).

3.1.4 Spillover: auxiliary + succeeding word

To assess whether the trend found at the auxiliary was more pronounced at the word succeeding the auxiliary (the verb of the matrix clause), a spillover model was constructed. There was a significant effect of RC type. The auxiliary plus succeeding word in ORCs was read 0.11 log RT slower than in SRCs (SEM = 0.04, t(32.1) = 3.11, p = 0.004, 95% CI [0.04 … 0.18 log RT]). There was no main effect of Frequency (β = 0.01 log RT faster in LF-HF pairings, SEM

(24)

= 0.03, t(28.06) = -0.45, p = 0.66, 95% CI [-0.07 … 0.07 log RT]). There was also no significant interaction between RC type and Frequency (β = 0.01 log RT bigger difference between SRC and ORC in the LF-HF condition as compared to the HF-LF condition, SEM = 0.06, t(27.9) = 0.24, p = 0.81, 95% CI [-0.1 … 0.1 log RT]). There were no effects of age or trial number.

Older participants read 0.02 log RT slower as compared to younger participants (SEM = 0.02, t(31) = 1.04, p = 0.31, 95% CI [-0.02 … 0.05 log RT]). Later trials were read 0.0009 log RT faster as compared to earlier trials (SEM = 0.0007, t(57.2) = -1.34, p = 0.18, 95% CI [0.002 faster … 0.0006 log RT slower]). To ensure that the spillover effect was not due to differing lengths of the verb of the matrix clause, a two-sample T-test was conducted. There was no significant difference between the verb in ORCs and SRCs (mean difference = 0.1 letter, t(29.8)

= -0.14, p = 0.89).

In sum, there was no effect of lexical frequency on the reading time of the auxiliary or the succeeding word. There were also no effects of age and trial number on the reading time of the auxiliary or the spillover. There was, however, a trend effect of clause type on the reading time of the auxiliary and a significant spillover effect on the word following the auxiliary. The spillover effect was not caused by a difference in word length between conditions. The effect of clause type was not modulated by the frequency combination of the NPs as the interactions were non significant.

3.2 Error analysis

The mean accuracy scores on the comprehension questions per RC type and frequency pairing are provided in Table 4. The outputs of the generalised linear mixed effects model are reported below.

Table 4. Mean accuracy on the comprehension questions per condition.

Condition Frequency Mean accuracy in % (sd)

SRC HF-LF 86.4 (34.4)

SRC LF-HF 86.4 (34.4)

ORC HF-LF 72 (45.1)

ORC LF-HF 79.5 (40.5)

(25)

3.2.1 Logistic mixed effects regression analysis

A first generalised regression model was run to investigate whether participants scored better on the comprehension questions depending on the type of relative clause and frequency pairing.

There was a trend main effect of clause type, such that participants scored better on SRC trials than on ORC trials (β = 0.89, 95% CI [-0.21 … 2.08], Z = 1.68, p = 0.09). There was no main effect of frequency pairing such that a low frequency NP1 resulted in better accuracy scores (β

= 0.18, 95% CI [-0.97 … 1.31], Z = 0.36, p = 0.72). There was also no effect of age (β = 0.03, 95% CI [-0.07 … 0.15], Z = 0.68, p = 0.49) or trial number (β = 0.004, 95% CI [-0.02 … 0.02], Z = 0.2, p = 0.67). Lastly, the interaction between clause type and frequency indicated that participants scored better on LF-HF ORC trials than HF-LF ORC trials, but this interaction was non significant (β = -0.59, 95% CI [-2.93 … 1.65], Z = -0.55, p = 0.58).

3.2.2 ANOVA analysis

Logistic mixed effects models generate more conservative p-values than the often used ANOVA due to the addition of other fixed effects and the addition of a random effect structure (Schad et al., 2020). Even though mixed models are becoming more popular, ANOVAs are still widely used and Johnson et al. (2011) made use of ANOVAs as well. Therefore, a more comparable ANOVA analysis was conducted as well as a supplementary analysis.

A two-way ANOVA with the percentage correct of each participant as the dependent variable, and clause type and frequency pairings as independent variables indicated that there was a significant difference between SRC and ORC stimuli. Participants scored on average 10% higher on SRC trials as compared to ORC trials (F(1) = 8.6, p = 0.004). There was no main effect of frequency pairing (F(1) = 1.092, p = 0.29), nor was there a significant interaction between trial type and frequency pairing (F(1) = 1.092, p = 0.29).

In sum, the logistic mixed effects model analysis yielded a trend effect of clause type such that participants scored better on SRC trials than on ORC trials. This main effect was significant in the ANOVA analysis. Neither the logistic mixed effects model nor the ANOVA analysis yielded an effect of frequency pairing or an interaction between clause type and frequency pairing.

3.3 Effect of working memory

To assess the influence of working memory on the reading time, an additional model was constructed which included the interaction between working memory, clause type, and frequency pairing. A model comparison with the model without working memory indicated

(26)

that including working memory and its interaction did marginally improve the fit (p = 0.06) for the reading time at the auxiliary, as well as for the spillover effect (p = 0.07). Adding working memory and its interactions did not improve the model fit for the comprehension questions (p

= 0.51). As the addition of working memory did not improve the model for the comprehension questions, they will not be discussed further.

3.3.1 Auxiliary

At the auxiliary, the significant interaction between frequency pairing and WM indicated that people with a higher WM score read the auxiliary in the LF-HF condition 0.06 log RT faster than in the HF-LF condition as compared to people with lower WM scores (SEM = 0.03, t(921.9) = -2.2, p = 0.02, 95% CI [0.02 … 0.11 log RT]). The interaction between clause type and WM score indicated that people with a higher WM score read the auxiliary 0.02 log RT slower in ORCs than in SRCs as compared to people with a lower WM score, but this was non significant (SEM = 0.03, t(28) = 0.74, p = 0.46, 95% CI [0.04 log RT faster … 0.06 log RT slower]). Lastly, the significant triple interaction between clause type, frequency pairing, and WM score indicated that people with higher WM scores read the auxiliary in LF-HF ORCs 0.1 log RT faster than HF-LF ORCs as compared to those with lower WM scores (SEM = 0.05, t(923.1) = -1.97, p = 0.049, 95% CI [0.2 log RT faster … 0.01 log RT slower]).

3.3.2 Spillover: auxiliary + succeeding word

Combining the RT of the auxiliary and the succeeding word, there was a significant interaction between frequency pairing and WM score. Those with higher WM scores read the auxiliary and succeeding word 0.04 log RT faster in LF-HF sentences than in HF-LF sentences as compared to those with lower WM scores (SEM = 0.02, t(949.1) = -2.1, p = 0.04, 95% CI [0.003 … 0.07 log RT faster]). The significant triple interaction that was found at the auxiliary, was non-significant at the auxiliary and succeeding word combined (β = -0.01, SEM = 0.04, t(957.5) = -0.4, p = 0.71, 95% CI [- 0.09 … 0.04 log RT]). The significant interaction between WM and clause type indicated that people with higher WM scores read the auxiliary and succeeding word 0.05 log RT slower in ORCs than in SRCs as compared to those with lower WM scores (SEM = 0.02, t(31) = 2.11, p = 0.04, 95% CI [0.008 … 0.11 log RT slower]).

However, when analysing only the sentences for which participants responded correctly to the comprehension question, this WM effect became non significant where people with higher WM

(27)

read the auxiliary in ORCs faster (β = -0.06 log RT, SEM = 0.04, t(379.1) = -1.6, p = 0.1, 95%

CI [-0.14 … -0.0009 log RT].²

In sum, the addition of WM only marginally improved the fit for the reading time analysis while it did not improve the fit of the accuracy data. The interactions between WM and frequency pairing at both the auxiliary and the spillover, indicated that people with higher WM scores read the auxiliary faster in sentences with a low frequency NP1. However, in the spillover model, the interaction between WM and clause type suggested that people with higher WM scores read ORCs slower than those with lower WM scores. This effect shifted directions and became non significant when only correct responses were analysed. Model comparison indicated that WM did not have a significant effect on the accuracy scores.

4. Discussion 4.1 Discussion of the present study

Previous work on RC processing found that object RCs are harder to process and comprehend than their subject RC counterparts (e.g. Mak et al., 2002, 2006, 2008; see Lau & Tanaka, 2021 for a review). Syntactic, pragmatic, and frequency-based explanations have been put forward to account for this processing asymmetry (see Gordon & Lowder, 2012). Johnson et al. (2011) advocated for a frequency-based explanation as they reported that the subject-object asymmetry was reduced in English when the first NP had a low frequency. They argued that the reader would encode this NP more completely which made the NP easier to reactivate at the point of disambiguation to build the correct syntactic frame. The primary aim of the present study was to replicate the findings of Johnson et al. (2011) in Dutch to establish a proof of principle. Based on their findings, it was predicted that the auxiliary would be read slower and that more errors would be made in object RCs as compared to subject RCs, but that this difference would be reduced if the first NP had a low frequency (the LF-HF condition). An additional aim of the present study was to assess the influence of WM as it has been suggested in the literature that those with higher WM capacity process object RCs more efficiently (e.g.

King & Just, 1991; Traxler et al., 2005). Based on their results, it was predicted that participants with higher WM capacity, as measured by a digit span test, would read the auxiliary in object RCs faster than those with lower WM capacity. No specific predictions were made regarding the relationship between WM capacity and frequency pairing.

2 This model did not converge with the same random effects structure. The random effects structure was simplified by only having random intercepts for subjects and items. This would lead to more liberal p-values as less variation can be taken into account, yet the effect still disappeared.

(28)

For research question (1), as to how different frequency configurations affect the processing of RCs in Dutch, differences were found in the overall reading time of the auxiliary such that they were read slower in object RCs as compared to subject RCs. The results confirm the prediction that object RCs would be more difficult to process and are in line with previous research on RC processing (Frazier, 1987; Johnson et al., 2011; Lau & Tanaka, 2021; Mak et al., 2002, 2006; Traxler et al., 2002, 2005). At the auxiliary, there was only a trend effect.

However, the confidence interval did not include 0 which normally happens for significant effects only. This suggests that the lack of a significant effect was due to a power issue instead.

This effect was strengthened at the word succeeding the auxiliary, creating a spillover effect.

Spillover effects are not uncommon in reading tasks. As Mitchell (1984) points out, “In most immediate processing tasks the end of one response measure is immediately followed by the beginning of another, together with a new portion of text. In this situation any uncompleted processing will spill over from one response measure to the next” (p. 76). The processing of word-N has not been fully completed when the participant encounters word-N+1, something that is strengthened in a paradigm like self-paced-reading where the participant cannot go back to the part they already read (Findelsberger et al., 2019).

As there were no main effects of frequency in the reading times of the auxiliary and the spillover, and the fact that animacy was controlled for, the results are best explained by a syntactic explanation (Gibson, 1998). According to the principle of parsimony, the parser does not assume more arguments than needed to complete the input as a grammatical string, which would make a subject RC less costly. In subject RCs, only a subject NP-trace and the verb need to be stored in memory before the syntactic frame can be completed. In object RCs, on the other hand, three components need to be stored: an object NP trace, the verb and a subject NP for the verb. However, Gibson’s (1998) analysis in terms of memory costs does not hold for transitive RCs. In Dutch, embedded structures are verb-final, meaning that two NPs and the verb need to be stored in memory in both RC types. Thus, they impose the same memory cost.

In this case, it is the integration cost that is higher for object RCs as the parser initially ascribed a subject reading to the relative pronoun. Upon encountering the auxiliary, the parser has to reanalyse the relative pronoun as an object, thus requiring more integration costs compared to a subject RC where the parser need not reanalyse the pronoun.

No facilitatory effect of lexical frequency was observed in the present study as the interactions between clause-type and frequency pairing were non significant at both the auxiliary and the auxiliary+1. The results are therefore not in line with the prediction that an infrequent NP1 would reduce the subject-object asymmetry. Three suggestions can be made.