• No results found

Using eye-tracking technology to investigate the influence of discourse coherence on the on-line interpretation of disjoint pronouns in adults and children.

N/A
N/A
Protected

Academic year: 2021

Share "Using eye-tracking technology to investigate the influence of discourse coherence on the on-line interpretation of disjoint pronouns in adults and children."

Copied!
75
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Using eye-tracking technology to investigate the

influence of discourse coherence on the on-line

interpretation of disjoint pronouns in adults

and children.

(2)

ABSTRACT

(3)

TABLE OF CONTENTS

1. INTRODUCTION 5

2. BACKGROUND 6

2.1 Earlier Research 6

2.2 Research Questions, Hypotheses, and Predictions 16

3. METHOD EXPERIMENT 1 18 3.1 Subjects 18 3.2 Materials 18 3.3 Procedure 22 4. RESULTS EXPERIMENT 1 24 4.1 Behavioral data 24 4.1.1 Accuracy 24 4.1.2 Reaction Times 25 4.2 Eye-movement data 26 4.2.1 Observation Length 29

4.2.2 Time to First Fixation 31

4.3 Discussion Experiment 1 32

5. METHOD EXPERIMENT 2 35

5.1 Subjects 35

5.2 Materials 35

5.2.1 Truth-value Judgment Task 35

5.2.2 Working Memory Task 35

5.3 Procedure 35

6. RESULTS EXPERIMENT 2 37

6.1 Accuracy 37

6.2 Eye-movement data 38

6.2.1 Observation Length 40

6.2.2 Mean Time to First Fixation 41

6.2.3 Working Memory 42

6.3 Discussion Experiment 2 43

7. GENERAL DISCUSSION 45

(4)

REFEENCES 50

APPENDICES

A1 SUBJECTS EXPERIMENT 1 52

A2 SUBJECTS EXPERIMENT 2 53

APPENDIX B 54

1. Sentences for the Test Items 54

2. Sentences for the Filler Items 55

3. Sentences for the Practice Items 56

APPENDIX C 57

1. Pictures for the Test Items 57

2. Pictures for the Filler Items 65

3. Pictures for the Practice Items 69

APPENDIX D 70

Working Memory Task 70

APPENDIX E1 71

Instructions Experiment 1 71

APPENDIX E2 72

Instructions Experiment 2 72

F FORMS 73

1. Form to be filled out by participants of experiment 1 73 2. Form with screening questions to be filled out before making an

appointment with the participants of experiment 2 (and their

parents/guardians). 74

3. Informed consent form to be signed by the parents/guardians

(5)

1.

INTRODUCTION

For most parents, the common belief about children’s language acquisition and development is that children comprehend aspects of language before they are able to produce it correctly. This is often true, but not for pronouns. Over the past two decades, linguistic researchers have been trying to work out why children often accept sentences such as He poked him to mean He poked himself. Him in this case is what is called a disjoint pronoun that requires a referent that is not present in the same syntactic category. In other words: according to syntactic rules, he and him cannot refer to the same person. Then why do children accept this interpretation when production tests clearly show that they are able to produce disjoint pronouns correctly?

Factors that are believed to have an influence on the interpretation of disjoint pronouns are: (lack of) knowledge of the relevant grammatical and structural rules (cf. Chomsky, 1981), semantics, salience of possible referents (cf. Grosz, Joshi, & Weinstein, 1995), context coherence (Spenader, Smits, & Hendriks, 2008; Conroy, Takahashi & Lidz, 2009), and even working memory capacity (Hendriks, 2008).

The present study compares the on-line interpretation of disjoint pronouns in children and adults. By recording the eye-movements of our participants while they are doing a picture

(6)

2.

BACKGROUND

2.1 Earlier research

The Pronoun Interpretation Problem was established by Chien & Wexler in 1990, when they tested how children perform when interpreting pronouns in sentences like (2) during a picture verification task.

(1) This is Goldilocks; this is Mama Bear. Is Mama Bear washing herself? (2) This is Goldilocks; this is Mama Bear. Is Mama Bear washing her?

During the task, children were asked to respond YES or NO to the question after seeing a picture with a self-oriented action (herself) or an other-oriented action (her). The results of this

experiment reveal that children seem to have no difficulty interpreting the reflexive form (1), while performing poorly on the sentences with the pronoun ‘her’. In other words, when presented with a sentence such as (2) with a mismatching picture (Mama Bear is washing herself), children responded YES at chance level, indicating that they find both Goldilocks and Mama Bear a plausible referent for ‘her’. The children in this experiment were 2;6 to 7;0 years old and although the oldest

participants (6;6 to 7;0 years old) performed better (up to 80%) than the younger children (57% and less), they still failed to score adult-like (up to 100%). Knowing that children have already gone through most stages of first language acquisition at age 5;0, these results give rise to the question of why children have such a hard time interpreting disjoint pronouns, especially since the same study found that the production of these pronouns is adult-like. Other production studies, too, found that children correctly produce pronouns from the age of 4;0 (Bloom et al. 1994; de Villiers et al. 2006, Spenader et al. 2008).

The phenomenon described above is known as the Delay of Principle B Effect (DPBE), in which Principle B refers to one of the constraints in Chomsky’s Binding Theory (1981). The Principles of this theory state that:

(3) - A reflexive must be bound locally (Principle A), so: (a) John1 sees himself1

(b) *John1 sees himself2

- A pronoun must be free in its governing category (Principle B), so: (c) John1 sees him2

(d) *John1 sees him1

- An R-expression must always be free (Principle C), so: (e) John1 sees Mary2

(7)

These Principles are universal, which means they hold in every language and all language users obey these Principles. Do Chien & Wexler’s (1990) results imply that the principles do not hold for

children up to the age of 7 and therefore may not be universal after all? The researchers themselves argue that Principle B remains intact, but children’s ‘underdeveloped pragmatic skills’ are

responsible for the delay of obeying the Principle. They base these conclusions on their findings on the part of the experiment where pronouns were sometimes preceded by a quantifier (4) and sometimes they were not. Children performed better when there was a quantifier available, indicating that they are able to find the correct antecedent for a pronoun.

(4) This is Goldilocks; these are bears. Is every bear touching her?

We call this phenomenon Quantificational Asymmetry (QA) and it has been tested in English (cf. Thornton 1990; McDaniel & Maxfield 1992), Dutch (Philip & Coopmans 1996), Norwegian (Hestvik & Philip, 1999), Russian (Avrutin & Wexler, 1992), and French (Harmann, Kowalski & Philip 1999). Elbourne (2005) conducted a meta-study in which he reviewed the above mentioned study among other studies. He claims that in (4) children do not obey Principle B but ‘[…] interpret pronouns in the way made most plausible by the context, and the scenarios used in the relevant experiments make it likely that the pronouns in question will be interpreted as referring to certain prominent characters’ (Elbourne 2005:339). The presence of ‘every’ in (4) causes the NP (bear) to be salient and salient items are very accessible and easier to interpret; the QA is not a reflection of children’s knowledge of Principle B and thus reduced to being a methodological artefact. Furthermore, Elbourne claims that the materials that are used to test QA and the DPBE are incoherent and unnatural, causing test results to be unreliable. Looking at the materials used by Chien & Wexler (1990), we notice a few things. As for the pictures, the drawings are not very clear1. In the picture where Mama Bear is touching Goldilocks, it is hard to tell from the drawing who is touching whom. In addition, the bears are considerably smaller than Goldilocks, which makes Goldilocks more salient. It has been argued that the size of the possible antecedents may have an effect on the child’s interpretation process (Elbourne, 2005). Looking at the distance between Goldilocks and Mama Bear, we notice that there is much more space between the two characters in the pictures with the self-oriented action (herself) as compared to the pictures with the other oriented action (her). These extra-linguistic features may seem trivial, but if salience indeed affects pronoun interpretation, then Chien & Wexler’s (1990) test results are very likely to have been contaminated. The problem with the sentences in the experiment is that the conjunction structure of (1) and (2) causes the available linguistic antecedents –Goldilocks and Mama Bear- to be equally salient. This is odd, because in natural discourse the preceding context helps us link a pronoun to the correct referent. In fact, we do not use a pronoun unless there is a clear referent available; discourse would become messy if we used pronouns all over the place.

In a recent article, Conroy et al. (2009) underline the crucial role of discourse coherence for choosing the correct referent. They conducted their own experiment and reviewed 30 articles that

(8)

have investigated the DPBE and QA. In the meta-study, they found that in the Truth value Judgment Task studies where the correct referent is accessible in the context, children perform significantly better than when there is no clear antecedent available. This is interesting because it suggests that the DPBE is possibly an experimental artifact caused by the unnatural structure of the sentences (1) and (2) and therefore lack of a natural context that is required to help find the correct antecedent. For their own study, Conroy et al. (2009) choose to use for an elaborate context with a central character and a prominent other character. They conducted three experiments to test the

hypothesis that DPBE and QA are an artifact of experimentation and that children do know and obey Principle B. In experiment 1, children did a Truth Value Judgment Task (TVJT), accompanied by a Kermit the Frog puppet. After the experimenter acted out a story like (5) with props, Kermit made a statement about the story, such as ‘I think Grumpy painted him’ (referential condition) or ‘I think every Smurf painted him’ (quantificational condition) and the child had to indicate whether this statement was correct or not.

(5) ‘Papa Smurf announces that Snow White is going to have a party, and that she is going to have a painting contest. Papa Smurf declares that he is going to be the judge. Each of the dwarves shows and discusses the color of paint that he is going to use to get painted, as does Tennis Smurf. However, hiking Smurf does not have any paint, and he wonders

whether one of the characters will be willing to share. He first approaches Happy, who says that he would be glad to help out of any of the paint remains after he is painted.

Fortunately, when Happy is finished some paint remains and so he paints Hiking Smurf. Hiking Smurf, however, is not yet satisfied, so he approaches Dopey with a similar request, which is successful. Then, Grumpy, who is in such a bad mood that he doesn’t even want to go to the party, declares that he doesn’t need to get painted. The other dwarves really want him to go, and Grumpy agrees to get painted, using all of his paint in the process. After Grumpy is painted, Hiking Smurf approaches him and asks for some paint. Grumpy politely apologizes that he would like to help but cannot, because he has used up all of his paint. Hiking Smurf realizes that his best remaining chance is to ask Tennis Smurf for some extra paint, and Tennis Smurf obliges when he is asked. Finally, everybody is ready for Snow White’s party.’2

Children performed almost adult-like in this first experiment: they incorrectly accepted the bound interpretation of the pronoun in only 11% of the cases (compare: adults accepted in 5% of the cases), so the DPBE is almost eliminated. A coherent context also reduces QA: children accepted the

referential interpretation in ‘every bear painted him’ in 14% of the cases. These results show that children benefit from a coherent context, but still do not answer the question of whether children generally prefer an unbound interpretation of pronouns over an anaphoric interpretation of

pronouns. Experiment 2 was identical to Experiment 1, except that in the test sentence the pronoun

2

(9)

was replaced by a noun phrase (NP), changing the statement into (6) for the referential condition and (7) for the quantificational condition respectively.

(6) Grumpy painted his costume (7) Every Dwarf painted his costume

The anaphoric reading is in these cases acceptable because the pronoun is imbedded in the possessor as an object NP. The sentences are therefore fully ambiguous but because of the possessive they are somewhat biased towards a referential interpretation. The results reveal that children preferred the bound interpretation of (6) and (7) at an above chance level (80% and 73% acceptance in the referential and the quantificational condition respectively). However, the results would have been more useful if the researchers had another version of the story in which Grumpy painted Hiking Smurf’s costume. That way, one can control for the influence of the picture during the task. In the third experiment, the story is changed again. Hiking Smurf is still the central character, but there is no longer a second prominent character available to which the pronoun can refer. The researchers also changed the statements into (8) for the referential condition and (9) for the quantificational condition:

(8) Hiking Smurf painted him (9) Every dwarf painted him

They argue that “if children simply interpret “him” as referring to Hiking Smurf, then they should judge the test sentence TRUE in the referential condition and FALSE in the quantificational condition” (Conroy et al., 2009:30). This means that the intended reading is actually the

(10)

experiments, there is still a ‘residual’ of approximately 6% that is real.3 This means that the processing of pronouns probably is different between children and adults, but this difference is not as dramatic as Chien & Wexler (1990) once proposed.

Conroy et al. (2009) explain that children do seem to know and obey principle B, but once they come across an incorrect referent they find it difficult to recognize and recover from it. Previous research in the on-line interpretation of pronouns has provided evidence for the claim that adults too temporarily allow ungrammatical antecedents as possible referents for pronouns, but unlike children adults are able to recover from incorrect inferences (Badecker & Straub 2002; Runner, Sussman & Tanenhaus 2003). Kennison (2003) proposes a three phase model for establishing the correct referent for a pronoun or anaphor. The first phase is the stage in which a number of possible referents are selected. These may also include referents that violate Binding Theory, referents that are not available in the context, and referents that violate gender and number. In the second phase, the Bonding phase, Binding Principles are used to link the anaphors to the antecedents before taking other factors into account (such as gender and number). These links are then evaluated during the last phase: the Resolution stage. Once a clear antecedent is available in the context and obeys Binding Principles, the search for a correct referent is terminated. However, when the antecedent is not linguistically salient or even absent and fails to pass the Resolution stage because there are no structurally available antecedents in the context, this may result in linking the pronoun or anaphor to an unmentioned entity, thus violating Binding Principles.

Assuming that the Bonding stage is not yet fully developed in children, how does this model work for them? If they create a set of possible referents in phase one, they will probably be more dependent on salience and contextual information rather than structural availability for choosing the correct referent.

Spenader, Smits & Hendriks (2008) also believe that a coherent context is essential, but they prefer reducing the context to one sentence instead of telling an elaborate story like Conroy et al. (2009) did. In their comprehension experiment they let 83 children do a Truth-value Judgment Task. The experimenter told the children that their help was needed with the computer because it was broken; sometimes the computer said things that were incorrect. The young participants were presented with either a picture of an animal doing something to itself or to the other animal in the picture. At the same time, they heard a pre-recorded story about the picture and the child was asked to tell the experimenter whether the computer was right or wrong by pointing at a smiley face or a sad face. The experiment consisted of 18 items for each participant. The context sentences were manipulated, creating three context conditions:

(10) Classic: Here you see an elephant and an alligator. De elephant hits him/himself. (11) Single Topic: Here you see an alligator. The elephant hits him/himself.

(12) Embedded: The alligator says that the elephant hits him/himself.

(11)

The sentences that Chien & Wexler (1990) used in their experiment were presented in the Classic Condition (10). As noted earlier, the conjunction structure in (10) causes the two referents in the context sentence to be equally available, creating an unnatural and incoherent context. In natural discourse we only use a pronoun when a clear referent is available in the context. Compare (10) with (11) where only the disjoint topic is introduced, making the correct antecedent salient and the context more natural. Spenader et al. (2009) predicted that children especially benefit from this and that DPBE would be reduced in the Single-topic Condition. Their results show that participants generally performed well on the items with a reflexive; a coherent discourse preceding a sentence with a reflexive did not particularly facilitate the interpretation of reflexives and the mean error rate across context conditions was only 16%. For pronouns, children performed significantly better in the Single-Topic Condition with an error rate of 17% compared to 31% and 28% in the Classic Condition and the Embedded Condition respectively. In other words, children are perfectly capable of interpreting pronouns when the correct disjoint antecedent is salient. Although the correct referent is available too in the Classic Condition (10), neutralizing the influence of discourse

context causes the processing of pronouns to be more demanding; children are not yet able to apply Principle B in an adult-like manner and unlike adults, they rely mostly on the contextual cues rather than knowledge of Principle B (Hendriks et al. 2009).

To summarize, there seems to be a discrepancy between Binding Theory and the

experimental results as well as for the asymmetry between the interpretation and production of disjoint pronouns. Bidirectional Optimality Theory (OT) is a theoretical framework that can explain both asymmetries in a satisfying way (Hendriks & Spenader 2005/6). In this framework, the speaker and listener take each other’s perspective into account and recognize that there were other (unexpressed) options to choose from. Consider (13) - (16):

(13) ?When Tom1 walked into the bar, Tom1 saw James2, and Tom1 bought James2 a drink.

(14) When he1 walked into the bar, Tom1 saw James2 and he1 bought him2 a drink.

(15) ?When he1 walked into the bar, he1 saw him2 and bought him2 a drink.

(16) *When he1 walked into the bar, Tom1 saw James2 and bought *him1 a drink.

Of the above sentences, (14) is the most optimal because it is economical without losing essential information. Adults interpret (14) the way it is presented here; they know ‘him’ cannot refer to Tom because according to Binding Theory, a pronoun must be free in its governing category. A speaker who expresses (14) assumes that the listener knows and obeys Binding Theory, and

therefore (14) is the only possible interpretation. Optimizing bi-directionally is evaluating from form to meaning and from meaning to form or vice versa. In other words, when trying to get a message (meaning) across, one uses language to construct a way of saying it (form). Reversely, when

listening to someone speak, one has to decode the words (form) in order to understand the message (meaning). Hendriks & Spenader argue that the reason why children up to the age of 7;04 still show

4 That is, when the correct referent is neutralized in the context as in: ‘This is Goldilocks; this is Mama Bear. Mama Bear is

(12)

some DPBE is because they are only able to optimize uni-directionally, meaning that they only consider their own perspective and not the other person’s; they use language in a way that makes sense to them and not necessarily to their conversational partner. So when children produce a disjoint pronoun, they do not think about whether the other person understands which antecedent it refers to. At the same time, when they hear a disjoint pronoun, they do not consider the

speaker’s reasons for using this pronoun and that the speaker had alternative ways (for instance, an NP) of getting his message across.

Adults, on the other hand, are aware of the alternative ways of referring to something and we saw earlier that they prefer to use a pronoun or even a reflexive instead of an NP because the former is more economical. However, adults also know that reflexives are locally bound and that we can only use a pronoun correctly if there is a clear antecedent available; there are implicational hierarchies (or: constraints) that influence the behavior of anaphoric relations in our language (Burzio 1998). The constraints that apply to the use of anaphoric expressions are described in (17) and (18), where (17) is stronger than (18). The stronger the constraint, the more important it is so satisfy.

(17) Principle A: Reflexives must be bound locally (18) Referential Economy: Avoid R-expressions5

>> Avoid pronouns >> Avoid reflexives

The above constraints have two important features. First of all, (17) is a constraint on the interpretation and production of anaphors, while (18) is a constraint on only the production of pronouns and anaphors. (17) and (18) do not apply to the interpretation of pronouns. Secondly, (18) Referential Economy states that reflexives are preferred over pronouns as bound NPs, and pronouns are preferred over R-expressions as bound NPs regardless of their interpretation. (17) Principle A states that a reflexive is used only if a coreferential meaning is intended. This constraint is stronger than (18) and therefore more important to satisfy. Had (18) been the strongest constraint, then the only NPs in language would be reflexives.

In order focus on the asymmetry between production and comprehension, we need to distinguish between the speaker’s perspective and the listener’s perspective, allowing production and comprehension to be modeled as separate directions of discourse: from meaning to form for the speaker, and from form to meaning for the listener. In production, the interaction of (17) and (18) leads to (19) and causes no problems:

(19) Bound NP: Coreferential meaning?  YES: use reflexive

 NO: use pronoun >> R-expression

In comprehension, however, (18) Referential Economy does not apply because it is a constraint on form and not on meaning, and (17) Principle A only has an effect when a reflexive is in the input.

(13)

When the input form is a pronoun and neither constraint applies, pronouns become ambiguous. Consequently, a referential meaning and a disjoint meaning become equally plausible.

We can put the constraints (17) and (18) in a tableau that is typical of Optimality Theory to evaluate which form is most optimal. The way the tableaus work is that the constraints are

presented in order of dominance, from left to right. A violated constraint is marked with an asterisk (*); multiple violations of the same constraints are marked with multiple asterisks. The exclamation mark marks the crucial violation. The most optimal output is the one with the least or no violations of the constraint in the highest ranking (in our case, Principle A), and is marked with a pointing finger. A candidate may violate several lower constraints multiple times while obeying the most dominant one and therefore still be the most optimal candidate. So, when we have a reflexive in the input form, we can evaluate which meaning is most optimal according to Optimality Theory by using Tableau 1.1 below.

INPUT: reflexive Principle A Referential Economy



Corefential meaning

Disjoint meaning

*!

Tableau 1.1: The constraint Principle A for the disjoint meaning is violated when a reflexive is in the input. Thus, a coreferential meaning is the only correct interpretation.

The tableau shows that a disjoint violates Principle A, the most highly ranked constraint. A

coreferential meaning does not violate this constraint, making it the most optimal candidate. When we look at Tableau 1.2, we see that both constraints are obeyed when a pronoun is in the input because Principle A only applies to production and not to interpretation; both coreferential meaning and disjoint meaning become plausible.

INPUT: pronoun Principle A Referential Economy



Corefential meaning



Disjoint meaning

Tableau 1.2: The constraints have no effect when a pronoun is in the input. Both meanings become plausible.

(14)

R-expressions for bound NPs where possible. The listener also knows that because of Principle A, a reflexive can only be used when a coreferential meaning is intended. The pronoun used must therefore have a disjoint meaning. By being aware of the unexpressed other possibilities that the speaker could have used to express reference, the listener optimizes bidirectionally: From form to meaning and from meaning to form. When then do children make the transition from unidirectional to bidirectional optimization? Spenader & Hendriks (2005/6) argue that optimizing in the opposite direction requires a second order Theory of Mind, which is not developed until the age of 5. But, as we have seen, studies that test DPBE show that children do not perform adult-like until the age of 7.

Hendriks (2008) believes that there is a correlation between the ability to optimize

bidirectionally and working memory size; evaluating the production and interpretation of expressed forms is complex, it requires a large working memory. People with smaller working memory, such as children and elderly people, are therefore expected to perform significantly worse as compared to young adults during a task in which they have to optimize bidirectionally. Two participant groups (young adults and elderly adults) did an elicitation task and a comprehension task in which they had to tell a story based on a sequence of pictures that they saw. As for the production task, the group of elderly adults produced more pronouns to refer to the old topic while there was a new topic present as well. Example (20) below is a transcript of an elderly person describing a picture in which a lady gives an ice cream cone to a little girl:

(20)

a. A lady1 is holding an ice cream cone in her hand.

b. […] and that girl2 certainly wants that ice cream cone.

c. She2 gets it.

d. She2 enjoys it.

e. […] she1 is going to buy another ice cream cone […] at the ice cream van.

(Hendriks 2008, p. 455/6)

In (20e), the participant still uses the pronoun ‘she’ although it no longer refers to ‘that girl’ in (20b). Hendriks (2008) found that elderly adults tend to use pronouns to refer to an old topic in the presence of a new one, thus making the pronoun ambiguous. An older study by Karmiloff – Smith (1985) showed the same pattern for children up to the age of 6;0. The young adults in the Hendriks (2008) study correctly used a full NP (“The lady”) in (20e) to indicate a topic shift. They scored better in the working memory test than the elderly adults: 9.2 (SD: 2.29) versus 13.0 (SD: 2.29) respectively. The findings support Hendriks’s hypothesis that there is a correlation between working memory capacity and the ability to optimize bidirectionally: because elderly adults are limited in their processing capacities, they are not always able to consider the hearer’s perspective. Consequently, they often produce ambiguous pronouns in subject position.

(15)

found that children, who have a smaller working memory, produce more (ambiguous) pronouns in subject position. Going back to example (20) above, this means that children use a pronoun where they should have used a full NP; the child in (20) uses ‘she’ to refer to the lady as well as to the girl without indicating a topic shift. In addition, the results of the interpretation test reveal that

children respond at chance level when there is a topic shift. This means that the children found two possible referents in the task equally suitable to refer to the pronoun, while the adults in the control group had a stronger preference for the ‘correct’ referent.

So far, we have discussed the Delay of Principle B Effect, Quantificational Asymmetry, and what may have caused children to perform so poorly on tasks in which they have to choose the correct interpretation of disjoint pronouns. Spenader et al. (2009) claim that a coherent discourse helps children to find the correct meaning of pronouns; when the correct referent is salient, the DPBE is almost resolved. Conroy et al. (2009) agree that a coherent context is crucial, but believe that the ‘residual’ DPBE is caused by children’s inability to recognize and inhibit incorrect

interpretation. Spenader & Hendriks, on the other hand, (2005/6) argue that the DPBE is caused by children’s inability to put themselves in their conversational partner’s perspective. Moreover, (Hendriks (2008) and Wubs (2008) found a correlation between working memory capacity and the ability to optimize bidirectionally. In other words, children, who have smaller working memory capacity than adults, perform poorly during DPBE tasks6 because interpreting disjoint pronouns involves the complex task of taking the speaker’s perspective into account.

The above studies are all off-line tests of DPBE in children. Although researchers agree that the DPBE is real, there are still some blanks to fill. In the present study we want to further

investigate the on-line processing of pronouns in children by using eye-tracking technology. Eye-movements can tell us much about the processing of linguistic input, for instance whether children even consider certain antecedents as a possible answer and whether a coherent context influences their decision. So far, this technology has only been used to test the on-line processing of pronouns in adults (Hendriks et al., 2010), which makes the present study a valuable addition to DPBE research.

The present study is partly a reproduction of Hendriks et al.’s (2010) study in which they tested whether adults benefit from a coherent discourse during on-line interpreting pronouns. Participants were presented with a picture-story combination. In the sentences, the Type of Introductory Sentence (Classic condition or Single topic condition, see also Spenader et al., 2009) and Type of Anaphor (reflexive or pronoun) were manipulated, thus creating 2 x 2 = 4 conditions with matching and mismatching. The researchers collected accuracy data, reaction times data, and eye-movement data. As expected, adults scored 96% correct on average, and they found no

significant difference between conditions for the few incorrect responses. The reaction times data revealed that pronouns in the Classic Condition (where two possible referents are equally salient) lead to significantly slower responses than the other three conditions. This finding is in line with the results of Spenader et al’s (2008) off-line study with children, who clearly benefited from a

coherent discourse. Interestingly, the eye-movement data show a different pattern. It was mostly

(16)

the type of anaphor that influenced the processing of linguistic input; the looking patterns indicate that a disjoint interpretation was more demanding to process than a referential interpretation. This seems to contradict the reaction times data of the same study, but when we look at the

eye-movement plots of the items with a pronoun in the linguistic input but with a picture that shows a self-oriented action, we observe a high proportion of looks at the distractor. Although these findings are tentative and not significant, what they may suggest is that a picture with a self-oriented action is more salient than one with an other-oriented action. Therefore, it seems that adults also benefit from a coherent discourse, though perhaps not as much as children do.

2.2. Research Questions, Hypotheses, and Predictions

For the present study, we investigate the on-line interpretation of disjoint pronouns by recording accuracy rates, reaction times (only for the adult participants) and eye-movement data during a Truth-value Judgment Task (TVJT) where 20 adults and 20 child participants respond TRUE or FALSE to a picture-sentence combination. The pre-schoolers also do a working memory task that allows us to test whether the children with a larger working memory also perform better during the TVJT. Analysis of these data should enable us to answer the following research question (21):

(21) What can eye-movement data reveal about the on-line processing of disjoint pronouns in pre-schoolers and does it differ from adults’ on-line processing of disjoint pronouns?

a. Do both children and adults benefit from a coherent context?

b. Is there a correlation between children’s working memory capacity and the ability to interpret disjoint pronouns correctly?

On the basis of previous findings we can formulate a number of predictions.

First of all, it is expected that adults score up to 100% accuracy and that there is no significant difference between conditions for the few items that were answered incorrectly. Following Spenader et al., we predict that, in children, a coherent discourse facilitates disjoint pronoun interpretation and this should become visible in the accuracy data: a higher percentage of correct items when the correct referent is salient compared to the items in the Classic Condition.

Secondly, Hendriks et al.’s results in their study of on-line processing of pronouns in adults lead us to predict that adults’ reaction times should reveal that pronouns in the Classic Condition are more difficult to process compared to when there is a coherent context available. These items have a longer reaction time than the items in the other conditions. In addition, reflexives are generally easier to interpret, so the reaction times are expected to be faster.

(17)

correct referent and faster recognition of the correct referent. In addition, if children find it difficult to recognize and inhibit incorrect referents like Conroy (2009) claims, then there should be a significant difference between the items that are matching and those that are a mismatching picture-sentence combination. If this is indeed the case, then children need more time to find the correct referent when the picture does not match the sentence, if they can find the correct referent at all. This effect should especially become visible in items with a pronoun, without a salient correct referent, and a picture with a self-oriented action.

(18)

3.

METHOD EXPERIMENT 1

Experiment 1 is a reproduction of Arina Banga’s MA thesis project (Banga, 2008). It is a truth-value judgment task in which adult participants are asked to respond TRUE or FALSE to a picture-sentence combination while their reaction times and eye movements are being recorded.

3.1 Subjects

20 native Dutch speakers (age 20 – 43; mean 26.2) participated in the adult experiment on voluntary basis. All subjects had a college degree or were still in college. They all had normal or corrected vision, no hearing disabilities, and no history of neurological problems (ADHD, autism spectrum disorders). Please refer to Appendix A1 for a summary of the subjects of experiment 1.

3.2 Materials

The experiment consisted of 16 experimental items and 16 fillers, where one item is a combination of a picture on a screen and a pre-recorded statement about the picture. The task items and fillers were randomly distributed over four lists. The pictures were created by art student Petra van Berkum, and the sentences were recorded by a male speaker with a calm voice and a neutral accent in a sound studio.

Each picture shows two different animals. One of them does something to either the other animal (other-oriented action) or to itself (self-oriented action). The pictures are controlled for the size of the animals, extra attributes, and type of action. Picture 1 shows two examples of pictures that were used in the experimental items.

Picture 3.1: Two examples of pictures of experimental items. The left picture shows a self-oriented action; the right picture shows an other-oriented action.

(19)

as opposed to zich (himself)7, and (3) since the sentences are presented with a picture, it is

important that it is possible to make simple and unambiguous drawings of the verb. So, we used the following verbs for the experimental items: aankleden (to dress), bijten (to bite), slaan (to hit), vastbinden (to tie), tekenen (to draw), wijzen (to point), schminken (to make up), kietelen (to tickle). Because the number of verbs that qualify is so small, one verb comes in two different versions. For example, slaan (to hit) has one version with an elephant and a crocodile and another version with a penguin and a sheep. This construction allows us to present the same verb twice, but in a different condition. Another advantage is that the experiment is kept interesting.

As for the pre-recorded audio files, we manipulated two factors in the sentences: Type of Introductory Sentence and Type of Anaphor. In 50% of the cases, the introductory sentence introduced both animals. We call this type of introductory sentence the Classical Condition (or C-condition) as in (22). The other condition, Single-Topic Condition (or S-C-condition) introduces only the animal that does not perform the action (23).

(22) Een kameel en een beer zijn in de klas (23) Een kameel is in de klas

The second factor we manipulated is Type of Anaphor. The anaphor, which is either a Reflexive (R) as in (24) or a Pronoun (P) as in (25), is revealed in the second part of the audio file. This is also the point where the item is disambiguated; reaction times and eye-movements are time-locked to this moment.

(24) De beer kleedt zichzelf aan (25) De beer kleedt hem aan

Now, when we combine the two factors and bear in mind that the sentences may be paired with matching or mismatching picture, we end up with 2 x 2 x 2 = 8 conditions. Table 3.1 gives an overview of the possible conditions. Please refer to Appendix B1 for an overview of all test items.

7

(20)

Condition Type of Introductory Sentence

Type of Anaphor

Type of action Sentence Picture Intended response

CR - Match Classic Reflexive Self-oriented Een beer en een kameel

zijn in de klas.

De beer kleedt zichzelf aan.

TRUE

CR - Mis-match Classic Reflexive Self-oriented Een beer en een kameel zijn

in de klas.

De beer kleedt zichzelf aan.

FALSE

CP - Match Classic Pronoun Other - oriented Een beer en een kameel zijn

in de klas.

De beer kleedt hem aan.

TRUE

CP – Mis-match Classic Pronoun Other - oriented Een beer en een kameel zijn

in de klas.

De beer kleedt hem aan.

FALSE

SR - Match Single - Topic Reflexive Self-oriented Een kameel is in de klas. De beer kleedt zichzelf aan.

TRUE

SR – Mis-match Single - Topic Reflexive Self-oriented Een kameel is in de klas.

De beer kleedt zichzelf aan.

FALSE

SP - Match Single - Topic Pronoun Other - oriented Een kameel is in de klas.

De beer kleedt hem aan.

TRUE

SP – Mis-match Single - Topic Pronoun Other -oriented Een kameel is in de klas. De beer kleedt hem aan.

FALSE

Table 3.1: Over view of all conditions, including examples of sentences and pictures.

As mentioned earlier, we added 16 fillers in order to cover up the purpose of our experiment. We created four types of fillers. In two cases, we used the verb of an experimental item and replaced the anaphor by a noun phrase (NP). These types of fillers are referred to as co-reference (5) and disjoint fillers (6).

(26) Een egel is op de stoep. De olifant slaat op zijn poot

(A hedgehog is on the sidewalk. The elephant is hitting his foot)

(21)

Note that the pictures that are used for these fillers are the same as those for the experimental items. Consequently, participants may come across the same verb four times in one session, but never with the same sentence. In some cases we were able to replace the second animal (the one that is not performing the action), to create as much variety among items as possible.

In the other two types of fillers the animate objects are replaced by a prepositional phrase (28) or an inanimate object (29).

(28) Een beer en een pinguïn zijn in de tuin. De pinguïn zit in bad

(A bear and a penguin are in the garden. The penguin is sitting in the bath tub)

(29) Een kip is in de klas. Het paard drinkt cola

(A chicken is in the classroom. The horse is drinking a coke)

Since these two types of filler items are very straightforward and unambiguous and are completely unrelated to the experimental items, we used these fillers as control items. This means that if a participant answers 3 or more (out of 8) of these items incorrectly, they are excluded from the experiment because it is likely that they do not understand the task or do not take it seriously. For a complete overview of all filler items, please refer to Appendix B2.

Picture 3.2: Filler with a PP. The penguin is in the bath tub Picture 3.3: Filler with an inanimate object. The horse is drinking a coke

The four experimental lists are randomized, but the order in which the items appeared on the list is not completely arbitrary. First of all, the items with the same verb are separated as much as possible, which includes the fillers that have the same verbs as some of the experimental items. Secondly, the Type of Introductory sentence never appears more than twice in a row. Bearing in mind that half of our participants were going to be pre-schoolers, we made sure that there are never more than three matching or mismatching items in a row, regardless of whether they are fillers or experimental items.

(22)

is incorrect. The animals, the verbs, and the location are always correct. This also holds for the fillers. Figure 3.1 shows an overview of all items.

Figure 3.1: overview of all items; all types have two TRUE and two FALSE items.

3.3. Procedure

The testing of participants took place in the eye-lab, where the Tobii T120 eye-tracker is set up. The eye-tracker is connected to two computers, one which presents the E-Prime stimuli and collects accuracy and reaction time data, and another which collects the eye-movement data. The

participant can only see what is on the eye-tracking monitor. (See Picture 5 for a schematic plan of the eye-lab). The actual Tobii T120 eye-tracker is integrated in the 17” TFT monitor and requires no visible tracking device (such as a special cap) to record the eye-movements that may affect the participant. We attached an extra keyboard that participants could use to enter their answer.

Participants were tested individually. They were asked to fill out the first part of a form with basic questions (name, age, etc). Once seated in front of the eye-tracker, their eyes were calibrated so that the eye-tracker can measure the average of both eyes. Before starting the experiment participants received the instructions, which appeared on the monitor. These

instructions stated that participants were about to see pictures while listening to a short story. It is their task to judge whether the picture matches the story by pressing a green button (TRUE) or a red button (FALSE). In this experiment, the TRUE button was on the left and the FALSE button was on the right, because we decided to follow the exact same procedure as Banga (2008). People differ in what they consider logical assignment of the buttons and labeling them a certain way is always 4 Refl. 4 Pron. 4 Refl. 4 Pron.

T

F

T

F

T

F

T

F

T

F

T

F

T

F

T

F

8 Classic 8 Single topic 4 Coref. 4 Disjoint 4 PP 4 inan. object 8 Classic 8 Single topic

16 exp. items 16 fillers

(23)

completely arbitrary. In order to get reliable reaction times, participants were told not to enter their answer until the second sentence was finished.

(24)

4.

RESULTS EXPERIMENT 1

4.1 Behavioral Data

First we will discuss the behavioral data. We measured accuracy and reaction times. Please note that an “accurate response” means that the participant provided the intended response. For instance, they responded TRUE when the intended response was TRUE also.

4.1.1 Accuracy

In total, we received 20 x 16 = 320 responses from our participants, out of which 311 (97%) were accurate, and 9 (3%) were inaccurate. As mentioned earlier, all participants received four items per condition, and we calculated the mean proportion of accuracy for each of the four conditions per subject. Table 4.1 gives an overview of the mean proportions and their standard deviations for each condition, based on subject analysis. The table shows that participants generally did very well on the task, as we expected from adults.

Anaphor Pronoun “hem” Reflexive “zichzelf”

Context Classic Single-topic Classic Single-topic

Prop. St. Dev. Prop. St. Dev. Prop. St. Dev. Prop. St. Dev.

0.98 0.08 0.94 0.11 0.99 0.06 0.99 0.06

Table 4.1: Mean proportions and standard deviations of accurate responses of 20 adult participants.

Before running the significance tests, the proportions from Table 4.1 were arcsine-transformed8 so that they are more normally distributed. GLM Repeated Measures ANOVAs were then run on these transformed proportions with Type of Introductory Sentence (classic versus single-topic) and Type of Anaphor (pronoun versus reflexive) as within-participants factors.

With F = systematic variance/residual variance, the F-ratio indicates how much of the total variance can be explained by the experimental model. For instance, an F-ratio > 1 indicates that the residual variance is smaller than the experimental variance, meaning that the differences between the mean values are caused by the experimental manipulations. The p-value or alpha level reflects the chance or likelihood that the effects that we find are real. We set our alpha level at 0.05, so when we find a significant effect (i.e. a p-value ≤ 0.05) we are at least 95% confident that we correctly reject the null hypothesis (H0) that our manipulations have no effect.

No significant main effects were found for Type of Introductory Sentence with F[1,19] = 1.879, p > 0.10 or Type of Anaphor with F[1,19] = 3.065, p > 0.05. The interaction of Type of Introductory Sentence x Type of Anaphor was also not significant with F[1,19] = 1.000, p > 0.05.

(25)

These tests indicate that the errors made were equally distributed across conditions; adults are in this case not influenced by the experimental manipulations.

Figure 4.1: Mean proportions and standard deviations of accurate responses of 20 adults.

4.1.2. Reaction Times

During the truth-value judgment task participants were asked to enter their response as soon as they felt confident that the sentence did or did not match the picture. Unlike accuracy, which is a very rough measurement, reaction times can reveal small differences between conditions; longer reaction times are generally interpreted as more demanding linguistic processing. So even though the adults were able to give the correct response 97% of the time and the errors adults did make were equally distributed across conditions, differences between conditions may still be present in reaction times.

Recall that items were manipulated in two ways: Type of Introductory Sentence (Classic versus Single-topic Condition) and Type of Anaphor (reflexive versus pronoun). First, we expect reflexives to be processed faster than pronouns due to the intrinsic features of reflexives. Second, if context indeed facilitates the interpretation of pronouns (Spenader et al., 2009 and Hendriks et al., 2010) then pronouns in the Single-topic condition should be easier to interpret, resulting in faster reaction times. The account of Conroy et al. (2009), however, predicts that context does not have an influence on the interpretation of pronouns. Differences between context conditions for the reflexive items are not expected because reflexive.

Table 2 shows the mean reaction times in milliseconds for each condition and Figure 4.2 shows these data graphically. Trials with an incorrect response were excluded from the analysis because they may influence the mean reaction times. As participants were asked to respond as quickly as possible, we also excluded trials with a reaction time that was lower than 150 ms or higher than 4000 ms; it is unlikely that a participant is able to process and respond to an item within 150 ms while a reaction time of more than 4 seconds means that the participant was

(26)

the CP condition as compared to the other conditions; reflexive items in the classic condition were processed fastest. To find out if the differences between conditions are significant, we ran a GLM Repeated Measures ANOVA on the data with Type of Introductory Sentence (classic versus single-topic) and Type of Anaphor (pronoun versus reflexive) as within-participants factors. The analysis revealed no significant main effects for Type of Introductory sentence or Type of Anaphor (all F-ratios < 1, p > 0.40), but there was a significant interaction of Type of Introductory Sentence x Type of Anaphor with F[1,19] = 5.15, p = 0.04. Paired t-tests showed that items in the CP condition took longer than those in the SP condition, but this was only tentative with t(19) = 1.97 and p = 0.06. Items in the CR condition were processed faster than SR items, but this too was a tendency with t(19) = -1.91 and p = 0.07. The other pairs revealed no significant differences.

Anaphor Pronoun “hem” Reflexive “zichzelf”

Context Classic Single-topic Classic Single-topic

RT (ms) St. Dev. RT (ms) St. Dev. RT (ms) St. Dev. RT (ms) St. Dev.

1720 845 1463 494 1433 429 1602 416

Table 4.2: Mean reaction times and standard deviations in milliseconds for the adult participants in each condition.

Figure 4.2: Mean reaction times for all conditions measured from the start of the anaphor.

4.2. Eye-movements

During the Truth-value Judgment Task, the eye-movement of the participants were recorded; the pattern and timing of looks to possible antecedents can provide information about which

(27)

Tanenhaus, 2003; Sekarina, Stromswold & Hestvik, 2004). Before we go to the analysis of the eye-movement data, some more information on what the data entails is in order. Among other things, we will discuss which items were included in the analysis, Areas of Interest (AOIs), and the types of measurement we recorded.

In order to be able to measure how long a participant looks at a certain referent, we had to define Areas of Interest in the pictures. It is likely that participants not only look at the Target and Distractor during the task, but also on parts of the picture that are not relevant for our research. In other words, we have to make sure that these fixations are not included in the analysis so that the Tobii software can measure how often and how long a participant looks at a particular AOI at a given moment in the experiment. The AOIs were drawn by hand off-line in Tobii, and the border of each AOI is approximately 1 centimetre around the animal where possible. That is, in the case of the pictures with the other-oriented actions, the animals are sometimes too close to each other to have this 1 cm border around them. In these cases, we just made sure that there was no overlap.

Picture 4.1: AOIs were defined manually and off-line9.

After deleting the fixations outside the AOIs, what is left is all measurements of all fixations on the AOIs of the correctly answered items during the entire experiment. Since we are interested in the eye-movements that happen between the onset of the anaphor and the moment of response (see also Figure 4.3.1-4 below), we stripped the data of all the fixations that started before the start of the anaphor and those after the participant has entered the response. This includes the fixations on the Target that started before the start of the anaphor and end within the set time frame. An unfortunate side-effect of trimming the data this way is that some items are lost because they contain no fixations on the Target in this time frame, even though participants did look at the Target at one point during the trial.

Pictures 4.3.1-4 below present the overall looking patterns for the four conditions, distinguishing between matching and mismatching picture-sentence pairs, measured from the presentation of the picture until after the response has been given. The graphs represent the proportion of fixations on the target antecedent, the distractor, and fixations outside the areas of

(28)

interest. The measurements that are used for our analyses, Observation Length and Time to First Fixation, are measured from the disambiguating point in the test sentence (represented by “A” in the figures below), until the moment of response (RT). Please note that the target is always the correct referent, and the distractor always the incorrect referent.

Picture 4.3.1: Pronouns in the Classic Condition (CP)

Picure 4.3.2: Reflexives in the Classic Condition (CR)

(29)

Picture 4.3.4: Reflexives in the Single-topic Condition (SR)

Similar to Hendriks et al.’s (2010) findings, we observe a considerable effect of the type of pictures in the fixation plots; there is a considerably different looking pattern for items with a self-oriented picture as compared to items with an other-oriented picture10. When participants are presented with an other-oriented picture the fixations are distributed more or less equally between target (40%) and distractor (30%). However, items with a self-oriented picture show more looks to the animal that is performing the action (who is also the patient), up to 60%. The other referent is only looked at 20% of the time. In other words, the pictures influence the looking times and patterns of the participants: show more looks to the correct referent (target) for the reflexive and more looks to the incorrect referent (distractor) for the pronoun in the self-oriented pictures.

Our eye-movement data analysis consists of two types of measurements: Observation Length and Time to First Fixation. The first gives the mean proportion of looks at an AOI during the set time frame –from the onset of the anaphor until the moment of response. The second, Time to First Fixation, is a very early measurement that tells us how long it took for the participant to find the correct referent. This too is measured from the onset of the anaphor.

4.2.1 Observation Length

Observation Length is defined as the overall measurement that aggregates all looking times to an Area Of Interest from the onset of the anaphor until the moment of response by the participant. Because of the individual differences in reaction times between participants, Observation Length is measured in proportions.11

We assume that participants look more to the most probable antecedent in the picture compared to the other antecedent. Following this assumption, if a participant looks less at a possible antecedent in one condition compared to another, it means that the participant finds the antecedent less probable in that condition compared to the other. Moreover, following Spenader et

10 Items with a self-oriented picture are reflexive items with a matching picture and pronoun items with a mismatching

picture (CP-mismatch, CR-match, SP-mismatch, SR-match). Items with an other-oriented picture are items with a pronoun and a matching picture and items with a reflexive and a mismatching picture (CP-match, CR-mismatch, SP-match, SR-mismatch). See also Table 3.1 above.

11 Calculated by dividing the Observation Lengths of the correct referent by the sum of the Observation Lengths of the

(30)

al. (2009), if context has an influence on the on-line processing of pronouns, then items with a pronoun in the Single-topic conditions should have a more observations as compared to items with a pronoun in the Classic condition because the former should be easier to interpret. On the other hand, if context has no influence on the on-line processing of pronouns, like Conroy et al. (2009) propose, then we will find no difference between the two context conditions.

As described in section 4.2 above, some data were removed because only correct items with fixations on the target that started after the start of the anaphor were included in the analysis. We also removed the trials with a long reaction time, because they may influence the proportions of observations. Recall that in the case of a reflexive, the animal that performs the self-oriented action is the Target antecedent and the by-standing animal is the Distractor and vice versa in case of a pronoun. The analysis of Observation Length is based on a total of 136 items from 20

participants.

Table 4.3 shows the mean proportions and standard deviations per condition. Figure 4.4 shows these proportions graphically.

Anaphor Pronoun “hem” Reflexive “zichzelf”

Context Classic Single-topic Classic Single-topic

Prop. St. Dev. Prop. St. Dev. Prop. St. Dev. Prop. St. Dev.

0.41 0.18 0.51 0.25 0.77 0.14 0.79 0.14

Table 4.3: Mean proportions for Observation Length. Only correct items with fixations on the target that started after the start of the anaphor were included in the analysis.

(31)

The proportions from Table 4.3 were arcsine-transformed (Kirk, 1982) so that they are more normally distributed. A GLM Repeated Measures ANOVA was then run on these transformed proportions with Type of Introductory Sentence (classic versus single-topic) and Type of Anaphor (pronoun versus reflexive) as within-participants factors.

Table 4.3 shows slightly longer observation lengths for items in the Single-topic condition, but we found no main effect for Type of Introductory Sentence F[1,19] = 2.26 and p > 0.10. The Observation Length for the items in the pronoun condition is much smaller than those in the reflexive condition, and this strong main effect for Type of Anaphor is significant with F[1,19] = 31.34 and p < 0.001. We found no interaction between Type of Introductory Sentence and Type of Anaphor: F[1,19]= 1.34, p > 0.10.

4.2.2 Time to First Fixation

Our second eye-movement measurement is Time to First Fixation: the time in milliseconds that a participant needs to find the correct antecedent, measured from the onset of the anaphor (see Figure 4.3). This measurement involves the very early processing of the anaphor. Assuming that a more difficult antecedent takes longer to find, we predict from Hendriks et al. (2010) that pronouns have a longer mean Time to First Fixation compared to reflexives. Although Accuracy rates,

Reaction Times, and Observation Length did not reveal differences between context conditions for adults, items with a pronoun in the Single-topic condition are predicted to have a shorter Time to First Fixation than items in the Classic condition (Spenader et al., 2009). If context does not

influence the on-line processing of pronouns, like Conroy et al. (2009) propose, then we will find no difference between context conditions.

As explained in section 4.1 above, the items that were included in the analysis were answered correctly by the participants and contain fixations on the target that started after the start of the disambiguation point. Consequently, the analysis is based on 17 participants: 3 participants had no items left in one of the conditions, resulting in exclusion from the subject analysis. The reason why so much data was lost is because participants were asked to press TRUE or FALSE as quickly as possible. However, when someone is already fixating on the target before the start of the anaphor, then hears the anaphor and quickly checks whether the alternative antecedent (the Distractor) is really not the correct one and then responds there are no fixations on the target during the set time frame. Thus, the data of this item is not included in our analysis. If this happens for all four items in one condition, then the subject is excluded from the analysis.

(32)

Anaphor Pronoun “hem” Reflexive “zichzelf”

Context Classic Single-topic Classic Single-topic

Time (ms) St. Dev. Time (ms) St. Dev. Time (ms) St. Dev. Time (ms) St. Dev. 500 324 511 318 377 147 460 245

Table 4.4: Mean times to first fixation in milliseconds and their standard deviations per condition. Analysis is based on 17 subjects.

Figure 4.5: Mean Time to First Fixation in milliseconds and their standard deviations per condition. Analysis is based on the data of 17 subjects.

4.3 Discussion Results Experiment 1

(33)

Starting with the overall performance, the average score of 97% correct responses indicates that the adults in our study generally did very well as predicted by both Conroy’s and Hendriks’s account. In the earlier study by Hendriks et al, a main effect was found for Type of Anaphor, with a shorter reaction time for reflexive items. We did not find such main effect in the reaction times but similar to the earlier study, items with a pronoun in the Classic Condition did have the longest reaction times and the difference with items with a pronoun in the Single-topic condition was tentatively significant. Although not significant, this result does hint at influence of context; like children, adults may also benefit from a salient correct referent like Hendriks et al. have argued. The short reaction time of reflexive items in the Classic Condition compared to reflexives in the Single-topic Condition is quite unexpected since a reflexive is bound locally; a preceding sentence should not influence the processing of a reflexive. At the same time, it provides us with further evidence that adults too are sensitive to context condition; a reflexive in the Classic condition is more natural than in the Single-topic condition because in latter case, the animal that is the agent has not been properly introduced yet and using a definite pronoun feels unnatural. In other words, in the case of reflexives, the Classic Condition and not the Single-topic Condition provides a coherent context, which may lead to faster reaction times.

When looking at the eye-movement data, the only significant result we found is a main effect for Type of Anaphor for Observation Length: reflexive items had a significantly larger proportion of fixations on the target than items with a pronoun. Assuming that a larger proportion of looks means more confidence that the antecedent is the correct one, this result means that reflexives were easier to interpret than pronouns. The general looking pattern also shows that during a matching reflexive item a considerable portion (up to 80%) of fixations is on the target. Compared to 40% for matching items with a pronoun, where participants spent about 30% of fixations on the distractor, this indicates that when the item had a reflexive in the sentence, participants were more confident that the antecedent they were looking at was the correct one, resulting in more and longer observations on the target. On the other hand, pronoun items with a mismatching picture also revealed a large proportion on the agent of the action, who is the distractor in these items; the pictures with a self-oriented picture were clearly more salient than those with an other-oriented action. Since we analyzed fixations on the target and not the fixations on the distractor, we may conclude that the salience of self-oriented pictures have influenced the looking patterns of our participants.

(34)
(35)

5.

METHOD EXPERIMENT 2

Experiment 2 is the same truth-value judgment task but with children instead of adults. We also added a working memory task to find out whether a greater working memory correlates with a more adult-like interpretation of pronouns.

5.1 Subjects

Subjects were 26 native Dutch children, meaning that they were raised with Dutch as the only language spoken at home. All children had normal vision and no hearing disabilities. Four children were excluded from the experiment as they failed to answer 5 out of 8 of the control questions correctly (see Materials for more information). One child was not calibrated properly which led to invalid eye movement data. However, his behavioral data was included in the data analysis. This leaves us with 21 preschoolers (13 boys, 8 girls) age 4;412 to 6;4 (mean 5;6). Please refer to Appendix A2 for a summary of the subjects of experiment 2.

5.2. Materials

5.2.1 Truth-value Judgment Task

We used the same materials as in experiment 1, plus a working memory task.

5.2.2. Working Memory Task

We added the memory task for the pre-schoolers because there is some evidence for a correlation between the size of the working memory and the ability to optimize bidirectionally (Hendriks 2008; Wubs 2008). We used the auditory memory test from the Schlichting Test for Language Production (Schlichting et al., 1995). During the working memory task, a research assistant read out the sequence of words carefully, and the participant has to repeat it. The number of words in a sequence increase when the task progresses. For each new block of three word sequences, there is a practice session. The words that are included in the task are mainly monosyllabic words with a CVC-structure. See Appendix C4 for an example of the task.

5.3. Procedure

Although the materials were the same, the procedure was different from Experiment 1 for a number of reasons, the most important one being that our participants were very young and therefore were always accompanied by a parent.

Again, we used the Tobi T120 eye-tracker that is integrated in the 17” TFT monitor and the volume of the speakers was at 75%. We attached a mouse which was used to enter the answers of the preschoolers. A research assistant guided the children through the experiment, while the experimenter sat in the control area (Picture 5).

Before participants were tested, the parent had to fill out a form with basic information about the child and a consent form (see Appendix F). The preschoolers were also tested individually. Since

12

(36)

children tend to be more restless than adults, we made sure they were able to move as little as possible by moving the chair to the table. This way the calibration would be more reliable. A research assistant explained the task by telling the children that they were about to see pictures of different animals that are doing all sorts of things. In earlier research with children some of the pictures turned out to be ambiguous; for instance, some children thought that the crocodile was a dinosaur and that vastbinden (to tie) was losmaken (to untie) (Spenader et al., 2009). To make sure that the preschoolers of our experiment did not base their answer on the (mis)interpretation of the picture, we did a short pre-test by pointing out the animals that may be ambiguous (hedgehog, crocodile, bear) on a separate piece of paper and explicitly named them. We also named two or three verbs explicitly, one of which was always vastbinden (to tie). We made sure not to conjugate the verb, because then we would reveal (part of) an experimental item. See also Appendix E2.

Next, we told the child that they were about to hear a little story and see a picture on the screen. However, the computer is not very smart and sometimes shows a wrong picture with the story. It is up to the child to tell the researcher and the computer whether the picture matches or not. The research assistant entered the child’s response by clicking the left (TRUE) or right (FALSE) mouse button.

After these instructions, participants did three practice items. As children may have a yes/TRUE bias, two of the practice items were FALSE, and one was TRUE. The child was always asked to explain their answer to the practice items. Not unlike the adults, children sometimes based their answer on the description of the background. Some children were more concerned about the realism of the pictures; they answered FALSE because “elephants do not eat french fries” or “turtles do not wear hats.” In both cases we explained that it is a made up story and that they should focus on what happens in the picture and whether that matches the story. As the first item of the actual task is always a filler item, we asked the child to explain their answer to the first item as well if we thought this was necessary.

After 16 items, the preschooler was allowed a short lemonade break. We used this break to do the working memory task. The research assistant explained that they are going to do a new game on paper, where she will say a word, and the child has to repeat it. Sometimes, she’ll add another word, so the child has to repeat a sequence of two, three or even more words. We ticked off the sequences the child was able to do until they failed to remember a sequence correctly. Before finishing the task, we made up an easy word sequence for last so the child does not feel like they failed.

(37)

6.

RESULTS EXPERIMENT 2

6.1 Accuracy

As described in paragraph 5.1 above, we tested 26 participants. Of those 26, five were excluded from analysis because they failed to give the correct answer to 3 or more out of 8 control items, indicating that they did not understand the task. The analysis for the accuracy ratio is therefore based on 21 participants.

If the pronoun interpretation problem is still present in pre-schoolers, then they will perform more poorly on items with a pronoun as compared to items with a reflexive. If they are facilitated by a coherent context, then items in the Single-topic condition should be easier to interpret than those in the Classic condition, resulting in a higher accuracy ratio (Spenader et al., 2009). Table 6.1 shows the mean accuracy rates for the child participants; Figure 6.1 shows these data graphically.

Anaphor Pronoun “hem” Reflexive “zichzelf”

Context Classic Single-topic Classic Single-topic

Prop. St. Dev. Prop. St. Dev. Prop. St. Dev. Prop. St. Dev.

0.70 0.32 0.72 0.27 0.87 0.22 0.90 0.19

Table 6.1: Overview of the proportions and standard deviations of accurate responses by the pre-schoolers.

The proportions were arcsine-transformed before running the GLM Repeated Measures ANOVA with Type of Introductory Sentence and Type of Anaphor as within-subject factors. The analysis revealed that the higher accuracy ratio for items with a reflexive is a significant main effect for Type of Anaphor with F[1,20] = 15.13, p = 0.001. There was no main effect for Type of Introductory Sentence, nor was there an interaction between the two factors (both F-ratios < 1 and p > 0.10).

Referenties

GERELATEERDE DOCUMENTEN

In order to answer the central research question this research has examined how Google Trends data and YouTube sentiment could be used to create a model to explain

Adults also need to see the benefits of certain algorithms to pre-select and recommend relevant information from a large content database and facilitate online search with

Results: The level of technology use in the con- text of aging in place is influenced by six major themes: chal- lenges in the domain of independent living; behavioral op-

The assumptions of a 4G/LTE network, like higher latency, less availability and lower bandwidth speed is also looked at to find the best suited (D)DoS attack on a 4G/LTE network..

Uit literatuur bleek dat Chamaecyparis thyoïdes minder gevoelig zou zijn voor Phytophthora en goed zou voldoen als

The general aim of the study is to design and develop a group work programme empowering adolescents from households infected with or affected by HIV and AIDS by teaching them

Table 2 Overview of state-of-the-art sleep stage classification algorithms based on cardiac, respiratory and actigraphy signals.. QDA:

De biomassa van prooien voor zwarte zee-eenden varieert sterk tussen jaren, wat onder meer veroorzaakt wordt doordat de meest dominante soort (Ensis spp.) door