Recursive Processing Across Domains

(1)

Recursive Processing Across Domains

Author: Ekaterina Abramova Student number: 10032851 Supervisors: Dr. Jelle Zuidema Dr. Mariëtte Huizinga Amsterdam August 2011

(2)

This study focuses on developing a novel approach to the empirical investigation of the recursion-only-hypothesis, according to which the mechanism of recursion was the crucial step in language evolution (Hauser, Chomsky and Fitch, 2002). Recursion is thought to be a cognitive mechanism unique to humans and unique to language but the empirical data supporting the latter claim has been poor to date. The aim of this study is to ll this gap with regard to recur-sion as a domain-specic versus domain-general cognitive mechanism. We compared individual performance on linguistic and non-linguistic tasks thought to involve recursion, predicting that it will be correlated. If it had been successful, the study could, on the one hand, undermine the recursion-only-hypothesis and on the other hand, contribute to a greater understanding of not only recursion itself but also language evolution in general and the interplay between language and non-linguistic cognition. Unfortunately, the prediction was not borne out. The possible reasons for this are discussed and future work suggested.

(4)

Introduction

Recursion-only-hypothesis

It has been argued that language exerts a transformative inuence on human cognition (e.g. Clark, 1997). It is clearly one of the most prominent features that distinguishes humans from other animals and so it is reasonable to suppose that it is (at least partly) through language that our other unique cognitive properties have emerged. Therefore, understanding the emergence of language in philo-and ontogenesis is crucial to understphilo-anding cognition in general. Recent decades have witnessed a rebirth of both theorizing and empirical work on that issue, supported by the use of comparative methods and computational modeling.

One of the recent proposals that sparked a vigorous debate in the eld has been the recursion-only-hypothesis by Hauser, Chomsky and Fitch (2002). According to this proposal, it is the mech-anism of recursion that provided the key to human-like language and cognition. In a later article (Fitch et al., 2005)1 _{they clarify that their hypothesis consists of two parts. First, they suggest}

making a distinction between the Faculty of Language in the Broad sense (FLB) and the Faculty of Language in the Narrow sense (FLN). FLB would be simply everything that is shared with other non-linguistic cognitive systems or with other animals. FLN, on the other hand, would be the abstract linguistic computational system alone, independent of the other systems with which it interacts and interfaces (HCF1, p. 1571). Being terminological, it is impossible to falsify and therefore can only be judged by how useful it is in guiding research. Not so, however, for the second part of their proposal, which is the empirical hypothesis that FLN is equal to the computational mechanism of recursion. The authors split this part of their hypothesis again into two parts:

1. Recursion is an evolutionary key to the transition from animal communication to human language and therefore does not appear in non-human species.

2. Recursion is a domain-specic mechanism and therefore does not appear in the non-linguistic cognitive domains.

With respect to the rst claim they encourage the use of comparative method, i.e. showing that no other animal exhibits recursion (otherwise it could not explain the transition to human cognition). With respect to the second claim, they state explicitly:

The discovery of a recursive mechanism in phonology [or, by extension, in any non-linguistic domain] would rst raise the empirical questions is it the same as or dierent from that in phrasal syntax? and is it a reex of phrasal syntax perhaps modied by conditions imposed at the interface? ... If the answer to all of these questions were same, we would reject our hypothesis 3, possibly concluding that FLN is an empty

(5)

subset of FLB, with only the integration of mechanisms being uniquely human (HCF2, p. 201).

They would reject it presumably because otherwise language evolution could be explained by general cognitive mechanisms and not changes within the language faculty itself. However, there still is a possibility that recursion rst appeared in syntax and then spilled over to dierent domains. Despite the claims of falsiability of their hypotheses, Hauser, Chomsky and Fitch do not make it entirely clear what they understand by recursion nor what the precise evolutionary scenario they propose is. In both articles they conjecture that there existed precursors of recursion in other domains and that the current utility of recursive mental operations is not limited to communication (HCF2, p. 186). At the same time, however, the whole point of equating recursion with FLN was to claim that it is a human-specic and language-specic computation. To escape this apparent contradiction it seems that a conceptual clarication of the notion of recursion is in order. Nevertheless, we believe that it is still interesting to try to operationalize that notion and investigate empirically whether it is a mechanism specic to language.

Language: Recursive sentence processing

We start by considering recursion in language. Although there is a deep conceptual ambiguity on what the mechanism of recursion in linguistic processing precisely is, there have been psycholin-guistic investigations of sentences at least supercially deemed to be recursive. Recursion is usually dened in terms of self-reference and so recursive sentences are said to involve self-embedding, being composed of two or more constituents arranged in an onion-like manner. For example the sentence like The mouse that the cat chased ate the malt can be represented with a simplied tree as in Figure 1.

Figure 1: Parse tree of The mouse that the cat chased ate the malt S NP NP DET The N mouse SBAR C that S NP DET the N cat VP V chased VP V ate NP DET the N malt

(6)

It is visible that it is a case of embedding a sentence within a sentence and that the main clause (The mouse ate the malt) is interrupted with the relative clause (The cat chased the mouse)2_{, creating embedding in the center. Potentially, the number of such layers is unlimited but}

in practice, performance rapidly degrades with the increase in that number. Already a sentence with three layers (i.e. two embeddings) like The mouse that the cat that the dog chased bit ran is dicult to comprehend. Usually, such sentences are contrasted with sentences of the same number of layers but with a dierent kind of embedding - tail-embedding. An example of such a sentence is The cat chased the mouse that ate the malt. Another way to characterize such sentences is as right-branching because, as can be seen in Figure 2, the embedded clauses are simply stacked on the right as the sentence progresses.

Figure 2: Parse tree of the cat chased the mouse that ate the malt S NP DET The N cat VP V chased NP NP DET the N mouse SBAR C that S VP V ate NP DET the N malt

One of the rst studies to compare these types of sentences was a study by Miller & Isard (1964) who showed that with increasing the number of embeddings, free recall of sentences drops signicantly in center-embedded ones, compared to unembedded control sentences for which they used tail-embedded constructions. The explanation provided for this eect was based on the compe-tence/performance distinction. Namely, although grammar (competence, i.e. recursive sub-routines mechanism) allows for innite number of embeddings, nesting present in center-embedded sentences poses high demands for memory (performance) since in order to deal with nested constructions, the language user must hold in memory the still unresolved portion of one constituent while he is processing another.

(7)

A number of similar studies has been carried out since then showing, for example, that self-embedded sentences are often judged to be ungrammatical (Marks, 1968) and incomprehensible (Hamilton & Deese, 1971) and take longer to process (King & Just, 1991). At the same time, the question of what underlies the diculty was kept alive. For example, Blumenthal (1966) asked his participants to rewrite the self-embedded sentences (breaking them into composite simple clauses) and observed that even with no memory demands people are not able to analyze the sentence structure and therefore their poor performance cannot be attributed to the diculty in processing these sentences, i.e. according to Blumenthal, they do not seem to exhibit any recursive competence at all. In a similar vein, Stolz (1967) pointed out that only some participants are able to analyze center-embedded sentences immediately (thereby displaying what he called strong productivity) and others can learn to do so if they are trained on them or if semantic support is provided - making predicates selectively related to their subjects but not other subjects in the sentence. On the other hand, Blaubergs & Braine (1974) wanted to examine specically the contribution of memory interference and therefore trained people on processing such sentences and then examined their performance assuming that if they ensure that people do possess the competence and cannot use semantic clues (they used neutral sentences), the diculty will reect only the memory-related cost. There are, of course, newer studies on the subject as well. One of the more popular sentence processing theories explaining the diculty with center-embedded sentences is Gibson's (1998) Syn-tactic Prediction Locality Theory which claims that two factors contribute to sentence complexity - storage costs and integration costs. If there is a dependency between two syntactic elements of the sentence and if they are separated by other elements, the cost stems from the necessity to store the rst element and then integrate it when the matching element is encountered. This explanation holds for a wide variety of sentence structures, including ambiguous sentences in which case the processor prefers the structures that reduce the processing costs and thus needs to reanalyze the sentence if the preferred structure turns out to be incorrect. MacDonald & Christiansen (2002) on the other hand assume that language processing is shaped by the individual's experience reecting the distributional properties of the previously encountered input. Their analysis concerns a dierent class of frequently compared sentences - subject- and object-extracted relative clauses but the same explanation could be extended to center- and tail-embedded sentences. Just as subject-extracted relative clauses are easier presumably because they display a more regular SVO word order as op-posed to the unusual OVS order, tail-embedded sentences could be said to be easier because they display regular word order and center-embedded sentences are dicult because they display OSV order. Any individual dierence in how well people process such sentences would be merely an indi-cation of how much experience they had with similar structures and not of any underlying cognitive mechanisms like working memory or recursive skill.

The key issue in all these studies is whether we should evoke recursion, or some other syntactic competence factor, to explain the diculty with certain kinds of sentences or rather spell out the eects in terms of more general performance factors like working memory, storage and integration

(8)

costs or learning mechanisms3_{. If we refuse to accept the latter, and therefore claim that recursion}

is some special cognitive mechanism responsible for the appearance of human language (or at least the syntactic part of it, giving it structure and innite expression) over and above any general cognitive mechanisms, it follows that (1) recursion is required to various extent in dierent kinds of sentences and (2) like with any cognitive mechanism, it is not equal across time and across individuals. In support of this view one could argue that the individual dierences are unlikely to be only a result of dierences in general cognitive features because even the oft-evoked memory storage is merely a necessary but not a sucient condition for sentence processing, i.e. even with perfect memory one needs to have a mechanism that would enable them to connect the relevant elements of the sentence. Which elements are considered relevant and how they should be connected could depend primarily on semantic relationships or syntax but most likely on a combination of both. The relative contribution and how syntactic and semantic information is combined in comprehension is investigated under the headings of interactive vs modular models but both of them usually assume that the process is the same in everyone. If we consider the potential recursive capacity, however, an option becomes available that the processing varies individually with some people relying more on semantic relationships and others on syntactic structure. Such dierences could then explain why some people exhibit the strong productivity upon encountering center-embedded sentences immediately, others only when provided with training or semantic support and still others are not able to deal with them at all, even when their memory burden is alleviated. This is the possibility we will pursue in our study.

Recursive problem-solving

Unfortunately, recursion does not get clearer when we move to the non-linguistic domains. For example, Kinsella gives an example of visual processing as recursively applying the decomposition function:

The rst input to the function is the entire scene. The rst application will decompose the scene into the larger objects (say a person, or a building), and these objects will form the input to the subsequent applications of the function, where they are further decomposed into their sub-parts (head, torso, legs, or roof, walls, door, etc.) ... there is no doubt that the procedure is recursive, and more specically, nestedly so (Kinsella, 2009, p. 138, emphasis added).

One should realize here that this view on vision is just one possibility and there do exist alternative explanations of the process, claiming, for example, that we see in a more functional way through sen-sorimotor coupling and not through building a complete representation of the visual scene (O'Regan & Noë, 2001). Even if the view cited by Kinsella is the correct one, however, it is not clear whether

(9)

it describes a process that is truly recursive or merely hierarchical - in what sense could we say that the house is the same type of constituent as the door and the windows that it is decomposed into? At the same time, Pinker and Jackendo (2005) also refer to vision but use the example of visual grouping in which a picture of groups of Xs is perceived as being built recursively out of discrete elements which combine to form larger discrete constituents (p. 217). But of course, that would mean that pictures like the famous Droste package illustration or fractals like broccoli somehow require recursion for perceiving them, which would be a claim that requires a stronger motivation than simply calling such a process recursive. And it does not seem necessary to think recursively in order to understand the peculiar character of such pictures either. Of course, we can conceptualize these phenomena in terms of recursion and perhaps such a move would add something to the appreciation of Droste-like pictures or to the understanding of vision but that would amount to using a formal, explicit concept of recursion and mean nothing for the cognitive mechanism of recursion.

Perhaps it is better if we turn our attention away from the relatively low-level non-linguistic domains and consider higher-level examples that are said to involve recursion, like social cognition (e.g. false-belief tasks) or problem solving (e.g. Tower of Hanoi task). Here the problem will be, however, that these processes are conceptual4 _{and according to the recursion-only-hypothesis}

conceptual-intentional system is also not part of the FLN. If it was, one could argue that language is only derivatively recursive, i.e. possesses recursive structure in order to express recursive thoughts but this recursive structure can be realized in semantics and/or pragmatics equally well as in syntax. For example, instead of saying If the dog barks, the postman may run away we could say The dog might bark. The postman might run away (Evans & Levinson, 2009, p. 443). The problem for the recursion-only claim is that if it is the conceptual content that matters, then purely syntactic recursion could neither be the pivotal step nor be restricted to linguistic domain. We could of course, hypothesize a reversed dependence - that it is the syntactic recursion that promoted the appearance of recursive conceptual structures. The direction could be investigated by developmental studies but in either case, if recursion was attested in conceptual-intentional system and in language, it could logically not be considered a domain-specic skill.

The conceptual task we choose as operationalization of non-linguistic recursion is the problem-solving task: the Tower of Hanoi (ToH), given a rather wide agreement that it involves recursion in some form. This task involves three vertical pegs and a number of disks that can be put on them (see Figure 3). The disks are presented in a variety of initial congurations and the goal is to transform them into another conguration under the constraints that (1) only one disk can be moved at a time and (2) a disk cannot be placed on top of a smaller disk. Simon (1975) provided a comprehensive analysis of the task stating that the same task can be solved using dierent strategies - perceptual,

4_{Conceptual here is not used in the same sense as in the concept of recursion in the paragraph above. The latter}

refers to an explicit, conscious understanding of recursion, the former is a terminology of Hauser, Chomsky and Fitch and seems to refer to conceptual processes that are distinguished from lower-level sensori-motor processes but can still be pitched at the level of sub-personal unconscious mechanisms.

(10)

rote and recursive - which all pose dierent demands on the learner. Perceptual strategy relies on observing the situation and trying to move the discs based on their size, either looking at the source peg only, or both the source and the target. A rote strategy involves storing the solution steps in memory, usually dened in terms of rules, and executing them successively. Finally, recursive strategy involves realizing that the task can be decomposed into the problems of moving smaller pyramids until only one disc needs to be moved. This strategy is highly ecient because it produces the path to the solution of minimal length and does not require storing the pre-dened rules in long-term memory. It requires, however, representing inlong-termediate goals and holding them in short-long-term memory and it also requires that the concepts of a pyramid and a sub-pyramid are invented by the subject, i.e. realizing that the problem involves self-embedding of certain objects and procedures. Interestingly, Simon thought that recursion is merely a concept, a formally taught skill, not a natural capacity.

Figure 3: Typical Tower of Hanoi task

(a) Start state (b) Goal state

In psychological literature, the ToH task is claimed to measure executive function, that is cogni-tive capacity to regulate behavior and more basic cognicogni-tive processes to achieve a future goal (Welsh & Huizinga, 2001), which comprises the mechanisms of goal selection, planning, set maintenance, self-monitoring, inhibition, exibility of strategies. It has been shown in an exploratory study that individuals dier in what strategy they use in solving the task. Welsh and Huizinga (2005) dened goal recursion strategy as relying on

(1) recognizing that the rst subgoal is to move the largest disk to its goal position, (2) moving the smaller disks out of the way, (3) building a sub-pyramid stack of these smaller disks on the open peg, (4) moving the largest disk to its goal position, (5) repeating these steps with the next-largest disks and progressively smaller sub-pyramid stacks until the goal state is achieved (p. 284).

Within each subgoal there are smaller cycles of these steps. The rote strategy can be safely assumed to not be known to the participants who were not familiar with the task but they could use perceptual strategy. In the experiment, participants were administered the task in blocked (in the order of increasing diculty) or in random order and asked at the end how they had solved the problems. The researchers assessed the presence of strategy components listed above. There was no eect of how the task was administered on strategy knowledge. More importantly, strategy score did have an eect

(11)

on ToH performance and not merely in the end of the performance but rather the dierences were visible right from the start, i.e. people who reported in the end having used the recursive strategy were better at solving the task already from the rst trials suggesting that strategy knowledge could be an individual dierence variable. What could be the possible cause of this eect? Inductive reasoning? Working memory? Inhibition? Perhaps, analogously to the linguistic competence, also here the recursive skill itself could aect the choice of the strategy in those participants and, as a result, their performance.

Recursion across domains

The recursion debate has been so far a purely theoretical one (Fitch et al., 2005; Jackendo & Pinker, 2005; Kinsella, 2009; Luuk & Luuk, 2011; Pinker & Jackendo, 2005; Tomalin, 2007) and our primary goal here is to look for a more quantiable approach to the issue and gain preliminary empirical data. We have suggested above that what is required is a clearer view on what recursion as a cognitive mechanism could be and, accordingly, how it could be operationalized. The upshot of all this is that waving at some non-linguistic phenomenon like visual grouping, calling it recursion and using this to show that recursion is a domain-general mechanism proves nothing without specifying rst, what the commonality could be on the level of the underlying mechanism and second, showing empirically that the two domains are actually related. Specifying the former is a question for a much larger project but we can already provide some data for the latter. We suggest that a way to do this, which, as it happens, has not been used yet in the recursion discussions, is looking at the individual dierences in recursive processing in dierent domains and correlating the resulting performance.

Previous research (reported above) has shown that there are individual dierences in linguistic recursive processing, measured as individual performance on center-embedded sentences. It has also shown that there are individual dierences in non-linguistic recursive processing, measured as both overall performance score on the Tower of Hanoi task and the kinds of strategies people use to solve it. Our main research question is, therefore: Is there a relationship between linguistic recursive processing and non-linguistic recursive processing within individuals? Next, if the variability is shown for both domains, we could expect the dierences to be correlated within an individual, namely people who are better at processing recursive sentences could be also better at solving non-linguistic task, i.e. ToH in this case.

In order to investigate this relationship we need to be able to express the individual performance on both tasks in terms of scores meaningful for the current goal. In case of the linguistic task, what is typically taken as a performance indicator is how much slower people deal with center-embedded sentences versus tail-embedded ones or how much slower they deal with semantically neutral sen-tences versus semantically supported ones. If our hypothesis that the individual dierences can be explained by variability in recursive processing skill is correct, we could propose that the higher the recursion skill, the less detrimental it is to people's performance (measured as reading time, comprehension score etc.) when the sentence is center-embedded, semantically neutral and with

(12)

higher level of embeddings and the eect is additive. Therefore, the processing cost of all these factors could be combined into a single score expressing the recursiveness of sentence processing. A similar score would be desirable for the ToH task performance. Traditionally it is judged in terms of success vs failure in solving each of the congurations, in terms of success of solving in in a required amount of time or a minimum required number of moves. Obviously neither of these scores directly reect which strategy has been used by the participant and the work in deriving a better score is currently on the way (Zuidema, personal communication). For now, however, we will be forced to use the old scoring strategy.

Of course, we cannot eliminate the possibility that even if the relationship is found, it is not due to the general cognitive skills, which were previously evoked to explain individual dierences in both sentence processing and problem-solving, such as working memory, executive function or statistical learning. However, we can try to estimate the contribution of these factors statistically. Given the preliminary nature of this study, we restrict ourselves to one factor - working memory.

The contribution of working memory was already explored in the context of recursive sentences. King and Just (1991) claimed that individual dierences in syntactic processing (expressed in reading times and comprehension) depend on working memory capacity. They measured working memory with the Reading Span Test which is found to be more correlated with language tasks than regular working memory tests like word/digit span tests. Performance of low-capacity subjects decreases more under cognitive load and improves more when semantic support is provided which is evident in both comprehension and reading times, although the relationship with reading time is not straight-forward. Non-comprehending low-capacity subjects spend less time in the critical area (second verb) of object-relative center-embedded sentences. It has been argued (MacDonald & Christiansen, 2002) that reading span tests actually measure language comprehension skill, not working memory. There is a wide discussion (e.g. Fedorenko et al., 2006; Caplan et al., 2011) on whether working memory is a single pool or there exist pools specic for every domain (i.e. a pool specic for language and for problem-solving). If performance on a linguistic task and a ToH task is at least in part due working memory, the question of the nature of working memory resources and what particular working mem-ory tasks in fact measure becomes a relevant one. At this stage, however, we do not see a reason to commit ourselves to any particular view and just use the WAIS task for measuring working memory capacity in order to gain preliminary data.

Our predictions can be summarized as follows:

Hypothesis 1 : The performance on linguistic recursive task and recursive problem-solving is correlated.

Hypothesis 2 : The correlation is not explained primarily by the individual dierences in working memory capacity.

(13)

Study 1: Sentence processing pilot

We conducted a pilot study which served two main purposes. First, we needed to establish if the stimuli we chose for the linguistic task reproduce the patterns observed previously in the literature. Second, we wanted to apply several exploratory methods in attempt to derive a single recursivity score for each participant which would correlate with observed trends in the data and which could later be used to investigate the relationship with the non-linguistic performance. With regard to establishing a good individual score for recursive skill based on the reading times, several dierent metrics were examined. An a priori validity test for a potential score was thought to be a pattern akin to one represented in Figure 4. If a person is high on recursive skill, they are likely to have atter reading times diagrams because they are less sensitive to the changes in the levels of syntactic embedding, the type of it and the absence of semantic support. On the other hand, a person low on such a skill would be most likely slightly slower in sentence processing in general and also highly sensitive to recursive manipulations. Therefore, while it is true that such a person would produce slower reading times in the most dicult types of sentences (center-embedded, with 3 levels and no semantic support - cn35_{) than highly recursive people in general, an equally important feature}

should be that the dierence between reading times for cn3 and control sentences should be higher for them because cn3 sentences involve a combination of 3 factors requiring well functioning recursion mechanism. Unfortunately, it has not been investigated to date how much inuence each of these factors has on reading time in quantitative terms nor how this inuence might be varied individually. In other words, it is not known whether the large increase in reading times for cn3 has the same source for everyone or maybe for some people it is mostly due to the lack of semantic support and other due to the increase in embedding. In addition, recursive skill could, for example, have higher impact on sensitivity to the lack of semantic support and the level of embeddings but the type of embedding could depend more on the working memory capacity. Given the limits of this project, we are not able to address all these issues here and therefore the character of our theorizing is largely exploratory. We address more of similar concerns in the Discussion section and suggest what steps could be made in the future.

Methods

Participants

Twenty two graduate student volunteers participated in the experiment. Eleven were native English speakers and ten were procient users of English. One participant did not declare their level of English prociency and therefore was not included in the between-subject analysis.

5_{Hereafter the type of embedding is denoted as c for center-embdding and t for tail-embedding, the presence}

(14)

Figure 4: Predicted reading time patterns depending on the level of recursive skill

Procedures

The task we used was a moving window self-paced reading task which allows for recording the amount of time spent on reading each word and have been found to closely resemble gaze durations (Just et al., 1982). The stimuli chosen for the task were a combination of stimuli previously used in research on center-embedding (Stolz, 1967; King & Just, 1991) and newly composed sentences. The stimulus set consisted of 48 sentences - 24 of which were control sentences with no embedding, chosen semi-randomly from linguistic corpora (Wall Street Journal and Switchboard) and matched in supercial structure and complexity to the experimental sentences. The experimental sentences were of 8 kinds exhibiting a combination of the following factors: center vs tail-embedding, 1 vs 2 levels of embedding (i.e. 2 vs 3 sub-clauses within a sentence) and presence vs absence of semantic support. For example, a center-embedded sentence with one level of embedding and the presence of semantic support was The iceberg that the penguin inhabited sank the ship, while a tail-embedded sentence with two levels of embedding and without semantic support was The receptionist liked the typist that thanked the clerk that teased the boss. As can be seen from the last example a lack of semantic support does not mean semantic inconsistency, it simply means that any subject would be appropriate for any predicate and therefore semantic information does not aid in decomposing the sentence into sub-clauses and comprehending its meaning. Control sentences were chosen to not exhibit clausal embedding, start with a subject and include a number of subjects or predicates, for example All the brothers and sisters got together, The new rules allowed investors to buy foreign stocks directly, The men came out and said that was only a snake and so on.

The stimuli were encoded in an applet which was published online (the applet and the source code is publicly available on http://sta.science.uva.nl/~abramova/applet/). The participants were asked to run the applet at the time convenient to them. The sentences were presented one at a time, with each word revealed at a button press (see Figure 5). The times between each button

(15)

Figure 5: Moving window self-paced reading task

press were recorded. Each sentence was followed by a true/false comprehension question to ensure that the subjects payed attention to the material and investigate the possible relationship between comprehension and reading times. The actual stimuli were preceded by a practice round and followed by a series of questions asking the participants to indicate their level of English (native vs not) and whether they think they know what was being tested, as well as whether they used any strategy in solving the task.

Results

Reading times per word that deviated more than three standard deviations from the mean (across subjects) were removed from the analysis6_{. Mean reading times per word for each sentence and for}

each subject were subsequently calculated and then averaged to form 9 groups of sentences. Figure 6 shows the box-plots for the resulting data.

As can be clearly seen, especially 3-level center-embedded sentences diered from control sen-tences. A one-way repeated measures analysis of variance showed a signicant main eect of sentence type, F (8, 12) = 5, 36, p = .005, partial η2 _{= .781}_{. After a Bonferroni adjustment was made,}

spe-cic comparisons revealed that cn2, cn3, cs3 and marginally tn3 diered from the control sentences signicantly with p < 0.01, p < 0.001, p < 0.003 and p < 0.06 respectively. The dierence between tail-embedded semantically non-supported sentences with three levels of embedding and controls is unprecedented in the literature and perhaps warrants a closer investigation in future research. Further analysis revealed that whether subjects were native English speakers or procient non-native English speakers did not have any inuence on the sentence type eect, most likely because

6_{There are two other possibilities of removing outliers. One would be to remove outliers in relation to the mean}

reading time within each subject, in a way standardizing to the individual reading speed. Since, however, word lengths dier, it would punish longer words. One could also rst calculate the average reading times per sentence and then standardize to within individual. Neither of the methods, though, is required for further statistical processing.

(16)

Figure 6: Average reading times for dierent types of sentences (in ms)

non-native English speakers were very close to native in their prociency.

The eect of the sentence type is also visible when plotting results for individual sentences (Figure 7). The reading latencies are visibly much higher in cs3 than in control sentences, with the largest delays centered around the second half of the sentence. They also show a higher variability among the participants. A similar pattern was observed for sentences with the same structure but without semantic support, only with reading times even higher.

Figure 7: Word-level dierence (in ms) between dierent types of sentences (dierent lines represent dierent subjects)

(a) Control (b) Cs3

The structural dierences between dierent experimental sentences were also analyzed in more detail. The means were calculated across subjects for each word in each of the sentences. Then,

(17)

word averages were again averaged to word regions of interest as follows: subjects into N1, N2 etc. together with articles and relative clause conjunctions and predicates into V1, V2 etc. For example, [The iceberg that] [the penguin] [inhabited] [sank] [the ship] was encoded as N1 N2 V2 V1 N3 as indicated here with square braces. As the last step the 6 sentences falling into one of the four structural categories (center vs tail, 2 vs 3 levels) were combined into two vectors, separately for semantically supported and non-supported sentences.

As can be seen from the graphs (Figure 8), the semantic support manipulation did not seem to matter for the 2-level sentences but did for both center- and tail-embedded 3-level ones. Surprisingly, the region of the largest reaction times in center-embedded 2-level sentences was V2 and not, as predicted by the literature, V1. The prediction of V2 being most problematic for 3-level sentences was conrmed, however. Interestingly, the second half of the 3-level tail-embedded sentences is also problematic with reading times increasing on each subsequent noun as visible in Figure 8d. The dierences between reading times for semantically supported versus neutral sentences on the word region level were only statistically signicant for V3 in center-embedded 3-level sentences (p < 0.008, t(21) = −2.92).

Figure 8: Reading times (in ms) for dierent word regions

During the experiment, in addition to recording the sentence reading times, comprehension questions were asked and the times taken for answering them (and the answers themselves) were recorded. The answer time was not corrected for the length of the question (and so the time needed to read it) but most questions were of the same length (6 words), therefore the results should not be

(18)

extremely biased. The mean answer times were positively correlated with mean reading times overall (r(20) = .63, p < 0.002) and for each type of sentence (this, however, signicantly only for control, tn2, cn3 and ts3). Thus, in general longer reading times are associated with longer time taken to answer the comprehension question. The comprehension scores were relatively high, ranging from 34 to 44. Performing a median split on these scores and including the resulting groups in the analysis of variance of reading times did not provide signicant interaction. High comprehenders diered from low comprehenders in that they spent more time on reading the sentences and answering questions, especially for the dicult cn3 sentences and although this is in line with previous research (King & Just, 1991), neither of these dierences reached statistical signicance. Since answer times or comprehension scores largely paralleled sentence reading times and did not provide much new information, they were not further analyzed. It is advisable, however, to keep the current set up of the experiment due to the motivational function of comprehension questions.

With regard to establishing a good individual score for recursive skill based on the reading times, several dierent metrics were examined, namely processing strategy, cost scores, clustering and principal components analysis. Participants were asked at the end of the test whether they used any strategy in solving the task. They were classied as to whether they mention more semantic factors (Some sentences were solvable through simple logic, I tried to imagine the situation visually in my mind) or more syntactic ones (try to remember the order of nouns vs order of verbs, splitting one sentence into few, First I had to think of short sentence constructed in this way and then transfer the way of concluding on the long sentences). Unfortunately, it was not obligatory to answer these questions and there was no a priori guideline as to classifying the subjects. Therefore, only 12 participants were classied and there was no signicant dierences between the groups. Perhaps giving participants a pre-dened set of possible strategies in the future could shed more light on the issue.

Another way to capture individual dierences we used was converting the raw reading time scores into cost scores after standardizing them to the mean reading time within participant. It is predicted that semantically neutral sentences are more dicult (take more time to process) than semantically supported sentences, center-embedded sentences more dicult than tail embedded ones and 3-level sentences more dicult than 2-level sentences. Therefore, we separately computed the semantic cost scores, type cost scores and level cost scores for each participant by substracting the reading times while keeping the other two factor constant. In other words, the semantic cost score was computed by substracting the reading time of cs2 from cn2, ts2 from tn2, cs3 from cn3 and ts3 from tn3 and then averaging this to a single value. Next we tried using this to form two groups of participants either by performing a median split on the composite cost score or by cluster analysis on the matrix of the three scores. Neither produced groups with signicant interaction with the sentence type eect and no meaningful patterns. Figure 9a illustrates that taking a composite cost score as a basis for separating participants into two groups is essentially circular as the only dierence between the groups is in the 3-level sentences which are a combination of the largest amount of cost factors.

(19)

Figure 9: Mean sentence reading times (in ms) grouped

Next, we attempted applying principal component analysis to both raw sentence means and cost scores. The results looked more promising for the former, producing both a signicant interaction eect for the main eect of the sentence type and a more interesting plot (see Figure 9b). However, if one examines these groups with regard to a dierent measure of sentence processing, i.e. comprehen-sion scores, one can notice that the lower group, i.e. participants described with a lower half of the scores on the rst principal component and having lower mean reading times also have signicantly (!) lower answer scores. A good measure of recursive skill, on the other hand, should capture the fact that people high on such a skill not only process the sentences faster but also comprehend them better. Therefore, what could be pictured in Fig 11b is in fact merely a dierence between people who were less motivated to perform the task and so devoted less time to reading the sentences and as a result understood them less.

Discussion

It seems then that we still do not have a score summarizing the performance on sentence processing task in a way that could reect an underlying recursion mechanism. Perhaps a better approach would be to rst quantify a contribution of each of the factors varied in the dierent types of sentences and devise a model that would enable capturing both the inuence of these factors on the reading times and the inuence of a latent individual variable. It would be also advisable to incorporate some sort of comprehension measure into such a model and perhaps make a task allowing for a better detection of how well participants understand the sentences. These concerns are further addressed in the General Discussion.

(20)

Study 2: Sentence processing and Towers of

Hanoi

We conducted a second study with the aim of gaining insight into the issues involved in trying to compare linguistic recursion with non-linguistic recursion and test the tasks and the stimuli developed to that end.

Methods

Participants

Twenty one native (18) and procient (3) English speakers participated in the experiment. The sample was recruited from international graduate students as well as using expatriates fora and job oering websites on the internet. As a result, it was diverse in age and background. Participants were invited to a location in Amsterdam Science Park, provided with information about the experiment and after the experiment payed an experimental fee of 8 euros and debriefed. As was evident from preliminary data analysis, 5 persons did not understand the Towers of Hanoi instructions and therefore their data was excluded from further analysis.

Procedures

The second study was composed of three tasks and a post-experimental questionnaire. The latter involved asking the participant whether they think they know what is being tested and whether they used any strategy in solving any of the tasks. The study took about one hour to complete.

Self-paced Reading Task

Linguistic recursion was tested with the same self-paced reading task as one used in the rst study.

Short-term Memory Task

Short-term memory was tested by a WAIS digit-span task implemented in a web-based applet. Participants were presented with a series of digits, one digit at a time, which they were required to type in after each presentation. The rst block of digits required repeating the sequence in the same order and the second block of digits - in the reversed order. The number of digits was increased gradually from three to eight and the task was stopped once the participant made two mistakes in a row7_{. The average number of sequences reproduced without mistake from both blocks was taken}

7_{Unfortunately, after the experiments were concluded, an error in the task implementation has been discovered.}

Phase 1 of the WAIS task was unaected but Phase 2 stopped the participant after 2 non-consecutive real errors while not counting the cases where the participant misunderstood the task and typed the digits in sequential instead of reverse order. This has undoubtedly aected memory scores. The implementation of the memory and ToH tasks was not the author's responsibility but not having double-checked after running the experiment with the rst participants was (and is a sad but invaluable lesson for the future).

(21)

as a memory score of a participant. Problem-solving Task

Recursive and non-recursive problem-solving was measured by the Tower of Hanoi task. This task involves a set-up of three vertical pegs and a number of disks that can be put on them. The disks are presented in a variety of initial congurations and the goal is to transform them into another conguration under the constraints that (1) only one disk can be moved at a time and (2) a disk cannot be placed on top of a smaller disk. This task is said to be optimally solved using recursive strategy (Simon, 1975; Welsh & Huizinga, 2005): decomposing the problem of moving a pyramid of n disks into the smaller problem of moving a pyramid of n-1 disks on a spare disk and so on until a move of one disk is required. Participants were required to solve a set of 22 puzzles that diered in both starting and goal congurations. They were free to move to the next puzzle at any time and the number of solved puzzles was recorded. Traditionally the ToH score is taken to be the number of puzzles solved in a minimum number of moves. We allowed participants to continue beyond the minimum and recorded whether they succeeded in solving each of the puzzles and the number of moves they required.

Results

Linguistic data was pre-processed as in Study 1. Mean reading times per word for each sentence and for each subject were obtained and averaged to form 9 groups of sentences. The results of the rst study with respect to the main eect of sentence type were replicated8_{, despite the smaller sample}

size (Figure 10).

Figure 10: Average reading times for dierent types of sentences (in ms)

(22)

As mentioned above, memory scores were taken to be the average number of items replicated correctly by the participant in sequential and reversed blocks. Two varieties of ToH scores were considered. First, a composite score of the number of correctly solved puzzles adjusted for the number of extra moves calculated per participants (i.e. if a participant solved 21 puzzles correctly but performed in total 60 moves above the minimum number of moves for all the puzzles that were solved correctly, their score would be 21 x 10 - 60 = 150 points), referred to below and in gures as the ToH Score. A second way of scoring ToH performance is to take the number of puzzles solved correctly in a minimum number of moves, referred to as the ToH Score Min. These two scores were highly correlated (r(14) = .81, p < 0.01) and therefore we used only one of them - the composite score - for further analysis. The summary of the data for memory and Tower of Hanoi performance is presented in Figures 11 and 12.

(23)

Figure 12: ToH scores

Next, we compared performance on the ToH task and the memory task with a number of linguistic recursion scores: (1) simply the reading times of semantically neutral, center-embedded sentences of two levels of embedding (cn3), (2) the dierence in reading times between such sentences and control sentences (Cn3 - Control), (3) the average cost scores and (4) the scores derived from a principal component analysis performed on the reading times across 9 kinds of sentences. Neither of these comparisons was in line with our Hypothesis 1. Correlation of linguistic data with ToH score varied between r(14) = .18 for the Cost score and r(14) = .27 for Cn3 and correlation of linguistic data with memory between r(14) = .003 for Cn3 and r(14) = .26 for the Cost score. In addition to being very low, neither of these correlations reached statistical signicance. Correlation between memory and ToH performance was r(14) = .44, only signicant in a one-tailed test9_{. And overview of these}

results is presented in Table 1. Because of these results, the planned multiple regression aimed at testing our second hypothesis was not performed.

Visual inspection of the data reveals that three participants performed much better on the ToH task than the rest (see Figure 12b) in that they solved nearly all puzzles in the required minimum amount of moves. Therefore, one could wonder whether with suciently enough sample (containing more such highly performing individuals), more meaningful patterns could be revealed. For example, these participants could be expected to also dier in interesting ways in their reading times and

9_{We expected that people with better memory perform better at ToH and, accordingly, a one-tailed test is more}

appropriate. Because it was not our explicit hypothesis and because a two-tailed test is more conservative, we report both.

(24)

Table 1: Correlations between Language, Memory and ToH

Cn3 Cn3

-Control CostScore PCAScore Memory ToHScore

Cn3 r 1 .974** .492 .978** .003 .274 Sig. .000 .053 .000 .990 .305 Cn3 - Control r .974** 1 .461 .914** .065 .238 Sig. .000 .073 .000 .811 .375 Cost Score r .492 .461 1 .567* .263 .176 Sig. .053 .073 .022 .324 .516 PCA Score r .978** .914** .567* 1 -.051 .235 Sig. .000 .000 .022 .852 .381 Memory r .003 .065 .263 -.051 1 .437 Sig. .990 .811 .324 .852 .091 ToH Score r .274 .238 .176 .235 .437 1 Sig. .305 .375 .516 .381 .091

**. Correlation is signicant at the 0.01 level (2-tailed). *. Correlation is signicant at the 0.05 level (2-tailed). N = 16 for all comparisons

memory. However, the examination of the relevant scatterplots does not seem to warrant such a conclusion (Figure 13).

Figure 13: RT scatterplots

Despite the discouraging correlation results, we attempted an exploration of the inuence of memory- and ToH-based median splits on linguistic data. ToH-based split showed a slight dierence in the reading time distribution on the level of sentence averages but in a direction opposite to predicted (Figure 14a). That is, we expected that people performing better on the Tower of Hanoi task will have lower reading times in general and a smaller dierence between the critical three-level semantically neutral center-embedded sentences and other types of sentences. The results show, however, that they have slightly higher reading times across all types of sentences and a greater increase in the reading times for cn3-type sentences. Next, we tried zooming in to the reading times per word averaged across the same type of sentences, i.e. reading times for all three instances of the same type of sentence (cs2, ts2 etc.) were averaged for each person and subsequently averaged

(25)

across participants based on median splits into higher and lower-performing groups. There was no dierence for cs2, ts2 or tn2. For cn2, the higher ToH group has slightly higher increase in reading time upon encountering V1 - the verb attached to the rst noun and for the rest of the kinds of sentences the lower ToH group has slightly higher reading times overall.

Memory-based split showed absolutely no dierence for the reading times across dierent types of sentences and no dierence on the level of reading times per word in dierent types of sentences. The only dierences noticeable were higher reading times for V3 in cn3 sentences for the lower memory group (Figure 14b). For ts3, lower memory group has higher reading times for the articles and lower reading times for the associated nouns and the reverse is true for the higher memory group which is an interesting pattern perhaps suggestive of dierent processing management strategies depending on the memory capacity.

Figure 14: Median splits (RTs in ms)

Discussion

Although the preliminary results seem discouraging, it could well be that either the tasks or the treatment of the data was at fault rather than the lack of hypothesized relationship per se. First, the sample of 16 participants was very small. Second, as in Study 1, we did not have a straightforward measure of recursiveness in sentence processing. On the other hand, all measures we did try were highly correlated with each other and they all seem to be at least a good proxy for sentence processing performance. Third, perhaps ToH either should not be thought of in terms of recursion or the way we scored it does not reect the underlying recursive strategy. Forth, even if recursion can not be thought of as an individually variable domain-general skill, there still is a possibility that there are short-term interaction eects in recursive processing in dierent domains. Therefore, a priming experiment could still be conducted.

(26)

General Discussion

Addressing the issues in both our studies, we believe that failing to nd a signicant correlation between recursive sentence processing and recursive problem-solving might have more to do with the problematic theoretical status of recursion rather than a lack of any relationship.

First, self-reference can be realized in at least two dierent ways: structurally and procedurally. A structurally recursive object is an object that looks like it is built up of smaller copies of itself, for example a fractal (think of a broccoli or a snowake). Procedural recursion, on the other hand, is the ability of a subroutine or a procedure to call itself (Harel, 1993, p. 31). For example, a cake can be divided into sixteen equal pieces by rst cutting it in half, then cutting the resulting two pieces in half and then repeating the procedure twice more to give the desired number of portions. As can be seen from already these two examples, a relationship between the two kinds of recursion is complex: a fractal does not need to be a result of a recursive procedure (for example, iteration might do the trick) and a recursive procedure does not necessarily lead to a recursive structure (a divided cake does not look recursively).

Second, recursion itself can be seen as either a formal tool for describing certain phenomena or as something realized in the brain/mind. It is the former understanding that has been traditionally applied to linguistic recursion since the notion has been introduced by Bar-Hillel (1953) and popu-larized by Chomsky (1956), who saw the need to view languages as involving recursive devices and discrete innity for the purpose of simplifying the description (p. 115), relative to the descriptions oered by formerly used nite state models. Obviously, however, if we try to describe language, we are interested in explaining human linguistic capacity, not just language as a formal entity. That is why over time recursion began to acquire more and more cognitive connotations until the culmina-tion in the form of becoming a genetically-specied computaculmina-tional procedure realized in the brain (HCF110_).

Why these two distinctions matter for operationalizing recursion should become evident from examining what they mean in the context of linguistic structure. The recursive devices Chomsky talks about are in his early work realized as phrase structure grammar that allows for hierarchicality in place of mere concatenation of strings into a at structure. Recursion comes in when hierarchy includes self-embedding that results from applying recursive rewrite rules such as X YX, in which the category to the left of the arrow reappears on the right, engendering the capacity for looping ad innitum. The result of applying such rules is a structurally recursive parse tree but it does not seem justied to think of the application of such rules within a derivation as procedurally recursive. Especially given the existence of explicitly procedurally recursive Merge operation in Chomsky's later Minimalist Program (Chomsky, 1995). Merge appears already on the lowest level of linguistic structure in recombining the basic syntactic objects (which are simply lexical items). It takes two elements A and B, e.g. love and Mary, and combines them together to form a higher level unit AB,

(27)

e.g. love Mary. The created unit can undergo further Merge with e.g. John and produce John loves Mary. As should be evident, Merge is undoubtedly procedurally recursive - more so than the rewrite rules and derivations, since it can apply to its own output indenitely, combining both single words with each other but also whole phrases with other phrases, including embedded sentences.

These dierent kinds of recursion have consequences for which phenomena in language we con-sider to be exhibiting recursion and therefore for what the empirical part of the recursion-only-hypothesis entails. If we opt for structural recursion, there is a choice of granularity of self-embedded constituents. Psycholinguistic studies usually concern themselves with the smallest recursive set pos-sible, that is sentences with long-distance dependencies (center-embedded sentences in our studies) but if we take recursive rewrite rules to be the hallmark of recursion, there is no reason why tail-embedded sentences or sentences with self-tail-embedded prepositional phrases cannot be considered recursive. And what is more, who is to tell whether a prepositional phrase embedded in another prepositional phrase should be classied as PP in PP and considered an instance of recursion or rather as PP-LOC (Verhagen, 2010). Opting for procedural Merge, on the other hand, will make every sentence recursive because produced by recursive operation. In sum, taking the structural vs procedural distinction into account, on two extremes we could say that either everything in language is recursive (because it is produced by Merge or because the syntactic categories are coarse enough) or nothing is (because sentences are produced by non-recursive mechanism or because the categories are very ne-grained). We decided to choose a position in between - recursion as something special, realized in certain self-embedded structures but perhaps a dierent decision could have produced dierent results.

In the face of such ambiguities, it seems that developing a more explicit computational level model of recursion should be the next step, a model that spells out the role of recursive mechanism in language or what operation from the already established theories of sentence processing could approximate that mechanism. Gibson's integration operation in his Syntactic Prediction Locality Theory is one good candidate, one that seems to go in the direction of Merge if we consider the distinctions above.

As mentioned in section on recursive sentence processing, the SPLT theory considers two major components that aect reading times and comprehension: (1) a memory cost component which dic-tates what quantity of computational resources are required to store a partial input sentence and (2) an integration cost component which dictates what quantity of computational resources need to be spent on integrating new words into the structures built thus far. Integration lies at the heart of sentence comprehension and occurs on multiple levels: syntactically combining the structures together, semantically assigning thematic roles and pragmatically, adding new elements to the dis-course structure. An act of integration requires one to match the category of the current word with a syntactic prediction that is part of one of the candidate structures being pursued and to reacti-vate the lexical head/dependent associated with the syntactic prediction so that the plausibility of the head-dependent relationship can be evaluated within the discourse context (Gibson, 1998, p. 11). The computational resources are needed to perform the integration itself and to manage the

(28)

elements proportional to the distance between them because the activation of lexical items decays with receiving new input.

The linguistic integration cost consists of a cost dependent on the complexity of the integration (e.g. constructing a new discourse referent is more complex) and a distance-based cost. A memory cost is the resources required to remember each category that has been encountered and needs to be integrated with new elements to form a grammatical sentence. Gibson's theory predicts that the time required to perform a linguistic integration ... is a function of the ratio of the integration cost required at that state to the space currently available for the computation (p. 16): RT = C*I/(Mcapacity-Mcurrently-used) where C is a constant, I is an integration cost and M refers to

memory capacity and memory currently used to store syntactic predictions. Gibson considers that subjects can dier in their working memory capacity. If we were to equate recursion with linguistic integration, however, we would have to claim that the integration cost and specically the cost related to the complexity of the integration can also dier between individuals or can be inuenced by priming.

Future studies need to make use of Gibson's formalization and investigate whether the integration complexity cost (1) can be manipulated in such a way that it could be teased apart from the distance-based cost and memory cost, (2) whether it varies individually and (3) whether a metric distance-based solely on such a cost can be correlated with non-linguistic recursion. Enriching the resulting model with measures of semantic relatedness derived from a corpus could be also considered in order to account for semantic support eects stemming from semantic level of integration.

Finally, a greater thought needs to be given to the role of recursion in tasks like Tower of Hanoi. If recursion in language is responsible for integrating syntactic constituents, what could it function be in non-linguistic domains such that even asking a question about domain-specicity was meaningful? Or should we rather abandon postulating recursion as a cognitive mechanism sui generis and turn our attention to more basic commonalities between tasks that are said to involve recursion? Gibson (1998) lists a number of alternative explanations for the diculties in self-embedded sentences, like perspective shifting (MacWhinney, 1977), according to which processing resources are required to shift the perspective of a clause, where the perspective of a clause is taken from the subject of the clause(Gibson, 1998, p. 5). This brings another non-linguistic domain to mind, namely Theory of Mind which also requires shifting the (mental) perspective. Perhaps then comparing the performance on subject- and object-extracted relative clauses and performance on ToM false-belief task oers a more promising avenue for future research. Especially that developmental links in this area have already been found (De Villiers & De Villiers, 2003).

References

Bar-Hillel, Y. (1953). On recursive denitions in empirical science. 11th International Congress of Philosophy, 5 , 160165.

(29)

Blaubergs, M. S., & Braine, M. D. (1974). Short-term memory limitations on decoding self-embedded sentences. Journal of Experimental Psychology, 102 , 745748.

Blumenthal, A. (1966). Observations with self-embedded sentences. Psychonomic Science, 6 , 453 454.

Caplan, D., Dede, G., Waters, G., Michaud, J., & Tripodis, Y. (2011). Eects of age, speed of processing, and working memory on comprehension of sentences with relative clauses. Psychology and Aging, 26 (2), 49450.

Chomsky, N. (1956). Three models for the description of language. IRE Transactions of Information Theory IT-2 , 3 , 113124.

Chomsky, N. (1995). The minimalist program. Cambridge: The MIT Press.

Clark, A. (1997). Being there: Putting brain, body and world back together again. Cambridge: MIT Press.

De Villiers, J., & De Villiers, P. (2003). Language for thought: Coming to understand false beliefs. In D. Genter, & S. Goldin-Meadow (Eds.) Language in mind. Cambridge: MIT Press.

Evans, N., & Levinson, S. C. (2009). The myth of language universals: Language diversity and its importance for cognitive science. Behavioral and Brain Sciences, 32 , 429492.

Fedorenko, E., Gibson, E., & Rohde, D. (2006). The nature of working memory capacity in sentence comprehension: Evidence against domain-specic working memory resources. Journal of Memory and Language, 54 (4), 541553.

Fitch, W., Hauser, M., & Chomsky, N. (2005). The evolution of the language faculty: Clarications and implications. Cognition, 97 , 179210.

Gibson, E. (1998). Linguistic complexity: locality of syntactic dependencies. Cognition, 68 , 176. Hamilton, H. W., & Deese, J. (1971). Comprehensibility and subject-verb relations in complex

sentences. Journal of Verbal Learning and Verbal Behavior, 10 , 163170.

Harel, D. (1993). Algorithmics: The spirit of computing. Reading: Addison-Wesley.

Hauser, M., Chomsky, N., & Fitch, T. (2002). The faculty of language: What is it, who has it, and how did it evolve? Science, 298 , 15691579.

Jackendo, R., & Pinker, S. (2005). The nature of the language faculty and its implications for evolution of language (reply to tch, hauser, and chomsky). Cognition, 97 , 211225.

Just, M. A., Carpenter, P. A., & Woolley, J. D. (1982). Paradigms and processes in reading com-prehension. Journal of Experimental Psychology: General, 111 , 228238.

(30)

King, J., & Just, M. A. (1991). Individual dierences in syntactic processing: The role of working memory. Journal of Memory and Language, 30 , 580602.

Kinsella, A. R. (2009). Language evolution and syntactic theory, chap. 4: Language as a recursive system, (pp. 112159). Cambridge: Cambridge University Press.

Luuk, E., & Luuk, H. (2011). The redundancy of recursion and innity for natural language. Cognitive Processing, 12 (1), 111.

MacDonald, M., & Christiansen, M. H. (2002). Reassessing working memory: A comment on just and carpenter (1992) and waters and caplan (1996). Psychological Review, 109 , 3554.

MacWhinney, B. (1977). Starting points. Language, 53 , 152168.

Marks, L. E. (1968). Scaling of grammaticalness of self-embedded english sentences. Journal of Verbal Learning and Verbal Behavior, 7 , 965967.

Miller, G. A., & Isard, S. (1964). Free recall of self-embedded english sentences. Information and Control, 7 , 292303.

O'Regan, J., & Noë, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and Brain Sciences, 24 (5), 939973.

Pinker, S., & Jackendo, R. (2005). The faculty of language: what's special about it? Cognition, 95 , 201236.

Simon, H. A. (1975). Functional equivalence of problem solving skills. Cognitive Psychology, 7 , 268288.

Stolz, W. S. (1967). A study of the ability to decode grammatically novel sentences. Journal of Verbal Learning and Verbal Behavior, 6 , 867873.

Tomalin, M. (2007). Reconsidering recursion in syntactic theory. Lingua, 117 , 17841800.

Verhagen, A. (2010). What do you think is the proper place of recursion? In H. Van der Hulst (Ed.) Recursion and Human Language, (pp. 93110). Berlin: Mouton de Gruyter.

Welsh, M., & Huizinga, M. . (2005). Tower of hanoi disk-transfer task: Inuences of strategy knowledge and learning on performance. Learning and Individual Dierences, 15 , 283298. Welsh, M., & Huizinga, M. (2001). The development and preliminary validation of the tower of

(31)

Appendix 1: Internship time-course

Time schedule

Starting date: 01-01-2011 Interruptions:

01-02-2011 31-03-2011: following other courses 20-06-2011 01-07-2011: CSCA summer school part-time since 01-09-2011

Final date: 27-01-2011 Plan of work

January: literature review

April: choosing the experimental tasks, implementing them May-June: running Study 1, analyzing preliminary data July: writing the rst report draft

September-December: running Study 2, analyzing data, completing the report Experience acquired during the internship

rening research questions operationalizing hypotheses

implementing experimental tasks (choosing appropriate stimuli, programming) data analysis (in MatLab and SPSS)

critical interpretation of the results

Appendix 2: Linguistic stimuli

1. Control sentences

All the brothers and sisters got together The children next door just opened the fence The hammer came back and hit him in the thumb The diesel has fallen into unpopular status

She made an apple pie when the pastor and his wife came over The men came out and said that was only a snake

(32)

She got a dog for her birthday from her family when they lived at the farm The boy killed himself a few years ago in a grade school

They had everything from cats to hamsters and lizards

The government sold information to private companies about individuals or families In his classes they were running a bunch of scenes from a comedy

The closely held supermarket chain named her vice president The administration lacked a comprehensive health-care policy Last week the parliament displayed unusual political immaturity Baskets of roses and potted palms adorned his bench

The new rules allowed investors to buy foreign stocks directly

The association rst started working on developing this contract last decade Short-term interest rates rose at the government's regular weekly auction The company declined to estimate the value of the foreign holding One worker suered permanent brain damage from mercury exposure The push for cleaner fuels increased the attractiveness of natural gas The remainder of the house leaned precariously against a sturdy oak tree They opened small gift shops mostly aimed at tourists

2. 2-clause sentences (a) center-embedded

i. semantically supported

The iceberg that the penguin inhabited sank the ship The waiter that the comedian amused served the wine The juice that the child spilled stained the table ii. semantically neutral

The raccoon that the bear observed crossed the path

The librarian that the scientist liked dominated the conversation The girl that the dog saw read the newspaper

(b) tail-embedded

The hunter shot the lion that devoured the zebra The match lit the replace that warmed the room The gardener planted the owers that pleased the model

(33)

ii. semantically neutral

The university stored the collection that tempted the physicist The student visited the nurse that ate the ice-cream

The philosopher greeted the ocer that carried the baby 3. 3-clause sentences

(a) center-embedded

The bees that the hives that the farmer built housed stung the children The vase that the maid that the agency hired dropped broke on the oor The stone that the boy that the club members initiated threw hit the window ii. semantically neutral

The chef that the waiter that the bus-boy appreciated teased admired good mu-sicians

The frog that the turtle that the spider chased followed stayed on the log The baker that the butcher that the candlestick-maker paid congratulated

bor-rowed 200 dollars (b) tail-embedded

The critic reviewed the movie that portrayed the architect that designed the tower

The ghost haunted the barber that shaved the doctor that prescribed the medicine The pharmacist composed the injection that cured the virus that infected the

town

ii. semantically neutral

The rats feared the man that avoided the chipmunks that lived on the hill The receptionist liked the typist that thanked the clerk that teased the boss The hairdresser saw the boy that discovered the children that noticed something

Appendix 3: Self-paced reading applet source-code

// The program stores sentences to be displayed on the screen.

// Each sentence is shown word-by-word, with the initial state of all words hidden. // Each word is displayed when a key or a mouse-button is pressed.

// Time between the spacebar presses is recorded and stored in an array list and then exported to a data le.

Recursive Processing Across Domains