The Influence of Learning a Second
Language at a Later Age on the
complexity and fluency of the First
Language
LEONIE OOSTRA
Sxxxxxxx
MA thesis, Department of Applied Linguistics,
Faculty of Arts, Rijksuniversiteit Groningen
Supervisors:
Dr. R.G.A. Steinkrauss (supervisor)
Dr. M.C.J. Keijzer (second reader)
15-06-2018
2
Declaration of authenticity
MA Applied Linguistics - 2017/2018 MA-thesis
Student name: Leonie Oostra____________________________________________________ Student number: sxxxxxxx______________________________________________________
PLAGIARISM is the presentation by a student of an assignment or piece of work which has in fact
been copied in whole, in part, or in paraphrase from another student's work, or from any other source (e.g. published books or periodicals or material from Internet sites), without due acknowledgement in the text.
TEAMWORK: Students are encouraged to work with each other to develop their generic skills and
increase their knowledge and understanding of the curriculum. Such teamwork includes general discussion and sharing of ideas on the curriculum. All written work must however (without specific authorization to the contrary) be done by individual students. Students are neither permitted to copy any part of another student’s work nor permitted to allow their own work to be copied by other students.
DECLARATION
• I declare that all work submitted for assessment of this MA-thesis is my own work and does not involve plagiarism or teamwork other than that authorised in the general terms above or that authorised and documented for any particular piece of work.
Signed____L.E.Oostra_________________________________________________________
3
Table of content
0. Abstract p. 5 1. Introduction p. 6 2. Background p. 7 2.1 L1 attrition p. 72.2 Affected language areas p. 8
2.3 Ways of investigating attrition p. 16
2.4 Statement of purpose p. 19
3. Method p. 22
3.1 Subject p. 22
3.2 Materials and procedure p. 23
3.3 Design and analyses p. 26
4. Results p. 27
4.1 Filled and unfilled pauses p. 27
4.2 Lexical diversity p. 29 4.3 Speech rate p. 30 4.4 Lexical sophistication p. 31 4.4.1 Frequency bands p. 31 4.4.2 Individual words p. 34 5. Discussion p. 35
5.1 Filled and unfilled pauses p. 35
5.2 Lexical diversity p. 39
5.3 Speech rate p. 40
4 6. Conclusion p. 43 References p. 47 Appendices p. 51 Appendix A p. 51 Appendix B p. 53
5
0. Abstract
In 1982 Lambert made the statement that we do know a lot about learning languages, but we know very little on how language skills can be forgotten. Since then a lot of studies have tapped into the area of first language (L1) attrition. L1 attrition can be present in different areas, like at the lexical or grammatical level, and it can be investigated using different methods, like through formal tasks or through the analysis of free speech. Multiple studies have investigated language attrition since Lambert’s statement; however, most have only looked at participants who emigrated before the age of 30.
The current case study provides insight into the hitherto understudied area of attrition starting after the age of 30. The participant is PM, a Dutch missionary who went to Peru at the age of 41 and then started to learn Spanish. After 7,5 years PM came back to the Netherlands and started to preach again in Dutch. Recordings of sermons in Dutch from a period of five years, starting from the moment PM was back in Holland, were analysed on complexity (lexical diversity and sophistication) and fluency (pauses and speech rate). Correlation analyses revealed no significant changes regarding lexical diversity, speech rate and lexical sophistication over time. Significant changes were however observed in the number of pauses. More specifically, there was an increase in the use of unfilled pauses and a decrease in the use of filled pauses. The pattern in PM’s use of unfilled pauses may be explained by the fact that previous research found that Dutch speakers use more empty pauses. Another explanation lies in the placement of the unfilled pauses. PM wants, as a preacher, to convey a message to the listeners and uses unfilled pauses to emphasize important parts in the sentences. In conclusion, the results of the present case study tentatively suggest that learning a second language after the age of 30 might not have a great impact on the fluency and complexity of the L1.
6
1. Introduction
In 1982 it was Lambert who made the statement that we do know a lot about learning languages, but we know very little on how language skills can be forgotten. The phenomenon of losing language skills in one’s first language is called L1 attrition and this is linked to the learning of a second language (Schmid, 2011b). The thought that L1 attrition begins very early in the learning process of a second language (Schmid, 2011b) is different from the assumption that was made in earlier literature on attrition, where researchers believed that attrition only becomes apparent in highly advanced L2 speakers and where the speakers haven’t used their L1 for a long period of time (Seliger & Vago, 1991). This interesting process of losing language skills in the L1 due to the learning of a new language has been investigated widely in emigrants. In most research, the maximum age of emigration was around 30 years old, where the focus was mostly on children and adolescents (e.g. Bergmann, Sprenger, & Schmid, 2015; De Leeuw, Schmid, & Mennen, 2010; Hulsen, 2000; Olshtain & Barzilay, 1991; Silva-Corvalán, 1991; Yilmaz & Schmid, 2012). Many researchers used formal tasks to investigate L1 attrition, like sentence judgment tasks, verbal fluency tasks and the C-test (e.g. Altenberg, 1991; Ammerlaan, 1996; Schmid & Beers Fägersten, 2010; Schmid & Jarvis, 2014). Lately, the use of formal tasks to investigate first language attrition has actually been questioned (Schmid & Jarvis, 2014; Yilmaz & Schmid, 2012), or, as Schmid (2004) stated, formal tasks should at least be used in combination with unguided free speech.
The current study is a case study that focuses on a Dutch missionary who went to Peru for 7,5 years. In this context, he had to learn Spanish when he was already 41 years old. In 2012 he came back from Peru, and started to preach again in the Netherlands, and thus in his L1. His sermons were being audiotaped for over a period of five years, and this provided the data for the current study. The question that this paper will try to answer is what kind of consequences learning a second language at a later age has for the L1 and how these develop over the period
7
the subject is back again in his home country. This leads to one central question: “What is the influence of learning a second language at a later age on the fluency and complexity of language use in the L1?”. In this way, the current study both answers the call to use more free production data in attrition studies and provides insight into the hitherto understudied area of attrition starting after the age of 30. In order to answer the research question, four minute fragments of the subject’s sermons are being transcribed and analysed on lexical diversity (D), filled and unfilled pauses, lexical sophistication, and speech rate.
In the following sections, I will discuss L1 attrition, affected language areas and the ways of investigating attrition (e.g. formal tasks and free speech). The analyses of the data will include correlation analyses. Finally, the results of these data analyses will be discussed.
2. Background
2.1 L1 attrition
The phenomenon of forgetting language skills, being referred to as attrition in the literature, is a phenomenon widely investigated. Language attrition is the partial or total loss of a first or second language by an individual, where the forgetting is not caused by a medical condition, like a brain injury (Schmid, 2011b). Forgetting language skills in the first language, the L1, is the result of cross-linguistic influence (CLI), which is the influence that the different languages have on one another in a bilingual individual (Schmid, 2011b). The thought that L1 attrition begins very early in the learning process of a second language (Schmid, 2011b) is different from the assumption that was made in earlier literature on attrition, where researchers believed that attrition only becomes apparent in highly advanced L2 speakers and where the speakers haven’t used their L1 for a long period of time (Seliger & Vago, 1991). In contrast, first language attrition will be present as soon as a speaker becomes bilingual, because there will be some traffic from the L2 to the L1 (Schmid, 2011b) already in the very beginning of the learning
8
process of a second language. This means that L1 attrition is dependent on two aspects: the development of and exposure to the L2, and the decreasing presence and use of the L1 (Schmid & Beers Fägersten, 2010).
2.2 Affected language areas
L1 attrition can be present in various aspects of language, for example on the lexical or grammatical level. In 1991 it was Altenberg who conducted a study investigating attrition occurring in the lexical, morphological and syntactic areas. The participants of the study were a married couple, who were native German speakers but had at that point lived in the US for over forty years. They still spoke German to one another on a daily basis. They were twenty-five and twenty-nine years old when they migrated to the US. Through three tasks, Altenberg (1991) investigated which aspects of grammar were most sensitive to attrition. An untimed syntactic judgment task, investigating the syntactic area, was used to examine the grammatical rules and forms of the participants’ L1. The participants had to judge the grammaticality of four types of sentences. The sentences to be judged were in German and in English, and could be grammatical in one language, in both languages or in neither of the languages. Two monolingual English speakers rated the English sentences to establish the English norms. German speakers and linguists were asked to judge the German sentences in order to establish German norms for all tasks. Grammatical sentences were rated as better than the ungrammatical ones. Since the married couple still used German on a daily basis, it was to be expected that syntax remained largely intact (Altenberg, 1991). Results showed that the participants indeed knew quite well what was grammatical in the L1. Nevertheless, ungrammatical sentences were harder to judge for the subjects. Ungrammatical sentences in German that were grammatical in English were rated somewhat better than ungrammatical sentences that were also ungrammatical in English. This indicates that the syntax of the L2 had an influence on the
9
syntax of the L1 (Altenberg, 1991). In contrast to the attriters, the monolingual English speakers in the control group did not show this pattern. The second task, an untimed judgment task investigating the lexical area, consisted of sentences with the verbs brechen (to break) and
nehmen (to take), where half of the sentences were acceptable. This task was used to investigate
whether idiomatic verbs in English had an influence on the participants’ L1. Results showed that a few unacceptable sentences in German were accepted by the participants, and this points into the direction that the use of verbs is subject to attrition (Altenberg, 1991). However, the results for the two verbs were different from one another. The acceptance for the verb brechen was greater than for nehmen. Altenberg (1991) suggests a possible explanation: there is more phonetic similarity between break and brechen than between take and nehmen. Attriting subjects being less accurate was visible in Altenberg’s (1991) third and last task of her study, which was an untimed fill-in task. The participants had to fill in the gender and the plural form of low and high frequency words, investigating lexicon as well as morphology. The lists consisted of nouns with predictable and unpredictable plural forms. Results showed that the participants made clearly fewer errors in filling in the gender than the plural forms, despite the fact that gender was never predictable (Altenberg, 1991). Zooming in on the results for the plural forms, it was found that more errors were made in the low frequency wordlist than in the high frequency wordlist. In addition, unpredictable nouns were more error prone than predictable nouns, the form of which could be derived from rules. Thus, low frequency words seem to be more vulnerable to attrition concerning lexical information than high frequency words (Altenberg, 1991). Also, unpredictable nouns seem to be more vulnerable to loss of information associated with it than predictable nouns. When a noun has an unpredictable plural form, this knowledge is part of the lexical representation, while predictable plural forms are derived from morphological rules, according to Altenberg (1991). Therefore, Altenberg (1991) suggests that lexical information might be more sensitive to attrition than morphological rules
10
are. Like Altenberg (1991), Ammerlaan (1996) also finds that attrition processes (in lexical access and retrieval) are influenced by similarities between the L1 and the L2. In his Ph.D. study, he investigated lexical accessibility in native Dutch speakers living in Australia who immigrated later than the age of six (Ammerlaan, 1996). The participants had to perform a cloze test and a verbal fluency task. Results showed that the attriters were slower and less accurate than their controls. Thus, both Altenberg (1991) and Ammerlaan (1996) found that attriters had problems with lexical retrieval. Indeed, lexical access seems to be the most vulnerable area and most quickly affected when adult bilinguals live in an L2 environment (Yilmaz & Schmid, 2012). However, problems in lexical access and retrieval can be overcome, since they are not structural (Ammerlaan, 1996). This point was also made by Hulsen (2000), who argued that difficulties in the lexicon are retrieval processing problems, and that there is no actual loss.
Altenberg (1991) was not the only one investigating which areas of language are subject to first language attrition. Attrition in the lexicon has been studied throughout the years (Ammerlaan, 1996; Hulsen, 2000; Olshtain & Barzilay, 1991; Schmid & Beers Fägersten, 2010; Schmid & Jarvis, 2014; Yilmaz & Schmid, 2012). Although the lexicon is seen as the area that is most prone to early attrition processes (Yilmaz & Schmid, 2012), it is typically not the whole lexicon that is affected (Hulsen, 2000). Hulsen (2000) continued the investigation that Ammerlaan started in 1996, and she found a difference between productive and receptive competences in lexical access in three generations of Dutch migrants living in New Zealand. Her data showed that it was production that was affected across all generations, while comprehension hardly changed (Hulsen, 2000). The finding that lexical production is affected, was also found by Olshtain and Barzilay (1991). In contrast to the studies who made use of formal tasks, Olshtain and Barzilay (1991) conducted a study involving a free speech task, a so-called story telling task, which was performed by American attriters who lived in Israel. It seemed that the subjects did not struggle during the retelling, as they behaved like competent
11
speakers of English (Olshtain & Barzilay, 1991). However, upon closely inspecting specific lexical choices, it became clear that the attriters had difficulties with so-called keywords. The keywords could not be left out in the retelling, and as a consequence, the attriters tried to convey the meaning of those keywords in other ways. This was done for example by paraphrasing or using circumlocution, or systematic word retrieval processes. The controls in this study, native Americans who did not emigrate, had no difficulties in naming these keywords and therefore did not come up with a range of substitutions (Olshtain & Barzilay, 1991). From this study it is clear that there is often some reduction of lexical accessibility in attriting subjects (Olshtain & Barzilay, 1991). A more recent study tapping lexical access also revealed that emigrants (the attriters) were less productive on verbal fluency tasks than their controls (Schmid & Jarvis, 2014). In their study, Schmid and Jarvis (2014) made use of formal tasks and free speech, investigating the lexicon. In their study 159 native Germans participated; 53 participants emigrated to Anglophone Canada, 53 participants emigrated to the Netherlands and 53 participants lived in Germany their whole lives. The mean age at emigration for the group who moved to Canada was 26 years, for the group who moved to the Netherlands it was 29 years. Data were analysed on lexical diversity and lexical sophistication among others. Results of the verbal fluency tasks showed that the attriters were outperformed by the monolinguals. In other words, they produced fewer exemplars within the time span of one minute compared to their non-migrant peers (Schmid & Jarvis, 2014). There are two possible reasons for this finding. The first explanation they gave is that the attriters have to suppress the words from the L2 lexicon in order to come up with the correct L1 word. The second explanation is that attritional processes are at play, which results in problems in the lexical access of words in the L1. Next, the free speech samples were analysed on the frequency of the words the participants used. Results showed that the group of attriters overused the high-frequency words and underused the low-frequency items. Schmid and Jarvis (2014) could not pinpoint what the actual cause
12
was, since that is hard to establish on the basis of their data. In their study it was also found that the attriters did not differ on D as a measure of lexical diversity from the non-attriters (Schmid & Jarvis, 2014).
Whereas lexical access is most vulnerable in terms of attrition, grammar seems to be more stable in people who emigrated after puberty (Schmid, 2010). However, as was clear from Altenberg’s study (1991), grammar and syntax are not completely free from attritional processes, and more research points into this direction (e.g. Bergmann, Meulman, Stowe, Sprenger, & Schmid, 2015; Håkansson, 1995; Silva-Corvalán, 1991). For example, in 1991, Silva-Corvalán investigated the morphology of verbs in fourteen Spanish-English bilinguals. Participants were divided into three groups: 1) a group (n=6) of speakers born in Mexico who moved to the United States (after the age of eleven), 2) a group (n=4) of Mexico-born speakers who moved to the US before age six, or who were born in Los Angeles; and 3) a group of speakers (n=4) who were also born in LA, but had at least one parent fitting group 2’s criteria. The author had individual conversations in Spanish with the fourteen participants, which provided the data to be investigated. The data revealed that later generations (i.e. the participants who immigrated before the age of six and the participants who were born in LA) used less verbal morphology, while the first generation, i.e. the participants who immigrated after the age of six, still showed a fully developed variety of spoken Spanish (Silva-Corvalán, 1991). Håkansson (1995) also conducted a study into the morphology in attritional processes, in this case of the Swedish language, where the subjects were five bilingual L1 Swedish expatriate students, who were around 20 years old when they returned to Sweden. Data for this study included spoken as well as written language. Håkansson (1995) investigated various aspects of the Swedish grammar and found that the students had difficulties with noun phrase morphology (agreement), but not with the verb second rule (word order). This result indicates that attritional processes affects some aspects of grammar more than others (Håkansson, 1995).
13
Another study inspecting the area of morphology, but using a completely different technique, is the study of Bergmann and colleagues in 2015. This was the first study to use electroencephalography (EEG) in first language attrition research. In this study, the processing of morphosyntactic violations was investigated in a group of attriters whose L1 was German and L2 was English, living in the US or in Anglophone Canada (Bergmann, Meulman, et al., 2015). The control group consisted of L1 speakers of German living in Germany. All subjects were presented with stimuli in an event-related potential (ERP) experiment, where the stimuli were verb form combinations and determiner-noun combinations marked for grammatical gender. The results of the EEGs revealed that the attriter-group reacted the same to violations of gender agreement as the control group did (Bergmann, Meulman, et al., 2015). Violations of verb forms evoked the same reaction in both groups. However, it was seen that the attriters showed an extra effect for the violations of verb form combinations and this pattern had also been seen in L1 speakers of English before. This points to an influence of the English language on the L1 (German) (Bergmann, Meulman, et al., 2015).
Phonology too seems to be a more stable area in the attrition process in emigrants (Schmid, 2010). However, differences between attriters and non-attriters (e.g. Celata & Cancila, 2010; Chang, 2012; De Leeuw et al., 2010) in this aspect of language can be seen, as well on the phonetic level (e.g. Mayr, Price, & Mennen, 2012; Mennen, 2004). It is found that attriters are more likely to be perceived as non-native speakers than non-attriters (De Leeuw et al., 2010). Chang (2012) found that even after just six weeks of learning a second language there is an effect on the L1 speech. However, this was found through speech measurements, and these effects are not necessarily audible. Not only the productive side of language can be subject to attritional processes, also the perceptive side can be affected (Celata & Cancila, 2010). The study investigated two groups of Lucchese speaking immigrants who lived in Los Angeles (Celata & Cancila, 2010). The first group (the first generation) consisted of participants born in
14
Lucchesia, but migrated to LA between the ages of 17 and 28. The second group (the second generation) were born in the USA and had Lucchese parents. There also was a group of native Lucchese speakers who did not migrate. Outcomes showed that the monolingual, non-attriting group performed better on the phoneme identification tasks than the attriters, and the first generation did better on the tasks than the second generation (Celata & Cancila, 2010).
Phonetics are included too in first language attrition research. Mayr and colleagues (2012) performed a study which involved Dutch monozygotic twins. One of the twins emigrated to the United Kingdom, while the other one stayed in the Netherlands. Both participants’ speech was investigated through word list reading. Changes were detected in the voice onset time (VOT) and vowels in the emigrated speaker’s L1 (Mayr et al., 2012). Like Chang’s study in 2012, assimilation towards the L2 system was detectable in the L1 speech. This assimilation in phonetics towards the L2 was also seen in a study investigating suprasegmentals in Dutch-Greek bilinguals, and Dutch and Dutch-Greek control groups (Mennen, 2004).
Another direction of research is investigating the (dis)fluency of L1 attriters through analysing free speech (Bergmann, Sprenger, et al., 2015; Schmid & Beers Fägersten, 2010). Disfluencies consist of several hesitation markers, for example pauses in the speech stream. Filled pauses are known to serve semantic functions, which are linked to discourse organization, emphasis or information structure and the distribution of filled pauses therefore seems to be language-specific (Schmid & Beers Fägersten, 2010). Unfilled pauses, retractions and repetitions are other examples of hesitation markers, but they serve another function than filled pauses. These hesitation markers are not language-specific, and rather seem to serve the function of resolving or signalling a problem on the cognitive level, such as the retrieval of words, according to Schmid and Beers Fägersten (2010). In their study, 245 monolingual and bilingual speakers of German and Dutch were investigated. They were divided over five groups. The first group consisted of native German speakers living in Canada, and the second group
15
consisted of native German speakers living in the Netherlands. The third group served as a control group, consisting of native German speakers living in Germany. The fourth group were all native Dutch speakers also living in Canada. The last group also served as a control group. This group contained native Dutch speakers. All participants completed a C-test and a verbal fluency task. This was done in order to establish whether there were any signs of attrition in the participants who migrated to Canada or the Netherlands. Group results of these formal tasks showed that migrated participants were outperformed by the controls and this difference was significant. Then the participants had to do a film retelling. All narratives were recorded and transcribed. In the transcripts, filled pauses, unfilled pauses, repetitions and retractions were coded as disfluencies. Filled pauses were coded regardless of how they were pronounced. Unfilled pauses were coded when a break was heard in the intonational contour, or every other sign of interruption of the speech stream. After counting all disfluency markers, results showed that the groups of migrants used significantly fewer filled pauses and more unfilled (or empty) pauses than the control groups (Schmid & Beers Fägersten, 2010). This outcome shows that looking at (un)filled pauses is a good measure of attrition in migrants’ first language. Results also showed that the L1 Dutch speakers (attriters and controls) had more empty pauses in their speech than the L1 German speakers. In addition, the hesitation markers with a cognitive function (unfilled pauses, retractions, repetitions) were more present in the speech of attriters than in the speech of controls. The differences in the filled pause patterns, which were assumed to serve a semantic function, may have been the result of interlanguage effects (Schmid & Beers Fägersten, 2010).
To get more insight in disfluencies in attriters’ speech, Bergmann and colleagues (2015) performed a study also investigating the spontaneous speech of different speakers. The speakers were divided in three groups (each n=20). Group 1, the control group, consisted of monolingual German speakers. Group 2, the learners, were native English speakers who moved to Germany,
16
learning German as their L2 and group 3, the attriters, were native German speakers who had moved to North America (mean age of emigration was almost 28 years). Each participant watched a scene from a movie and right after watching it, they retold the story which was recorded and transcribed by native German speakers. The participants’ speech was analysed on speech rate (in syllables per minute) and lexical diversity, using type-token ratio (TTR) and D. Besides looking at speech rate and lexical diversity, Bergmann and colleagues (2015) also tapped into (dis)fluency markers, operationalized as filled and unfilled pauses, retractions and repetitions. Concerning TTR, results showed that the lexical diversity in the attriters’ speech is lower than in the learners’ and controls’ speech (Bergmann, Sprenger, et al., 2015). However, measuring lexical diversity with D did not show this difference (Bergmann, Sprenger, et al., 2015). This was mostly due to the difference in sample size which impacts TTR but not D, and therefore the authors disregarded TTR and assumed that there were no differences in lexical diversity between the three groups. Speech rate analysis revealed that the attriters did not differ from the controls. The learners did differ from the controls and the attriters, where the learners were a little slower. Turning to the results for disfluency, data showed that the attriters had the most empty pauses in their speech, and together with the learners, the attriters also produced the most filled pauses (Bergmann, Sprenger, et al., 2015). In addition, the attriting group also produced the most repetitions and self-corrections. Looking at all these outcomes, Bergmann and colleagues (2015) concluded that the attriters as well as the learners have more disfluencies in their speech than the control group. This means that the attriters are distinguishable from non-attriters on these measures.
2.3 Ways of investigating attrition
As the review of the literature has shown, there are various options when it comes to investigating first language attrition. This can be done through formal (controlled) tasks, such
17
as a verbal fluency task and the C-test, or through free speech, such as a story retelling or an interview. The choice for what kind of task to use to investigate the participants’ language is based on the information a researcher wants to have and what aspects of language the researcher wants to investigate.
Formal tasks are widely used in research into L1 attrition. For instance, morphology and syntax in attriters is investigated through judgment tasks (Altenberg, 1991; Håkansson, 1995). Well-known tasks that are used for researching attrition in the lexical area are the C-test and the verbal fluency task. The C-test is a task that consists of a few different texts, for example five texts. Each text has a number of gaps (which can vary), where the first few letters of the words are visible. The person taking this test has to fill in the correct words, and the participants are given a certain amount of time for each of the texts. The verbal fluency task is a task that is very popular in first language attrition research. The reason for this is that it is very easy to design, it takes very little time and it does not need stimuli (Schmid & Jarvis, 2014). During this task the participants have to name as many objects in one minute from a given category, like ‘animals’ or ‘fruits and vegetables’. An example of a lexical task that takes more effort is the picture naming task, where the response time of naming pictures is measured.
Recently, analysing free speech in attrition research has increasingly been used. This can be done in various ways, for example through interviews (about the participant’s personal life), or through the (re)telling of a story. A story that is frequently used for the retelling task is a Charlie Chaplin clip, where the participants see a 10-minute fragment of a Charlie Chaplin movie (Bergmann, Sprenger, et al., 2015; Schmid, 2011a; Schmid & Beers Fägersten, 2010; Schmid & Jarvis, 2014). After the movie has been watched, the participants, while being audiotaped, have to retell what they saw in the movie. These kind of data are then used to measure lexical diversity and sophistication, disfluencies, foreign accent, morphosyntactic errors and speech rate.
18
Lately, the use of formal tasks to investigate first language attrition has been questioned (Schmid & Jarvis, 2014; Yilmaz & Schmid, 2012), or, as Schmid (2004) stated, formal tasks should be used in combination with unguided free speech at least. In 2012, Yilmaz and Schmid performed a study in which they investigated the accessibility of lexical knowledge (L1) of Turkish-Dutch bilinguals, living in the Netherlands. The mean age at emigration for these bilinguals was 21 years. Testing lexical accessibility was done through a free speech task and a lexical naming task. Data resulting from the lexical naming task revealed that the attriters (the late bilinguals) were performing like the monolingual control group (Yilmaz & Schmid, 2012). This finding is in contrast to what the data revealed for the free speech task, which was that the attriters were significantly less fluent than the controls, and in addition showed less lexical diversity in their speech (Yilmaz & Schmid, 2012). The result that the attriters were outperformed by the controls was thus only visible in free speech, not in the experimental task. A possible explanation for this difference is that with formal (controlled) tasks, the participants can fully focus on the process of lexical retrieval, while in free speech many different linguistic processes are going on at the same time, resulting in greater difficulties (Schmid & Jarvis, 2014). Analysing free speech also gives researchers the chance to measure a wide variety of language skills, instead of only looking at one measure in formal tasks (Schmid & Jarvis, 2014). As has been mentioned before, Schmid and Jarvis (2014) performed a study investigating German attriters through formal tasks and free speech. The subjects had to complete two verbal fluency tasks, one story retelling (Charlie Chaplin) and a free speech interview about their biography and history. The group results of the fluency tasks showed that the attriters were outperformed by the monolinguals, but it was not possible to identify individual speakers as attriters or not, so this task did not have predictive power in this study (Schmid & Jarvis, 2014). Therefore, it has been questioned whether this is a good task to use in attrition research. The free speech samples were analysed on the frequency of the words the participants used. The
19
data from all participants from the retellings served as a corpus. The researchers divided all lexical lemmas into five frequency bands, each representing 20% of all tokens. Results showed that the group of attriters overused the high-frequency words and underused the low-frequency items. Schmid and Jarvis (2014) also analysed which task predicts best if a speaker is an attriter or not. That analysis showed that the most predictive task in classifying a speaker as an attriter or a non-attriter is the language use in the interview (free speech) (Schmid & Jarvis, 2014). This “underscores the fact that attrition affects the skill that is most characteristic of what native speakers know how to do: use language in free speech” (Schmid & Jarvis, 2014, p. 746).
2.4 Statement of purpose
As the above review of the literature has shown, virtually all studies investigated groups of attriters with a mean age of emigration (AoE) of below 30 years. This entails that very few individuals with an AoE of over 30 or even 40 years have been studied. The current study taps into the understudied area of L1 attrition when the L2 is learned after the age of 40. Furthermore, the above review of the literature has shown that studies discussed L1 attrition in attriters living in the new environment, and not after the subjects were back in their home country. This study does investigate a subject who is no longer in the L2 environment, but who is back again in his L1 environment. In addition, most studies were cross-sectional group studies and not longitudinal case studies. This is where the current investigation differs from earlier studies, as this study is a longitudinal case study. Thus, what not has been investigated much is the effect of learning a second language at a later age on the L1, and in addition what happens to the first language when a speaker is back in the L1 environment and uses the L1 more than the L2. This is especially interesting after the statement of Ammerlaan (1996), who said that lexical access and retrieval difficulties can be overcome. Many studies included attriting adults up to the age of around 30 years old (for a quick overview see Bylund, 2009), but not many studies included
20
adult learners of an L2 later in life. It might be interesting to investigate what influence learning an L2 later in life has, since the language system of the L1 has been settled for a longer time and may be quite stable by then (Bylund, 2009).
The current study is a longitudinal study, which is going to investigate the first language of a Dutch pastor (hereafter PM), who went to Peru for 7,5 years as a missionary. He was therefore ‘forced’ to learn Spanish at a later age: he was 41 years old when he started to learn Spanish. In 2012 PM came back to the Netherlands and started to preach again in Dutch, and stopped using Spanish on a regular basis. This study wants to answer the call to use free speech in attrition research, and when PM started to preach again in Dutch, his sermons were being audiotaped, providing a large set of authentic free speech data for the current study. In addition, earlier studies showed that free speech has more predictive power to classify attriters as attriters (Schmid & Beers Fägersten, 2010; Schmid & Jarvis, 2014; Yilmaz & Schmid, 2012). In this study, four-minute fragments of 126 sermons from a period of five years are transcribed using CLAN (MacWhinney, 2000). The data are analysed on five measures: filled pauses, unfilled pauses, lexical diversity (D), lexical sophistication (based on a corpus) and speech rate (syllables per second).
Earlier studies tapping into disfluencies in attriters’ and non-attriters’ speech showed that pauses, repetitions and retractions are sensitive to attritional processes, which means that these measures are good in distinguishing attriters from non-attriters (Bergmann, Sprenger, et al., 2015; Schmid & Beers Fägersten, 2010). In this study, fluency is operationalized as number of filled pauses and number of unfilled pauses. Due to time restrictions it is not possible to also look at repetitions and retractions as measures of fluency. Research concerning lexical diversity has been contradictory. While some studies showed less lexical diversity in attriters’ speech than in non-attriters (Schmid, 2002; Yilmaz & Schmid, 2012), other studies did not find differences between the groups (Bergmann, Sprenger, et al., 2015). These contradictory
21
findings warrant further study. Therefore, lexical diversity will be investigated in the current study. Turning to lexical sophistication, this can be investigated by comparing one’s own data to an external corpus, or through establishing a corpus combining all data from one’s own research (Schmid & Jarvis, 2014; Yilmaz & Schmid, 2012). This study is going to make use of the latter possibility, making it possible to investigate whether PM used for example more less frequent words towards the end of his sermons. Lastly, Bergmann and colleagues (2015) have looked at the speech rate of attriters, comparing it to non-attriters, but did not find a significant difference. However, listening to the recordings for the first time, a slight increase in speech rate towards the end of the recordings was noticeable. For this reason, speech rate is included in the analysis.
The research question that is going to be central is: “What is the influence of learning a second language at a later age on the fluency and complexity of language use in the L1?” The sub-question that is going to be answered is how the L1 develops over time. The hypothesis is that the L1 of PM is indeed subject to attritional processes. This can become clear in the development over the period of five years. Difficulties in lexical access and retrieval may be overcome (Ammerlaan, 1996) and based on this and earlier findings concerning (dis)fluency (Bergmann, Sprenger, et al., 2015; Schmid & Beers Fägersten, 2010), the hypothesis is that both filled and unfilled pauses will become less frequent over time. Looking at lexical diversity, findings in earlier research have been contradictory (Bergmann, Sprenger, et al., 2015; Schmid, 2002; Schmid & Jarvis, 2014; Yilmaz & Schmid, 2012). Since this study makes use of the same measure for lexical diversity as Bergmann and colleagues (2015) and Schmid and Jarvis (2014), which is D, it is predicted that PM’s speech will not undergo changes in lexical diversity during the five years. However, some earlier studies have showed differences in lexical diversity between attriters and non-attriters (Schmid, 2002; Yilmaz & Schmid, 2012), so it is a possibility that changes in PM’s speech are going to be found in the present investigation. Turning to
22
lexical sophistication, it is assumed that less frequent words are more advanced and more difficult, and that more proficient speakers tend to use these words more often than less proficient speakers (Yilmaz & Schmid, 2012). The opposite is expected for more frequent words. The hypothesis concerning lexical sophistication is therefore that PM will use more less frequent words towards the end of the period of investigation, i.e. words that appear only a few times in the corpus will be found mostly in the later transcripts. In addition, the expectation is that the use of very frequent words decrease less over time. Although speech rate has not been found to distinguish attriters from non-attriters in particular, it is included in this case study. Speaking in the L1 might go with more ease when being in the L1 environment, so speech rate is expected to increase as well. However, there is a possibility no changes in speech rate will be found, given earlier findings (Bergmann, Sprenger, et al., 2015).
3. Method
This case study investigates the development in a five-year-period of the first language of PM, a Dutch missionary who went to Peru for 7,5 years. During the five years PM is back in the Netherlands, his sermons were being recorded and these sermons serve as data for this study. The sermons will be analysed on the following measures: filled pauses, unfilled (empty) pauses, lexical diversity, lexical sophistication and speech rate. Since this is a longitudinal case study, the beforementioned variables are dependent on the length of time PM is back in the Netherlands.
3.1 Subject
This study involves one male participant, making it a case study. The subject of this study is PM, a 53 years old Dutch missionary. In 2005, he was sent, together with his family, as a missionary to preach in Peru. He was therefore forced to learn Spanish at a later age (41 years
23
old). At first, he preached in English, but after a while his knowledge of Spanish was good enough to preach in Spanish. The only place where PM spoke Dutch during his time in Peru, was at home with his family and when contacting people in the Netherlands. However, gradually this family started to detect changes in the language they spoke in to one another. Spanish and English were progressively more part of the languages they spoke at home. Sometimes, even all three languages were combined into one sentence. After a period of 7,5 years (in 2012), PM came back, together with his family, to the Netherlands and immediately he started to preach again in Dutch. The main language spoken by him at this time was Dutch again and still is Dutch. He now only speaks Spanish when talking to native Spanish speakers.
3.2 Materials and procedure
The present investigation is based on 126 sermons PM preached, with an interval of around two weeks between sermons, which provides a large set of authentic data. All sermons were audio-recorded between July 2012 and July 2017. These data can be classified as free speech, since there are no instructions or limitations as in formal tasks. PM did prepare his sermons and made notes to some extent, which he brought to the pulpit, but he did not stick to the literal wording of it and he did ‘improvise’. This dataset includes sermons with returning topics, such as humility, marriage, prayer and character.
Sermons lasted for 45 to 60 minutes. Fragments beginning at ±15 minutes into the sermon were picked in order to avoid disfluencies while PM was easing into the sermon. Then, fragments with a duration of about four minutes from each sermon were transcribed orthographically in CHAT-format, using CLAN (MacWhinney, 2000). The fragments started at the beginning of a sentence and roughly four minutes later the fragments were ended, in such way that the last utterance was not cut off. The transcripts included the coding of the following measures:
24
Filled pauses. Filled pauses were coded as eh, no matter what the pronunciation of the filled pause was.
(1) terwijl Saul sliep ging hij eh naar Saul toe
while Saul was sleeping, he went uh to Saul
Unfilled pauses. Unfilled pauses were coded as (.) in the transcripts. Due to time restrictions it was not possible to measure every pause acoustically. Therefore, the author marked a pause when there was a break in the speech stream and/or intonation. The marked pauses were mostly pauses longer than 300 milliseconds when checked in the recording.
(2) wij kunnen niet (.) misbruik maken van (.) Gods genade
we can’t (.) take advantage of (.) God’s grace
After all fragments were transcribed, the number of filled and unfilled pauses were measured using the freq-command. In addition, lexical diversity, lexical sophistication and speech rate were measured.
Lexical diversity. For each transcript lexical diversity was measured. In CLAN, a morphology-tier was added to identify the part-of-speech of each word. CLAN partly does this automatically. In some cases, there is more than one class a word can belong to, for example ‘zijn’ (are) as a verb, or ‘zijn’ (his) as a possessive. These unclear cases were disambiguated manually. After this procedure was completed, lexical diversity could be measured. This was done by counting content words (i.e., verbs, nouns, adjectives). Function words were excluded, since these words are used for grammatical reasons and therefore reoccur often and do not add anything to lexical diversity, understood as the range of words a speaker uses. Type Token Ratio (TTR) is a commonly used measure for lexical diversity. However, TTR is strongly influenced by text length (Skehan, 2016), and is therefore unreliable when used for texts consisting of more than 300 words. Since each transcript contains more than 300 words, D was chosen to measure lexical diversity. This is a more advanced way of measuring the ratio of types and tokens. D is
25
measured in the following way (after McCarthy & Jarvis, 2010): 100 random samples of 35 tokens are taken, and for each of these samples TTR is calculated. Then the mean TTR of all the samples together is saved. What follows is the same procedure for 36 to 50 tokens, and this is done three times. After running this command, CLAN gives the optimum average of D. A higher value expresses more lexical diversity.
Lexical sophistication. In order to be able to measure lexical sophistication, one needs to use a corpus in order to determine which words are frequent overall and which are not. By way of proxy, infrequent words are considered to be more sophisticated. Like Yilmaz and Schmid (2012) and Schmid and Jarvis (2014), all transcripts are put together and this served as the comparison corpus for this study. The corpus was used to inspect whether there were any changes detectable in PM’s word use throughout the five years. In order to be able to measure this the corpus had to be created in the following steps. First, a list of all words with frequency of occurrence (tokens) from highest to lowest was made. Secondly, the total number of tokens was divided by five, because five frequency bands had to be created, each representing 20% of all tokens. This means that for example the 20% most frequent words in the corpus should have a combined token frequency of a fifth of all tokens. For the first frequency band, only a few words already led to a fifth of all tokens because those words have very high frequencies. For the last frequency band a lot of words with very low frequencies had to be selected in order to represent 20% of all tokens. Then, per transcript and frequency band, the number of occurrences of the words in that frequency was calculated, resulting in one result per transcript, each around 20%. In this way possible changes in any of the frequency bands throughout the transcripts could be mapped. It is assumed that basic, high-frequency words are used more by less proficient speakers, and that lower frequency words, who are more difficult and advanced, are used more in language of proficient speakers of a language (Yilmaz & Schmid, 2012).
26
It is unlikely that a clear trend in the above described frequency bands will be found over time, because they will always fluctuate closely around 20%. Therefore, it might be interesting to inspect the data in another way, which is done by smoothing the raw data. The used technique is called moving average (Van Geert & Van Dijk, 2002). With this technique the average of two datapoints is calculated and this is completed for all datapoints. In this way, the data becomes more smooth and it is easier to capture general tendencies.
Speech rate. For every four-minute fragment, speech rate is measured. A special script in PRAAT (Boersma & Weenink, 2014) was used in order to do this. This script is called the
Praat Script Syllable Nuclei by De Jong and Wempe (2009). This script automatically detects
syllables and measures the number of syllables in the entire recording. Then, the number of syllables was divided by the duration of the recording in seconds, leading to speech rate in syllables per second.
3.3 Design and Analyses
After the total amount of filled pauses and unfilled pauses were calculated for each transcript, and after D, the frequency bands for analysing lexical sophistication (raw and smoothed data) and speech rate were obtained for each transcript, these outcomes were loaded into R. Then, these data were analysed by using correlation analyses in R version 3.4.2. The non-parametric alternative Kendall’s tau was used, because the data were not normally distributed. There were two exceptions, which were the data for speech rate and frequency bands 1, 2 and 3. Here, a Pearson’s r was used because these data were normally distributed. Beforementioned variables are dependent on time, which are the numbers of transcripts, i.e. transcript 001 was the first in the period of five years, transcript 126 the last one. Filled and unfilled pauses and speech rate give an indication of fluency, which means that the lower the number of pauses, the more fluent a speaker is. D gives an indication of diversity in language use, which means that the higher the
27
number of D is, the more diverse the speech is. The chosen α-level for all analyses is 0.05. Regarding lexical diversity and speech rate, testing is done two-tailed, because it is unclear if the correlations were going into a positive or negative direction. Regarding filled and unfilled pauses and the five frequency bands, testing is done one-tailed, because there were clear expectations about the direction of the correlations. Positive correlations were expected for frequency bands 4 and 5. Negative correlations were expected for frequency bands 1, 2 and 3, and for both filled and unfilled pauses.
4. Results
4.1 Filled and unfilled pauses
Figure 1. Number of filled pauses in the period spanning five years with an added linear trend
line. 0 5 10 15 20 25 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 N u m b er o f fil le d p au se s Trancsript
Filled pauses
28
Figure 2. Number of unfilled pauses in the period spanning five years with an added linear
trend line.
All transcripts together contained 671 filled pauses (sd = 4.05) and 1315 empty pauses (sd = 5.5). PM used 208 filled pauses in the first year, which is almost a third (31%) of all pauses in the whole corpus. In addition, PM used 148 unfilled pauses in the first year, which is 11.3%. In the last year, this was 390 unfilled pauses (29.7%). The use of filled and unfilled pauses in the period of five years are shown in Figure 1 and Figure 2. Kendall’s tau correlation analyses showed that there is a significant correlation between time and number of filled pauses, but not between time and number of unfilled pauses. The correlation between time and filled pauses was of a small effect size (rtau = -0.175, p < 0.01), while the correlation between time and empty pauses was of a medium effect size (rtau = 0.387, p = 1). These results are also visible in the added trendlines in Figure 1 and 2. Another correlation was performed between filled pauses and unfilled pauses. Kendall’s tau showed a significant medium effect size between the two measures (rtau = -0.329, p < 0.001). 0 5 10 15 20 25 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 N u m b er o f u n fill ed p au se s Transcript
Unfilled pauses
29
4.2 Lexical diversity
Figure 3. Lexical diversity (D) in the period spanning five years with an added linear trend line.
For every transcript, D as a measure of lexical diversity was calculated. This was done in order to find out whether PM made use of more different words towards the end of the five years than in the beginning. The results are shown in Figure 3. A correlation analysis revealed that there was no significant change in lexical diversity during the period of five years (rtau = 0.017, p = .784), as can be seen in Figure 3. Mean D is 68.3 (sd = 12.67). The lowest D, which was 42.3, is measured in the last year (transcript 113), while the highest D, which was 109.6, is measured in the first year (transcript 10).
30 40 50 60 70 80 90 100 110 120 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 N u m b er o f co n ten t w o rd s Transcript
Lexical diversity (D)
D Lineair (D)30
4.3 Speech rate
Figure 4. Speech rate (measured in syllables per second) during the period spanning five years
with an added linear trend line.
Figure 4 shows speech rate in number of syllables per second in every recording. As the trendline in Figure 4 shows speech rate did not change during the period spanning five years. The mean speech rate was 3.76 (sd = 0.14) syllables per second. The lowest speech rate (3.28 syllables per second) was found in the fourth year. The highest speech rate (4.06 syllables per second) was found in the second year. A Pearson correlation analysis showed no significant changes over time in speech rate (r = -0.04, p = .66).
3,1 3,3 3,5 3,7 3,9 4,1 4,3 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 Sy llab le s/s Recording
Speech rate
31
4.4 Lexical sophistication 4.4.1 Frequency bands
Figure 5. Usage of words belonging to frequency band 2 during the period spanning five
years with an added linear trend line (raw data).
Figure 6. Usage of words belonging to frequency band 5 during the period spanning five
years with an added linear trend line (raw data). 5 10 15 20 25 30 35 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 Cov era ge Transcript
Frequency band 2
Band2 Lineair (Band2)
5 10 15 20 25 30 35 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 Cov era ge Transcript
Frequency band 5
32
All words used in the transcripts were put together and served as the corpus for the current study. Then the words were divided into five frequency bands. The first frequency band contains the 20% most frequent words and the last band contains the 20% least frequent words. The data in frequency band 4 and 5 were not normally distributed. Therefore, Kendall’s tau analyses were conducted. Analyses for the other frequency bands were done using Pearson’s r correlation. In all frequency bands no significant changes were detectable (band 1 r = 0.013, p = .559; band 3 r = -0.021, p = .409; band 4 rtau = 0.003, p = .482; band 5 rtau = -0.088, p = .928). Except for band 2, these were considered as negligible effects. In contrast, the analysis of the second frequency band showed a small effect, which was not significant however (r = 0.169, p = .971). Looking at Figure 5, an upward trend of the usage of words belonging to frequency band 2 is seen. In one other frequency band a trend was visible, see Figure 6. Words belonging to this last frequency band, which contains the least frequent words in the corpus, were used less over time. Graphs showing the results for frequency bands 1, 3 and 4 can be found in Appendix A. Both trends that were observed in frequency bands 2 and 5 were small, which was expected. In addition, a lot of variability between the transcripts are observed. Therefore, additional analyses using a moving average to smooth the data were performed. These results can be found below.
33
Figure 7. Usage of words belonging to frequency band 2 during the period spanning five
years with an added linear trend line (smoothed data).
Figure 8. Usage of words belonging to frequency band 5 during the period spanning five
years with an added linear trend line (smoothed data).
Like the results for the raw data, the results for the smoothed data did not show significant changes over time (band 1 r = 0.027, p = .618; band 2 r = 0.228, p = .985; band 3 r = -0.035, p = .351; band 4 rtau = -0.013, p = .558; band 5 rtau = -0.188, p = .982). The correlations for
10 15 20 25 30 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 Cov era ge Transcript
Frequency band 2
Band2Smooth Lineair (Band2Smooth)
10 15 20 25 30 35 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 Cov era ge Transcript
Frequency band 5
34
frequency bands 1, 3 and 4 were still considered as negligible effects, while the correlation for frequency band 5 increased to a small negative effect. Frequency band 2 also showed a small effect. The two trends are shown in Figure 7 and 8. Graphs showing the results for frequency bands 1, 3 and 4 can be found in Appendix B.
4.4.2 Individual words
During the analysis of the words in the five frequency bands there were a few individual words that caught attention. For example, the word amen ‘amen’ was used 121 times in the whole corpus, of which 75 (62%) were used during the first year. More specifically, in the first two sermons PM used the word amen respectively eighteen and seventeen times. In the rest of the sermons it was used with a maximum of seven times per sermon.
Although no increase was found in the lowest frequency band towards the end of the investigated time span, there were some words who caught attention during the process of transcribing. Generally, these words are not used very often in in daily language use. Besides amen, other words that caught attention were words belonging to the following word classes: adverbs, adjectives and nouns. An example of an adverb is allicht ‘probably’. Used adjectives are for instance dubieuze ‘dubious’, frappant ‘striking’, ontvankelijk ‘an open mind’ and
faliekant ‘utterly’. Some examples of used nouns are cliché ‘cliche’, denominatie
‘denomination’, facet ‘facet’, momentum ‘a short period of time with great opportunities’ and
illusie ‘illusion’. Most of these words were used only once in the entire corpus. Some of these
words were used twice, but then they occurred in the same sermon. Words like denominatie and momentum are used especially in the religious world. Another example that belongs to that jargon is ettelijk ‘innumerable’, which was used once by PM. Looking at the time of occurrence in the corpus, it was seen that most of these words occurred in the last (fifth) year.
35
There also were some words used by PM that one only uses in a specific phrase and that are not used on their own. For these words the same applies as the words described above: mostly they occurred once and if more they occurred in the same sermon, and they were used in the last year of the investigated time span. PM used the word kijf which only occurs in the following phrase: dat staat buiten kijf ‘that is beyond dispute’. He also used clinch, which is always used in the phrase in de clinch liggen met iemand meaning ‘be at loggerheads with someone’. The last word that is always used in combination with particular words resulting in a phrase is kriegel. This is always seen in the phrase daar word je kriegel van which mean ‘it gets under your skin’.
5. Discussion
The aim of the present study was to provide insight into the hitherto understudied area of attrition starting after the age of 30 and to answer the call to use free speech in attrition studies. In order to achieve this, sermons of a Dutch missionary (PM) who went to Peru for 7,5 years were analysed for a period of five years. PM started to learn the Spanish language when he already was 41 years old. The investigated sermons are the sermons PM preached during the first five years he was back in the Netherlands. In total, fragments of 126 sermons were analysed on the number of filled and empty pauses, lexical diversity using the measure D, speech rate in syllables per second and lexical sophistication. Research questions that are addressed in this study are whether learning a second language at a later age has an influence on the first language and how the L1 develops when one is back again in his home country.
5.1 Filled and unfilled pauses
The statistical analyses revealed that PM used fewer filled pauses, but more unfilled pauses over time. Filled pauses are known to serve a semantic function, which is for example
36
organizing discourse and structuring information in a sentence (Schmid & Beers Fägersten, 2010). Since PM used fewer filled pauses during the five years, this might mean that organizing the structure and information in his speech takes less effort and thinking what to say next towards the end of the investigated time frame. The finding concerning the development of the use of filled pauses is in line with earlier findings. Bergmann and colleagues (2015) found that the attriters in their study used more filled pauses than the non-attriters. The decrease in PM’s use of filled pauses indicates that he sounds more like an attriter at the start and sounds more like a non-attriter at the end of the investigated sermons.
One thing that was noticed while listening to and describing the fragments of the sermon was PM’s use of the word amen ‘amen’. This is of course a common word for a preacher to use in this context. However, in the first few sermons when PM was preaching again in Dutch he used this word a lot. The placement of the word often seemed to be a little odd in the sentence. It looked like he used it to gain more time to think about what he wanted to say next. So it might be the case that he used this as a filled pause. After some time, the use of the word amen became normal: it was not used often and it was placed in the right spots in the sentence. The usage of the word amen followed the same pattern as the unfilled pauses and could be therefore seen as a filled pause.
Empty pauses were used more by PM as time was progressing. Where filled pauses are thought to serve a semantic function, unfilled (or empty) pauses serve a cognitive function. Empty pauses are thus a sign of problems at the cognitive level, such as problems with the retrieval of words or information (Schmid & Beers Fägersten, 2010). The assumption is that for bilingual speakers the cognitive load is heavier since they have to deal with two linguistic systems simultaneously, which can lead to an increase in the use of unfilled pauses (Schmid & Beers Fägersten, 2010). From the moment PM came back to the Netherlands he started to use his L1 massively and used his L2 barely. This was different from when he still was in Peru
37
where he used his L2 more, although in combination with his L1. At the start when PM was back in the Netherlands it would have made sense that he would use many unfilled pauses as he might have struggled switching to his L1. However, he did not actually use many unfilled pauses in the beginning and the use of unfilled pauses even increased significantly over time. An explanation for this finding might be found when looking at the Dutch data in Schmid and Beers Fägersten’s study in 2010. The L1 Dutch participants used the most empty pauses, whether they were migrants in Canada (the attriters) or always lived in the Netherlands (the controls), when compared to the L1 German participants. The Dutch attriters used more unfilled pauses than the Dutch controls. This might point to the direction that L1 Dutch speakers in general use more empty pauses. So, this would mean that with the increase of unfilled pauses, PM actually developed to sound more like a (monolingual) Dutch speaker. Comparing the number of empty pauses from the current study with the number of empty pauses in Schmid and Beers Fägersten’s (2010) study is difficult, because only group numbers are available and in addition, the recordings of the participants vary in length. Unfortunately, the current study does not have a control, a L1 speaker of Dutch who never migrated elsewhere, which could have provided information in the usage pattern of unfilled pauses over time. Another possible explanation for the finding that PM uses more unfilled pauses at the end of the investigated five years might lie in the fact he wants to convey a message to the listeners. In his sermons PM wants to reach a certain goal and sometimes silence is used to emphasize an important part. Looking at the positioning of the unfilled pauses in the sentence, many of them are indeed placed right before an important part of the message in the sentences. Examples of it can be found below in (3), (4) and (5).
(3) en als je dan terugkijkt dan zeg je nou ik ben er toch overheen gekomen (.) samen met de hulp van God
38
and if you look back you can say well, I’ve come across it (.) together with God’s help
(4) kun je nog steeds dankbaar zijn (.) voor datgene wat God gedaan heeft in je leven?
can you still be grateful (.) for the things God did in your life?
(5) waar dat voorheen voor jou was uitgesloten (.) vanwege jouw liefde voor God, word je nu gemakzuchtig
before it was not an option for you (.) because of your love for God, you now became lazy
Turning to the distribution of filled and unfilled pauses it was seen that overall, PM used more unfilled pauses than filled pauses which is not in line with previous research (Bergmann, Sprenger, et al., 2015; Schmid & Beers Fägersten, 2010). However, when the data from the first year only, or even just the first sermon, is considered it is seen that PM used more filled pauses than unfilled pauses. This is conform to earlier findings that attriters, but also non-attriters, use more filled than empty pauses (Bergmann, Sprenger, et al., 2015; Schmid & Beers Fägersten, 2010). Since both attriters and non-attriters use more filled than empty pauses in previous investigations, it is hard to establish whether PM has a pattern corresponding to attriters or non-attriters. The difference then between attriters and non-attriters is that the gap between the number of filled and unfilled pauses is bigger in non-attriters. Looking at the distribution of pauses in the last year the pattern is the other way around: PM used more unfilled pauses than filled pauses. This might be due to earlier addressed explanations concerning unfilled pauses.
The hypotheses concerning the use of pauses was that both pauses would be used less by PM over time. This is partially true, as the use of filled pauses indeed decreased during the period of five years. However, the use of unfilled pauses did not decrease but even increased over time.