• No results found

An Exploratory Study: Working Memory Capacity and L2 Speech Production

N/A
N/A
Protected

Academic year: 2021

Share "An Exploratory Study: Working Memory Capacity and L2 Speech Production"

Copied!
90
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

An Exploratory Study:

Working Memory Capacity

and L2 Speech Production

CICILIA D. M. PUTRI

S3724697

MA in Applied Linguistics

Faculty of Liberal Arts

University of Groningen

Supervisor: Prof. dr. M. C. J. Keijzer

Second Reader: dr. R. G. A. Steinkrauss

Date

:

18 June 2020

(2)

Declaration of Authenticity MA Applied Linguistics – 2019/2020

MA-thesis

Student name: Cicilia Deandra Maya Putri Student number: S3724697

PLAGIARISM is the presentation by a student of an assignment or piece of work which has

in fact been copied in whole, in part, or in paraphrase from another student's work, or from any other source (e.g. published books or periodicals or material from Internet sites), without due acknowledgement in the text.

TEAMWORK: Students are encouraged to work with each other to develop their generic

skills and increase their knowledge and understanding of the curriculum. Such teamwork includes general discussion and sharing of ideas on the curriculum. All written work must however (without specific authorization to the contrary) be done by individual students. Students are neither permitted to copy any part of another student’s work nor permitted to allow their own work to be copied by other students.

DECLARATION

• I declare that all work submitted for assessment of this MA-thesis is my own work and does not involve plagiarism or teamwork other than that authorised in the general terms above or that authorised and documented for any particular piece of work.

Signed:

(3)

Abstract

This study investigates the relationship between working memory (WM) capacity and L2 speech production in adults. As a secondary aim, this study addresses the question which WM capacity measure correlates best with L2 speech production. To address these issues, an experiment was conducted comprising 18 L1 Indonesian Master’s degree students of various programs of the University of Groningen in the Netherlands, who were advanced L2 learners of English. The participants were required to complete three tasks: a backward digit span test (BDS) and a reading span test (RST) to measure their WM capacity as well as a video-retelling task to elicit their L2 speech production. The results showed a correlation between the BDS scores and the complexity measure obtained on the basis of the L2 speech

production. There was also a tendency for the RST scores to be correlated with the

complexity measure of L2 speech production, but this did not reach significance. The data showed that there was a link between L2 speech production and that, in this study, the BDS was the best measure to reflect this relationship. With respect to the WM system’s limited capacity, the analysis confirmed that there were trade-off effects among L2 speech

production measures, for instance between complexity and fluency as well as fluency and weighted lexical density. These findings show the intricacies of WM system itself, which furthermore calls for more studies to investigate the constructs of WM and how it pertains to L2 learning trajectories.

Keywords: working memory, speech production, limited capacity, backward digit span, reading span test

(4)

Acknowledgement

I would like to extend my deepest gratitude to Jesus Christ, our Lord and Savior, for His abundant graces and blessings throughout my Master’s program. Through Him, I was awarded a full scholarship from the Indonesian Endowment Fund for Education (LPDP). My whole Master’s journey would not have been possible without the support of LPDP.

I am extremely grateful to my father in heaven, my mother and my sister in Indonesia for the prayers and support. Additionally, to my host family: Mom, Dad, and Kelsey, in the United States of America. Thank you for always having my back!

I cannot begin to express my thanks to my thesis advisor, Professor Merel Keijzer for her continuous guidance, patience, and motivation. Her valuable suggestions helped me all the time of writing the thesis. I could not ask for a better advisor for this challenging process and thank you for believing in me. My gratitude also goes to all of my lecturers in the Master’s of Applied Linguistics program for the extensive knowledge.

Many thanks should also go to my support system here: my LDR girls Vania and Elma; and my two boys, Peter and Thomas. Cheers also to my TSLD girls: Annelies and Claudia as well as my classmates in the Applied Linguistics programs. I am also grateful to my friends from all around the globe: Matilde, Anna, Emma, Paolo (Groningen unites us!). Cheers to our ups and downs throughout the year and to our future endeavors!

Last, but not least, I must also thank Ardy Joan Pradito for being by my side. Always. Thank you and God bless every one of you.

18 June 2020, Writer

(5)

Table of Contents Declaration of Authenticity... 2 Abstract ... 3 Acknowledgments... 4 List of Tables ... 6 List of Figures ... 7 Introduction ... 8 Literature Review... 12

Individual Differences (IDs) and Language Aptitude ... 12

Working Memory (WM)... 15

Phonological Working Memory (PWM) ... 19

Working Memory Capacity and L2 Development ... 23

Working Memory Capacity and L2 Speech Production ... 26

The Present Study ... 29

Methodology ... 32

Methodological Approach ... 32

Research Design... 32

Participants ... 33

Materials ... 33

Backward Digit Span (BDS) ... 34

Reading Span Test (RST) ... 35

Video-retelling Task ... 37

L2 Speech Production Measures ... 38

Procedures ... 40

Statistical Analysis ... 42

Results ... 43

Descriptive Statistics for WM Capacity Measures ... 43

Descriptive Statistics for L2 Speech Production Measures ... 45

Correlational Statistics: WM Capacity and L2 Speech Production Measures ... 48

Discussion ... 54

Conclusion ... 63

References ... 66

(6)

List of Tables

Table 1 The Descriptive Statistics for the BDS and the RST ... 43 Table 2 The Descriptive Statistics for L2 Speech Production Measures ... 45 Table 3 Correlational Statistics for the BDS, RST, COM, ACC, FLU, and WLD Measures 49 Table 4 Correlational Statistics for COM, ACC, FLU, and WLD Measures ... 51

(7)

List of Figures

Figure 1 Baddeley-Hitch's (1974) Working Memory Model ... 16

Figure 2 Participants’ scores on the BDS ... 44

Figure 3 Participants' scores on the RST ... 44

Figure 4 Participants' complexity scores ... 46

Figure 5 Participants' accuracy scores ... 46

Figure 6 Participants' fluency scores ... 47

Figure 7 Participants' weighted lexical density scores ... 48

Figure 8 The relationship between the BDS scores and complexity measure ... 49

Figure 9 The relationship between the RST scores and complexity measure ... 50

Figure 10 The relationship between the RST and BDS scores ... 51

Figure 11 The relationship between complexity and fluency ... 52

(8)

An Exploratory Study: Working Memory Capacity and L2 Speech Production

When it comes to learning a second language (L2), some people might be able to acquire the target language easily and achieve a good level of attainment without much difficulty, but some others might face substantial challenges despite the fact that they are exposed to the same amount of language teaching and input. How do these contrasting

conditions happen even with the same (amount of) L2 exposure? According to de Bot, Lowie, and Verspoor (2005), these variations are due to individual differences (IDs) underlying L2 learning endeavour, which generally explains why L2 learners have different rates and ultimate attainment levels of success when learning the target language. IDs can thus best be captured as multifaceted aspects of differential abilities and characteristics of how individuals process information (de Bot et al., 2005).

Ranging from age, anxiety, to learner beliefs, a considerable number of IDs have been discussed since the 1970s, and IDs now form an important subfield of SLA (Ellis, 2004). According to Skehan (1991), there are four prominent areas of IDs that have robustly shown to be relevant for second language acquisition research: language aptitude, motivation, learner strategies, and learning styles. Among all those four majorly investigated IDs, language aptitude has shown to be an important predictor of an individual’s L2 learning ability (Skehan, 1991; Sparks, Ganschow, and Patton; 1995; Ellis, 2004).

Language aptitude is not one trait, but rather constitutes a set of cognitive abilities that could predict to what degree, relative to other individuals, an individual can learn an L2 in a given amount of time and under given conditions (Carroll & Sapon, 2002). Previous research has found that, among the set of language aptitude abilities, one component is an especially strong predictor of language achievement (Carroll, 1958; Skehan, 1991; Harley & Hart, 1997; Winke, 2013). This component is called working memory (WM) and is defined as an

(9)

individual’s ability to temporarily store and manipulate information required to complete complex cognitive activities (Baddeley, 1992). This WM system has been assumed to have a limited capacity shared between the storing and processing demands of the activities which rely on WM (Daneman & Carpenter, 1980). Several SLA researchers believe that SLA processes require more controlled processing as opposed to the more automatic processing in L1 learning (Harrington & Sawyer, 1992; Miyake & Friedman, 1998). Furthermore, they argue that this kind of controlled processing places greater demands on cognitive resources that rely on WM. The association between WM and SLA has been supported by existing studies on vocabulary acquisition, language comprehension, written production, speech production, and other language domains (Service, 1992; Ellis & Sinclair, 1996; Fortkamp, 1999).

WM itself has been subdivided into different sub-constructs and tests have been developed to measure these different sub-constructs, for instance the so-called digit-span test (DS), reading span task (RST), speaking span test (SST), and operation-word span test (OWST), among others. These measures have been related both to L2 learners’ performances within receptive and productive skills; however, the results found so far are mainly mixed in terms of the correlations between the different WMC measures and L2 performance

(Daneman, 1991; Service, 1992; Cheung, 1996; Vulchanova, Foyn, Nilsen, & Sigmundsson, 2014; Kormos & Safar, 2008). Furthermore, as most of these studies were conducted to examine the relations between WM capacity and L2 learners’ performances in children (5-10 years old) and adolescents (15-17 years old), there is much left to discover how WM

develops over the longer lifespan in relation to L2 learning. Thus, it is necessary to conduct similar research on adults to see whether or not the results obtained deviate from the results of previous studies on children as well as to investigate which WM capacity measure

(10)

of all clarify the role of WM as a modulating factor at different L2 learning stages, but also at the same time shed light on WM itself and how it is best captured (by means of which WM measure).

Despite the fact that L2 learners’ performance includes different kinds of productive and receptive skills, speaking the target language fluently is often the ultimate goal of learners. However, research on speaking has received less attention compared to reading, which has been ascribed to the fact that production skills are generally harder to assess than receptive skills (Fortkamp, 2000). Speaking can be considered a complex cognitive skill, as mental processing is involved which in turn competes for the limited attentional capacity resources of the WM (Finardi & Prebianca, 2006). Thus, examining the role of WM more closely, most notably in how it modulates L2 speaking attainment is necessary in order to see whether there is a trade-off effect between the aspects of L2 speech production as a result of the competition for the limited attention capacity of the WM.

Drawing on what previous studies have so far not included, the aim of this study is to investigate the relationship between WM capacity and L2 speech production but specifically extend it to adult populations. To shed more light on this question, WM in the adult L1 learner sample that is included in this study is related to different aspects of L2 speech production: complexity, accuracy, fluency, and weighted lexical density.

In addition to the introduction (Chapter 1), the thesis consists of 6 chapters. Chapter 2 comprises an extensive literature review, starting with an elaboration on IDs in general and narrowing down to language aptitude as the ID that forms the focus of the current

investigation how the construct may predict an individual’s L2 learning ability. Chapter 3 provides the method used to meet the study’s aim to investigate the relationship between WM capacity and L2 speech production. In particular, it explains how WM capacity and L2

(11)

speech production were measured in this study. Chapters 4 and 5 present the study’s findings, which are subsequently discussed in the light of the research questions and hypotheses. Chapter 6 summarizes the main findings of the study and reflects on the relationship between WM capacity and L2 speech production more broadly, also pointing to limitations of the current study and providing suggestions for future avenues in the realm of L2 learning and WM capacity, with a focus on learning how to achieve spoken fluency in the L2.

(12)

Literature Review

Individual Differences (IDs) and Language Aptitude

Individual differences (IDs) play an essential role in explaining why some L2 learners might be able to achieve native-like competence with ease while others face difficulties to even progress beyond the beginner’s level, despite the same exposure to L2 input. According to Dörnyei (2005: 4), IDs are defined as existing personal characteristics which make each individual differ by definition but also in terms of how they cope with L2 learning.

The field of IDs has been extensively investigated to address the reasons L2 learners vary enormously in their rate of success and ultimate attainment when learning a target language (Skehan, 1991; Dörnyei & Skehan, 2003; Ehrman, Leaver, Oxford, 2003; Ellis, 2004). Researchers have identified a number of IDs, ranging from age, sex, motivation, to learning strategies, that all pertain to differences in L2 learning rate and success. Despite the various IDs, there are four prominent areas of IDs that have shown to be especially relevant in the context of SLA (Skehan, 1991): language aptitude, motivation, learner strategies, and learner styles. Furthermore, Skehan (1991) argued that most notably the study of language aptitude may be important because it enables the prediction of successful language learning. In line with this view, Carroll and Sapon (2002) defined language aptitude as a set of

cognitive abilities that could predict to what degree, relative to other individuals, an individual can learn a foreign language in a given amount of time and under a given condition. Thus, central in this view is the predictive power of aptitude in relation to the second language learning process.

Investigations focusing on language aptitude and its subcomponents in understanding how an individual acquires a given L2 have been widely conducted (Carroll, 1958; Skehan, 1991; Harley & Hart, 1997; Winke 2013), with Carroll’s work (e.g., 1958, 1973,) featuring

(13)

very prominently in this respect. He proposed a model comprising four subskills of language aptitude: 1) the ability to distinguish and code sounds in a foreign language; 2) the ability to connect stimuli (native language words) and responses (foreign language equivalents). In other words, individuals’ ability to create and strengthen this connection would affect their speed of vocabulary growth and consequently foreign language achievement; 3) the ability to recognize words’ function grammatically in a foreign language; 4) the ability to examine, notice as well as identify patterns involving meaning or syntactic form in a foreign language.

With reference to these subskills of language aptitude, numerous tests have been developed to evaluate language aptitude. Amongst those are the Modern Language Aptitude Test (MLAT) by Carroll and Sapon (1959) and the Pimsleur Language Aptitude Battery (PLAB) by Pimsleur (1966). Broadly speaking, both tests measure foreign language aptitude using simulated format and grammar that incorporate all subcomponents of language aptitude with the outcome of indication or prediction of an individual’s degree of success in learning a foreign language. With language aptitude’s role as one of the central aspects of IDs in

language learning and the construct being frequently portrayed as a robust predictor of language learning success (Skehan, 1989), numerous empirical studies have related language aptitude to SLA rate and success (e.g., Harley & Hart, 1997; Kiss & Nikolov, 2005; Winke, 2013). Harley and Hart (1997), for one, found that there were positive relations between language aptitude and SLA. However, these relations differed according to early or late immersion programs that formed the basis of their investigation. Indeed, the participants of their study were 65 11th-grade students (with English as their L1) in early and late immersion French programs. They were given three language aptitude tests, namely the MLAT-IV Word Pairs subtest to test associative memory, the Wechsler’s (1972) Memory Scale subtest to measure memory for text, and the PLAB-IV Language Analysis subtest to assess the ability to analyse language structure, and a background questionnaire assessing students’ prior

(14)

language experience. Through completing a series of French proficiency tests, which

included vocabulary recognition, listening comprehension, Cloze test and written production task and an individual oral test, it was found that there was a positive relationship between different dimensions of language aptitude and L2 proficiency in early or late immersion programs. In the late immersion program, there was a positive relationship between the analytical dimension of language aptitude (as measured by PLAB-IV Language Analysis) and L2 outcomes, whereas in the early immersion program a positive relationship was shown between memory ability (tapped using Wechsler’s (1972) Memory Scale) and L2 outcomes. Though varied, both findings present evidence of language aptitude being related to L2 proficiency. Similar results have been obtained in other studies with different research designs, further lending support for the link between aptitude components and L2 learning success.

Kiss and Nikolov (2005) conducted a study to explore how language aptitude scores were related to learners’ L2 proficiency and used 419 12-year-old Hungarian school students who were L2 learners of English for this purpose. The students had been learning English for around 3 years, but there were differences in the intensity and the quality of their learning experiences. The students were given a Hungarian General Aptitude Test (Otto, 1996),

developed specifically for Hungarian learners of English, an English proficiency test to assess listening, reading, and writing skills, and a motivation scale to measure the participants’ language learning motivation. They found a strong relationship between the overall language aptitude scores (as measured by the Hungarian General Aptitude Test) and L2 proficiency (r=.634, p<.01). When comparing the relationships among aptitude test scores, motivation and L2 proficiency, significant correlations were found between aptitude and both

(15)

motivation. This implies that aptitude was a better predictor of L2 learning success than motivation.

Overall, then, language aptitude has robustly shown to be relevant and to play an important role in the success rate of SLA. Higher aptitude for second or foreign language learning has been linked to successful adaptation to L2 that can be measured by faster

learning progress as well as higher degrees of L2 attainment in proficiency (Robinson, 2019).

Working Memory (WM)

In further investigation of language aptitude, researchers have tried to answer the question of which subcomponent of language aptitude in particular is most strongly linked to SLA success (Carroll, 1990). A number of SLA studies have proposed that the best candidate may be working memory (WM), which is central in second language aptitude and has been repeatedly shown to play an essential role in second language proficiency (Miyake & Friedman, 1998; Robinson, 2001; 2002; Skehan, 2002).

As proposed by Baddeley (1992), WM is a dimension of language aptitude that defines an individual’s ability to temporarily store and manipulate information that is

required to complete complex cognitive activities. Following this temporary buffer of storing and or manipulating information, information can either be discarded or transferred to long-term storage (Baddeley, 2003a). As a result, WM arguably plays an important role in various cognitive activities, for instance, reasoning (Salthouse, Mitchell, Skovronek, & Babcock, 1989; Capon, Handley, & Dennis, 2003) and language learning (Baddeley, 1992; Juffs, 2006; Miyake & Friedman,1998; Vulchanova et al., 2014). WM is a limited-resource memory system where the mental processes involved in the complex task performance compete for the limited capacity of the WM (Baddeley, 1981; 1990; Baddeley & Hitch, 1974; Daneman &

(16)

Carpenter, 1980). Before going into the limited capacity of WM further, the discussion will turn to a more detailed account of WM and models that have been proposed to capture it.

One of the earliest models of WM has been proposed by Baddeley and Hitch (1974), and has subsequently been applied in numerous studies. The model consists of three

subcomponents: the central executive, which plays a supervisory role in relation to the following slave systems, namely 1) the phonological loop, and 2) the visuospatial sketchpad. In later years, partly on the basis of empirical evidence, another slave system was added onto the model: the episodic buffer (Baddeley, 2000). In what follows, the Baddeley-Hitch WM model is presented along with the explanation of each subcomponent (Figure 1).

The central executive is thought to play a crucial and central role in WM, most notably in that it is deemed responsible for attentional control of WM (Baddeley, 2003b). In other words, it regulates the flow of information within WM and retrieval of information from other parts of the human memory system. However, the capacity of the central

(17)

executive is limited; and the more demands are imposed on it, the weaker it becomes in terms of functioning. In case two or more activities happen at the same time, the central executive directs attention, prioritizes certain activities and ignores others. For instance, when biking and talking on a phone at the same time, important attentional resources are taken away from concentrating on biking. To better carry out different, potentially conflicting tasks, the central executive system can allocate information to the subsystems: the phonological loop, the visuospatial sketchpad, and the episodic buffer.

Unlike the central executive, whose function is mainly considered to be the control of attentional processes rather than the storage of information, the phonological loop and the visuospatial sketchpad are particularly designed for storing information (Baddeley, 1986). The phonological loop, first of all, refers to any capacity to store short-term phonological information and enables individuals to retain smaller portions of (typically verbal)

information for a short period (Henry, 2012). It consists of two parts: the phonological store, which acts as an inner ear as well as retains information through sounds; and the articulatory control process, which operates as an inner voice rehearsing information from the

phonological store (Baddeley, 1986). The first part is linked to speech perception, while the latter is connected to speech production, but both are designed to store verbal materials.

To store other non-verbal materials, the visuospatial sketchpad has been suggested to process and store visual as well as spatial information, equivalent to verbal materials

(Gathercole & Baddeley, 1993). That is to say that the visuospatial sketchpad stores the visual characteristics of an object and the space in which this object is located (Henry, 2012). Hence, the information stored in the sketchpad may take the form of mental images that could be lost when not rehearsed. Due to the different nature of the information that the human memory system receives, evidence suggests that WM can deal with visual and verbal information separately. Performing two tasks relying on the different type of processing (a

(18)

phonological task together with a visuospatial task) may lead to a modest performance reduction, Contrastively, performing two tasks relying on the same type of processing may lead to a bigger performance loss. These conditions led to the perspective that WM is a multi-component system capable of independent processing of visual and verbal information respectively. With the existence of different information from respective sources, a fourth component of the WM system was proposed, namely the episodic buffer (Baddeley, 2003).

With its most recent addition to the WM construct, the episodic buffer serves as a temporary storage that integrates information from different sources in the sense that “it holds episodes whereby information is integrated across space and potentially extended across time” (Baddeley, 2000, p. 421). What this means is that the episodic buffer is capable of binding together information from different sources (the phonological loop and the visuospatial sketchpad) and types (verbal and visual) into a single multifaceted code. Furthermore, it is the episodic buffer that ties this information together into a coherent memory episode or chunks, therefore the term “episodic” (Baddeley, 2003a).

WM is time and capacity limited. In other words, its capacity has to be shared between the various processes and the storage of immediate outcome. Consequently, its efficiency and capacity decrease with the number of demands placed at the same time (Baddeley, 1981; 1990; Daneman & Carpenter, 1980). These limitations could be easily observed from an individual’s experience of memory limitations and forgetting, for instance when attempting to retain a person’s address in mind at the same time as listening to

instruction on how to get there or meeting someone new and almost immediately forgetting the person’s name. There might be many reasons underlying these restrictions, but the point is that “there are constraints on how much information can be managed, processed, and integrated effectively all at once” (Bunting & Engle, 2015, p. xx). Additionally, the limited-capacity nature of the WM system by definition means that its subcomponents are in

(19)

competition with each other: one process of manipulation or storage could be increased only at the expense of other processes (Bunting & Engle, 2015). In other words, there are trade-off effects between different aspects of WM as a direct result of the capacity constraint.

Against the backdrop of the multifaceted and finite capacity WM system, the majority of language research has tended to focus on investigating the phonological loop, as it has been found to be responsible for language learning process, for instance, vocabulary

acquisition (Gathercole & Baddeley, 1989; Service, 1992), reading development (Gathercole & Baddeley, 1993; Hansen & Bowey; 1994), and language comprehension (Adams, Bourke, & Willis, 1999). It needs to be pointed out that the bulk of previous work was conducted on young children between the ages of 3-8 because the phonological loop has been found to especially support the development of language processing system in childhood (Shankweiler & Crain, 1986; Gathercole & Baddeley; 1993) and research thus has been largely confined to an L1 context. To understand how it may also modulate L2 learning and use, it is pertinent to, first of all, understand phonological WM in more detail.

Phonological Working Memory (PWM)

According to Baddeley-Hitch’s (1974) WM model, there is one memory component that is designed to temporarily store and process verbal material: the phonological loop. As explained previously, this component consists of two parts: the phonological store and the articulatory control process. The phonological store functions as the main entrance of

information, where information is retained for a period of time before it finally decays, unless rehearsed by the articulatory control process. In such a way, the phonological store and the articulatory process are semi-independent processes that function collaboratively in (verbal) memory tasks (Gathercole & Baddeley, 1993). While an elaborate categorical distinction between these two sub-processes or components of the phonological loop goes beyond the

(20)

confines of the present study, several previous studies include elaborate descriptions (see Service, 1992; Cheung, 1996; Grivol & Hage, 2011). Because distinguishing between these subcomponents is not an integral part of the current study, the more neutral and overarching term phonological working memory (PWM) will be used to refer to the parts of WM that are involved in the temporary storage and manipulation of verbal material (Gathercole &

Baddeley, 1993).

Juffs (2006) argued that PWM is a way of operationalising WM, which determines an individual’s capacity to remember a series of unrelated items using covert ‘inner speech’ rehearsal (Ellis, 2001:34). PWM capacity has thus been operationalised in two different ways and on the basis of two different methods in particular: the ability to repeat nonsense words of various syllable lengths and the ability to reliably remember lists of unrelated words or digits (Juffs, 2006). The majority of the work so far has been conducted in the context of child (L1) language development, where it was found that PWM supports a wide range of linguistics behaviours, including word learning and vocabulary development in children (Adams & Gathercole, 1995; Baddeley, Gathercole, & Papagno, 1998; Kormos & Safar, 2008).

With regard to L2 learning, findings from previous studies have conformed to the view that PWM is significantly correlated with various aspects of L2 acquisition in not only children (Masoura & Gathercole, 1999; Dufva & Voeten, 1999; Service & Kohonen, 1995), but also adult learners (Atkins & Baddeley, 1998; Williams & Lovatt, 2003; Hummel, 2009). For one, Atkins and Baddeley (1998) tested the hypothesis that individual differences in immediate phonological memory span would predict success in L2 vocabulary acquisition. The study’s subjects were 32 English adult participants (19-40 years old) who were required to learn 56 English-Finnish translation pairs without any previous experience of learning Finnish. To assess their PWM capacity, the subjects were asked to complete a verbal digit

(21)

span task and a letter span task. As part of these procedures, they were respectively asked to recite aloud the entire sequence of digits and letter spans in the order of presentation, either spoken by the experimenter or presented on a computer screen. The result showed that PWM span was reliably correlated with vocabulary learning, indicating that PWM was an effective predictor of vocabulary learning success. Other studies also showed similar results supporting the role of PWM in L2 vocabulary learning (e.g. Gupta, 2003; Speciale, Ellis, & Bywater, 2004).

Another line of evidence suggests that PWM may be related to L2 grammatical abilities in adults. For instance, Williams & Lovatt (2003) examined the relationship between PWM and the ability to learn determiner-noun agreement rules in semi-artificial micro languages. The subjects were 41 adults whose L1 was English and who were required to complete two experiments which included an immediate serial-recall task to assess PWM, a vocabulary learning task, an input memory task, and a generalization test. In the vocabulary learning task, the participants had to learn to produce the Italian translations of English words. The input memory task required the participants to memorise some Italian phrases in English-Italian sentences for a later recall. In the generalization test, the participants had to translate the final English noun phrase into Italian by giving an oral response. Upon

calculating the scores, the experiments found correlations between PWM ability and rule learning. This result is consistent with the assumption that PWM is related to language learning (Baddeley et al., 1998; Ellis, 1996).

Previous research on adults has thus substantially looked at the relation between PWM and L2 vocabulary as well as grammatical acquisition. However, far less research has been conducted on the possible contribution of PWM to speech production, not in an L1 context but, more crucially, not in an L2 setting either. Indeed, investigations targeting adults with regards to PWM capacity in relation to speech production are very limited. For one,

(22)

O’Brien, Segalowitz, Freed, and Collentine (2007) investigated the relationship between PWM and L2 fluency gains in 43 adult learners of Spanish. The participants were native English speakers who had had at least two semesters of formal study of Spanish. Through completing a serial non-word recognition test to assess PWM and a Spanish oral test, the study found that both variables were correlated, indicating that PWM contributed

significantly to L2 learning in terms of oral fluency development. This finding lends support to the earlier study by O’Brien, Segalowitz, Collentine, and Freed (2006), which suggests that PWM plays an important in narrative development of L2 learning at earlier stages and in the acquisition of grammatical competence in adults. Despite the fact that these studies focused on L2 speech production, they aimed specifically at L2 oral fluency and narrative

development. There are other aspects of L2 speech productions, such as accuracy and complexity, that could be further examined in relation to the effect of PWM on the

development of these aspects. The scant previous studies that have attempted to relate PWM capacity to adult L2 language learners failed to demonstrate any direct relations between PWM and language production (Gathercole & Baddeley, 1993) in the sense that language production in skilled adult speakers appears to be mainly controlled by automatic procedures (as opposed to controlled effortful processing) that is independent of the working memory system (Shiffrin & Schneider, 1977). In contrast, PWM has been found to be associated with various measures of spontaneous speech in children, including the length of utterance,

syntactic complexity, and vocabulary richness (Adams & Gathercole, 1995; 1996; Blake et al., 1994). Therefore, an investigation on whether PWM would also play a role in L2 speech production in adults is necessary. In order to do this, it is necessary to delve deeper into the exact link between WM capacity and L2 development.

(23)

Working Memory Capacity and L2 Development

The association between WM and SLA has been supported by existing studies on L2 vocabulary acquisition, language comprehension, written production, speech production, and else (e.g., Service, 1992; Ellis, 1996; Fortkamp, 1999). Drawing on the existing evidence, L2 researchers have suggested that WM capacity to a large extend governs L2 acquisition (Miyake & Friedman, 1998; Robinson, 2001; 2002; Skehan, 2002). Most of this research has proposed that WM is a primary construct of language aptitude, with the strongest formulations even purporting that WM equals aptitude in an L2 learning setting (Robinson, 2001, 2002; Skehan, 2002). It is no wonder, then, that there has been a steep increase of interest in exploring the relation of WM and SLA (Wen, 2012).

WM performance among L2 learners has provided initial support for the view that WM capacity is related to L2 proficiency, varying from WM modulating productive to receptive skills (Miyake & Friedman, 1998) and different domains within that. The notion that L2 reading skill is correlated with WM capacity, for instance, is supported by the finding of a study conducted by Harrington and Sawyer (1992) among relatively advanced adult L2 learners. In their study, 43 native Japanese learners of English completed both Japanese and English versions of a digit span test in which they were asked to recall a list of unrelated digits that gradually increased in size, a word span test where instead of digits, the

participants were required to recall a list of unrelated words that gradually increased in size, and reading span tasks (RST), where they were required to read gradually longer sets of sentences out loud while trying to remember the final word of each of the sentences for later recall. Then, the participants completed the grammar and the reading section of the Test of English as a Foreign Language (TOEFL). The correlational analyses showed that L2 English digit span and word span measures did not correlate significantly with the TOEFL measures. In contrast, the L2 English RST was highly correlated with both of the TOEFL subsections.

(24)

On the other hand, the L2 word span and digit spans were weakly correlated with the grammar and the reading section of the TOEFL. Interpretatively, the results agree with the view that RST could be considered as an index of WM capacity suggesting that there is a trade-off effect between active processing and storage. On the other hand, the results could be considered inconsistent in the sense that different WM span tests resulted in different effects. According to the authors, these differences might have been because of the strong effect for language. To put it simply, the discrepancy between the simple span and the reading span measures may reflect differences between listening and reading skills across the languages (Harrington & Sawyer, 1992).

Moving from reading to the receptive skill of L2 listening, Fay and Buchweitz (2014) investigated whether IDs in WM capacity of L2 learners predict listening comprehension skill as tested in a proficiency exam. The subjects were 24 adult students of English as an L2. Using the BAMT (Bateria de avaliaciao da Memoria de trabalho) to assess WM capacity and an English listening comprehension task, the study showed that both variables were

significantly positively correlated. Put simply, the larger a person’s WM capacity scores, the higher his or her scores on the listening comprehension task were. This further indicates that WM capacity could predict listening comprehension performance.

With regards to productive skills, Bergsleithner (2010)’s study on the relation between WM capacity and L2 writing performance also found that both variables were statistically significantly positively correlated. In addition, a trade-off effect was attested between L2 writing accuracy and complexity in the study, which the authors interpreted in the sense that gains in one aspect resulted in losses in the other. This latter finding supports the notion that WM capacity is limited and competes for resources (Bunting & Engle, 2015; Foster & Skehan, 1996). In L2 speaking skills, WM capacity has been assumed to partially

(25)

predict narrative vocabulary production at early stages and grammatical accuracy at later stages of L2 mastery (Fortkamp, 1999; 2003; Payne & Whitney, 2002).

The results of these studies converge on the assumption that there is a robust positive relation between WM capacity and SLA. More specifically, the evidence suggests that individuals with a larger WM capacity are more prone to attaining good L2 performance levels compared to those with smaller WM capacities. However, some studies also mentioned the existence of trade-off effects between the different aspects of L2 skills (e.g. Foster & Skehan, 1996; Fortkamp, 2000, Mota, 2005). For one to assess L2 productive skills, there are complexity, accuracy, fluency that interact with each other. Previous studies found that gains in some of these measures resulted in losses in other aspects. For instance, accuracy might increase at the expense of other production subskills, such as complexity in L2 speech production (Fortkamp, 2000; Mota, 2005; Finardi & Prebianca, 2006). However, the results of these studies vary in the sense that no straightforward relationship among these aspects has been found. This effect could mean that IDs in WM capacity reflect differences in allocating attention to certain demands when completing a complex activity (Engle et al., 1999). Therefore, individuals with higher WM capacity could allocate more attentional resources to pay more attention to performance on a given task (Kane & Engle, 2000; Engle et al., 1999).

L2 speech production has been conceptualized as a cognitive activity carried out within the limited-resources constraints of WM (Mota, 2005). With the focus of the present study to gain insights regarding the complexities of L2 speech production in relation to WM capacity, it is necessary to first look into a detailed account of the relationship between WM and L2 speech production in particular.

(26)

Working Memory Capacity and L2 Speech Production

One of the productive skills that second language learners usually aim for is speaking, as it plays a major role in the survival and development of human society (Levelt, 1995). The ability to convey thoughts and meaning into overt speech in the L2 is therefore often the goal for the majority of L2 learners around the world (Guillot, 1999; Hieke, 1985). However, as compared to other fields of study, the field of L2 speech production is still under-researched, as it is generally harder to measure and contains more variable than other language domains such as reading and writing (Mota, 2005). As such, the field still lacks consensus on the constructs of L2 speech performance as well as the most effective approach, either from a theoretical or a pedagogical perspective, towards successful oral performance (Fortkamp, 2000), even though L2 speech performance has received much more attention over the past two decades.

Many language researchers believe that L2 performance and L2 speaking proficiency in particular are multi-component constructs, which can thus only be adequately and

comprehensively captured by resorting to measures of complexity, accuracy, and fluency (CAF) (Skehan, 1998; Ellis, 2003, 2008). Complexity is defined as the elaborateness, richness, and diversity of L2 learners’ linguistic systems (Housen & Kuiken, 2009); accuracy is the degree of deviation from a particular norm or characterized as errors (Wolfe-Quintero, Inagaki, & Kim, 1998); fluency refers to the smoothness of speech or writing (Chambers, 1997; Lennon, 1990). Furthermore, this triad has been widely used to both describe language learners’ performances in the written and oral domain as well as indicate language learning progress (Housen & Kuiken, 2009).

Conceptualizing L2 speech production as a complex cognitive activity in this way (cf. Clark, 1996; Levelt, 1989), means it is very interesting to link to WM as a limited-capacity processing system very much underlies or modulates what learners can do (Mota, 2005).

(27)

When producing L2 speech, an individual would have to store and manipulate information simultaneously. Such concurrent information storing and manipulation compete for WM resources. With different information processes being served simultaneously, a trade-off between the measures of L2 speech performance is likely to emerge, operationalized as a trade-off between complexity, accuracy, and fluency, as has been proposed by researchers (e.g., Yuan & Ellis, 2003; Guara-Tavares, 2011; Fortkamp, 2000).

As one of the only studies in this field, Fortkamp (2000) investigated the relationship between WM capacity as measured by the speaking span test (SST) and L2 speech

production, as assessed through Complexity, Accuracy and Fluency (CAF) measures as well as weighted lexical density (WLD) in the context of advanced L2 learners of English. More specifically, the learners were 13 adult participants (18-41 years old) originating from Brazil, Germany, Korea, Japanese, and other countries and who were taking L2 English classes at the Minnesota English Center. The SST consisted of 60 unrelated one-syllable words,

organized in three sets of two, three, four, five, and six words. Each word was presented on a computer screen for the duration of one second and accompanied by a beep. Participants were asked to read the word silently. Upon hearing the beep, they were prompted to orally produce a sentence for each word in the set, in order of appearance. With regards to L2 speech

production aspects, fluency was measured by unpruned and pruned speech rates, number of silent pauses per minute, number of hesitations per minute, and mean length of run. In more detail, unpruned speech rate was obtained by dividing the total number of semantic units, including repetitions, by the total time to complete the task. Pruned speech rate was

calculated the same way as the unpruned speech rate, but excluding repeated semantic units. Accuracy, finally, was computed by counting the total number of errors in each speech

sample divided by the number of semantic units produced; complexity was measured in terms of the number of dependent clauses per minute, and weighted lexical density (WLD) indices

(28)

were obtained by calculating the numbers of lexical and grammatical items. The study later found that there was a positive significant correlation between the participants’ WM capacity and CAF measures, but a negative significant correlation with WLD measures. According to the author, the results obtained in the study indicate that individuals with larger WM capacity, as measured by the SST, spoke faster and longer, with fewer errors, and used more complex language. However, the negative correlation between WM capacity and WLD measure suggests the existence of a trade-off effect among the L2 production measures as a result of the limited capacity of WM system.

Other studies following Fortkamp (2000) comprised similar designs contributing to the trade-off effects between the aspects of L2 speech production. For instance, Finardi and Prebianca (2006) investigated the relationship between WM capacity through a speaking span test (SST) and an operation-word span test (OWST) and measures of L2 speech performance (CAF). The OWST consisted of 60 operation strings and 60 English words (Turner & Engle, 1989) in which participants were required to calculate simple mathematical operations and mention the results orally while trying to memorize a word following each operation. By means of an experimental study conducted among 12 university students who were intermediate L2 learners of English, they found that there was a significant correlation

between WM capacity and L2 speech production measures, particularly between the SST and the fluency measure. The authors interpreted this as evidence that in order for individuals to produce speech fluently, they need to direct their attentional resources towards faster oral production, thus neglecting or affecting other aspects of L2 speech production. This finding also lends support to the existence of the trade-off effects among speech production variables as a function of WM capacity.

However, the small sample size used in previous studies makes it problematic to generalize the result to a bigger population. The sample size resulted in a short variation in

(29)

the scores of the WM test, which might have contributed to the lack of significant

correlations with L2 speech production measures. Other than that, the various L1 background of the participants in previous studies might have influenced the L2 performances, including oral performances (e.g. Riggenbach, 1991). Involving more participants sharing the same L1 as well as using different existing tests to measure WM capacity would help illustrate more clearly the relationships between WM capacity and L2 performance.

The Present Study

Previous work found there to be trade-off effects between the storage and processing functions of WM (specifically operationalised by tasks tapping PWM) and L2 speech

production. Furthermore, the findings suggest that individuals with higher WM capacity outperformed their peers with lower WM capacity in tasks that demand complex cognitive processing. However, these previous studies used the same tests to measure WM capacity by means of the speaking span test and the operation-word span test despite the existence of other WM capacity measures, such as the digit span test and the reading span test. It is informative to also relate such lesser used PWM to L2 speech development, to shed further might on the role of WM in L2 speech production. Indeed, even though PWM, which is usually assessed using the digit span test, has been usually related to vocabulary acquisition (Gathercole & Baddeley, 1989; Service, 1992), it has also been found to be related to speech production in a study by Adams and Gathercole (1995), albeit in younger learners and in an L1 context. They found that children showing good PWM scores produced a language that was more grammatically complex, contained richer vocabulary, and longer utterance compared to children with poor PWM. However, that study was conducted on children of 3 years old mastering their native language and - to the extent of the author’s knowledge – no similar design has been implemented in an L2 adult learning context. On the other hand, the reading span test has been claimed to reflect the dual function of storing and processing

(30)

information (Daneman & Carpenter, 1980) and that it could account for the variation in L1 speech production. However, there was still little evidence on whether or not it could be related to L2 speech production.

In the area of L2 development, investigations into the relationship between WMC and L2 performance might contribute to a better understanding of the vast individual differences (IDs) in adult learners (Miyake & Friedman, 1998) as well as contribute to the existing evidence on the different measures of WM capacity and how they relate to L2 speech

performance and development in adults. Drawing on existing research in the field, the present study aims to answer the following research questions:

1. Are working memory capacity measures (operationalised as backward digit span and reading span test) related to aspects of L2 speech production, as assessed through complexity, accuracy, fluency, and weighted lexical density measures?

2. Which working memory capacity measure (backward digit span or reading span test) correlates best with L2 speech production?

In relation to the two research questions and on the basis of Fortkamp (2000), it is hypothesized that:

1. There is a relationship between measures of working memory capacity (the backward digit span and the reading span test) and L2 speech production as assessed through complexity, accuracy, fluency and weighted lexical density in a video retelling task. 2. There is a relationship between measures of working memory capacity (the backward

digit span and the reading span test) and/or complexity, accuracy, fluency, and weighted lexical density in a video retelling task.

3. There is a relationship only between the backward digit span test and/or complexity, accuracy, fluency and lexical density in a video retelling task.

(31)

4. There is a relationship only between the reading span test and/or complexity, accuracy, fluency and lexical density in a video retelling task.

(32)

Methodology

Methodological Approach

In order to address the research questions and hypotheses presented in the previous chapter, the present study used a quantitative approach. According to Creswell (2009), quantitative research is:

“an approach for testing objective theories by examining the relationship among variables. These variables, in turn, can be measured, typically in instruments, so that numbered data can be analysed using statistical procedures… Those who engage in this form of inquiry have assumptions about testing theories deductively, building in protection against bias, controlling for alternative explanations, and being able to generalize and replicate the findings.” (p. 4)

The choice for such a quantitative design was informed by the aim of the current investigation to examine specific variables and their potential patterns of correlation (see below for further details). Quantitative research allows the result of the analysis of the sample population to be generalized to a larger population (Burrel & Gross, 2018).

Research Design

The study involved 18 Master’s degree students, all enrolled in various programs of the University of Groningen in the Netherlands. They were all Indonesians who were active L2 learners of English and were currently enrolled in a Dutch environment. The independent variables used in the present study were two measures of WM capacity: the backward digit span (BDS) test and the reading span test (RST) scores. On the other hand, the dependent variables were measures of L2 speech production: complexity, accuracy, fluency, and weighted lexical density, following Fortkamp (2000). Upon data collection completion, the variables were analysed through the use of inferential statistics (see below for details).

(33)

Participants

Eighteen advanced L1 Indonesian learners of L2 English joined the experiment, each with an English proficiency level of 6.5 – 7.0 on IELTS. They were Indonesian students who were enrolled in various Master’s degrees at the University of Groningen in the Netherlands. Of the eighteen participants, 7 were women and 11 were men, with ages ranging from 24 to 28, and a mean of 25.7. All participants reported having studied English during their school years since they were 6 or 7 years old. In addition to receiving previous formal English instruction, they had similar proficiency levels because of the University of Groningen’s entry requirement of a minimum of 6.5 on IELTS, and their study programs were entirely in English. Their fields of study included psychology (3 participants), law (5 participants), international relations (5 participants), communications (1 participant), business

administration (2 participants), architecture (1 participant), and environmental planning (1 participant).

At the time of data collection, all participants were in the second or last semester of their programs, which means that they had finished a number of courses that required them to perform in English at a high standard in both receptive and productive areas. They had also been in an English-speaking environment for at least six months and had been using English actively, also outside of their coursework. They were furthermore exposed to Dutch, although they did not master this language actively. With this background information, the sample could be considered as a relatively homogenous group in terms of educational history and English proficiency levels.

Materials

The experiment consisted of three tasks: two tasks to measure WM capacity and one task to elicit L2 speech production. A number of L2 speech measures were then derived from

(34)

the latter L2 speech production task. The WM capacity tasks comprised a backward digit span task (BDS) and a reading span task (RST), whereas a video-retelling task was chosen to obtain the L2 speech production sample.

Backward Digit Span (BDS). The Backward digit span (BDS) requires participants

to recall a list of unrelated digits that gradually increases in size, but to do so in the reverse order than that in which they were presented. The BDS was chosen as a measure of WM capacity because it was considered more demanding than its forward digit span task

counterpart, in the sense that the BDS invokes not only immediate verbal memory, but also an additional processing component for mental operations, reordering in this case (Pisoni, Kronenberger, Roman, & Geers, 2011). Thus, constituting a more complex task, i.e., retaining the information for a period of time in the memory loop while manipulating it to a different sequence, the BDS is generally considered to be a good measure of WM capacity (Pisoni et al., (2011).

The BDS test was administered via a PowerPoint Presentation on a laptop. Following two trial items, the test started with two sets of two random digits, for example, ‘1 5’ and ‘9 2’, each presented on separate slides. The digits were presented in black Calibri font, size 100, on a white background. The digit series gradually increased in length, with each

occurring twice; after the aforementioned two-digit example, the test continued with two sets of three-digit, four-digit, until a ten-digit series had been reached (see Appendix A). Each slide was presented for 3 seconds. After each of the digit slide, the slide with the word ‘recall backward’ was shown to instruct participants to write down their answers on a separate answer sheet provided to them (see Appendix B). This recall phase was untimed. The participants were not allowed to write down anything before the ‘recall backward’ slide was shown. The test continued until participants made a mistake on both sets of a given digit size series. If one of the two was correct, the participants were still allowed to continue. The total

(35)

span that was correctly recalled by participants constituted the final score on the test. For example, a participant who could correctly recall a 7-digit size series would be assigned the score of 7. The test took four to five minutes on average to complete.

Reading Span Test (RST). The reading span test (RST) requires participants to read

gradually longer sets of sentences out loud while trying to remember the final word of each of the sentences to be recalled upon demand later (Daneman & Carpenter, 1980). Like the BDS it is thus a two-component WM measure, but the WM components are invoked in a different order: as part of the RST, the first step taps into the processing component, whereas the latter into the storage component of working memory. The RST was chosen in the present study as another measure of WM capacity because it has been found to better predict individuals’ ability to perform complex cognitive tasks, such as speaking and reading skills, than the traditional digit span does and with that presents a clear verbal counterpart to the digit span test (Daneman & Hanon, 2007).

The RST that was employed in the current study was largely based on van den Noort et al.’s (2008) English version (see below for details). The RST was administered via a PowerPoint Presentation on a laptop. This experiment was conducted in L2 (English) instead of L1 (Indonesian), because previous studies had administered their span tasks in the L2 and found this to be effective (Fortkamp, 2000; Finardi & Prebianca, 2006). Furthermore, Gass and Lee (2011) found that L1 and L2 WM scores would largely match for high proficiency learners. In this case, the participants were considered as high proficiency learners. The L2 test was then deemed preferable because it meant that the entire testing session could be administered in English, without any switching effects and costs.

Following van den Noort et al. (2008), the RST was divided into three series, with each series comprising 20 sentences in English presented on different slides (see Appendix C). Each sentence was 12 to 17-words long and typed in black Calibri font, size 40, projected

(36)

on a white background. The 20 sentences within each series were presented in sets of 2, 3, 4, 5, or 6 sentences, in random order. In other words, some participants started directly with a 6-sentence set and others with a 2-6-sentence set. This approach was taken to avoid anticipation effects on the part of the participants.

During the test, the participants were prompted to read the sentences out loud and press the spacebar when they were done. Even though they could thus read the sentences at their own pace, each slide had an automatic transition that lasted for 6.5 seconds. In other words, after this time the next sentence would be presented in case the participants had not yet finished reading the sentence out loud. This timing was based on van den Noort et al. (2008), who in turn had pretested this design and had validated the timing on the basis of large numbers of participants, including college-aged populations as employed in the current study. Beforehand, participants were also instructed to remember the final word of each sentence in the different sets and that the sentence sets would vary in length. They were asked to freely recall as many final words of each sentence as they could remember upon being presented with a slide containing the word “recall”, which also marked the end of each set in a series. The recall order did not matter. Other than remembering the final words, the

participants were also asked to pay attention to the content of each sentence, as there would be two comprehension questions at the end of each series. To motivate the participants, some mini-breaks were provided for about 1 minute after each series. A score of 1 was given to each correct word. Subsequently, the RST final score constituted all correct items added together (van den Noort et al., 2008). In other words, the maximum RST score was 60, corresponding to the total number of sentences in the test. Following van den Noort et al. (2008), the comprehension questions were not scored but merely used to emphasise the importance of reading the sentences as fast as possible while also reading for content, adding

(37)

complexity to an already demanding task. The entire test took approximately 10 minutes to complete for each participant.

Video-retelling Task. A Video-retelling task was used to elicit L2 speech production

for all the participants. More specifically, a story-retelling task was used on the basis of a video that all participants had watched, because even though it differs from routine conversational speech, such a design likely taps enough of the same skills to be used as a brief standardized language sampling procedure (Culatta, Page, & Ellis, 1983). In other words, the topic, content, and length of verbal discourse in story-retelling tasks can be controlled easily across different speakers. Furthermore, speakers must access their lexicon and linguistic skills to retell the stories, instead of producing the exact words used in the story that could be too long to memorize (Gazella & Stockman, 2003), thus resulting in proper and authentic language output. Bardovi-Harlig (1995) noted several advantages that follow from story retelling tasks, most notably with respect to the use of silent video: the sequence of events is known to the participants, such narratives can be compared across learners, and a retelling task might encourage learners to produce longer utterances than other narrative tasks, such as a picture description task.

The video used in the present study was an animated video found on YouTube,

entitled Pip. It was a 4-minute long video without any overt speech, which told a story of how a dog enrolled in a school for guide dogs overcame his shortcomings and fears to become a successful guide dog. After watching this video, the participants were asked to retell the story in English and use at least 2 minutes to do so. The speech productions were recorded using a phone recorder feature. The researcher actively attempted to produce as few interruptions and back-channelling comments as possible during the time that the participants retold the story. This set-up resulted in speech products which varied between two and four minutes (with approximately three minutes of free speech being produced on average).

(38)

L2 Speech Production Measures. To assess the L2 speech production, complexity,

accuracy, fluency (CAF), and weighted lexical density (WLD) measures were obtained on the basis of the free speech samples. The CAF and WLD measures were adapted from the

framework proposed by Skehan (1996; 1998), which has been widely used in research on the effects of planning time on speech production in the L2 (Foster & Skehan, 1996; Ortega, 1999). As a partial replication study, the present study largely used the same speech production measures as employed by Fortkamp (2000) and Finardi and Prebianca (2006). This design is elucidated below.

Complexity was operationalized by dividing the number of dependent clauses by the time in seconds taken to complete the video-retelling task. Following Mehnert (1998), dependent clauses were defined as including finite and non-finite subordinate clauses, coordinate clauses, and infinitive constructions. The results were then multiplied by 60 to express the number of dependent clauses per minute.

Accuracy was calculated by counting the number of errors per 100 words (Fortkamp, 2000; Finardi & Prebianca, 2006). The errors that were thus counted included syntactical, morphological, and lexical errors. Pronunciation and intonation errors were not included. The total number of errors was then divided by the number of words produced, and subsequently multiplied by 100 to express the number of errors per 100 words.

With regard to the fluency measure, Fortkamp (2000) used four temporal variables, all considered to reflect the notion of fluency: unpruned and pruned speech rate, number of silent pauses per minute, number of hesitations per minute, and mean length of run. However, due to the explorative nature of the present study, only the unpruned speech rate variable was used to quickly measure fluency, similar to the procedure followed by Finardi and Prebianca (2006). Fluency that was obtained on the basis of unpruned speech rate was calculated by dividing the total number of words (including self-repetitions and corrections) by the total

(39)

time (including pausing time) needed to complete the task (Lennon 1990; Fortkamp, 2000; Finardi & Prebianca, 2006). As a final step, the result was multiplied by 60 to obtain the number of words produced per minute.

Finally, to measure weighted lexical density, it was necessary to categorize all linguistic items as grammatical or lexical items based on the following criteria formulated by O’Loughlin (1995):

(A) Grammatical items:

1. All modals and auxiliaries.

2. All determiners, including articles, demonstrative and possessive adjectives, quantifiers, and numerals.

3. All pronouns and this and that when used to replace clauses. 4. Interrogative adverbs (what, when, how) and negative adverbs. 5. All contractions of pronouns and auxiliary verbs.

6. All prepositions and conjunctions.

7. All discourse markers including conjunctions, particles (oh, well), lexicalized clauses, and quantifier phrases (anyway, somehow, whatever).

8. All lexical filled pauses (so, well). 9. All interjections (gosh, really, oh). 10. All reactive tokens (OK, no!). (B) Lexical items:

1. Nouns, adjectives, verbs, adverbs, participle form. Main verbs were counted as one lexical item.

Following Fortkamp (2000), the lexical and grammatical items were then divided into high-frequency and low-frequency items. A grammatical or lexical item that appeared more than once in the same speech sample was considered as a high-frequency item, whereas those

(40)

appearing only once in the same speech sample were considered low-frequency items. Inflectional and derivational forms of the same lexical and grammatical items were

considered repetitions and thus counted as high-frequency lexical and grammatical items (e.g. go/went, this/these, etc). Then, high-frequency lexical and grammatical items were given half the score of the low-frequency lexical and grammatical items, which constituted the total number of weighted lexical items. This number was then divided by the total number of linguistic items and multiplied by 100 to obtain the percentage of weighted lexical items over the total number of linguistic items in the speech sample.

Procedures

The experiment was conducted face-to-face with one participant and the researcher being present at any one time. The data collection took place at the author’s house or the participant’s house. All tests were administered using a laptop and the researcher was present at all times to supervise the proceedings. Before the start of the experiment, each participant was asked to complete and sign a subject consent form after listening to the explanation and aim of the research (see Appendix D). Upon filling out the consent form, each participant was required to complete the BDS test, followed by the RST, and finally the video-retelling task. The order of experiments was thus the same for all participants. In the BDS test, after giving the instruction to the participants, the researcher provided two trials to check the participants’ understanding of the test. Then, participants were given the BDS answer sheets on which they could record their answers. The participants were explicitly asked if all was clear in terms of instructions before finally completing the actual test. The BDS test took approximately 5-6 minutes to complete per participant. As a next step, the score of each participant was

recorded by the researcher. Afterwards, the participants could take a mini-break for about 1-2 minutes.

(41)

Following the BDS, the RST was administered. The test was also administered via a laptop. After instructing the participants, the author provided two trials to check the

participants’ understanding of the test. The participants were allowed to ask questions if anything was unclear and then proceeded to completing the actual test. No answer sheet was given in this test because each participant was required to name the sentence-final words orally and the author noted down their answers, automatically recording how many items the participant was able to recall. After each series, the participants were allowed to take a mini-break for 1-2 minutes. On average, the RST lasted for about 10 minutes. After that, the final score of each participant was recorded. Before moving on to the video-retelling task, the participants could take a break for 1-2 minutes.

As a final test in the test battery, the video-retelling task was administered. In this task, the participants were instructed to watch the mute animated video (detailed above) and attend to as much detail as possible. After watching the video, each participant was asked to retell the story for at least 2 minutes. This process was recorded with a phone recorder feature available on an iPhone XR. Then, the speech samples were transcribed and coded in the Computerized Language Analysis (CLAN) program (MacWhinney, 2000) on a computer to ease the CAF and WLD assessment.

The CLAN program is designed specifically to perform a large number of automatic analyses on data transcribed in the Codes of Human Analysis Transcript (CHAT), which is a standardized transcription system. The program covers a wide array of possible commands to analyse frequency counts, word searches, co-occurrence analyses, among others. Several frequency commands were created to compute complexity, accuracy, fluency, and lexical density in accordance with the L2 production measures framework (see Appendix E). The raw data of the tests can be seen in Appendix F.

Referenties

GERELATEERDE DOCUMENTEN

Research based on the correlation between production and perception is founded on the assumption that a participant indeed has separate categories for each investigated sound.

Utilizing a low-frequency output spectrum analysis of an integrated self-mixer at the upconversion mixer output for calibration, eliminates the need for expensive microwave

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright

Only accuracy data were used in the behavioral analyses of the CWM task, as participants did not receive any instructions to perform the task rapidly. To compare the model outcomes

According to this theory, then, only words that carry stress on a final syllable that is not super-heavy are exceptional (e.g., foREL ‘trout’). These words have their stress

In this regression it has a negative value that indicates that for the first shock in oil price the effect of the size in downgrade results in lower probability that a company

Given the importance placed on infrastructures in contemporary society, it appears land administration must be recognized as core, public good, critical