• No results found

To what extent does vocabulary and grammar learning aptitude predict accuracy measures in adult L2 speakers’ oral and written attainment?

N/A
N/A
Protected

Academic year: 2021

Share "To what extent does vocabulary and grammar learning aptitude predict accuracy measures in adult L2 speakers’ oral and written attainment?"

Copied!
122
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

To what extent does vocabulary and grammar learning aptitude predict

accuracy measures in adult L2 speakers’ oral and written attainment?

Claudia dos Santos Cunha

Study Program: Linguistics

Specialization: English Language and Linguistics

Course: Cognitive Approaches to Second Language Acquisition

MA Thesis, Leiden University

Supervisor: Dr. N.H. de Jong

Second reader: Dr. Johanneke Caspers

(2)

Abstract

This thesis investigates the relationship between vocabulary and grammar learning aptitude (measured by two LLAMA subtests) and accuracy constituents in adult L2 speakers’ oral and written discourse productions. It also examines the relationship between the speaking and writing attainment, and the extent to which the link between performance in aptitude and speaking differs from the one for writing. Oral and written picture narratives elicited from 30 ESL speakers were analyzed and coded in terms of lexical and morphosyntactic errors per 100 words. The results of the Spearman correlation analyses revealed that vocabulary learning aptitude is significantly associated with the command of lexis and morphosyntax in writing, explicit grammar learning aptitude shows no significant association with accuracy in both production modes, and time availability in writing has no influence on the interplay between explicit and implicit knowledge with regard to the access of lexical forms. No significant differences were found with respect to the strength of aptitude-speaking and aptitude-writing relations.

(3)

Table of Contents

1. Introduction……….. 5

2. Theoretical Background ……….. 8

2.1 Language Aptitude and SLA ……… 8

2.2 Language Aptitude Components ……… 10

2.3 Explicit and Implicit Knowledge ………12

2.4 CAF Measures ……… 15

2.5 CAF Measurements ……… 16

2.6 Accuracy ………. 18

2.7 Lexical and Morphosyntactic Accuracy in SLA ………. 20

2.8 Effects of Language Aptitude on Accuracy in SLA ……… 22

2.9 Motivation for the Current Study ……… 23

2.10 Research Questions and Hypotheses ……… 24

3. Method ……….. 27

3.1 Participants ……… 27

3.2 Materials ……… 28

3.2.1 Speaking Task ……… 28

3.2.2 Writing Task ……….. 28

3.2.3 Aptitude Test (LLAMA) ……… 29

3.2.3.1 LLAMA B ………. 29

3.2.3.2 LLAMA F ……….. 30

3.2.4 DIALANG ………. 30

3.2.5 LEAP-Q ………. 31

(4)

3.3 Procedure ……… 31

3.4 Material Preparation ……… 32

3.5 Calculating Accuracy Measures ……….. 32

3.6 Analyzing Errors ………. 33 3.7 Data Analysis ……….. 34 4. Results ……… 35 5. Discussion ……….. 39 6. Limitations ………. 43 7. Conclusion ………. 46 8. Practical Implications ……… 47 References ………. 49 Appendices Appendix A Participants’ Profile ………. 56

Appendix B Speaking Task ………. 57

Appendix C Writing Task ……… 61

Appendix D LEAP-Q ……….. 64

Appendix E Informed Consent ……… 68

Appendix F Guidelines for Coding Errors ……….. 69

Appendix G Clean Transcriptions with LE and MSE ………. 71

Appendix H Writing Tasks with LE and MSE ……… 91

Appendix I Spreadsheet Results ………. 122

(5)

1. Introduction

Anna, a Brazilian friend of mine whom I met when I lived in Hong Kong (HK), is a polyglot. At the time we met, she already spoke four languages: Portuguese (her native language), English, Spanish, and Italian fluently and accurately, and she was on her way to learn her fifth language: Pǔtōnghuà (Mandarin). Anna’s “talent” for learning languages was evident to everybody who knew her. She was able to switch from one language to another (either in speaking or in writing) effortlessly. Can we, therefore, conclude that “talent” plays an important role in second language (L2) learning, evidenced by fewer or no mistakes in grammar and vocabulary in speaking and in writing? Sandra, another “talented” Brazilian friend who also lived in HK, spoke three languages: Portuguese (her native language), Spanish, and English. However, she used to say that her speaking skills in English were different from her writing ones. To her mind, when speaking, she had to give answers “on the spot,” and as a result, she made more mistakes in grammar and vocabulary, whereas when writing, she had “more time to think” and could make changes in her sentences if she wanted to. Can we conclude that “talent” in L2 speaking differs from “talent” in L2 writing? Can we also conclude that a person’s L2 oral performance differs from his/her written performance due to time constraints?

Second language acquisition (SLA) researchers refer to this “talent” as language aptitude, and they consider it relevant to L2 learning (Skehan, 2015). This intrinsic potential relates to an array of cognitive and interrelated componential abilities and “is considered to be a relatively stable trait (rather than skill), regardless of previous L2 learning experience” (Saito, 2017, p. 666).

Used as measuring instruments, language aptitude test batteries predict (1) how fast learners can perform better in an L2 (L2 learning rate) and (2) to what degree they can in due course reach near-native competence (ultimate L2 learning success) in formal and naturalistic learning contexts. Since its creation, The Modern Language Aptitude Test (MLAT) (Carroll & Sapon, 1959) has been one of the most influential measures in the history of aptitude research (e.g., De Graaf, 1997; Erlam, 2005; Robinson, 1995; Van Patten & Borst, 2012a,b). Another test that has been used in a large number of

(6)

studies in the SLA field is the LLAMA Language Aptitude Test (Meara, 2005), which has a shorter design and is loosely based on the MLAT test.

Until the 1990s, language aptitude researchers investigated the relationship between beginning-of-term aptitude scores and end-beginning-of-term attainment among taught groups of students (Carroll, 1965; Skehan, 1982). However, this type of research design revealed little about the nature of language learning itself. Consequently, aptitude researchers started to associate language aptitude to learning processes, namely, implicit and explicit ones. The former does not involve any awareness and “reflects development as a function of exposure to language input” (Skehan, 2015, p. 368), whereas the latter consists of some awareness and “focuses on particular aspects of language” (Skehan, 2015, p. 368). Therefore, the distinction between implicit and explicit processes started to be considered an important factor to understand the construct of language aptitude.

Speaking and writing in our native or first language (L1) is quite different from L2 speaking and writing. Undeniably, linguistically speaking, to process a task throughout these two production modes, an L2 speaker undertakes different amounts of cognitive load and taps into both implicit and explicit knowledge under different time availability. Therefore, a comparison between spoken and written attainment can help us understand how different cognitive demands influence a speaker’s L2 production.

Over the last decades, SLA researchers have examined how the vocabulary (lexis) and grammar (morphosyntax) of L2 learners’ language production vary depending on their cognitive load or task features (e.g., Crookes, 1989; Foster & Skehan, 1996; Robinson, 2003; Rutherford, 2001; Skehan & Foster, 2005). Additionally, they have significantly investigated the link between foreign language (FL) aptitude and the learning of L2 morphosyntax (e.g., Li, 2015, 2016; Robinson, 2005; Saito, 2017; Skehan, 2015). However, it is noteworthy that a considerable amount of FL aptitude research has mainly been concerned with L2 learners’ oral speech in FL classrooms. Thus, an investigation on the relationship between language learning aptitude and lexis and morphosyntax (accuracy) in speaking

(7)

and writing among L2 speakers can shed light on the understanding of their ultimate oral and written productions in distinct communicative tasks and how these productions differ from each other. This implies that the teaching practice stands to benefit from it by designing speaking and writing tasks that can promote linguistic accuracy in L2 classrooms.

Due to the status of English as a Lingua Franca (ELF) (Seidlhofer, 2011) and the growing number of English as a second language (ESL) users in the world (Pennycook, 2017), the protagonist of this current study is the ESL adult speaker. Contrary to Saito’s (2017) investigation whose 50 Japanese learners’ use of English was highly limited outside of the classroom, this study will focus on a multicultural group of 30 ESL adult speakers who learned ESL in a formal setting but has already been out of the instructional environment for a long time. However, this group has currently been communicatively active in naturalistic settings.

The aim of this study is threefold: to investigate the potential relationship between vocabulary and grammar learning aptitude measured via two subtests of the LLAMA test (Meara, 2005) and lexical and morphosyntactic accurate constituents of adult L2 speakers’ oral and written discourse productions, to explore the relationship between the speaking and writing attainment, and to examine the extent to which the correlation between performance in language aptitude and speaking differs from the strength of the correlation for writing.

(8)

2. Theoretical background

2.1 Language Aptitude and SLA

As a sub-set of cognitive abilities significantly relevant to L2 learning, language aptitude has been considered one of the most substantial individual difference variables (e.g., Cochran et al., 2010) and found to be itself componential and relatively stable (e.g., Skehan, 2015).

A large amount of research in the field of language aptitude has examined the relationship between language aptitude and L2 achievement (e.g., Carroll, 1981). Such investigations theoretically approach aptitude in two broad lines: (1) as a trait that affects the L2 outcome (a predictive approach) or (2) as a trait that affects the L2 process (an interactionist approach).

Within the predictive approach, many studies investigated the correlation between L2 learners’ aptitude scores and L2 attainment measured via end-of-term grades, or proficiency test scores, or free constructed tasks (e.g., Saito, 2017). Some other predictive studies also examined aptitude together with three distinct individual difference variables, namely, anxiety (e.g., Bell & McCallum, 2012; Sparks et al., 2009), intelligence (e.g., Bond, 2011; Sparks et al., 2006), and motivation (e.g., Bialystok & Fröhlich, 1978; Gordon, 1980) and overall, aptitude was regarded as different from such variables (e.g., Gardner & Lambert, 1965) and considered the best predictor (e.g., Cochran et al., 2010). Carroll and Sapon (2002) claim that language aptitude encompasses a set of cognitive abilities that are “predictive of how well, relative to other individuals, an individual can learn a foreign language in a given amount of time and under given conditions” (p. 23). This establishes a product-oriented view of language aptitude showing its predictive potential and association with ultimate L2 attainment, regardless of instruction type and learning context.

When it comes to the interactionist approach, Robinson (2002, 2005) argues that language aptitude is treated as a dynamic construct whose role varies according to the function of the processing demands of the different treatment conditions (e.g., explicit vs. implicit in de Graaff (1997) and Sheen

(9)

(2007) or inductive vs. deductive in Erlam (2005) and Hwu & Sun (2012)). In Robinson’s view, language aptitude is regarded as “strengths individual learners have - relative to their population - in the cognitive abilities information processing draws on during L2 learning and performance in various contexts and at different stages” (Robinson, 2005, p. 46). This constitutes a process-oriented view of language aptitude. For instance, in a study comparing Esperanto (an artificial language) and Spanish (a natural language) among two groups of 27 college learners who got different exposure to instruction (explicit vs. implicit), de Graaff (1997) concludes that the effect of language aptitude on learners’ test performance under the implicit conditions is greater over time. In contrast, it shows little change under the explicit instruction. In a study focusing on the direct object pronouns in L2 French, Erlam (2005) investigates whether there is a relationship between learners’ language aptitude and three different instructional methods (i.e., deductive instruction, inductive instruction, and structured input instruction) among three groups of high school students. His findings indicate that the deductive instruction minimizes the effects of individual differences on learners’ language aptitude and that both inductive and structured input instruction benefit learners with higher language analytical ability.

Different from the predictive approach where learners’ knowledge or use of a particular structure is not tested, the interactionist approach consistently manipulates variables and focuses on tasks/tests that target one or more specific linguistic structures. Studies on how language aptitude interfaces with explicit versus implicit treatments show that aptitude seems to be more involved in explicit rather than in implicit treatments (e.g., Li, 2015), whereas investigations on how language aptitude affiliates with the inductive-deductive instruction reveal that aptitude is more likely to be involved in inductive approaches that favor learners with high language aptitude and unlikely or less probable in deductive approaches (e.g., Erlam, 2005; Hwu & Sun, 2012).

The distinction between explicit and implicit processes has been considered a relevant factor to understand the construct of language aptitude. The former comprises “deliberative mental operations” (Dulany, 2012, p. 207), involves some reasoning, and consciously focuses on significant

(10)

aspects of language. The latter consists of “evocative mental processes” (Dulany, 2012, p. 207), involves no awareness and focuses on language input development due to incidental exposure. DeKeyser (2000) claims that aptitude is only relevant for explicit learning; however, some other researches on language aptitude reveal that aptitude also plays an important role in implicit learning (e.g., Granena, 2012, 2013a; Linck et al., 2013). Hence, aptitude becomes fundamentally important for all types of language learning.

Although informal/naturalistic learning settings differ from formal/instructional ones in which in the former L2 learners get more insights from daily experiences and are more exposed to the target language, and in the latter L2 learners get more hierarchically systematized and structured instruction, Skehan (2015) claims that previous studies on the relationship between language aptitude and formal instruction reveal that “higher aptitude seems associated with a greater capacity to benefit from instruction, whether this is explicit or implicit” (Skehan, 2015, p. 373).

2.2 Language Aptitude Components

After the advent of World War II (1939-1945), US military services recognized the need to have considerable numbers of available personnel who could speak foreign languages during wartime.

Assigned to create an L2 aptitude test for the US government, John B. Carroll, in conjunction with Stanley Sapon, developed the Modern Language Aptitude Test (MLAT) in a five-year research project at Harvard University between 1953 and 1958. The initial aim of the MLAT was to help the US government find military, missionaries, and civilian employees at the Foreign Service Institute (FSI) who could learn a foreign language successfully.

Along with Stanley Sapon, Carroll administered 34 cognitive tests (Winke, 2013) among a thousand military personnel who took a one-week intensive Mandarin Chinese course that focused on oral abilities. Using exploratory factor analysis (a statistical method used to identify the underlying

(11)

relationships between measured variables), they identified five tests that tapped into independent abilities: (1) number learning (a test that measures one’s memory and auditory alertness); (2) phonetic script (a test that measures one’s ability to associate speech sounds and written symbols); (3) spelling clues (a test that measures one’s vocabulary knowledge of English and sound-symbol association ability); (4) words in sentences (a test that measures one’s sensitivity to grammatical structure without any reference to grammatical terminology); and (5) paired associates (a test that measures one’s rote memorization ability). According to Carroll (1962, 1981, 1990), Robinson (2005a), and Skehan (2002), these five tests were independent of each other, practical to carry out and, used together, predicted L2 learning success and the rate of learning (Carroll, 1962).

These five tests comprise the MLAT (Carroll & Sapon, 1959) nowadays and access four components of language learning aptitude: (1) phonemic coding ability (the capacity to analyze and retain unfamiliar sounds); (2) grammatical sensitivity (the ability to identify functions of words in sentences); (3) inductive language learning ability (the capacity to identify patterns grounded in language samples); and (4) associative memory (the ability to make form-meaning links to remember new words).

Undertaking several MLAT validation studies after his five-year research project, Carroll (1962, 1963, 1966) concluded that the MLAT scores of L2 learners from different backgrounds in intensive learning settings correlated quite well with L2 achievement with correlation coefficients ranging from . 40 to .65 (e.g., Carpenter, 2008; Skehan, 1998). Besides, Carroll (1965, as cited in Saito, 2017, p. 667), found that MLAT scores could “predict students’ final course grades, teachers’ evaluations, and SAT scores”.

Apart from the MLAT, other aptitude test batteries, such as LLAMA test (Meara, 2005) with its predictive approach, and CANAL-F test (Grigorenko et al., 2002) and HiLAB (Linck et al., 2013)

(12)

based on cognitive psychology, have also been widely used to measure aptitude in SLA. For instance, LLAMA scores were associated with the oral performance of taught groups of L2 learners and were found to be predictive of learners’ lexical richness and their ability to retain lexical information (e.g., Saito, 2017), whereas HiLAB scores in phonological short-term memory (PSTM), associative memory, and implicit learning distinguished successful from very successful L2 learners in reading and listening proficiency (e.g., Saito, 2017; Skehan, 2015).

Since its creation, the MLAT has been criticized because its grounds are rooted in the audio-lingual teaching method whose framework is based on the mechanical drills and rote learning of the Behaviourist learning theory (e.g., Li, 2015). Although theoretical developments have been elaborated relating aptitude components to the stages of information processing (e.g., Skehan, 2002, 2012) and proposals have been made on exploring interactions between aptitudinal complexes and different contexts for learning (e.g., informal vs. formal contexts) (e.g., Robinson, 2007), “the ‘standard’ set of components have, by default, been those measured by the MLAT” (Li, 2015, p. 387).

2.3 Explicit and Implicit Knowledge

The study of L2 learners’ language can provide information about their underlying explicit and implicit knowledge.

Anderson (1983) distinguishes explicit and implicit knowledge as declarative and procedural, respectively. He suggests that declarative knowledge involves knowledge of abstract rules and patterns (e.g., an L2 learner referring to the use of articles). In contrast, procedural knowledge involves automatization (e.g., an L2 learner gains control over the abstract rule and is able to restructure his/her declarative knowledge “into if-then productions of increasing delicacy” (Ellis, 2005, p.149)). Adding to the dimension of the explicit versus implicit distinction, Ellis and Barkhuizen (2005) claim that explicit

(13)

knowledge is analyzed (conscious) and metalingual (knowledge of linguistic terminology), whereas implicit knowledge is formulaic (e.g., chunks such as ‘I don’t know’ and ‘How do you do?’) and rule-based (unconscious).

Recent years have witnessed considerable empirical studies investigating implicit and explicit knowledge in the field of SLA (e.g., Andringa & Rebuschat, 2015; DeKeyser, 2003; Hulstijn, 2005). Although an L2 learner’s performance can provide information about both types of knowledge, it is quite a challenge to determine which type of knowledge an L2 learner’s production reflects.

To discover what L2 learners know, some SLA researchers prefer to rely on learners’ intuition (e.g., learners have to judge the grammaticality of sentences shown to them). For example, in a study about the dissociation between conscious and unconscious knowledge, Dienes and Scott (2005) asked volunteer subjects from the University of Sussex participating in an artificial grammar learning experiment to state whether each item they judged in a test was based on guess, intuition, memory, or rule. Judgments attributed to memory and rule were regarded as a reflection of explicit knowledge, whereas judgments attributed to guess and intuition indicated a contribution of implicit knowledge. In another study, Bowles (2011) proposed to validate Ellis’ (2005) battery of tests, which provided relatively separate measures of implicit and explicit language knowledge. Bowles’ (2011) investigation tested Spanish native speakers (NSs), L2 and heritage language (HL) learners of Spanish in five tests (oral imitation, oral narration, timed and untimed grammaticality judgment test (GJT), and metalinguistic knowledge test) and results provided evidence that (1) test scores loaded on implicit knowledge and those loaded on explicit knowledge support the construct validity of Ellis’ (2005) tests and that (2) L2 learners scored higher on tests of explicit knowledge and lower on tests of implicit knowledge, HL learners scored lower on tests of explicit knowledge and higher on tests of implicit knowledge and that NSs, L2 learners, and HL learners all scored significantly lower on the timed GJT than they did on the untimed GJT. Besides, similar results were reported by Philp (2009) on an

(14)

investigation that examined the relationship between background-contextual variables and English language knowledge.

However, other SLA researchers prefer to collect samples of L2 learners’ language to uncover what they know. Their choice is based on the fact that L2 learners’ analyses of grammaticality judgments yield different outcomes. Thus, to elicit both spontaneous and authentic language, SLA researchers have proposed the use of pedagogic tasks where L2 learners can primarily focus on the proposed message instead of linguistic form production (e.g., Spada & Tomita, 2010). Such assessments may include picture narrative tasks (e.g., Derwing et al., 2004; Saito, 2017; Saito et al., 2016; Trofimovich & Isaacs, 2012), story retelling tasks (e.g., Yilmaz & Granena, 2016), oral interview tasks (e.g., Abrahamsson & Hyltenstam, 2008), and monologue tasks (e.g., Derwing, et al., 2004) and may be used for the measurement of both productive language skills, that is, speaking and writing. Additionally, Ellis (2005) claims that different types of language tasks seem to tap into different types of knowledge. In his view, tasks that require awareness of a rule, are unpressured by time, and focus on form (e.g., a writing task) tap more into explicit knowledge, whereas tasks that are pressured by time, whose answer is according to feel and whose focus is on meaning (e.g., an oral narrative) tap more into implicit knowledge. In Ellis’s view, explicit knowledge “is held consciously, is learnable and verbalisable, and is typically accessed through controlled processing when learners experience some kind of linguistic difficulty in using the L2”. In contrast, implicit knowledge “is procedural, is held unconsciously, and can only be verbalized if it is made explicit” (Ellis, 2006a, p. 95).

Kuiken & Vedder (2012) state that speaking and writing productions are related to the cognitive processes that language learners go through during task performance. In their view, “speaking and writing pose different demands on cognitive involvement and may be characterized by the use of different linguistic features (Halliday, 1989). Language learners may, therefore, perform a task differently in the written mode compared to the oral mode” (Kuiken & Vedder, 2012, p.150). Thus,

(15)

aspects such as explicit versus implicit knowledge, planning time, verbalization time, attention, and working memory may contribute to the different aspects of speaking and writing.

During speaking and writing processes, the cognitive load upon individuals is related to the demands of processing information (using working memory and attention). Human beings possess a limited capacity of attentional resources, and their oral and written performances differ when they are required to engage in different attention-demanding tasks (Skehan, 1996). Depending on task difficulty, the cognitive load increases or decreases according to the amount of attention given to a task (Lively et al., 1993). In speaking, subjects have to (re)access stored information to produce coherent speech, whereas, in writing, they can retrieve information from what they have written down during the planning time. Towell, Hawkins, and Bazergui (1996) claim that individuals are under time pressure in speaking, whereas, in general, there are no time constraints in writing. Therefore, under time restriction during speaking tasks, subjects cannot rely that much on explicit declarative knowledge, which is slower than implicit proceduralized knowledge.

In sum, the issue of implicit and explicit knowledge plays a central role in SLA research (e.g., Ellis, 2011; Williams, 2009), and comparing both spoken and written L2 performances can contribute to the understanding of how different cognitive demands affect L2 speakers’ production.

2.4 CAF Measures

The triad, complexity, accuracy, and fluency (CAF), functions as the major variables in SLA research. These three factors address the issues of (1) measuring L2 learners’ performance, that is, oral and written language, and (2) indicating learners’ proficiency level underlying their performance (Housen & Kuiken, 2009).

In the 1990s, CAF’s traditional working definitions were formulated, and they are still currently used. Complexity has regularly been described as “the extent to which the language produced in performing a task is elaborate and varied” (Ellis, 2003, p. 340), accuracy as “the ability to produce

(16)

target-like and error-free speech”, and fluency as “the ability to produce and process the L2 with ‘native-like rapidity” (Lennon, 1990, p. 390) or “the extent to which the language produced in performing a task manifests pausing, hesitation, or reformulation” (Ellis, 2003, p. 342).

To date, CAF research reveals that initial investigations to gauge L2 learners’ performance surfaced in the 1970s and were mainly divided into two major strands (e.g., Housen et al., 2012; Wolfe-Quintero et al., 1998). On the one hand, based on L1 acquisition research, where the mean length of utterance (MLU) was used as an established index (e.g., Brown, 1973; Hunt, 1965), L2 researchers searched for an L2 developmental index with which they could “expediently and reliably gauge proficiency in an L2” (Larsen-Freeman, 1978, p. 469) in an empirical, and measurable way (e.g., Hakuta, 1975; Larsen-Freeman, 1978, 2009; Nihalani, 1981). On the other hand, around the same period, research on L2 pedagogy made a distinction between accuracy and fluency in L2 performance to probe the development of oral L2 proficiency in instructional settings (e.g., Brumfit, 1979, 1984; Hammerly, 1991). Brumfit (1979, 1984) was one of the first to use the accuracy-fluency dichotomy to distinguish accuracy-oriented tasks, which center on linguistic forms and structures that are grammatically correct in the L2, from fluency-oriented tasks, which center on spontaneous speech production in the L2 (e.g., Hammerly, 1991).

In the 1990s, Skehan (1996, 1998) proposed an L2 proficiency model adding complexity as the third concept of the triad and including CAF as the three main proficiency dimensions in L2 usage (e.g., Housen & Kuiken, 2009).

2.5 CAF Measurements

L2 researchers have mainly investigated L2 oral and written production data to particularly verify the quantifiable linguistic phenomena that contribute to the understanding of complexity, accuracy, and fluency.

(17)

In the SLA literature, complexity is considered as the most ambiguous and controversial dimension of the CAF triad (e.g., Robinson, 2001; Skehan, 2001), accuracy the most transparent and consistent construct (e.g., Hammerly, 1991; Housen & Kuiken, 2009; Pallotti, 2009; Wolfe-Quintero et al., 1998), whereas fluency pertains chiefly to spoken language measurement. Nevertheless, L2 writing research makes use of measures of fluency as well, such as recent studies that have used keystroke logging software to register online writing features (e.g., Leijten & van Waes, 2013; Révész et al., 2017). However, measuring CAF’s validity, reliability, and efficiency is not an easy task. Housen and Kuiken (2009) claim that subjective measures such as ratings by experts and objective measures such as types of errors for accuracy or number of filled pauses for fluency, for instance, have been used to gauge one’s L2 performance.

Apart from the different types of measures used, another problem to measure CAF is its interdependency aspect (e.g., Larsen-Freeman, 2009; Skehan, 2009). Towell’s (2012) proposal of a cyclical developmental sequence states that L2 learners first learn a new structure/form (thereby, they expand their complexity). Then, they may learn how to use this structure/form more and more correctly (hence, they become more accurate), and at last, they may learn how to use this structure/form with more automaticity, efficiency, and fluency. In Towell’s (2012) view, the three dimensions must be fully integrated with each other within the speaker’s linguistic system so that an L2 speaker can produce complex language in an accurate and fluent way. Larsen-Freeman (2009) proposes longitudinal research to verify CAF’s cyclical developmental sequence. In a recent study, Vercellotti (2017) longitudinally investigates 294 monologues of ESL learners codifying each transcription for complexity, accuracy, and fluency. Her results show that (1) learners who have high proficiency scores initially also have high accuracy scores initially; (2) both accuracy and lexical complexity scores are correlated; (3) there is a positive correlation among all within-individual scores; and (4) all CAF constructs are connected and grow together over time, and there are no trade-off effects among them. 1

A trade-off effect is defined as a learner prioritizes one component of the CAF triad over the others (Ellis & Barkhuizen, 1

(18)

Unlike Vercellotti’s (2017) findings, Yuan and Ellis (2003) report on trade-off effects in the CAF scores of 42 undergraduate students of ESL. Their conclusion on the effect of planning on oral language performance is that there is a trade-off effect between accuracy and fluency among different planning groups but no trade-off effect within each planning group. In writing, a study by Ishikawa (2007) shows that within task-complexity groups, there are no trade-off effects between complexity and accuracy in the written texts of 54 Japanese high school students.

Besides, predicting a competitive relationship among the CAF constructs, Skehan’s (1996) cognitive framework hypothesis advocates that equal attention on all CAF aspects during language performance does not occur due to individuals’ limited mental resources. In his view, on the one hand, attention on complexity may increase in complex tasks. In contrast, accuracy may decrease; on the other hand, attention on complexity may decrease in a simple task, whereas accuracy may increase. Nonetheless, in Robinson’s (2001) cognitive hypothesis, such a competitive relationship between complexity and accuracy in L2 learners’ performance does not occur. In his view, in complex tasks, there can be a simultaneous improvement in both complexity and accuracy without any positive relationship with fluency, whereas, in simple tasks, there can be a decrease in accuracy and complexity and an increase in fluency.

In sum, the different research results, along with the contrastive ideas about the cognition among the CAF constructs in language performance, indicate that it remains unclear how complexity, accuracy, and fluency exert influence on each other.

2.6 Accuracy

Among the three CAF constructs, accuracy is the focus of the current study. Pallotti (2009) states that accuracy is simple to define due to its internal coherence. In agreement with Pallotti (2009), Polio and Shea (2014) also claim that “accuracy is relatively easy to define” (p.10). According to Housen et al. (2012), “accuracy is the ability to produce target-like and error-free language” (p. 2).

(19)

Here a question comes to mind: What is an error? Lennon (1991) defines error as “a linguistic form or combination of forms which, in the same context and under similar conditions of production, would, in all likelihood, not be produced by the speakers’ native speaker counterpart” (p. 182). In other words, accuracy refers to error-free speech or writing and measures the deviant degree from a particular norm. However, the choice of which linguistic norm should be considered poses a challenge. The criteria can be tuned to prescriptive standard norms or to “non-standard and even non-native usages acceptable in some social contexts or in some communities” (Housen & Kuiken, 2009, p. 463). Either way, measuring accuracy in L2 production involves decision-making about the amount of deviation from the norm.

In SLA research, holistic scales (e.g., Polio, 1997), global measures (e.g., number of error-free clauses, number of errors per 100 words), and specific measures (e.g., number of noun-adjective gender agreement errors) have been used to gauge L2 accuracy. To choose which measure should be adopted to approach accuracy depends on the language type expected. Kuiken and Vedder (2008) have proposed a categorized distinction between first (e.g., spelling or omitted article), second (e.g., word order), and third (e.g., combination of wrong word choice making a sentence/utterance incomprehensible) degree error concerning communicative adequacy while Foster and Wigglesworth (2016) have recently suggested a weighted measure for accuracy where a score is assigned to clauses as per their accuracy.

In sum, the benchmark for evaluating accuracy and analyzing errors poses a dilemma (e.g., prescriptive grammatical norms vs. non-standard/native usages). Besides, there is a range of accuracy measures that can be used to analyze L2 learners’ both oral and written production. Therefore, the decision-making about the amount of deviation from the norm along with which measure should be adopted may produce different results, which, in turn, poses a challenge to compare these results across different studies on L2 learners’ accuracy in SLA (Ellis & Barkhuizen, 2005).

(20)

2.7 Lexical and Morphosyntactic Accuracy in SLA

The construct of L2 linguistic accuracy comprises lexical, morphosyntactic, phonological, and pragmatical error analyses in SLA. The focus of the current study is on lexical and morphosyntactic error analyses.

Much attention has been given to L2 learners’ acquisition of morphosyntax rather than to their acquisition of lexis because whereas morphosyntax was originally conceived as a finite set of principles and parameters (e.g., Chomsky, 1981), lexis has posed a challenge to SLA researchers due to its “unstable and unsystematic nature” (Agusten Llach, 2011, p. 57).

A review of the literature on SLA reveals that there is a large number of investigations that explore the influence/effect of a wide variety of contextual/external factors (e.g., communicative adequacy, the learning context, task complexity, task planning time, task repetition, longitudinal development, instructional effects, computer-mediated communication, and corpus-based techniques) on L2 accuracy in either oral or written modalities (see Mora & Valls-Ferrer (2012), and Révész et al. (2016) for oral performances; see Bardovi-Harlig & Bofman (1989), Larsen-Freeman (2006), and Thewissen (2013) for written performances). However, few studies on L2 accuracy in SLA focus on the differences between both oral and written performances to examine the interplay between explicit and implicit knowledge.

Investigations into oral and written performances include Ellis and Yuan (2005), who investigated the effects of within-task planning on oral and written performance among a homogeneous group of 42 full-time undergraduate Chinese students who had never been to an English-speaking country and had rarely used English outside the classroom. Using measures of accuracy (error-free clauses and correct verb forms), they found that more correct clauses and verbs were produced by the careful planning group than the pressured planning group and that participants’ language was more accurate when writing than when speaking. Using measures of accuracy, Sauro (2012) compared oral and written SCMC (synchronous computer-mediated communication) narrative tasks among 21

(21)

university learners of English from diverse backgrounds. Results revealed no significant difference in either lexis or syntax in the two modes. However, considerable variability emerged individually concerning the use of more accurate language in text-chat over spoken discourse. In another study, Kuiken and Vedder (2012) investigated L2 accuracy as a function of task complexity and proficiency level in oral and written performances of Dutch students of Italian L2 and French L2. Using general performance measures for accuracy (total number of errors per T-unit, number of first and second-degree errors per T-unit for task complexity and total number of errors per T-unit, number of second and third-degree errors per T-unit for proficiency level) and a more detailed analysis of accuracy according to the type of errors (i.e. grammar, lexicon, spelling, appropriateness, and other errors), they found that in the written mode (1) the effects of task complexity and proficiency level influenced accuracy significantly for both groups of students, (2) the effects of task complexity on accuracy were due to a decrease of lexical errors implying that participants’ attentional resources during task completion were primarily focused on control of lexical form, (3) task complexity influenced accuracy as both groups of learners made more errors on grammar and lexicon rather than in the other types of errors, and fewer lexical errors were produced in the more complex task in both groups, and that (4) the students of French who had a higher proficiency level than the learners of Italian made more errors on appropriateness and other errors and fewer ones on spelling in the more complex task, whereas for the learners of Italian no differences between these two types of errors were found. In the oral mode, analyses showed (1) a significant influence of proficiency level on accuracy regarding the total number of errors and the number of first, second and third-degree errors per AS-unit, (2) a significant effect of task complexity on accuracy regarding the total number of errors, second and third-degree errors per AS-unit, (3) in general, fewer errors in the oral mode (except for appropriateness), (4) more errors regarding grammar and lexicon as in the written mode, (5) a significant effect of proficiency level regarding other errors, and (6) fewer errors made by high-proficient learners than the low-proficient ones.

(22)

As shown, the findings on L2 accuracy in SLA research reveal the degree to which lexis and syntax vary in distinct L2 production modes. Besides, they indicate that proficiency level and task complexity also influence L2 accuracy performance. However, SLA studies on the degree to which language learning aptitude is related to lexical and morphosyntactic accuracy attainment in both oral and written performances remain still scarce.

2.8 Effects of Language Aptitude on Accuracy in SLA

Over more than five decades, SLA researchers have investigated the role of FL aptitude to verify the learning rate and ultimate attainment in L2 morphosyntactic accuracy (and to a lesser degree in L2 lexical accuracy) in FL instructional/formal contexts (e.g., Li, 2016; Skehan, 2015). As reported by Li (2015), “language analytic ability was the most frequently studied, followed […] by rote memory […]; this is not surprising given that language analytic ability was postulated to be critical for grammar learning” (p. 396).

SLA researches on the relationship between FL aptitude and L2 morphosyntactic and/or lexical accuracy in FL formal settings encompass studies by Erlam (2005) who investigated the relationship between language aptitude and three different types of instructional methods (i.e., deductive, inductive, and structured input) among 92 New Zealand high school students who were taught direct object pronouns in French. Results showed that (1) learners who got structured input gained more from instruction due to their higher language analytical ability (LAA) in writing, (2) learners from the inductive group also benefitted more on the written production due to their greater LAA, and (3) there were no gains between the effects of language aptitude and learning outcomes in the deductive instruction group due to learners’ individual differences. In another study, Hwu and Sun (2012) examined the interaction between language aptitude and two different instructional conditions (i.e., deductive and explicit inductive) among approximately 400 students who were taught the Spanish verb gustar (to like). Results revealed that students from both groups improved in the learning of the verb

(23)

gustar and that those with high language aptitude benefitted more from both instructional contexts than those with low language aptitude. In another study, Yalçin and Spada (2016) examined the relationship between FL aptitude and the learning of two English structures (i.e., passive voice (defined as a difficult structure) and past progressive tense (defined as an easy structure)) among 66 secondary level learners of English as a foreign language (EFL). Using a written grammaticality judgment and an oral production task design, results showed that (1) grammatical inferencing contributed to learners’ gains on the passive voice rather than on the past progressive in the written task and that (2) associative memory contributed to learners’ gains on the past progressive in the oral task. In another recent study, Saito (2017) examined the relationship between FL aptitude and L2 learners’ attainment through an oral task (picture narrative) of 50 Japanese EFL learners who had no opportunity to use their L2 abroad and studied English for 7 years in FL instructional settings. Results indicated that learners’ language aptitude scores in rote and associative memory ability were moderately associated with lexical and morphological accuracy whereas in language analytic ability they were significantly associated with lexical richness rather than grammatical accuracy, and no significant link was found between learners’ aptitude scores and proficiency level regarding lexical appropriateness.

2.9 Motivation for the Current Study

Although SLA researchers have given a considerable amount of attention to the multifaceted relationship between the type of language aptitude (explicit vs. implicit), L2 proficiency level (beginner, intermediate, or advanced), learning settings (instructional vs. naturalistic), and modality (spoken or written), it is worth mentioning that a vast majority of the investigations has been predominantly focusing on the role of language aptitude in the learning of L2 morphosyntax, and linguistic proficiency has mostly been assessed through comprehension tasks such as grammatical judgments (e.g., Abrahamsson & Hyltenstam, 2008; Granena, 2013b), or controlled production tasks, such as sentence readings (e.g., Roehr, 2008). To date, no research has inquired into the extent to which

(24)

language aptitude can be an important factor in L2 adult speakers’ lexical, morphological, and syntactic abilities in both oral and written productions. Therefore, a study on the relationship between language aptitude and accuracy in speaking and writing performances of ESL speakers can elucidate our understanding of their ultimate oral and written attainment in different communicative tasks and how these cognitively distinct demanding tasks diverge from each other.

Loosely based on the MLAT, the LLAMA test battery (Meara, 2005) framework of aptitude for L2 learning comprises rote and associative memory (LLAMA B), sound recognition (LLAMA D), phonemic coding (LLAMA E), and language analytic ability (LLAMA F). In the current study, only two components of the LLAMA test battery will be used: LLAMA B (designed to measure L2 speakers’ ability to learn vocabulary) and LLAMA F (designed to measure L2 speakers’ ability to learn grammar with intention and awareness).

In this study, adult L2 speakers’ speech and writing will be elicited via two picture-based narrative tasks: one for speaking and one for writing; however, the participants will be under different planning time conditions to perform each task. Both tasks will be analyzed via accuracy measures to tap into the domains of L2 speech and writing.

2.10 Research Questions and Hypotheses

Empirical and theoretical SLA studies indicate that (1) aptitude test batteries can predict L2 learners’ rate and ultimate attainment in L2 morphosyntactic accuracy; (2) both oral and written language performances tap into L2 learners’ underlying explicit and implicit knowledge; (3) speaking and writing productions tap into distinct loads of L2 learners’ cognitive process; and (4) due to time constraints, speaking taps more into implicit knowledge, whereas writing taps more into explicit knowledge due to time availability. However, among the many investigations on language aptitude, the main concern has been on its link with L2 morphosyntax of L2 learners’ oral productions in formal settings. Thus, by associating the dimensions of language aptitude with the constructs of linguistic

(25)

accuracy measures of ESL speakers’ oral and written performances, the current study is designed to enlighten our comprehension about their utmost achievement in distinct communicative productions and the interplay between explicit and implicit knowledge involved in the performance of both productions.

In light of the premises mentioned above, the current study aims at answering the following research questions:

1. To what extent does rote and associative memory (LLAMA B) predict L2 measures of lexical and morphosyntactic accuracy in speaking and writing?

2. To what extent does language analytic ability (LLAMA F) predict L2 measures of lexical and morphosyntactic accuracy in speaking and writing?

3. To what extent is the performance in lexical and morphosyntactic accuracy in speaking correlated with the one in writing?

4. To what extent does the correlation between performance in language aptitude and speaking differ from the strength of the correlation for writing?

This study will take an investigative approach to provide data that will address the probable relationship between language aptitude and accuracy in ESL spoken and written productions of a group of speakers who speaks English daily in naturalistic settings. Based on the existing SLA literature, Saito (2017) is the only study that approaches the relationship between language learning aptitude and lexical and morphosyntactic accuracy. Nevertheless, his study focuses only on L2 spontaneous speech production. It reveals that L2 learners’ language aptitude scores in rote and associative memory ability tend to have a moderate association with lexical and morphological accuracy, whereas, in language analytic ability, there is no significant association with grammatical accuracy. Therefore, the following hypotheses are formulated for research questions 1 and 2:

1. The relationship between rote and associative memory (LLAMA B) ability and lexical and morphosyntactic accuracy will be significant in speaking rather than in writing;

(26)

2. The relationship between language analytic ability (LLAMA F) and lexical and morphosyntactic accuracy will not be significant in both speaking and writing;

Investigating the effects of careful and pressured planning on both oral and written performances, Ellis and Yuan’s (2005) study reveals that when there is time availability, ESL speakers’ language is more accurate in writing than in speaking, and pressured speaking offers less opportunity to monitor the L2 output. Therefore, the following hypotheses are formulated for research questions 3 and 4:

3. The correlation between the performance in lexical and morphosyntactic accuracy in speaking with the one in writing will be small as the participants are under different time constraints to perform the speaking and the writing task;

4. The correlation between performance in language aptitude and speaking will be different from the strength of the correlation for writing as the participants are under different time availability to tap into L2 accuracy constituents;

(27)

3. Method

3.1 Participants

With typologically distinct L1s, 30 adult participants (12 males, 18 females) with ages ranging from 28 to 59 years old (M= 42.90, SD= 9.26) took part in this study. They represent 11 countries: Brazil, Canada, France, Germany, Greece, India, Macedonia, Russia, Spain, Switzerland, and The Netherlands.

All the participants live in The Netherlands, and 3 out of 30 (10%) have never lived abroad. The participants constitute a heterogenous group and are distributed as follows: 5 Brazilians (1 with French and Portuguese as L1s), 1 Canadian (with French as L1), 1 French (with Russian and French as L1s), 2 Germans, 2 Greeks, 2 Indians, 2 Macedonians, 1 Russian, 2 Spanish, 2 Swiss (1 with German, French and Swiss as L1s), and 10 Dutch.

The participants with end-state English L2 performance began acquiring the English language with ages ranging from 3 to 26 years old and received formal instruction in English for 7.83 years on average. Their educational level is quite high (3.3% have a Ph.D. degree, 66.7% have a Master’s degree, 26.7% have finished Graduate School, and 3.3% have polytechnic level), and they speak four languages on average.

Unlike Saito’s (2017) investigation, whose 50 Japanese learners’ use of English was highly limited outside of the classroom, 50% of the participants always speak ESL, 43.3% quite often, 3.3% sometimes, and 3.3% occasionally. They often speak it in public services (60%) and trips (100%), and at home (43.3%), work (76.7%), and school (26.7%), and on average they travel abroad around 5.33 times per year (and speak the English language as a means of communication during their trips). (see Appendix A for a complete overview of the participants’ profile).

According to DIALANG, a diagnostic language assessment system for teenage and adult learners (Huhta et al., 2002), the participants showed high variability in score and their general English

(28)

vocabulary level roughly corresponded to Common European Framework of Reference for Languages bands (Council of Europe, 2001) from B2 (upper-intermediate) to C2 (proficient users).

3.2 Materials

3.2.1 Speaking Task

One English speaking task (a picture narrative) was used to elicit participants’ speech, as described in De Jong et al. (2015). Designed as a role-play, the speaking task (see Appendix B) engaged the participants to produce descriptive and formal language. The participants, who witnessed a bike accident on the street, had to describe to a judge what they saw in a courtroom. The task was presented on a computer screen, and the participants’ speech was recorded in Praat, a free computer software to scientifically analyze speech in phonetics (Boersma & Weenink, 2017). The task consisted of several slides with specific information and pictures giving additional information. For this task, the participants had 60s of preparation time and 120s of speaking time, which was shown by a status bar at the bottom of the screen.

3.2.2 Writing Task

To elicit similar grammatical structures and vocabulary, the writing task (a picture narrative) (see Appendix C) mirrored the speaking task. The participants, who had eyewitnessed a theft followed by an accident on the street, had to write a message on a contact form they encountered in the Dutch police website. To perform this task, the participants had to write the message on paper as if they would have to send it by email. Five photos in sequence describing both crime and accident were provided. For better visualization, the photos were also presented on a computer screen to each participant. There was no time limit for this task, the participants had to write about 150 words, and it took them between 10 to 15 minutes to complete it.

(29)

3.2.3 Aptitude Test (LLAMA)

The LLAMA aptitude test (Meara, 2005) was initially developed as part of a research training program for MA students at the University of Wales Swansea. It was designed as a shorter, free, language-neutral test loosely based on the components that appeared in Carroll and Sapon’s (1959) MLAT.

The 2005 version of the LLAMA test consists of four sub-components, conventionally referred to as LLAMA B, LLAMA D, LLAMA E, and LLAMA F. It aims to measure four domains of L2 aptitude: vocabulary acquisition, sound recognition, sound-symbol correspondence, and grammatical inferencing, respectively.

For this study, the participants took LLAMA B first, followed by LLAMA F. The entire session took approximately 15 minutes.

3.2.3.1 LLAMA B

As a vocabulary learning module, LLAMA B assesses the participants’ ability to associate unfamiliar names to unfamiliar objects in a short period (similar to the MLAT’s paired-associates test). The participants were explicitly told about the aim of the test (i.e., vocabulary learning, followed by recall). In the initial phase, the participants visualized twenty pictures simultaneously on the computer screen. Clicking on a picture caused its name to be displayed. The participants had 2 minutes to examine all 20 pictures and learn their names. However, they could not take notes as they worked. As the program did not place any constraints on how the participants could do this, they could adopt any strategy to complete the task. In the testing phase, the program displayed the name of each of the 20 pictures, and the participants had to identify the picture by clicking on its correct name. Five points were scored for each picture correctly identified. Guessing was not corrected by the system. For random guessing, the score was 5 points. The entire test took about 5 minutes.

(30)

3.2.3.2 LLAMA F

LLAMA F is a grammar inferencing test (similar to the MLAT’s grammatical sensitivity task). In the initial phase, the participants were given 20 pictures and a sentence written in an artificial language describing each picture. The participants were expected to work out how the sentences were related to the pictures. By doing this, they would be able to intuit some of the grammatical and morphological features from this artificial language. The participants had 5 minutes to explore this data set, and they could make any notes. In the testing phase, they were presented with a new set of pictures with new elements incorporated. Two sentences accompanied each picture, and the participants were supposed to indicate which sentence was the correct one. They would be able to do this if they had internalized the grammatical rules evidenced in the initial phase. Five points were awarded for a correct answer and five points deducted for an incorrect choice. The entire test took between 10 to 15 minutes.

3.2.4 DIALANG

Proposed by Huhta et al. (2002), with the financial support of the European Commission, DIALANG is a foreign language assessment system that diagnoses user’s language proficiency. Gratuitously accessible via the internet, it offers tests in five language skills (reading, writing, listening, grammatical structures and vocabulary) in fourteen European languages (Danish, Dutch, English, Finnish, French, German, Greek, Icelandic, Irish, Italian, Norwegian, Portuguese, Spanish, and Swedish) and its design follows the Common European Framework of Reference (CEFR) (Council of Europe, 2001) scale for languages (A1, A2, B1, B2, C1, or C2: A is attributed to the basic user level, B for the independent user, and C for the proficient one).

To avoid fatigue, the participants administered the English vocabulary test online. This test is comprised of 30 items in the form of multiple-choice, gap-filling, and short-answer questions, and they are listed by sub-skills such as “word combination”, “word formation”, “semantic relations”, and “meaning”. The DIALANG system provided the participants with information about their strengths and

(31)

weaknesses through feedback on their answers after completing the test. On average, each participant took between 15 to 20 minutes to complete it.

3.2.5 LEAP-Q

To elicit information about the participants’ linguistic profile, they filled in a shortened version of The Language Experience and Proficiency Questionnaire (LEAP-Q) (see Appendix D), which was developed by Marian and Kaushanskaya (2007).

3.2.6 Informed Consent

The participants dated and signed an informed consent (see Appendix E) stating that all information collected throughout this research would be used only for scientific purposes.

3.3 Procedure

Thirty participants were invited to participate in this investigation. All of them engaged in all tasks cooperatively.

To be well-acquainted with the purpose of this investigation, the participants got information about the acquisition and use of a second language from a cognitive perspective and that this study was a way to seek answers to the question of why individuals differ in learning a second language. Further, they got an explanation about what they were supposed to do before each task.

The participants completed all the tasks in the same order. They first took the English speaking task, proceeded to the aptitude subtests (LLAMA B and LLAMA F, respectively), and then engaged in the English writing task. After concluding this phase, the participants took the 30 vocabulary questions of the DIALANG test, filled in the language background questionnaire (LEAP-Q), and dated and signed the consent form.

(32)

All data collection sessions took place individually in a quiet room in the participant’s place, and the entire session took approximately 60 minutes per participant.

3.4 Material Preparation

After collecting the data from the participants, the 30 full-length recordings were transcribed, resulting in a full transcript that contained repetitions and repairs, and contractions (e.g., don’t). Next, all the 30 full transcripts were cleaned by removing the repetitions and repairs, and all contractions were written in full form (e.g., do not). This procedure resulted in 30 clean transcripts, which were used to get the total number of words (tokens). Besides, if there were contractions (e.g., don’t) in the compositions collected, they were counted as two words (e.g., do not), and a word count was done manually to get the total number of words (tokens) of each participant’s written production.

3.5 Calculating Accuracy Measures

Ellis and Barkhuizen (2005) suggest that “percentage of error-free clauses and errors per 100 words serve as general measures of accuracy” (p. 151). Skehan and Foster (1999) also claim that “a generalized measure of accuracy is more sensitive to detecting differences between experimental conditions” (p. 229). Additionally, Michel (2017) asserts that by adopting global accuracy measures (e.g., number of errors per 100 words), accuracy can be compared “over different languages, populations, and tasks” (p. 55). Thus, for accuracy analysis in this study, the total amount of lexical errors and the total amount of morphosyntactic errors were analyzed in both speaking and writing tasks. To calculate lexical accuracy, the percentage of lexical errors (number of lexical errors / number of words X 100) was calculated. To calculate morphosyntactic errors, the percentage of morphosyntactic errors (number of morphosyntactic errors / number of words X 100) was calculated.

(33)

3.6 Analyzing Errors

To detect accuracy errors, Polio and Shea (2014) suggest a guideline for coding such errors. The researcher coded each error type in the guideline as either lexical (red color) or morphosyntactic (green color) error types (see Appendix F).

Error analysis to determine the number of errors included every error in lexis and morphosyntax, but excluded intonation or pronunciation errors in the speaking task and punctuation errors (e.g., comma errors) and non-severe spelling errors (e.g., capitalization errors) in the writing task. A severe punctuation error (e.g., “its” instead of “it’s”) was counted as a morphosyntactic error.

For lexical errors, they were characterized as such in lexical choices that were deviant in prepositions, modals, pronouns, quantifier-noun agreement, articles, conjunctions, or any missing or extra word. As for morphosyntactic errors, they were defined as choices that affected verb phrases, relative clauses, passive voice, subject-verb agreement, comparative/superlative formation, word order, gerund/infinitive, and genitive.

Apart from Polio and Shea’s (2014) error taxonomy, some other error categorizations were considered. Repeated errors (i.e., recurring and consistent forms based on incorrect assumptions) (e.g., the use of the definite article “the” before street names) were counted as only one error. Additionally, reference errors (i.e., errors that could be inferred from the context) (e.g., “the light was green” instead of “the traffic light was green”) were not counted.

The researcher annotated all the 30 English speech transcriptions together with the 30 writing tasks. All participants’ lexical errors were marked in red color and the morphosyntactic ones in green color in all clean transcriptions (see Appendix G) and writing tasks (see Appendix H).

When doubts arose regarding a probable error, the researcher relied on the British National Corpus (BNC) (Davies, 2004-), The Corpus of Contemporary American English (COCA) (Davies, 2008-), or the online version of the Dictionary.com (Kariger & Fierro, 1995) to decide whether it constituted an error or not. In the case of a probable error occurrence, where neither the BNC, COCA,

(34)

nor the Dictionary.com elucidated on what decision to make, the researcher consulted the supervisor of this study for advice. This approach roughly accounted for 1% of the time.

3.7 Data Analysis

All participants’ LLAMA B and F results, together with their corresponding number of lexical and morphosyntactic errors per 100 words in speaking and writing, were uploaded in a Microsoft Excel (Microsoft Corporation, 2013) spreadsheet (see Appendix I).

To answer research question 1, Spearman correlation analyses were performed between the participants’ scores on the rote and associate memory test (LLAMA B) and their corresponding number of lexical and morphosyntactic errors per 100 words in speaking and writing. This procedure resulted in two correlations for speaking and two for writing. A similar procedure occurred to answer research question 2. The participants’ scores on the language analytic ability test (LLAMA F) were associated with their corresponding number of lexical and morphosyntactic errors per 100 words in speaking and writing, and two correlations for speaking and two for writing were obtained.

To answer research question 3, Spearman correlation analyses were also conducted. The participants’ lexical errors per 100 words in speaking were correlated with the ones in writing, whereas their morphosyntactic errors per 100 words in speaking were correlated with the ones in writing.

Research question 4 addressed the comparison of the correlations between participants’ performance in language aptitude and speaking with the strength of the correlation of language aptitude and writing. The significance of these correlations was checked by using Fisher's r to z transformation (Eid et al., 2011) and taking into account the relevant correlation between speaking and writing. All the results of these correlations were uploaded to Psychometrica (W. Lenhard & A. Lenhard, 2014) to test the significance of the difference between the correlations. The correlations retrieved were based on the same sample (N= 30), which led to dependence within the data set, increasing the power of the significance test.

(35)

4. Results

For a more precise and localized interpretation of the effect sizes, the current study adopted Plonsky and Oswald’s (2014) proposal of field-specific and empirically-based benchmarks for the r values in which the correlation coefficients of .25 were considered small, .40 medium, and .60 large.

As for the p-value, to determine the statistical significance of the observed results, the current study considered the p-value that is commonly regarded as significant in the social sciences. Thus, for comparison, correlations with the p-value less than .05 (p < .05) were considered significant.

Descriptive Statistics for LLAMA Subtests Scores and Accuracies

Table 1 below summarizes the 30 ESL speakers’ language aptitude scores for the two LLAMA subtests (LLAMA B and F) along with the accuracy measures for both speaking and writing performances.

Table 1 Descriptive statistics for LLAMA subtests scores and Performances

________________________________________________________________________________ Range _________________ M SD Min Max Subtest ________________________________________________________________________________ LLAMA B (100 points) 36.7 15.9 5 65 LLAMA F (100 points) 41.0 27.1 0 100 ________________________________________________________________________________ Performances Accuracy/Words ________________________________________________________________________________ SPEAKING LE/Words 1.5 1.04 0 3.90 MSE/Words 1.1 0.82 0 2.60 WRITING LE/Words 1.8 1.44 0 4.69 MSE/Words 0.8 0.79 0 2.84 ________________________________________________________________________________

(36)

The participants attained relatively higher scores (M= 41) on the LLAMA F (language analytic ability) compared to the LLAMA B (rote and associative ability) (M= 36.7). According to the Llama manual (Meara, 2005), most people who score for Llama B and F within the range between 25 to 45 attain an average score. Therefore, the participants’ overall performance in both subtests was moderate. Besides, the difference between the minimum and maximum range on both tests indicated high variability among the participants.

As for accuracy per 100 words, the participants’ errors were relatively higher in lexis (M= 1.5) than in morphosyntax (M= 1.1) in speaking. A similar trend occurred in the participants’ errors in lexis (M= 1.8) and morphosyntax (M= 0.8) in writing. Also, the difference between the minimum and maximum range for both accuracy measures in both L2 performances demonstrates a relatively high variability among the participants.

Vocabulary and Grammar Learning Aptitude-Accuracy Links

Spearman correlation analyses were performed to examine how strongly participants’ rote and associative ability and accuracy constituents were related in both speaking and writing. As shown in Table 2, the correlation coefficients fell within the small range (rs= -.17 and rs= .08) for both lexical and

morphosyntactic accuracy in speaking, respectively. In contrast, they fell within the medium-to-small range (rs= -.39 and rs= .44) for both lexical and morphosyntactic accuracy in writing. The L2 speakers’

LLAMA B scores were significantly associated with lexical (p= .03) and morphosyntactic accuracy (p= .01) in writing. The significance of these associations indicates that the higher the scores on LLAMA B, the lower the number of lexical errors. In contrast, a reverse trend occurred for the association between LLAMA B scores with the morphosyntactic errors. No significant association was found between LLAMA B scores and accuracy in speaking.

Referenties

GERELATEERDE DOCUMENTEN

Chapter 6: Implicit learning of a useful structure by adults and children 83 Chapter 7: Can grammar complexity be counterbalanced by a semantic 95.

participants who did not know what kind of structure to expect, explicit learning did not differ from implicit learning and knowledge acquisition was guided by salience. Chapter

To determine whether or not complexity affects the type of knowledge acquired in artificial grammar learning, separate linear regression analyses were performed for each type of

If learning were restricted to the aspect of the structure that is most useful to the task in the induction phase, participants memorizing exemplars from the stimulus set with

Specifically, we investigated whether removing the memorize task from the intentional condition and instructing participants to look for specific rules would enable them to

Two experiments were performed to test the hypothesis that implicit artificial grammar learning occurs more reliably when the structure is useful to the participants’ task than

The conditions under which implicit learning occurred, however, where the same for adults and children: they learned when the structure was useful to their current task, but not

In addition, they were required to indicate their confidence in each judgment on a scale from 1 (very little) to 5 (very much) by pressing one of the number keys on the