• No results found

An investigation into fluency among L2 users in an oral examination setting

N/A
N/A
Protected

Academic year: 2021

Share "An investigation into fluency among L2 users in an oral examination setting"

Copied!
63
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

AN INVESTIGATION INTO FLUENCY

AMONG L2 USERS IN AN ORAL

EXAMINATION SETTING

Word count: 16.914

Hélène t’Kindt

Student number: 01504992

Supervisor(s): Prof. David Chan

A dissertation submitted to Ghent University in partial fulfillment of the requirements for

the degree of Master of Arts in Multilingual Communication

(2)
(3)

Copyright Statement

The author and the supervisor give permission for this study as a whole to be made available for personal use. Any other use is subject to copyright restrictions, in particular with regard to the obligation to explicitly mention the source when citing data from this study.

(4)

Acknowledgements

First and foremost, I would like to express my gratitude to my thesis advisor Prof. David Chan of the Faculty of Arts and Philosophy at the University of Ghent. He was always ready to help whenever I encountered a problem or had a question about my research or writing. His enthousiasm, positivity and devotion throughout the entire process motivated me to believe in myself and to continue expanding my own boundaries.

Secondly, I would like to acknowledge Marie Jacobs as the second reader of this thesis, as I am grateful for her valuable comments on this thesis, which will allow me to further develop my linguistic knowledge and competences.

Finally, I would like to thank my dearest parents and friend Liesel Maertens for standing by my side throughout my five years of study and for providing me with continuous moral support and motivation. Their kindness, help and positivity earn my eternal gratitude and appreciation.

(5)

Preamble – the impact of the Corona crisis

For this study, the notion of L2 fluency was investigated. For that purpose, the study was subdivided into a theoretical section on the one hand, in which a definition of fluency as a construct was proposed as well as the most effective means of measuring L2 fluency, and a more practical section on the other, in which the degree of L2 fluency of 26 L2 students who are in their first year of Applied Linguistics was analysed. For the analysis, the variables of speed fluency, repair fluency and breakdown fluency were measured and analysed. In the normal course of events, the variables would have been measured three times for each recording, once for the longest monologic and once for the longest dialogic task performance as well as once for the overall recording. The variables of repair fluency (i.e. for the monologic, dialogic and overall task performance) and breakdown fluency (i.e. for the monologic and dialogic task performance) would have been measured manually, whereas the variables of breakdown fluency (i.e. for the overall recording) and speed fluency would have been measured by the means of the software program Praat.

The Corona crisis did not have an impact on the data collection, but rather on the measurement process. When measuring the variables of speed fluency and breakdown fluency, some technical problems occurred regarding the software program Praat. As it was impossible to solve the problems via an online meeting (nor with my supervisor Prof. David Chan, nor with Prof. Ellen Simoens, who is familiar with the program Praat), the software program could not be used. Therefore, we decided to measure the variables of speed fluency and breakdown fluency (i.e. for the overall recording) manually. As a consequence, some changes regarding the corpus size and the measurement process were made. First of all, as measuring breakdown fluency and speed fluency for the overall recording was no longer possible due to time constraints, we decided to only measure and analyse the variables two times per recording instead of three, i.e. once for the longest monologic and once for the longest dialogic task performance. Secondly, as it was impossible to analyse all 26 recordings in terms of speed fluency due to time constraints, only 16 of the 26 recordings were selected (i.e. 5 recordings for the highest and intermediate proficiency level and 6 for the lowest proficiency level). In order to measure speed fluency manually, the 16 recordings had to be automatically transcribed. As I did not have the appropriate software to transcribe the recordings myself, professor David Chan transcribed them, after which I post-edited them.

This preamble was written in agreement between the student and the supervisor and was approved by both of them.

(6)

Abstract

Within the field of applied linguistics and second language acquisition (SLA), a considerable amount of research has already been dedicated to the concept of L2 fluency. However, despite the importance of this concept, previous research has left several important gaps concerning the notion of L2 fluency. This study aimed to reduce these gaps and to provide new insights into the matter. For that purpose, the study firstly discussed the definition of fluency as a construct, followed by the introduction and comparison of the most effective means of measuring L2 fluency. Secondly, the degree of fluency of 26 L2 students who are in their first year of Applied Linguistics was analysed. The analysis was based on a task performance that was both monologic and dialogic, i.e. in an oral examination context. It aimed to discuss the correlation between fluency and overall oral proficiency on the one hand and the fluency contrasts in terms of task performances (i.e. monologic vs dialogic) on the other. The hypothesis that there would be a positive correlation between fluency and overall oral proficiency was partly contradicted, as it revealed opposing results. The hypothesis that students would, in general, be more fluent when engaging in a dialogue than when performing a monologue, however, was confirmed, although different patterns occurred when comparing the different proficiency levels.

(7)

List of tables and figures

1. General Procedure for Measuring SLA (Norris and Ortega, 2013) ... 7

2. Methodology: Distinction between Monologic and Dialogic Task Performances ... 21

3. Results: Speed Fluency: Number of Syllables/Second (Monologic vs Dialogic) ... 26

4. Results: Speed Fluency: Number of Syllables/second (Proficiency Levels) ... 27

5. Results: Repair Fluency: Number of Self-Repetitions, Self-Corrections, Filled Pauses & False Starts/Minute (Monologic vs Dialogic) ... 28

6. Results: Repair Fluency: Number of Self-Repetitions, Self-Corrections, Filled Pauses & False Starts/Minute (Proficiency levels) ... 30

7. Results: Breakdown Fluency: Number of Pauses/Minute (Monologic vs Dialogic) ... 31

8. Results: Breakdown Fluency: Length of Pauses/Minute (in Seconds) (Monologic vs Dialogic) ... 31

9. Results: Breakdown Fluency: Number of Pauses/Minute (Proficiency Levels) ... 32

10. Results: Breakdown Fluency: Lentgh of Pauses/Minute (in Seconds) (Proficiency Levels) ... 33

(8)

Table of Contents

1 Introduction ... 1

2 Theoretical background ... 2

2.1 Defining fluency ... 2

2.2 Measuring fluency ... 6

2.3 Monologic and dialogic task performances ... 10

2.4 The examination context with treatment of FLA – Foreign Language Anxiety ... 12

3 Methodology ... 15 3.1 Data collection ... 16 3.2 Conceptual stage ... 17 3.2.1 Construct definition ... 17 3.2.2 Behaviour identification ... 18 3.2.3 Task specification ... 19 3.3 Procedural stage ... 20

3.4 Reliability and measurement error ... 24

4 Results ... 25

4.1 Speed fluency ... 25

4.2 Repair fluency ... 27

4.3 Breakdown fluency ... 30

5 Discussion of the results ... 34

6 Conclusion ... 36

6.1 Results ... 36

6.2 Limitations of the study ... 39

6.3 Suggestions for further research ... 40

7. Bibliography ... 42

8. Appendix ... 47

8.1 Results Excel File ... 47

8.1.1 List of abbreviations ... 47

8.1.2 Low Proficiency ... 48

8.1.3 Intermediate Proficiency ... 49

8.1.4 High Proficiency ... 51

(9)
(10)

1

INTRODUCTION

Within the field of applied linguistics and second language acquisition (SLA), a considerable amount of research has already been dedicated to the concept of second language (L2) speaking proficiency and ways in which to measure it (Housen and Kuiken, 2009). In general, L2 speaking proficiency can be regarded as an ambiguous concept, as its componential structure has already raised various interpretations. However, many researchers state that its nature is multi-componential, comprising the three principal notions of complexity, accuracy and fluency (Skehan 1989; Ellis 2003, 2008; Ellis and Barkhuizen, 2005). The origins of this triad can be found in the 90’s, where a research project regarding L2 proficiency in classroom contexts mentioned the distinction between accuracy and fluency for the first time (Housen et al., 2009). The former was later defined as a notion that ‘focuses on linguistic form and on the controlled production of grammatically correct linguistic structures in the L2’ (Housen et al., 2009, p. 1), whereas the latter was defined as ‘fosters spontaneous oral L2 production’ (Housen et al., 2009, p.1). Skehan (1989) eventually added the notion of complexity to the triad, which was later defined as ‘[t]he extent to which the language produced in performing a task is elaborate and varied’ (Ellis 2003, p. 340).

The notion of fluency is considered an essential component of language proficiency and as an important tool to indicate and describe L2 development (de Jong, Steinel, Florijn, Schoonen, & Hulstijn, 2012; Housen et al., 2009; Kahng, 2014; Skehan, 2014). However, despite the importance of this concept, previous research has left three important gaps concerning the notion of fluency (Tavakoli, 2016). First of all, although extensive research has already been conducted in this area, no univocal definition of fluency as a construct has yet been proposed (Kahng, 2014; Prefontaine, 2013; Housen et al., 2009). Secondly, previous research shows that measuring fluency often leads to mixed results: if the interactive aspects of fluency are included in the measurement, analyses deliver a different outcome compared to when they are excluded. (Kormos, 2006; Skehan, 2014). Thirdly, as a consequence of these mixed results, one cannot be certain that previous research regarding fluency was measured in a reliable and valid way (Housen et al., 2009; Housen et al., 2012). Moreover, research into the concept of fluency is mainly conducted using monologic rather than dialogic task performances (Tavakoli, 2016). This could be explained by the fact that current studies concerning the measurement of fluency rely on Levelt’s (1989) three-stage model of speech production, which is called Conceptualization, Formulation and Articulation. Levelt’s model uses monologic speech as a starting point for language production and procession and does not comprise any dialogic speech. Although further research has been

(11)

conducted to minimise these gaps and to guarantee more precise and objective results, critical analyses still illustrate the inadequacy of this research area (Housen et al., 2009). Therefore, further research into L2 fluency is recommended.

This study aims to reduce the gaps previously mentioned by providing further research into the concept of L2 fluency and the ways in which it can be measured, as well as to analyse L2 fluency and to propose new insights into the matter. For that reason, the study consists, on the one hand, of a theoretical section, that will firstly discuss the definition of fluency as a construct further in detail, followed by the introduction and comparison of the most effective means of measuring L2 fluency. On the other hand, it consists of a practical section that will link theory to practise by analysing the degree of fluency of 26 L2 students who are in their first year of Applied Linguistics. The analysis is based on a task performance that is both monologic and dialogic, including the interactive aspects of dialogic conversation, in order to limit the impact on the results to a minimum and to provide a reliable and valid conclusion. The following research questions have been formulated:

1) What is fluency?

2) How can fluency be measured / What are the most effective means of measuring fluency?

3) To what extent is there a correlation between fluency and overall oral proficiency? 4) What are the fluency contrasts in terms of task performance (monologic vs. dialogic)

2

THEORETICAL BACKGROUND

The theoretical part of this study aims to provide an answer to the first two research questions, namely:

1. What is fluency?

2. How can fluency be measured / What are the most effective means of measuring fluency?

2.1 Defining fluency

Whereas the definition of accuracy as a synonym for ‘correctness’ (Housen et al., 2009, p.3) or as a reference to ‘the degree of deviancy from a particular norm’ (Housen et al., 2009, p. 3) is generally accepted, the definition of fluency as a construct, together with that of complexity, is less clear (Housen et al., 2009 & Hieke, 1985). A variety of interpretations and

(12)

theories coexist which often conflict with each other, and as such, this topic remains controversial and often challenging to address (Housen et al., 2009 & Fulcher, s.d.).

Regarding defining the notion of complexity, various interpretations can be made. First, a distinction can be made between task complexity on the one hand and L2 complexity on the other (Housen et al., 2009). The former refers to the characteristics and difficulties of language task whereas the latter refers to those of spoken language and L2 proficiency (Housen et al., 2009). Furthermore, researchers such as Dekeyser (2008), Housen, Pierard & Van Daele (2005) and Williams & Evans (1998) state that L2 complexity can be further subdivided into cognitive complexity and linguistic complexity. Cognitive complexity refers to ‘the relative difficulty with which language features are processed in L2 performance and acquisition’ (Housen et al., 2009, p. 5), in which the degree of difficulty is determined by both subjective factors (such as the learner’s motivation) as well as by more objective factors (such as the inherent linguistic complexity of a language). Linguistic complexity can, in turn, be interpreted as a dynamic characteristic as well as a more stable characteristic. When interpreting linguistic complexity as a dynamic characteristic, one refers to the ‘size, elaborateness, richness and diversity of the learner’s linguistic L2 system’ (Housen et al., 2009, p. 5). When interpreting linguistic complexity as a stable characteristic, one refers to it as what is called ‘structural complexity’, which can be further broken down into formal and functional complexity (Williams & Evans, 1998 & Housen et al., 2005 in Housen et al., 2009). Because of these various co-existing interpretations of complexity as a construct, complexity is generally considered as the most ambiguous and complex notion of the three dimensions of L2 speaking proficiency (Housen et al., 2009).

As with the notion of complexity, the notion of fluency as a construct also remains complex to define. However, according to many studies (eg. Housen et al., 2009; Tavakoli, 2016 & Tavakoli & Skehan, 2005), it is generally agreed that the notion of fluency as a construct is multi-faceted and multi-componential. Nevertheless, this multi-faceted character is often interpreted in different ways, resulting in several different theories.

Fillmore’s (1979) definition generally remains one of the oldest and most widely accepted definitions still adopted by many researchers today (such as Lennon, 1990; Tavakoli, 2016 & Fulcher, s.d.). He defined fluency as: ‘the ability to talk at length with few pauses; the ability to fill time with talk; the ability to talk in coherent and semantically dense sentences; the ability to have appropriate things to say in a wide range of contexts; and the ability to be creative and imaginative in the language use’ (Fillmore, 1979, p. 51). In this definition, fluency’s multi-faceted and multi-componential structure is once again emphasized (Tavakoli, 2016).

(13)

In 1990, however, Lennon focused more on the performative dimension of fluency as a construct, introducing different scales of categorisation. He made a distinction between fluency used in a broad sense and fluency used in a narrow sense. The former is used as an indicator that represents overall oral proficiency. Lennon specifies: ‘in this sense, fluent represents the highest point on a scale that measures spoken command of a foreign language’ (Lennon, 1990, p. 389). The broad sense of fluency as a construct is often used when evaluating a person’s foreign language ability, in which a person’s overall proficiency is marked by one of the following parameters: ‘fair’, ‘good’ or ‘fluent’ (Lennon, 1990, p. 389). Fluency in a narrow sense, however, is used as an indicator that represents only one single component of overall oral proficiency. The narrow sense is most often used by teachers, who make a distinction between fluency on the one hand and other indicators of oral proficiency such as pronunciation and idiomaticity on the other.

Housen et al. (2009), Skehan (2003) and Tavakoli et al. (2005) subdivided fluency as a construct along its three main characteristics, i.e. into speed fluency, breakdown fluency and repair fluency. Speed fluency includes for example density of delivery, whereas breakdown fluency and repair fluency respectively include for example the number of pauses in speech and the number of false starts (Housen et al., 2009). However, Freed (2000) interpreted the definition of fluency as a construct in a different way, proposing a broader subdivision. Following her theory, fluency can be divided into a range of different characteristics of speech. As a consequence, a distinction can be made between psychological manifestations (i.e. the underlying cognitive processes with regard to fluency) at the one end and real speech production at the other (Freed, 2000).

Segalowitz (2000) introduced a similar distinction, referring to the ‘cognitive aspects’ on the one hand and the ‘performance aspects’ on the other. The former is defined as ‘the efficiency of the operation of the cognitive mechanisms underlying performance’, whereas the latter is defined as ‘the observable speech, fluidity and accuracy of the original performance’ (Segalowitz, 2000 in Tavakoli, 2016). In 2010, Segalowitz proposed another definition, adopting different terminology and adding a third element to his original subdivision. He refers to cognitive, utterance and perceived fluency. Cognitive utterance refers to the cognitive processes during the production of speech, i.e. the way the speaker plans and executes his speech. Utterance fluency refers to the aspects of fluency that can be measured while speaking (eg. speed rate), whereas perceived fluency refers to “the inferences listeners make about speakers’ cognitive fluency based on their perceptions of their utterance fluency” (Segalowitz, 2010, p. 165). In the next paragraph, the relationship between utterance fluency and perceived fluency, as well as that between utterance fluency and cognitive fluency will briefly be discussed as those correlations are important to

(14)

understand the notion of fluency from the point of view of the listener (i.e. the perceiver) as well as of the speaker (de Jong, Schoonen & Steinel, 2012).

Various studies, such as Derwing, Rossiter, Munro & Thomson (2004) and Rossiter (2009), have indicated that there is a strong correlation between utterance fluency on the one hand and perceived fluency on the other. In both studies, for example, variables such as speech rate and pausing were recurring variables of utterance fluency that determined the rater’s perceived fluency, irrespective of the type of rater. In 1995, Freed expanded this theory by stating that other aspects, such as the use of vocabulary, were also related to ratings of utterance fluency and therefore function as a good predictor of perceived fluency.

There is also a correlation between utterance fluency on the one hand and cognitive fluency on the other. However, according to de Jong et al. (2012) this correlation is more difficult to measure, as the variables of L2 cognitive fluency, i.e. the way in which the speaker prepares his or her speech, are more difficult to identify. For this purpose, de Jong et al. (2012) introduced a theory, in which the correlation is measured ‘within speakers’ over time (de Jong et al., 2012, p. 4). Following this theory, a correlation between utterance fluency and cognitive fluency can be established if a specific aspect of utterance fluency develops over time, as, in that case, the development can be linked to a development of cognitive fluency. If an aspect does not develop over time, even though the overall proficiency has developed, no correlation can be established, as, in that case, that aspect can be linked to other factors such as personal speaking style. From their study, de Jong et al. (2012) were able to deduce that articulation rate, as an aspect of utterance fluency, was the strongest indicator of cognitive fluency and of the correlation between cognitive and utterance fluency.

To summarise, the definition of fluency is an ambiguous concept. Even though fluency is generally accepted as a multi-faceted and multi-componential construct, a variety of interpretations and theories coexist. In 1979, Fillmore introduced one of the oldest and widely accepted definitions, which is still adopted by many researchers today (such as Lennon, 1990; Tavakoli, 2016 & Fulcher, s.d.). He defined fluency as ‘the ability to talk at length with few pauses; the ability to fill time with talk; the ability to talk in coherent and semantically dense sentences; the ability to have appropriate things to say in a wide range of contexts; and the ability to be creative and imaginative in the language use’ (Fillmore, 1979, p. 51). In 1990, however, Lennon focused more on the performative dimension of fluency as a construct, introducing different scales of categorisation. He made a distinction between fluency used in a broad sense and fluency used in a narrow sense. In 2000, Segalowitz and Freed distinguished fluency according to its cognitive and performative aspects, whereas Housen et al. (2009), Skehan (2003) and Tavakoli et al. (2005) defined

(15)

and distinguished fluency along its three main characteristics, i.e. into speed fluency, breakdown fluency and repair fluency.

2.2 Measuring fluency

Measuring procedures for L2 fluency are as controversial as the definition of fluency as a construct. Norris & Ortega (2003), for example, state that various theories on measuring L2 fluency have emerged from the development of SLA as a research field. These theories have, in turn, resulted in ‘conflicting views about the ‘best’ way to gather data and/or the ‘correct’ questions to be asked’ (Gass, 1988, p. 199). Tavakoli and Wright (2016) share that view, stating that fluency remains a complex and difficult research topic, not only regarding its definition as a construct but also in terms of its measuring procedures.

As stated above, several theories have been proposed that each suggest the best way to measure L2 fluency. This section will first look at a general procedure for measuring SLA, after which five different theories will be discussed regarding the specific traits of L2 fluency that should be measured in order to provide the most reliable outcome.

In order to measure SLA in a reliable way, systematic means of measurement should be adopted, as, in research where this is not the case, research findings will be reported which ‘lack interpretability and generalizability and which do not contribute to the accumulation of knowledge’ (Norris et al., 2003). Wright (1999) further adds that research findings of that kind will be easily contested in replicable studies. Therefore, Norris et al. (2003) introduced a general procedure for measuring SLA (see figure 1.1), based on the discussions of Bennett (1999), Messick (1989; 1994), Mislevy (1994; 1995) and Mislevy, Steinberg, Almond, Haertel & Penuel, (s.d.). According to them, research measurement consists of several ‘interrelated but distinguishable stages’ (Norris et al., 2003, p. 719), which can be grouped under two main stages: the conceptual stage (the first stage) and the procedural stage (the second stage). During the first stage, researchers conceptualise and define the intended construct interpretations, in order to decide on the appropriate measures. This stage can be subdivided into the three stages of construct definition (i.e. the explicit definition of a construct, together with its theoretical assumptions), behaviour identification (i.e. the identification of the construct’s behaviours that will be analysed) and task specification (i.e. the selection of tasks or situations that will be used to elicit those behaviours). During the second stage, researchers create formal procedural methods to process the outcomes of the construct’s behaviours. This stage can be subdivided into the three stages of behaviour elicitation, observation scoring and data analysis. During the first stage, researchers elicit,

(16)

observe and record the behavioural data on the one hand and monitor the possible influence of other variables on the behavioural data on the other. During the second stage, scores are assigned to the meaningful qualities that were found in the participant’s behaviour during the task. Finally, in the third stage, the given scores are analysed in order to function as a valid foundation for further interpretation. Therefore, summaries and comparisons are made and the hypotheses put forward in the conceptual stage are evaluated.

In their theories, Wolfe-Quintero, Inagaki & Kim (1998) and Pallotti (2009) focus more on the type of variables of L2 fluency that should be measured. According to the former, the best way to measure L2 fluency is to analyse variables that ‘clearly show variance among subjects, both over time and across tasks, correlating with other equally varying proficiency measures’ (in Pallotti, 2009, p. 2). However, Pallotti (2009) opposes the focus on variance, arguing that constants and similarities are as important as variations and differences. As a consequence, he states that a wide range of variables should be measured, as variables that show constants and similarities can also be scientifically valid. If, for example, the results of a specific measure do not show any variation over time, it may simply mean that that specific trait of the construct does not vary. Pallotti (2009) enforces this by stating that there is only one condition for variables in order to be valid: they have to represent their underlying construct.

(17)

Skehan (2003) and Tavakoli et al. (2005), in turn, adopt a different approach, stating that, in order to provide the most reliable results, the specific variables have to represent a mixture of traits belonging to the main sub-divisions of L2 fluency, namely speed fluency, repair fluency and breakdown fluency (in Tavakoli 2016). Baker-Smemoe et al. (2014), Witton-Davies (2014) and Mora & Valls-Ferrer (2012) introduced the specific variables belonging to these sub-divisions. According to Baker-Smemoe et al. (2014), the following variables are most commonly used to measure speed fluency: the number of syllables per second, the number of pruned syllables per second, the number of runs or turns and the mean length of run in syllables. To measure repair fluency, it is the number of hesitations, false starts, and filled pauses that are commonly used (Baker-Smemoe et al., 2014). Finally, Baker-Smemoe et al. (2014) stated that the number of pauses and the length of pauses are the most reliable variables in order to measure breakdown fluency. However, Witton-Davies (2014) and Mora and Valls-Ferrer (2012) expanded this theory, also adding pause location, articulation rates and phonation time ratio to the categorisation.

This study will focus on the measures proposed by Baker-Smemoe et al. (2014), which will be further explained in the methodology section.

Another important issue with regard to fluency measurement is that of reliability. According to Norris et al. (2003), reliability can be defined as a feature that ‘reflects the extent to which a measure leads to consistent interpretations about a particular construct on each measurement occasion’ (Norris et al., 2003, p. 740). Previous research has shown that the reliability of fluency measurement can be questioned, as its results can be heavily influenced by various factors. For example, Tavakoli (2016) has shown that the type of measures researchers include in their measurement influences its outcome. When specific dialogue-only measures, such as between turn-pauses, are excluded in the measurement, the score for speech rate, for example, is higher, compared to when it is included (Tavakoli, 2016). Another factor that directly influences the outcome of the measurement is the idiosyncrasy of the participant, i.e. the participant’s characteristics, such as interest or motivation (Norris et al., 2003). In his theory, Bachman (1990) states that the results of fluency measurement can also be influenced by the internal consistency of the test taker’s performances. He specifies the internal consistency as a concept that is ‘concerned with how consistent test takers’ performances on the different parts of the test are with each other’ (Bachman, 1990, p. 172). The test taker’s consistency can be influenced by the inconsistencies of the type of test method that is used. In order to clarify this, he introduces the following example: ‘Performance on the parts of a reading comprehension test, for example, might be inconsistent if passages are of differing lengths and vary in terms of their syntactic, lexical, and organizational complexity, or involve different topics’ (Bachman,

(18)

1990, p. 172). As a consequence, he states that, if the inconsistencies of a particular test method influence the consistency of the speaker’s language performance, the type of test method is not adequate for producing reliable results regarding the speaker’s language performance. Therefore, in order to minimise the degree of measurement error, it is important that the internal consistency of the test taker’s performances or the test method in question is measured and taken into account. As a possible method to measure this, Bachman (1990) introduced the split-half method. Following this method, the test in question is divided into two parallel parts that are each attributed a score. Then, in order to determine the test taker’s consistency, the scores of the two parts are compared. When adopting this method, the equivalence of both parts is of great significance, i.e. both parts should have equal means and variances. In many cases, however, the equivalence of both parts cannot be guaranteed, as the questions of classic tests usually become more difficult towards the end (Bachman, 1990). A solution for this problem could be to assign items to halves at randome. If this were not possible either, the test then should be divided into halves ‘in such a way as to maximize their equivalence and their independence’ (Bachman, 1990, p. 174).

Following Bachman’s (1990) theory, an examination context in which both monologic and dialogic task performances occur, could have an influence on the test taker’s performances, as the different types of task performances could be considered as inconsistencies of the test method. As a consequence, this study will take the difference in task performances into consideration in order to minimise the degree of measurement error and to provide reliable results regarding the test taker’s performances. This will be further explained in the following section (see ‘3.4 reliability and measurement error’).

Following the theory of Norris et al. (2003), the factors mentioned above can be classified under the name ‘measurement error’ (Norris and Ortega, 2003, p. 745). The more these measurement errors influence the measurement, the less the measurement is considered reliable. As a consequence, reliability is a concept that should be questioned and taken into account when measuring L2 fluency. However, according to Norris et al. (2003), the importance of measurement errors is often overlooked, as they are infrequently considered and almost never reported. This could have significant consequences on the further development and credibility of SLA as a research area. If measurement error is not being discussed and therefore not researched, its influence cannot be understood. Moreover, in that case, measurement error will also remain a prominent issue in future studies, while actually its influence should be reduced (Norris et al., 2003).

To summarise, fluency remains a complex and difficult research topic, not only regarding its definition as a construct but also in terms of its measuring procedures, as various theories

(19)

on L2 measuring conflict with each other. In order to measure SLA in a reliable way, Norris et al. (2003) introduced a general procedure, which consists of two stages: the conceptual stage, which is further subdivided into construct definition, behaviour identification and task specification, and the procedural stage, which is further subdivided into behaviour elicitation, observation scoring and data analysis. Regarding the specific variables of L2 fluency that should be measured, the theories of Wolfe-Quintero, Inagaki and Kim (1998) and Pallotti (2009) contradict each other. The former states that only variables which ‘clearly show variance among subjects, both over time and across tasks’ (Pallotti, 2009, p. 2) are the most reliable, whereas the latter states that a wide range of variables should be measured as contrasts and similarities are equally important. Furthermore, Pallotti (2009) adds that the variables only have to represent their underlying construct in order to be valid. Skehan (2003) and Tavakoli et al. (2005), in turn, adopt a different approach, stating that, in order to provide the most reliable results, the specific variables have to represent a mixture of traits belonging to the main sub-divisions of L2 fluency, namely speed fluency, repair fluency and breakdown fluency. Baker-Smemoe et al. (2014), Witton-Davies (2014) and Mora et al. (2012) introduced the specific variables belonging to these sub-divisions. However, whereas Baker-Smemoe et al. (2014) introduced the number of syllables per second, the number of pruned syllables per second, the number of runs or turns, the mean length of run in syllables, the number of hesitations, false starts and filled pauses and the number and length of pauses as the most reliable variables to measure L2 fluency, Witton-Davies (2014) and Mora et al. (2012) also added pause location, articulation rates and phonation time to the categorisation.

Another important issue with regard to fluency measurement is that of reliability. The reliability of fluency measurement can be questioned, as its results can be heavily influenced by various factors, such as the type of measures included in the measurement (Tavakoli, 2016), the idiosyncrasy of the participant (Norris et al., 2003) and the internal consistency of the test taker’s performances (Bachmann, 1990). Despite the importance of reliability (i.e. measurement error), it is often overlooked, which could have significant consequences on the further development and credibility of SLA as a research area.

2.3 Monologic and dialogic task performances

Research into L2 fluency has largely been conducted examining monologic task performances (Tavakoli, 2016). According to Skehan et al. (2001), monologic task performances can be defined as task performances in which there is almost no to limited interaction: the participants of the task are not expected to engage in dialogue, but to respect the one who holds the floor and to wait until it is their turn to talk. For this study,

(20)

however, monologic task performances can be interpreted in a more supple way as they occur in an examination context. This will be further explained in the following section (see ‘3.2.3. task specification’). A first example of monologic task performances is oral narratives, in which the participant has to tell a story about an arbitrary subject (Skehan and Foster 1996; Tavakoli, 2011, Skehan et al., 2001). Another example is answering machine talk (Tavakoli, 2016), in which one analyses an answering-machine message from a speaker, who is speaking to an absent interlocutor (Gold, 2009). In general, Tavakoli (2016) distinguishes three main reasons why researchers prefer monologic task performances to assess L2 fluency rather than dialogic task performances. Firstly, researchers have a higher degree of control over speech planning when assessing L2 fluency in monologues compared to that in dialogues. Secondly, the outcome of monologues is more predictable than that of dialogues, which makes it easier to assess. Thirdly, while there are clearly defined ways to measure L2 fluency in monologues, measuring fluency in dialogues is often more complex because of the interaction inherent to dialogues. However, this can be dealt with qualitatively.

In comparison to the amount of fluency research based on monologic task performance, there is relatively little research into L2 fluency using dialogic task performance, largely due to the aspects mentioned above. However, there are several researchers that have focused on monologic as well as on dialogic task performance to measure L2 fluency. Riggenbach (1989), for example, focussed on both speech genres in her doctoral dissertation. Her approach was to measure L2 fluency differences between both tasks. Michel (2011) adopted the same approach, comparing L2 fluency in both monologic and dialogic task performance, respectively by means of answering machine talk and telephone conversation. Another example of research that uses both monologic and dialogic task performance is that of Witton-Davies (2014), who examined L2 fluency by the means of picture retelling and an interactive discussion. All three studies revealed that, in dialogic conversation, the speaker is more fluent regarding speed, pausing and repair measures. Ejzenberg (1977) supports this, hypothesising that if the speaker participates in a dialogue, fluency ratings will be higher compared to when the speaker performs a monologue. This can be explained by the fact that speakers have more preparation time in a dialogue; they rely on the interlocutor’s turn to think about what they’re going to say next. As a consequence, the speaker is allowed to speak faster, to pause less and to use minimal repair measures (Tavakoli, 2016). Following Tavakoli (2016), another explanation could be that the participant feels more encouraged to communicate when speaking with an interlocutor, as the participant wishes to respond to the interlocutor’s needs. As a consequence, the participant tries to speak as fluently as possible, ‘producing fewer hesitations and repetitions and faster speech’ (Tavakoli, 2016).

(21)

As discussed above, a minority of researchers have conducted research comparing monologic and dialogic task performance. However, these previous studies analysed monologic and dialogic task performance as two separate performances. This study, however, aims to study L2 fluency in a task that contains both monologic and dialogic aspects, i.e. in an oral examination context. This will be further explained in the next chapter (see 3.2.3 task specification).

To summarise, research into L2 fluency has largely been conducted examining monologic task performances (Tavakoli, 2016), for which Tavakoli (2016) distinguished three main reasons. First of all, researchers have a higher degree of control over speech planning. Secondly, the outcome of monologues is more predictable when compared to dialogues. Thirdly, measuring fluency in dialogues is often more complex because of the interaction inherent to dialogues. However, several researchers have focused on both monologic and dialogic task performances to measure L2 fluency, such as Riggenbach (1989) and Michel (2011). These studies revealed that fluency ratings are higher in dialogic conversations in comparison to monologic conversations. This could be explained by the fact that the participants have more preparation time in dialogues and that they feel more encouraged to communicate when speaking with an interlocutor (Tavakoli, 2016). Whereas previous studies analysed monologic and dialogic task performance as two separate performances, this study, however, aims to study L2 fluency in a task that contains both monologic and dialogic aspects, i.e. in an oral examination context. This will be further explained in the next chapter (see 3.2.3 task specification).

2.4 The examination context with treatment of FLA – Foreign Language Anxiety

Over the past twenty years, there has been a large increase in the research area that focuses on L2-related anxiety. Various theories and measurement processes have been developed since the 1980’s, which explains this increase of interest (Horwitz, 2001 in Sayin, 2015). A large amount of researchers focused on test anxiety in foreign language, such as Sarason (1978), Horwitz (2001) and Shomoossi & Kassaian (2009), whereas others, such as Toth (2008) and Subasi (2010) investigated anxiety when learning a foreign language (Sayin, 2015). Other researchers, such as Phillips (1992) and Salehi and Marefat (2014), examined both types (cited in Sayin, 2015).

Even though a large body of research has already been conducted into the notion of anxiety and its effects on the oral competences of L2 learners, anxiety remains a very complex and multi-faceted construct that remains often difficult to define (Phillips, 1992; Young, 1986).

(22)

This can be explained by the fact that different types of anxiety can be distinguished and that different measurement processes are used to measure them (Phillips, 1992; Young, 1986). In general, anxiety can be defined as ‘a type of cognitive response marked by self-doubt, feelings of inadequacy, and self-blame’ (Sarason, 1978, p.195) or as ‘the subjective feeling of tension, apprehension, nervousness, and worry associated with an arousal of the autonomic nervous system’ (Spielberger, 1983, p. 1). When considering the notion of anxiety in greater detail, two subtypes can be distinguished, namely trait anxiety and state anxiety (Phillips, 1992). Trait anxiety refers to a personality trait, which means that it is a more stable type of anxiety that remains over time and that occurs in a range of various situations. State anxiety, however, refers to a psycho-physiological state and is therefore a type of anxiety that only occurs in specific situations at a specific moment (Leal, Goes, Ferreira da Silva & Teixeira-Silva, 2017). Regarding the dimension of these constructs, various authors propose conflicting theories. Following the theory of Spielberger, Gorsuch & Lushene (1970) anxiety is a one-dimensional concept, which implies a directly proportional relation between trait and state anxiety: ‘the higher the trait anxiety, the higher the state anxiety in different situations of threat’ (Leal et al., 2017, para. 3). Endler and Parker (1991), however, oppose to this idea, stating that both state and trait anxiety are separate multi-dimensional constructs representing their own individual differences. Moreover, they make a further sub-division of state anxiety into a cognitive-worry and autonomic-emotional dimension. Trait anxiety, in turn, is subdivided along four different threats that can occur in specific situations, namely the social evaluation threat, the physical danger threat, the ambiguous threat and the threat in innocuous situations or daily routines (Leal et al, 2017).

As stated above, the notion of L2 anxiety remains a complex and multifaceted construct that is often difficult to define. However, it is generally agreed that the notion of L2 anxiety is strongly correlated to the notion of stress, as L2 anxiety is considered the main cause of stress during tests and exams (Sayin, 2015). According to Essel and Owusu (2017) the notion of stress can be defined as “a difficulty that causes worry or emotional tension and produces strain on the physical body” (p. 5). Rothkrantz, Wiggers, van Wees and van Vark (s.d.) further elaborate this definition, stating that stress is a psychological state of mind that mainly manifests itself through the non-verbal content of the voice. As a consequence, different voice features, such as loudness and speech rate are considered as clear indicators of stress levels of a particular test taker (Rothkrantz et al., s.d.). For this study, the degree of stress levels will be dealt with qualitatively as they may have an impact on the speaker’s performances. They will, however, not be dealt with quantitatively, as measuring the specific variables is beyond the scope of this study. This will be further explained in the methodology section.

(23)

Various studies, such as Young (1986) and Phillips (1991), have already been conducted studying the relation between language anxiety and second language oral competence in a general class context. These studies have shown that anxiety can affect the student’s oral competences. However, its effect can be positive as well negative and can differ in degree. Horwitz (1984), for example, stated that anxiety has a negative effect on the speaker’s oral competence: she observed an inverse relationship between the level of anxiety and the grade attributed to the student’s oral speaking performance. Her study revealed that students with a higher level of anxiety had received lower scores of speaking proficiency and vice versa (in Phillips, 1992). Other researchers, such as Chastain (1975) and Kleinmann (1977), however, have found a directly proportional relation between the two variables, resulting in better grades for students that were more anxious during their speaking performance. The inversely proportional and directly proportional relation between language anxiety and speaking proficiency can respectively be categorised under the name of ‘facilitative anxiety’ and ‘debilitative anxiety’ (Sayin, 2015, p. 113). Other researchers, such as Backman (1975), even suggest that there is no correlation at all between anxiety on the one hand and speaking proficiency on the other.

Sayin (2015) studied language anxiety in the oral examination context (i.e. state anxiety) and the effects it had on students’ performance. From his study, it can be stated that most of the students feel highly anxious towards speaking exams, which is also reflected in their speaking performance. According to Paker and Höl (2012), this can be explained by the fact that the oral examination context could be considered as one of the most challenging and stressful ways of testing a student’s oral proficiency. The students are tested one-by-one by the means of face-to-face communication, during which they have to talk about a subject that is assigned to them. Because of this specific context, students’ speaking skills are easily influenced by factors such as ‘concentration, self-confidence, limited time, and the attitudes of the assessors during the test’ (Paker and Höl, 2012, cited in Sayin, 2015, p. 113). Face-to-face oral examination is a traditional way of examining students’ oral proficiency that has become an important assessment measure because of two reasons (Sayin, 2015). First of all, it allows teachers to replicate and provide a real-life situation to their students. Secondly, it is used to determine to what extent a speaker is able to discuss familiar topics in a natural way. However, given that the face-to-face oral examination can be stressful for most students and can therefore affect their speaking proficiency, Sayin (2015) proposes a computer-based oral exam instead. The latter is supposed to provide a more stress-free environment for students and would help to reduce their anxiety towards oral examinations.

(24)

To summarise, even though there has been a large increase of interest in the research area of L2-related anxiety, anxiety remains a very complex and multi-faceted construct that remains often difficult to define (Phillips, 1992; Young, 1986). This can be explained by the fact that different types of anxiety, i.e. trait anxiety and state anxiety (Phillips, 1992), exist, different approaches regarding its dimension, i.e. one-dimensional (Spielberger et al., 1970) and multi-dimensional (Endler et al., 1991), are adopted and different measurement processes are used to measure them. Despite the complexity of anxiety as a construct, it is generally agreed that there is a strong correlation between the notion of L2 anxiety and the notion of stress (Sayin, 2015). Another correlation that has been studied is that between language anxiety and second language oral competence in a general class context. Some researchers stated that there was an inversely proportional correlation between those two aspects (such as Horwitz, 1984), whereas others indicated a directly proportional correlation (such as Chastain, 1975) or suggested no correlation at all (Backman, 1975). Sayin (2015) investigated the correlation between language anxiety in the oral examination context (i.e. state anxiety) and the effects it had on students’ performance. From his study, it can be stated that most of the students feel highly anxious towards speaking exams, which is also reflected in their speaking performance. This can be explained by the fact that oral face-to-face communication can be regarded as one of the most stressful ways of assessing a student’s oral proficiency. For this study, the degree of stress levels will be dealt with qualitatively as they may have an impact on the speaker’s performances. They will, however, not be dealt with quantitatively, as measuring the specific variables is beyond the scope of this study. This will be further explained in the methodology section.

3

METHODOLOGY

Whereas the theoretical part of this study discussed both the definition of L2 fluency as a construct and the most effective ways of measuring L2 fluency, the analysis of this study aims to provide an answer to the third and the fourth research question, i.e.:

3. To what extent is there a correlation between fluency and overall oral proficiency? 4. What are fluency contrasts in terms of task performance (monologic and dialogic)? For the analysis, first and foremost, a quantitative approach has been adopted, in which the oral examinations of L2 students have been analysed. The study, however, also has a qualitative dimension. The different patterns resulting from the quantitative approach will be interpreted in a qualitative way, taking into account the notions of FLA, as well as the influence they have on the quantitative data.

(25)

The general procedure put forward by Norris et al. (2003) will be adopted for and adjusted to the aim of this research, as their theory is considered most appropriate to cover the different research questions of this paper. The general procedure can be divided into two stages: the conceptual stage and the procedural stage.

During the conceptual stage, interpretations of the notion of fluency as a construct will be conceptualised and defined in order to decide on the appropriate measures. In concrete terms, this stage can be subdivided into the stages of construct definition, behaviour identification and task specification. During the first stage, a definition of fluency as a construct will be provided, together with its theoretical hypotheses. During the second stage, the different characteristics of fluency as a construct that will be analysed will be identified. Finally, during the third stage, the tasks or situations that have been used to elicit these behaviours will be identified.

During the procedural stage, formal procedural methods will be created to process the outcomes of the construct’s behaviours. In other words, this stage will describe the data analysis.

The four following sections of this chapter will further elaborate on the methodology of this research. The first section will address the collection of the data, i.e. what and how the data was collected. Following the theory of Norris and Ortega (2003), the second paragraph will cover the conceptual stage of the research, i.e. the construct definition, behaviour identification and task specification, whereas the third paragraph will cover the procedural stage of the analysis, i.e. the description of how the data was processed and analysed. Finally, in the fourth paragraph, more attention will be drawn to reliability and measurement errors.

3.1 Data collection

The data collected for this study consist of 97 tracks, recorded during the oral examinations of L2 students of English. These students were in their first year of Applied Language Studies at Ghent University. Amongst the participants were 72 female and 25 male students. For this study, however, only 26 of the 97 recordings will be analysed (this will be further explained in the procedural stage). Among these 26 recordings were three students with a different L1. The tracks are recorded during the examinations of the 2nd semester, more specifically on the 3rd, the 4th and the 5th of June 2019. I have not conducted any quantitative methods of data collection myself, as the professor responsible for the examination provided me with the data. Following the general procedure of Norris et al. (2003), the specifics of the task will be further described in the conceptual stage (see ‘3.2.3 task specification’).

(26)

3.2 Conceptual stage

3.2.1 Construct definition

With regard to the definition of fluency as a construct, various interpretations and theories co-exist. However, for this paper, the definition of fluency as a construct put forward by Fillmore (1979) has been adopted, as it corresponds best to the aim of this research paper. His definition remains one of the oldest and most widely accepted definitions, which is still adopted by many researchers today (such as Tavakoli, 2016, Hieke, 1985 and Tavakoli & Wright, 2016). Moreover, in his definition, fluency’s multi-faceted and multi-componential structure, a characteristic that is generally agreed upon, is once again emphasised. He defined fluency as: ‘the ability to talk at length with few pauses; the ability to fill time with talk; the ability to talk in coherent and semantically dense sentences; the ability to have appropriate things to say in a wide range of contexts; and the ability to be creative and imaginative in the language use’ (Fillmore, 1979, p. 51). The definition put forward by Fillmore (1990) serves as a basis for the specific variables that will be analysed in order to measure the degree of fluency. The specific variables will be further elaborated in the next section, i.e. behaviour identification. Furthermore, Fillmore’s (1990) definition corresponds most to the criteria that are used in this study to assess the test taker’s overall proficiency. A first hypothesis with regard to fluency as a construct is that there will be a positive correlation between the overall degree of oral proficiency on the one hand and the degree of fluency on the other. In other words, if a student obtains high marks for his or her overall proficiency, he or she will most likely also speak more fluently than a student who has obtained lower marks for overall proficiency. This hypothesis will be answered by comparing the observation scores of the behavioural data recorded in the Excel file on the one hand and the overall proficiency marks attributed by the professor on the other.

Secondly, following the theory of Ejzenberg (1997), both task structure and interactivity have an influence on L2 speakers’ speaking performances and, in particular, on the degree of fluency. Ejzenberg (1997) hypothesized that, if the speaker participates in a dialogue, fluency ratings will be higher compared to when the speaker performs a monologue. This hypothesis will be answered by comparing the observation scores of the behavioural data recorded in the Excel file for both the monologic and dialogic task performances.

(27)

3.2.2 Behaviour identification

For this research, both a quantitative and a qualitative approach will be adopted in order to measure the degree of fluency. For the quantitative approach of this research, the theory put forward by Skehan (2003) and Tavakoli et al. (2005) will be followed, measuring the three main sub-divisions of L2 fluency, i.e. speed fluency, repair fluency and breakdown fluency, as a mixture of subdivisions provides the most reliable results. Within those three sub-divisions, the quantitative variables put forward by Baker-Smemoe et al. (2014) will be measured, as those are most commonly used to measure L2 fluency. As a consequence, speed fluency will be measured by the number of syllables per second. Repair fluency will, in turn, be measured by the number of hesitations on the one hand, which can be subdivided in the number of self-repetitions (i.e. in terms of content, so not when a student says for example ‘I, I, I don’t like’), the number of self-corrections (i.e. when a student corrects his- or herself in terms of content, pronunciation and grammar) and the number of filled pauses (i.e. the number of times the student says ‘uhm’) and the number of false starts on the other (i.e. the number of times the student resumes the beginning of his or her sentence or decides to move into a different direction without further completing the beginning of that sentence). Finally, breakdown fluency will be measured by the number of pauses and the length of pauses. All variables will be measured manually by the means of he program Audacity and the online website O’Transcribe. This will be further explained in the next stage (i.e. the procedural stage)

As stated in the previous section (see ‘2.2 measuring fluency’), the mixture between dialogic and monologic task performances can be considered as an inconsistency of the test method and could therefore influence the test taker’s performances. Therefore, in order to minimise the degree of measurement error and to discuss the possible inconsistency of this test method in a qualitative way, the variables mentioned above will be measured two times per recording, i.e. once for the longest monologic stretch and once for the longest dialogic stretch of the recording. The variables of breakdown fluency will be measured three times per recording, i.e. once for the longest monologic and once for the longest dialogic stretch of the recording and once for the overall recording. This will be further explained in the next stage (i.e. the procedural stage).

For the qualitative approach of this research, variables such as loudness and speech rate as well as the difference in task performances will be taken into account.

(28)

3.2.3 Task specification

For this research, L2 fluency will be analysed in an oral examination context. More specifically, the oral examinations of 26 students in their first year of Applied Language Studies at Ghent University will be analysed for this purpose. The tracks are recorded during the examinations of the 2nd semester, more specifically on the 3rd, the 4th and 5th of June 2019. During these oral examinations, both monologic and dialogic task performances are covered.

For this study, monologic task performances will be defined, measured and analysed, relying on the theory put forward by Skehan et al. (2001), as it provides us with a general definition and therefore represents a solid starting point. According to them, monologic task performances can be defined as task performances in which there is almost no to limited interaction: the participants of the task (i.e. the professor in this case) are not expected to engage in dialogue, but to respect the one who holds the floor and to wait until it is their turn to talk. Monologic task performances in the examination context, however, can often be interpreted in a more supple way. As a consequence, for this study, speech utterances will be considered monologic if a student speaks at length without getting interrupted or if a student speaks at length with briefly getting interrupted by the professor but when not engaging with what the professor is saying. Dialogic task performances can, in turn, be defined as task performances in which there is clear interaction. As a consequence, speech utterances will be considered dialogic if a student directly responds to what the professor is saying or if there is a rapid exchange of information. In other words, when there is a conversation going on. It is, however, not always possible to make a clear distinction between monologic and dialogic task performances. If a student’s answer, for example, extends substantially, it should be regarded as a stretch of monologue instead of a stretch of dialogue. To avoid this type of ambiguity, a 20-second boundary will be set, after which a student’s answer shifts from a stretch of dialogue into a stretch of monologue. This figure is based on the average length after which most students were able to provide a sufficient answer to the question posed by the professor. If the 20-second boundary falls in the middle of a student’s sentence, the dialogic stretch will shift into a monologue at the beginning or at the end of that sentence. The 20-second boundary will only be set if the student is clearly engaging in a conversation with the professor. If the professor, for example, briefly interrupts the student’s monologic stretch and the student answers with a simple ‘yes’ or ‘no’, but then immediately continues with what he or she was saying, that stretch is still considered the continuation of his or her monologue rather than the beginning of a dialogic stretch, which could shift into a monologic stretch after 20 seconds.

(29)

The examination can be subdivided into two parts, which represent the two different task performances. First of all, the students are asked to answer an open question, which belonged to a list of questions they had been given in advance to prepare at home (i.e. the monologic task performance). Afterwards, the professor asks several additional questions in order to maintain the conversation and stimulate a discussion (i.e. the dialogic task performance). These questions are spontaneous and could therefore not be prepared at home.

The list of questions they had been give to prepare at home include the following topics: 1. A personal interest

2. A cultural item 3. An anecdote 4. A news item

As the students were able to prepare the first question at home, the first answer to this question (i.e. the beginning of each recording) will be considered a stretch of monologue. After this first question, the answers to the following questions will be considered a stretch of dialogue.

The aim of the examination is to gauge the student’s level of oral speaking proficiency. For that purpose, multiple factors are assessed in order to decide on the students’ overall proficiency mark. In general, students are expected to demonstrate a minimum B2 level in order to pass the examination. For this purpose, first of all, the student’s enunciation, prononciation, intonation and accent is assessed. Secondly, the student’s lexical range is assessed, i.e. the extent to which a student can display a broad and appropriate vocabulary. Thirdly, the student is expected to demonstrate grammatical accuracy, without this restricting them from what they intend to say. Fourthly, the student’s answer is assessed in terms of content and structure. Furthermore, the student is expected to be able to speak at length, without too much hesitations or pauses. Finally, the student’s overall engagement and expression is assessed during the examination.

3.3 Procedural stage

During the procedural stage, the behavioural data will be observed, i.e. every recording will be listened to in order to observe and measure the specific construct’s behaviours, which will then be recorded in an Excel file. The specific construct’s behaviours that will be measured for this study were identified in the previous section (see ‘3.2.2 behaviour identification’). In the following sections, the method by which the measurement was conducted will be explained in further detail.

(30)

To prepare the procedural stage, all recordings were first labelled: each recording was attributed a certain code, which consists of the letter ‘D’, followed by the number of the recording and the designation ‘M’ or ‘F’, which stands for male or female. Then, the first track (i.e. D2F) was transcribed manually by the means of the analytical tool oTranscribe, which enables you to transcribe the recording while listening and to provide your audio fragment with timestamps, in order to be able to distinguish the different monologic and dialogic stretches. This transcription was sent to the professor in charge of the examinations, who distinguished the different monologic and dialogic stretches of the track by way of example. After that, the monologic and dialogic stretches of the ten subsequent recordings were distinguished by ear due to time constraints. For these distinctions, a system of double-checking was used: the recordings were distinguished twice, i.e. once by myself and once by the professor in charge of the examinations. Afterwards, both versions were compared and adjusted in order to provide a reliable continuation of the distinction of the 86 remaining recordings. The different monologic and dialogic stretches were distinguished according to certain conventions. Figure 1.2 illustrates the distinction of the monologic and dialogic stretches for the track ‘D10F’:

00:13 S: Sure, so uhm (...)

02:38 P: if you were (...)

02:59 S: that's quite difficult uhm (...)

(shift in monologue)

03:19 M: S: just the emotions (...)

03:57 P: Uhm, they are one of these bands (...)

04:36 S: I think (...)

(shift in monologue, middle of sentence)

04:54 S: I think (...)

05:43 D: P: and a quite taste for some (...)

05:48 End of recording

(31)

First of all, the monologic stretches were marked in green, whereas the dialogic stretches were marked in blue. Secondly, for each recording, the longest monologic and dialogic stretch were underlined and marked in bold. Thirdly, at the beginning of each new stretch (i.e. monologic or dialogic), a timestamp was added, together with the designation ‘S’ or ‘P’, which respectively stands for ‘student’ and ‘professor’, and the first few words of the first sentence of that stretch. The symbol ‘(…)’ was used to indicate that the sentence was not finished yet. Fourthly, if the 20-second boundary was applied, i.e. when a dialogic stretch shifted into a monologic stretch, the shift was indicated by the designation ‘(shift in monologue)’. To visualise the 20-second calculation, the first few words of the sentence at which the calculation started was indicated and accompanied by a timestamp (here ‘02:59’). The beginning of the (new) monologic stretch was then indicated as mentioned above, i.e. with a timestamp, the designation ‘S’ or ‘P’ and the first few words of the first sentence of that stretch. In some cases, the 20-second boundary was reduced or extended by a few seconds as it fell in the middle of the sentence. In those cases, the shift was indicated with the designation ‘(shift in monologue, middle of sentence)’ to illustrate the reduction or extension of the 20-second boundary. Fifthly, the end of each recording was designated with the words ‘End of recording’ and accompanied with a time stamp.

After the preparation stage (i.e. the labelling of the recording and the identification of the monologic and dialogic task performances), the specific construct’s behaviours were measured.

In the normal course of events, the entire corpus, i.e. the 97 recordings, would have been analysed. However, due to time constraints, only 26 of the 97 recordings were measured and analysed. In order to ensure a well-balanced corpus, a set number of recordings were chosen for each proficiency level of L2 English. The highest and the intermediate proficiency level each consisted of ten recordings, with a varying range of marks. The lowest proficiency level consisted of six recordings, as there were only six students who failed the examination. For the measurement of speed fluency, a corpus of 16 recordings was used. This will be explained further in this section.

First of all, repair fluency, i.e. the number of hesitations and false starts, was measured manually. This was measured three times, i.e. once for the longest monologic stretch, once for the longest dialogic stretch and once for the overall recording. Secondly, breakdown fluency, i.e. the number and length of pauses, was measured. In the normal course of events, this would have been measured three times: once for the longest monologic and once for the longest dialogic stretch, which would have been measured manually, and once for the overall recording, which would have been measured automatically by the means of

Afbeelding

Figure 1.3 2,6	2,7	2,8	2,9	3	3,1	3,2	3,3	Monologic	Dialogic	N. Syllables/sec.

Referenties

GERELATEERDE DOCUMENTEN

Through a discussion of the decrees established at the Council of Trent as well as other theological issues considered to be of prime importance at the time,

In an effort to explore assessment of fluency in oral speech of L2 learners, temporal measures of fluency were examined in picture-task elicited speech of eight Iranian

The three newly developed instructional EFL programs differed in instructional focus and type of task, that is, (a) a program that combined form-focused instruction and practice

Therefore, the third hypothesis which stated that the interaction between gossip valence and gossip targets’ level of self-esteem would have weakened the indirect

Purpose - This paper aims to empirically investigate whether two main bundles of lean practices, just-in-time (JIT) and total quality management (TQM), have a linear effect

Tussen 1995 en 1999 heeft een omschakeling plaatsgevonden van pacht naar eigendom (tabel 2.5). Het areaal erfpacht is ook licht teruggelopen. De provincie Flevoland heeft ten

De dagelijkse stijging in voeropname werd ook niet beïnvloed door de opname van voer tijdens de zoogperiode. In de analyse van de dagelijkse stijging van de voeropname zijn de

The three newly developed instructional EFL programs differed in instructional focus and type of task, that is, (a) a program that combined form-focused instruction and practice