• No results found

The Perception of Dutch Fluency: a qualitative examination of fluency ratings on native and nonnative speech

N/A
N/A
Protected

Academic year: 2021

Share "The Perception of Dutch Fluency: a qualitative examination of fluency ratings on native and nonnative speech"

Copied!
83
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The Perception of Dutch Fluency

a qualitative examination of fluency ratings on native and

nonnative speech

MA thesis, Leiden University

Daya Haverman, s1636561

Supervisor: dr. N.H. de Jong

Second reader: dr. J. Caspers

(2)

Abstract

In fluency research that contains rating experiments, it is often the case that either a) overall proficiency is examined, and participants are free to rate fluency intuitively; or b) participants are told to base their judgments on several utterance fluency characteristics, since the researcher studies fluency in the narrow sense. In this study, I examine whether there are more speech factors that fall under the concept of fluency in the narrow sense than only the fluency characteristics that the participants are given prior to a rating task. Qualitative research into the perception of fluency on Dutch native and nonnative speech gave 17 different categories that participants take into account when judging spontaneous speech. Then, two groups of participants were juxtaposed: one group that had to judge fluency based on four utterance fluency characteristics (explicit group); and one group that only received a definition of cognitive fluency and was free to base their judgments on their own interpretation of this definition (implicit group). Results indicate that the implicit group was less likely to let disfluencies influence them negatively, but was more inclined to unconsciously judge overall proficiency rather than fluency in the narrow sense. Additionally, both groups showed a sensitivity for pause distribution that helped them to determine the speaker’s ease of lexical retrieval. I conclude that intonation and planning efficiency are essential components of fluency in the narrow sense, and should therefore be used in the instructions for future rating experiments.

(3)

Acknowledgement

First and foremost, I would like to express my sincere gratitude to Nivja de Jong for being my supervisor during the writing process. Completing this thesis would not have been possible without her help and support.

I would also like to thank my 23 participants for voluntarily taking part in this research project, especially the ones who took the time to partake in the stimulated recall sessions via Skype.

(4)

Contents

Acknowledgement ... 3

Section 1: Introduction ... 5

Section 2: Theoretical background ... 6

Section 3: The present research ... 13

3.1 Thesis outline ... 13 Section 4: Methodology ... 15 4.1 Participants ... 15 4.1.1 Implicit group ... 15 4.1.2 Explicit group ... 15 4.2 Materials ... 15 4.2.1 Stimulus description ... 15 4.2.2 Instructions ... 17 4.3 Procedure ... 18 4.3.1. Survey ... 18

4.3.2 Stimulated recall procedure ... 18

4.4 Data analysis... 18

4.4.1 Data analysis for section 5 ... 19

4.4.2 Data analysis for section 6 ... 20

Section 5: Perceptions of Dutch fluency ... 22

5.1 Results: influencing temporal and non-temporal factors ... 22

5.2 Discussion: explanation of the speech factors ... 22

Section 6: Explicit group vs. implicit group: perceptual differences ... 31

6.1 Results: quantitative analysis of implicit group experiment ... 31

6.2 Discussion: effect of the manipulations on the two groups ... 32

6.2.1 Pausing manipulations effects ... 32

6.2.2. Speech rate manipulation effects ... 34

6.3 Results: perceptions of other speech factors ... 36

6.4 Discussion: category differences ... 36

Section 7: General discussion ... 42

7.1 Limitations... 43

Section 8: Conclusion ... 44

Section 9: References ... 45

(5)

Section 1: Introduction

Imagine two people who have been learning Dutch as a second language for a while, and they are currently both in a classroom, doing their oral exam. Let us call them Arnold and Thomas – the two of them are being evaluated for their fluency. The examinators find that Arnold talks quite slowly, often uses the filler ‘umm..’, and often restarts his sentences. Thomas, on the other hand, seems to talk in a pleasant tempo, rarely uses the filler ‘umm..’ and never restarts his sentences. The results of the oral examination may therefore be obvious: Arnold’s grade may be insufficient or below average, whereas Thomas’s grade is above average; maybe even perfect. In this case, Arnold and Thomas have been assessed for the ease and smoothness of their speech processes. It should be noted, however, that this is only a component of their overall proficiency. If someone says, ‘Thomas is a very fluent speaker of Dutch’, it not only means that he rarely tends to use the filler ‘umm..’ or never has to restart his sentences; it can also mean that he has a wide array of words at his disposal, or that he does not talk with a heavy accent, or that he never makes grammatical mistakes in his sentences. In this case, the word ‘fluency’ indicates the level of overall proficiency.

In the field of applied linguistics, both concepts of fluency are widely researched. The knowledge that emerges as a result of fluency research can be used for language testing practices, since it is the case that for many language tests, fluency is evaluated one way or another (De Jong, 2016, p. 209). In this research field, the difference between these two concepts of fluency is indicated as follows:

fluency in the broad sense is about overall proficiency, and fluency in the narrow sense is about the ease

and smoothness of speech (Lennon, 1990). It should be noted that even native speakers of a language can therefore be considered to be less fluent speakers. In terms of overall proficiency, natives are fluent by default, although if that native talks extremely slow, and constantly uses the filler ‘umm..’, they can still be considered as less fluent in the narrow sense. Researchers in narrow fluency have learnt to look out for factors such as speech rate and pausing because of previous research that has focussed on the

perception of speech. Participants had to come to the lab, put on headphones, listen to different speech

fragments, and rate those speech fragments in terms of fluency on a likert scale: often 1 for ‘not fluent at all’, and 9 for ‘very fluent’.

However, in this kind of research it is important that the researcher is very cautious with the execution of the experiment. The results are dependent of what is asked of the participant. If the researcher wants to examine fluency in the narrow sense, and asks the participant how fluent they think a specific speaker is, the participant is very likely to judge the speaker’s overall proficiency. This way, the participants may comment on speech features that are not relevant for the researcher. There are many facets of oral proficiency that can be judged, and it depends on the kind of research what speech factors are relevant for the researcher. This is why research into fluency – whether it be fluency in the narrow sense or fluency in the broad sense – has to be specific and narrowed down to a particular area. Additionally, it could be the case that in experimental settings, participants get the assignment to judge speech on a particular area, but it is possible that they unconsciously also focus on other aspects of speech. This may influence their eventual ratings, rendering the results of the experiment less accurate and reliable. Therefore, this study attempts to find out what participants look for in spontaneous speech fragments when they are rating fluency. This study focusses on fluency in the narrow sense, for the Dutch language specifically. The more we know about participants’ rating behaviour when judging oral speech, the more our knowledge of fluency perception grows, which eventually gives information that could be valuable for language evaluation practices. Additionally, this study examines the instructions that are provided prior to a rating task, and the way in which these instructions influence the eventual perceptions of Dutch fluency. In short, this paper tries to deepen our understanding of fluency perception by examining how participants rate native and nonnative Dutch speech fragments.

(6)

Section 2: Theoretical background

Defining fluency

Several researchers have attempted to define the concept of fluency in the past. Fillmore (1979) defined fluency in four different ways: fluency is the ability to talk at length and to be able to fill time with talk; the speaker should be able to convey their message in a ‘semantically densed’ manner; speakers are fluent if they know what to say in a wide range of contexts; and they should also be creative and imaginative in their language use (p. 93). Lennon (1990), on the other hand, defines fluency as “the impression on the listener’s part that the psycholinguistic processes of speech planning and speech production are functioning easily and efficiently” (p.391). The difference between these two definitions is that Fillmore argues for the ease of the speaker in the production process, whereas Lennon focusses on the ease of comprehension on the listener’s part. Segalowitz (2010) helped to solve this difference by defining fluency in three ways. The first is cognitive fluency, which is “the efficiency of operation of the underlying processes responsible for the production of utterances” (p. 165). Cognitive fluency therefore reflects the cognitive processes in the brain that are involved during speech that enables the speaker to utter fluent stretches of speech. The second type is utterance fluency, which focusses on the speech that is actually uttered, and then looks at “the features (…) that reflect the speakers cognitive fluency” (p. 165), which can be measured acoustically. The last definition of fluency is perceived

fluency, “the inferences listeners make about speakers’ cognitive fluency based on their perceptions of

their utterance fluency” (p. 165). It can be concluded that Fillmore’s definition of fluency is based on utterance and cognitive fluency, whereas Lennon’s definition is based on perceived fluency. Lennon (1990) also established a distinction between two different kinds of fluency, which entails the difference between fluency in the broad sense (henceforth broad fluency) and fluency in the narrow sense (henceforth narrow fluency). Broad fluency defines the general language abilities of a person, and is comparable to Fillmore (1979)’s definition of fluency: if someone is said to be fluent in French, for example, it means that they know the language well – i.e. they are creative and imaginative in their language use – and the definition therefore reflects their overall proficiency in the French language. Narrow fluency, on the other hand, captures one specific aspect of that overall proficiency, which focusses more on the ease and smoothness of oral speech production that is closely related to linguistic planning efficiency, that enables the speaker to convey the desired message smoothly and with minimal effort.

In the field of Second Language Acquisition, both broad and narrow fluency are widely researched. In narrow fluency research, it is obvious that non-temporal speech factors such as accent, grammar and vocabulary are disregarded, since they are not really relevant for ease and smoothness of oral speech production. Rather, temporal factors are examined that can be acoustically measured, such as speech rate, frequency of filled pauses, or silent pause length. In table 1 below, such acoustic measures can be found, which are often used in narrow fluency research.

Table 1: Frequently used measures of utterance fluency (De Jong, 2018, p. 240)

Measure Formula

Speech rate Number of syllables/total time

Pruned speech rate (number of syllables – number of disfluent syllables)/total time

Articulation rate Number of syllables/speaking timea

Pace Number of stressed syllables/total time

Mean length of utterance Total speaking time/number of utterancesb or number

of syllables/number of utterancesb

Number of silent pauses (per minute) Number of silent pauses/total time or speaking timea

(7)

Phonation time ratio Speaking time/total time

Number of filled pauses (per minute) Number of filled pauses/total time or speaking timea

Number of repetitions (per minute) Number of repetitions/total time or speaking timea

Number of repairs (per minute) Number of repairs and restarts/total time or speaking timea

aSpeaking time is equal to total time minus silent pausing time. bNumber of utterances is equal to the number of silent pauses plus 1. Research in narrow fluency

The main bulk of research within applied linguistics is focussed on narrow fluency, in which utterance fluency is compared to perceived fluency, in order to find out which aspects of fluency are perceptually salient and which aspects of the speech signal listeners deem important (De Jong, 2018). The temporal measurements mentioned above are often used by researchers, so that a spontaneous speech fragment can objectively be analysed for its fluency. The fragments are then played to participants, who are supposed to rate the fragment in terms of fluency on a linear scale. Such research was done in the nineties, though methodologies differed: Lennon (1990) and Freed (1995) both investigated the development of fluency longitudinally; whereas others, such as Riggenbach (1991) compared the perception of fluent and non-fluent speakers, and correlated fluency scores with temporal variables. What these studies have in common is that they have all found that speech rate and mean length of run are the best predictors of fluency (Riggenbach, 1991; Freed, 1995; Lennon, 1990), and it was also concluded that in terms of pausing, more pauses indicate a less fluent speaker (Lennon, 1990; Freed, 1995). However, more recent studies are critical towards these studies, as their number of participants are very small. Additionally, Lennon (1990) and Freed (1995) found that non-temporal factors were mentioned by the participants as well, such as ‘accuracy of grammar’ or ‘tone of voice’ (Freed, 1995). This interfered with the temporal results, which is why other researchers wanted to find a way to get the participants to disregard the non-temporal factors completely, so that their results only reflected the temporal measures that they wanted to investigate. Their solution was in the instructions that they gave to the rating participants: they were instructed to judge fluency in the narrow sense, and were given some descriptors to base their rating on, such as speech rate, silent and filled pauses, corrections and repetitions.

Explicit instructions

These kinds of instructions can be considered as explicit instructions, as the participants are given explicit knowledge of what they should base their fluency rating on. An example of such research is done by Rossiter (2009), who investigated the perceptions of the speaking fluency of 24 adult ESL learners who narrated picture stories at Time 1 and again 10 weeks later at Time 2. Speech fragments of approximately 1 minute were played to fifteen novice and six expert native speakers of English, and they were asked to rate the fragments in terms of fluency on a scale of 1 (extremely disfluent) to 9 (very fluent). The raters were explicitly instructed to judge the fragments for temporal fluency, and were shown a list of factors associated with that: speech rate, hesitation phenomena (e.g., filled or non-lexical filled pauses, repetitions, self-corrections), and formulaic sequences or ‘chunks’, because “raters who receive no explicit description of fluency may base their evaluations on varying criteria (e.g., Freed, 1995; Lennon, 1990)” (p. 401). Rossiter found that the fluency ratings of the three groups of listeners were influenced by similar general impressions of temporal phenomena among the three groups: unfilled or non-lexical filled pauses, self-repetitions, and speech rate accounted for over three-quarters of the negative temporal fluency impressions.

Bosker et al. (2013) also used explicit instructions in their study, and also found speech rate and pausing measures to be significant predictors of fluency. In order to find out which influencing factors raters of fluency take into account when rating L2 speech, they assembled 114 speech fragments of

(8)

Dutch that had to be rated in terms of fluency on a scale of 1 (=not fluent at all) to 9 (=very fluent). The raters had to base their judgments on the use of silent and filled pauses, the speech rate and the use of hesitations and/or corrections, so that they would not rate fluency in the broad sense of language proficiency. For the fluency rating group, the speech fragments were analysed on 6 acoustic measures of narrow fluency: 2 speed fluency measures, 2 breakdown fluency measures and 2 repair fluency measures. They eventually found that fluency ratings were predicted by all six acoustic predictors; especially speed and breakdown fluency measures predicted a large part of the variance in fluency ratings, whereas repair fluency measures explained a small part of the variance.

Préfontaine, Kormos and Johnson (2016) also investigated which utterance fluency measures are the best predictors of fluency scores. Four utterance fluency measures were extracted from speech fragments, and these fragments were also rated by eleven untrained judges. They had to make use of an assessment grid that consisted of six quantitative and qualitative can-do statements from the Council of Europe’s (2001) Common European Framework of Reference (CEFR). The raters had to select a level from A1 to C2 that represented their assessment of oral performance, and therefore the CEFR scale was converted into a six-point numerical scale. As an extra instruction for this study, raters were also asked to provide their opinion with regarding to speech rate and pausing, for which they provided a scale from 1 (a lot of hesitations / unreasonable speed) to 6 (very few hesitations / very reasonable speed). These instructions can therefore also be considered as explicit, as the raters have been made conscious of the factors speech rate and pausing. Préfontaine, Kormos and Johnson found that pause frequency was seen as a predictor of fluency judgments, though it was the weakest predictor of ratings of pause behaviour in two out of three tasks: mean length of run was consistently one of the most important predictors of L2 fluency judgments.

What these three studies have in common is that they all found that pausing behaviour and speech rate were predictors of fluency ratings: a higher speech rate and fewer pauses or hesitations indicate a higher fluency rating, whereas a slower speech rate and more pauses indicate a lower fluency rating.

Intuitive ratings

There are also studies – though very few – that do not give the participants some temporal fluency factors to base their ratings on. Rather, they encourage the participant to rate intuitively, since the researcher is interested to see which variables emerge when participants are free to interpret the concept of fluency themselves. Kormos and Dénes (2004), for example, explored which variables predict native and non-native speaking teachers’ perception of fluency. They collected speech samples from 16 Hungarian L2 learners, and let native and non-native teachers rate the samples for fluency on a semantic differential scale that ranged from 1 (least fluent) to 5 (most fluent). In their study, Kormos and Dénes specifically focussed on ten temporal variables: speech rate; articulation rate; phonation-time ratio; mean length of runs; number of silent pauses per minute; mean length of pauses; number of filled pauses per minute; number of disfluencies per minute; pace; and space. The difference with the ‘explicit instruction’ method mentioned above, is that Kormos and Dénes did not provide any of these temporal measures for the raters to base their judgments on. They eventually found that speech rate, the mean length of runs and phonation-time ratio are all very good predictors of fluency scores. In their conclusion, they also stressed the importance of the measure pace: “we can argue that how many stressed words one can say in a minute is a slightly better predictor of fluency than how many syllables one utters a minute. In other words if a speaker utters a lot of unstressed words at a high speed, he or she is not necessarily perceived to be very fluent” (p. 158). This highlights the importance of stress when considering temporal speech rate variables. They also found that the number of filled and unfilled pauses and other disfluency phenomena were not found to influence perceptions of fluency – which contradicts the studies that used explicit instructions in their research.

(9)

Freed (1995) tried to identify attributes of fluency that distinguished a group of American second language learners in French that studied abroad, and a group of American students that was limited to the formal on-campus French language classroom in the United States. They asked 6 native speakers of French to evaluate speech fragments of each student. The raters were asked to rate the fragments on a scale of 1 (not at all fluent) to 7 (extremely fluent). They did not receive any instruction on how to rate these fragments on fluency; they were encouraged to use their own subjective reactions to fluency. Although the participants were encouraged to focus on broad fluency, Freed sought to evaluate temporal measures only, such as frequency of filled pauses, speech rate, length of fluent speech runs and repairs. She eventually found that students from the study abroad group spoke more, and at a significantly faster rate, and they tended to use fewer silent pauses and fewer non-lexical filled pauses. However, she also found that the raters reported to have been influenced by non-temporal factors as well, which is not surprising when considering the participants were allowed to freely interpret the concept of fluency.

“our judges indicated that they were influenced by a variety of factors which extend beyond the hesitation phenomena which we measured. Included among these were "richness of vocabulary," "accuracy of grammar," "accent," "clarity of voice," "enunciation" and "rhythm of the phrases," as well as qualities such as "tone of voice," "ease" or "confidence in speech" and "comfort in the ability to converse"” (p. 143).

In short, when examining fluency, the instructions provided prior to the experiment largely reflect the results that the researcher is going to get. If the perceptions of overall proficiency are being examined, it is a logical decision of the researcher that participants are free to interpret the concept of fluency themselves. However, if the researcher wants to focus on narrow fluency only, participants need more guidance in order to prevent them from focussing on the irrelevant aspects of speech. The solution is to provide the participants with descriptors of utterance fluency measures before the experiment, although consequently, the results of these experiment are being limited to those descriptors only.

Psycholinguistic research

In the literature that has been discussed up until this point, it seems as though faster speech rate and less pausing indicate a higher fluency rating; that is, the more disfluencies, the lower the fluency rating. In psycholinguistic research, disfluencies used to be regarded similarly: they were seen as parts of speech that had to be ‘edited out’ by the listener in order to comprehend the message (Levelt, 1989). However, more recent psycholinguistic research seems to conclude the opposite: disfluencies may actually prove to be advantageous to the listener rather than interfering the comprehension process. Corley et al (2007) investigated the effect of disfluencies in a sentence by examining Event-Related Potentials (ERPs).

“During comprehension, each word must be integrated with its linguistic context, from which it can often be predicted. Where integration is difficult (for example because a word is not predictable), a negative change in voltages recorded at the scalp relative to more easily integrated words is observed. This difference, the N400 effect, peaks at around 400 ms after word onset, maximally over central and centro-parietal regions.” (p. 660)

In an ERP experiment which compared fluent to disfluent sentences, this N400 effect was found for unpredictable compared to predictable words – as it signalises the processing of speech as more difficult. However, it was found that this effect was reduced in cases were the unpredictable word was followed by a hesitation marker. In short, disfluencies such as unlexical filled pauses are not as useless as they were previously thought to be; rather, they help the listener, as hesitations such as these heighten listeners’ immediate attention to upcoming speech. A similar conclusion was drawn by Brennan and Schober (2001), who set up an experiment in which participants had to select an object on a graphical

(10)

display after having received instructions. They noted that participants selected the target object quicker and more accurately after hearing mid-word interruptions in the instructions. In short, the instruction

“Move to the yel- uh, purple square” overall rendered quicker and more accurate results than the fluid

instruction “Move to the purple square”. There are more studies that claim that delays, such as filled pauses, help to speed up and improve word recognition (Corley, Hartsuiker, 2011; MacGregor, Corley, Donaldson, 2009); all pointing towards the advantages that disfluencies can have in listener comprehension.

Disfluency research

The fact that disfluencies can have an advantage in listener comprehension seems to contradict the conclusions from the research field that examines the link between utterance and perceived fluency. It is not yet clear when certain disfluencies help the listener, and when they interfere in the comprehension process. Kahng (2018) examined the role of silent pauses in fluency perceptions, and was especially interested in the distribution of these pauses. Native English speakers had to rate the speech of 74 English L2 samples from 37 Korean speakers on a scale of 1 (=extremely disfluent) to 9 (=extremely fluent). They were provided with explicit instructions prior to the rating task; they were asked to base their judgments on speed, pause and repair phenomena. Kahng (2018) eventually found that speech that contained pauses between clauses was rated to be more fluent than speech that contained pauses within the clauses, indicating that listeners are very sensitive to pause location. Kahng addresses how silent pauses can be distinguished: a ‘hesitant’ or ‘performance-based’ pause is based on a speaker’s natural prosody and is related to delays in planning and production processes, whereas a ‘prosodic’ pause separates utterances into intonational phrases and is thus part of the rhythmic structure of speech (p. 571). The difference is in the location of these pauses: prosodic pauses occur at intonational phrase or clause boundaries, whereas hesitant pauses can occur anywhere. Kahng (2018) found that listeners seem to have an understanding that pauses within a clause tend to reflect reduced cognitive fluency, which therefore may be more likely to influence the eventual fluency rating than when prosodic pauses are used.

Bosker, Quené, Sanders and De Jong (2014) also sought to examine the role of disfluencies, and were interested in native speech as well as nonnative speech. In previous research in which fluency evaluations were made, native disfluencies had been disregarded – as natives were seen as fluent by default – but Bosker et al. (2014) questioned that assumption by examining the difference in perception of disfluencies in native and nonnative speech. They designed two experiments in which native and nonnative Dutch speech fragments were rated by listening participants for fluency on a scale of 1 (=not fluent at all) to 9 (=very fluent). In the first experiment, silent pauses were manipulated in length; and in the second experiment, speech rate was manipulated. For the first experiment, they found speech with longer silent pauses was rated lower in fluency than speech with shorter silent pauses; and for the second experiment, they found that faster speech was rated higher in fluency than slower speech. These findings were true for both native and nonnative speech; no differential effects for either native or nonnative speech were found. Bosker et al. therefore concluded that disfluencies in native and nonnative speech are rated similarly. However, it should be noted that Bosker et al. also made use of explicit instructions: prior to the rating experiment, they asked the listeners to base their ratings on the use of silent and filled pauses, speech rate, corrections and repetitions. It therefore could be the case that the listening participants were only focussed on the speech features that were mentioned, and it may be inferred that they have deliberately disregarded any extra factors that may have influenced their rating. The participants may have rated slow native speakers relatively high for other perfectly valid reasons, if they were not asked to base their judgments on speech rate. Research should therefore be conducted that examines the perception of narrow fluency without providing temporal utterance fluency descriptors beforehand, but still prevents the listener from judging overall proficiency instead of narrow fluency.

(11)

This way, it can be found out what other characteristics of oral speech production influence the listener, which we may have been overlooking when researching narrow fluency perceptions in the past.

Qualitative research

A large bulk of research into narrow fluency consists of relating acoustic measurements to fluency ratings, which only consists of quantitative research. However, there are studies that provide qualitative research as well. Préfontaine and Kormos (2016) investigated the perception of French L2 fluency, and they assembled qualitative comments made by three raters who rated the French L2 performances of 40 speaking participants. Préfontaine and Kormos claimed that temporal measures attributed to narrow fluency “do not take full account of the range of differences L2 speakers show in the fluency of their speech and do not offer detailed insights into the wide variety of impressions listeners hold about fluent performance” (p. 152), which is why they did not provide any definition of fluency before the experiment to serve as a guide. They allowed the raters to make their own judgments about French L2 fluency instead, and hoped to find temporal as well as non-temporal factors (p. 155). In the results, they found comments that contradicted earlier quantitative research: unlexical filled pauses, for example, were seen to contribute to a more native-like language use:

“Of course, this student hesitates, makes a lot of ‘uh’ sounds, but he often adds a conjunction (et, euh, mais, alors, donc) ‘and, um, but, so, therefore’ to ensure the link with the preceding sentence. His hesitations seem normal because francophones use the same filler trick.” Rater 2, participant 9. (p. 159)

They also found that self-corrections do not always negatively impact on perceptions of fluency, and are sometimes even seen as sounding more native-like. Speech rate was seen as a more complex feature, in which speech should not be too slow, for it will not catch the listener’s attention; but speech that is too fast will cause lower comprehension in the listener. Conclusions such as these are of high value for narrow fluency research, as they elucidate why findings in previous quantitative research in fluency perception have been contradictory. Additionally, conclusions that point towards a higher perception of fluency when disfluencies are uttered agree with the psycholinguistic studies that found advantages for listeners’ comprehension in disfluencies. Qualitative research shows us that fluency perception does not only involve temporal features; rather, temporal features seem to be inextricably intertwined with non-temporal features when it involves fluency perceptions.

Other findings by Préfontaine and Kormos involved other non-temporal characteristics, but were specifically relevant for the French language only: rhythm (“the regular, patterned beat of stressed and unstressed syllables and pauses in an utterance” (p. 156)) was considered a more salient quality in the perception of French fluency than speech rate, for example. For other languages, qualitative studies such as these have not been done as of yet, though they are necessary: “investigating how ratings of fluency are related to listeners’ perceptions in various languages is important as there might be considerable variation across languages in the acoustic features of speech that contribute to judgments of fluency” (Préfontaine, Kormos, Johnson 2016, p. 57). There has been research into the perception of Dutch fluency (Bosker et al., 2014, discussed above) though this study considers temporal measures of utterance fluency only; and it has not yet been researched if the concept of narrow fluency consists of more than these measures only. Nevertheless, what should be kept in mind when examining this, is that the concept of narrow fluency should be interpreted, and that judgment of overall proficiency should be prevented.

In conclusion, a full understanding of all influencing factors on the perception of fluency is necessary in order to gain more understanding about the concept of narrow fluency. For the French language, Préfontaine and Kormos (2016) have already provided a research of this sort, though they

(12)

researched broad fluency (i.e. overall proficiency) rather than narrow fluency. This study tries to fill this research gap: it provides a detailed focus on listener perception by examining all facets of narrow fluency, including the ones that are often disregarded in narrow fluency research. In order to accomplish this, an experiment will be conducted in which participants receive a definition of cognitive fluency before they rate speech fragments on a linear scale. Additionally, this experiment will be juxtaposed to a similar experiment, but with explicit instructions, such as the ones that Bosker et al. (2014) provided prior to their rating experiment. This way, it can be examined how much influence the instructions given prior to a rating task have on the raters’ fluency perceptions, which may then give valuable information for narrow fluency research in the future.

(13)

Section 3: The present research

The present research is mainly based on and inspired by Bosker et al. (2014)’s research: The Perception

of Fluency in Native and Nonnative Speech. This is an example of research that investigates the

relationship between perceived fluency and utterance fluency, and uses a rating experiment with explicit instructions provided prior to the rating task. They concluded that disfluencies are rated similarly among native and nonnative speakers; faster speech is rated as more fluent, and speech with longer silent pauses are rated as less fluent, regardless of whether the speech is uttered by a native or nonnative speaker.

This thesis aims to repeat this study, but for two different groups. The participants from the one group are given the same instructions as Bosker et al. used – i.e. instructions that ask the participants to base their ratings on speech rate, silent and filled pauses, and self-corrections and repetitions – and will be referred to as the explicit group. The participants from the other group are provided with other instructions: they are given a definition of cognitive fluency in the narrow sense, and are instructed to base their judgments on their own interpretation of this definition. This group will be referred to as the

implicit group. In order to find out what the participants take into consideration when judging narrow

fluency, qualitative analyses will be conducted. The results of the different groups will also be juxtaposed, so that it will be clear in what way the instructions provided prior to a rating task influence the eventual perceptions of Dutch speech. In short, the goal of this research is to provide a full understanding of the perception of Dutch fluency in the narrow sense on the one hand, and to explain how dependent this perception is on the instructions provided before a rating task on the other hand. Therefore, the following two research questions will be answered:

RQ 1: Which temporal and non-temporal factors constitute the perception of Dutch fluency?

RQ 2: To what extent do the perceptions of Dutch fluency differ between having received implicit or

explicit instructions prior to the rating task?

For this study, the same stimuli will be used as the stimuli in Bosker et al.’s experiment. The explicit group will also be given the same instructions as Bosker et al. used; the explicit group experiment is therefore a repetition of Bosker et al.’s experiment, but smaller in scope. Therefore, the conclusions that have already been provided by Bosker et al. will be used here, as the goal of this explicit group experiment is to provide qualitative data that may elucidate these conclusions. The implicit group experiment, however, has not been conducted before, which is why I will use more participants for this group so that quantitative analysis can be conducted as well. In section 4, this will be explained further.

3.1 Thesis outline

In section 4, the methodology will be explained; Bosker et al. (2014)’s stimuli as well as this study’s set up of the experiment will be clarified. Section 5 revolves around the first research question, and treats the qualitative results of the explicit and the implicit group together. Section 5 names the categories that emerge when participants are asked to evaluate narrow fluency, and each category is then described and explained. The second research question will be treated in section 6, in which the data of the implicit and explicit group will be considered separately. First, the quantitative data of the implicit group will be examined, so that consequently, the conclusions of the implicit group can be compared to the conclusions of the explicit group. This will be elucidated by the qualitative data of both group experiments, and section 6 will end with a discussion. Section 7 then contains a general discussion in which all data will be summarised, as well as a description of the limitations to this study. The conclusion

(14)

of this thesis will follow in section 8. Section 9 contains the references, and the appendices can be found in section 10.

(15)

Section 4: Methodology

4.1 Participants

4.1.1 Implicit group

For the implicit group experiment, 19 participants took part on a voluntary basis. All of these participants were native speakers of Dutch. 12 of them were female and 7 of them were male. Their ages ranged between 21 and 25 years old, and all of them reported to have normal hearing. Of these 19 participants, 4 were asked to partake in stimulated recall sessions, of which 2 were male and 2 were female. Some extra information about education and experience in teaching was inquired, as well as their permission for audio recording the stimulated recall session. All four participants differed in level of education. Only one person was already familiar in the field of linguistics; they reported to have obtained a Research Master’s degree at a university, as well as having experience in teaching Dutch as a second language to Korean immigrants.

4.1.2 Explicit group

The explicit group experiment was only focussed on obtaining qualitative data, which is why only 4 participants were needed here. They all partook in a stimulated recall session. All four participants were female, ranging from 22 to 26 years old. Just as in the implicit group, one of these participants had completed a Research Master’s degree in Linguistics, and was therefore already familiar with linguistics and the concept of fluency. All four participants had different educational backgrounds and they all reported to have normal hearing.

4.2 Materials

4.2.1 Stimulus description

For this experiment, I have used the same stimuli as Bosker et al. (2014) (henceforth Bosker), who voluntarily provided them to me. I will first explain Bosker’s stimuli and experiment in order to provide a full understanding of it before explaining how I incorporated the stimuli into my own experiment.

Bosker first obtained speech recordings from native and nonnative Dutch speakers from the What Is Speaking Proficiency corpus (WISP) in Amsterdam (De Jong, Steinel, Florijn, Schoonen, Hulstijn, 2012). The recordings in this corpus consisted of monologic speaking performances on different topics, which varied in complexity, formality and discourse type. For each task, the participant was shown an instruction screen with a picture of the communicative situation about the topic, and was then asked to role play as if they were actually speaking to an audience. The three topics that Bosker chose for his research are described in table 2.

Table 2: Descriptions of the selected topics (Bosker et al., 2014)

CEFR-level Characteristics Description Topic 1 B1 Simple, formal,

descriptive

The participant, who has witnessed a road accident some time ago, is in a courtroom, describing to the judge what had happened.

Topic 2 B1 Simple, formal, argumentative

The participant is present at a neighbourhood meeting in which an official has just proposed to build a school playground, separated by a road from the school building. The participant gets up to speak,

(16)

takes the floor, and argues against the planned location of the playground.

Topic 3 B2 Complex, formal, argumentative

The participant, who is the manager of a

supermarket, addresses a neighbourhood meeting and argues which one of three alternative plans for building a car park is to be preferred.

In his experiments, Bosker found a topic effect – the ratings of the speech fragments were significantly higher in topic 2 and 3 then in topic 1. In my research, I will not try to eliminate this effect; I am more interested in maintaining a high diversity in topics and speakers, because this will generate different kinds of qualitative comments. Besides, 3 different topics are more likely to keep the participant’s attention during the rating process rather than 1 or 2, as listening to different versions of the same topic may become tedious.

Bosker chose 10 native and 10 nonnative speakers of Dutch for his stimuli, of which the nonnative speakers all had an intermediate proficiency in Dutch. These 20 people were asked to do the three assignments mentioned above, which resulted in 60 speech fragments total, which were all about 2 minutes long. The fragments were then shortened to approximately 20 seconds of roughly the middle of each recording, all starting with a phrase boundary, and all ending at a pause (>250 milliseconds). These 60 speech fragments then served as a baseline for Bosker’s research. Bosker then did two experiments, of which the first experiment concerned the length of the silent pauses. Each speech fragment was manipulated three times, which resulted in three conditions for each fragment: the NoPauses condition, in which pauses <250ms were removed; the ShortPauses condition, in which pauses >250ms were changed to a duration of 250-500ms; and the LongPauses condition, in which pauses >250ms were changed to a duration of 750-1000ms. Applying each of these conditions to the 60 fragments gave him 180 speech fragments total for his first experiment.

The second experiment concerned the speed of speech, and therefore contained speech rate manipulations. The speed of nonnative speech was sped up to the mean value of the native speakers, and the native speech was slowed down to the mean value of nonnative speakers. Bosker applied 3 types of manipulations for this experiment: the original speech fragment (no manipulations applied); Articulation Rate Manipulations (ARM) (speech with its speech intervals manipulated); and Speech Rate Manipulations (SRM) (speech with both its speech intervals and its silent intervals manipulated simultaneously). In the case of the native speech fragments, the ARM and SRM manipulations were slower than the original; and in the case of the nonnative speech fragments, the ARM and SRM manipulations were faster than the original. Each of the 60 fragments with its three conditions again resulted in 180 speech fragments total. Bosker made sure that each participant rated each item in only one condition using a Latin-square design, which means that each participant rated 60 items. The sessions lasted approximately 45 minutes.

Since I have mostly focussed on qualitative data, my experiment is smaller and I have therefore used less fragments than Bosker. Instead of conducting two separate experiments with 60 speech fragments each, I decided to construct one experiment, in which the participant listened to 12 speech fragments: 6 containing pausing manipulations, and the other 6 containing speech rate manipulations. 12 different speakers were therefore chosen – 6 native and 6 nonnative – from the 20 speakers from Bosker’s corpus. There were 6 different conditions (applied to native and nonnative speakers) of which everyone listened to one voice per condition, arranged in a Latin Square design with three groups of listeners for counterbalancing; in table 3, the distribution of the fragments can be found.

(17)

Table 3: Distribution of the fragments in lists

Note: nat = native; non = nonnative; NP = No Pause condition; SP = Short Pause condition; LP = Long Pause condition; OR = original condition; ARM = Articulation Rate Manipulation; SRM = Speech Rate Manipulation.

The speakers that I chose from the 20 speakers that Bosker assembled, were already matched on the number of silent pauses by Bosker. In making the selection of speakers for my research, I first made sure there was an equal amount of male and female speakers: of the 10 native speakers, I assembled 3 male and 3 female voices, as well as with the 10 nonnative speakers. I attempted to select the speakers that sounded as being from a different age category, as well as having different voice qualities, so that the assembled speakers would sound as diverse as possible. Table 4 shows the means and standard deviations of acoustic measurements of the speech fragments that I used for my experiment (the acoustic measurements of all chosen fragments can be found in the appendix).

Table 4: Acoustic measurements of the 12 chosen speakers (M (SD)).

4.2.2 Instructions

The only difference between the explicit group experiment and the implicit group experiment is to be found in the instructions provided prior to the rating task. Since the focus of this thesis is on the narrow definition of fluency, I had to prevent the listeners from rating the speech fragments on overall proficiency. I therefore mentioned in the instructions for both groups that I was not interested in what they think of the speaker’s overall proficiency; but rather the smoothness, flow, and ease of the speech. Another factor that may have helped the participants to focus on narrow fluency rather than broad fluency is the fact that they also had to rate native speakers. This way, they will not have established native speech as a norm to compare the nonnative speech to, and they were challenged to also critically rate native speech for its ease and smoothness.

In the instructions for the explicit group experiment, I mentioned that the participants should base their rating on speech rate, filled pauses, silent pauses, self-corrections and repetitions. In the instructions for the implicit group experiment, I did not provide these utterance fluency characteristics,

Speaker and topic LIST 1 LIST 2 LIST 3

pp901_T1 natNP natSP natLP

pp945_T2 natSP natLP natNP

pp951_T3 natLP natNP natSP

pp1036_T1 nonNP nonSP nonLP

pp1033_T2 nonSP nonLP nonNP

pp1025_T3 nonLP nonNP nonSP

pp928_T1 natOR natARM natSRM

pp943_T2 natARM natSRM natOR

pp935_T3 natSRM natOR natARM

pp1039_T1 nonOR nonARM nonSRM

pp1006_T2 nonARM nonSRM nonOR

pp1046_T3 nonSRM nonOR nonARM

Natives Nonnatives

Number of syllables 82.5 (20.4) 71.3 (9.7)

Seconds of spoken time 16.4 (3.5) 18.0 (1.3)

Seconds of silent pausing 3.7 (1.2) 3.1 (0.9)

Number of silent pauses 6 (1) 5 (1.2)

Number of repetitions 0 (0.4) 1 (1.3)

Number of AS-units 4 (2.3) 5 (1.9)

Number of filled pauses 2 (1.2) 6 (2.4)

Speech rate (syllables / total time) 4.1 (0.7) 3.4 (0.6)

(18)

but told them that they should think about what they think cognitive fluency means to them, and to base their judgments on their own interpretation of the provided definition. The exact instructions for both groups can be found in the appendix.

4.3 Procedure

4.3.1. Survey

The speech fragments were presented to all the participants (implicit and explicit group, 23 participants total) on the platform Qualtrics, in which several different surveys were made in which the sound files were accompanied by a likert scale from 1 (=not fluent at all) to 9 (=very fluent). The participants were encouraged to sit in a quiet room and to wear headphones while doing the survey, so no background noise could interfere or distract them from the task. Following the above mentioned instructions, the participants were provided with three practice items, in which they could get familiar with the assignment, as well as the three different topics included, which helped to eliminate familiarity bias. After the practice items, the actual experiment started. I made sure the 12 fragments were shown in a randomised order, so that a decline in rating quality for the last couple of fragments as a result of weariness or fatigue could be prevented.

After doing the survey, the participants who would not partake in the stimulated recall sessions (15 participants of the implicit group) were sent a Microsoft Word document sheet with two debriefing questions, which they had to answer directly after completing the survey. They asked the participant what they mainly based their rating choices on, and what they thought about the survey and whether they struggled with it. The goal of these debriefing questions was to obtain some extra qualitative data from the people that would not partake in the stimulated recall sessions. Completing the survey and answering the questions directly after took about 15 minutes per participant.

4.3.2 Stimulated recall procedure

The remaining 8 participants partook in the stimulated recall sessions. The 15 participants who did not partake in these sessions, rated lists that were completely randomised. However, for the stimulated recall participants, I had to know the exact order of the fragments in order to replay it for them during the stimulated recall sessions, which is why I designed a pseudo-randomised order of the fragments for the 8 stimulated recall participants.

After completing the survey, the 8 stimulated recall participants had a short break of about 3 minutes in which I had time to access their ratings. I wrote them down and video-called the participants on Skype, in which I explained that I was interested in what they thought while they listened to the sound files, and what might have influenced their ratings. I then played the first sound file – encouraging the participants to interrupt whenever that would be necessary, and to provide each thought that they may have had during the rating process – and reminded them of the rating that they gave. They then started to explain their reasons for giving that rating. With each participant, I went through all 12 sound fragments, which took about 20 minutes per participant. After going through the sound files, I asked a debriefing question in which I asked what they thought about the task and whether they struggled with it, and why. The sound of each Skype session was recorded.

4.4 Data analysis

The goals of this research are to extend the knowledge of the perception of Dutch fluency on the one hand, and juxtapose the results following two different kinds of instructions on the other hand. Section 5 will focus on answering research question 1 (which temporal and non-temporal factors constitute the

(19)

do the perceptions of Dutch fluency differ between having received implicit or explicit instructions prior to the rating task?).

4.4.1 Data analysis for section 5

Section 5 focuses only on the qualitative data in order to provide a full understanding of the perception of Dutch fluency (in the narrow sense). All qualitative data is therefore discussed in this section, and the explicit and implicit group data are taken together. The qualitative data consist of the transcripts of the 8 stimulated recall sessions on the one hand, and the answers to the debriefing questions by the remaining 15 participants on the other hand.

Stimulated recall sessions

Préfontaine and Kormos (2016) also examined the perception of fluency, and they conducted qualitative research as well. They eventually found 9 different speech categories that their participants commented on – these can be seen in table 5 below. As I am conducting similar research, these categories served as a guide as to which themes could possibly emerge from my own data. However, Préfontaine and Kormos examined French L2 speech, whereas I focussed on Dutch L1 and L2 speech, which is why I expected to find other themes as well.

Table 5: Perception of fluency themes and descriptions (Préfontaine and Kormos, 2016, p.156)

Theme Description

Speed Rapidity or rate of speech

Pause phenomena Temporary interruption to the stream of speech

Lexical retrieval Accessing words or expressions in the mental lexicon

Self-correction Perceived deficiencies in one’s own language output (Dörnyei

and Kormos 1998)

Efficiency / Effortlessness Reference to speaking ease or difficulty and underlying speech planning and processing efficiency in L2 communication

Rhythm The regular, patterned beat of stressed and unstressed syllables and pauses in an utterance

Expressivity / psychological state Expressivity or inner psychological state of the interlocutor conveyed in the voice

Grammatical competence Reference to the structural and syntax rules that govern a language

Native-like oral discourse features Native-like speech manifestations in spoken discourse

The data analysis of my experiment went as follows: the recorded Skype sessions were transcribed and these transcriptions were printed out, and then, the comments made by the raters were colour coded according to particular themes that emerged. These colour coded transcriptions can be found in the appendix. The comments were only marked and colour coded when it was clear that this actually influenced the eventual rating.

Example A: “…het loopt lekker door, als ze niet op een woord komt dan herhaalt ze het nog een keer of ze zegt wel eens ‘uhh’ tussendoor, maar ik vind het bij deze echt totaal niet storend.” - P3171

1In order to keep the participants’ anonymity, they were each assigned with a personal participant number: the first number

indicates the list that they were assigned to, and the other numbers indicate the order sequences of their own personal number. For example, P109 is the 9th participant, who rated list 1; and P210 is the 10th participant, who rated list 2; etc.

(20)

[…it runs smoothly, if she doesn’t know a word she’ll repeat it one more time, or she sometimes says ‘uhh’, but with this speaker I don’t mind it at all.]

In example A, comments were made about using repetitions and filled pauses, but these comments were not counted, as they were followed by a statement that indicated that it had not influenced the eventual rating. All comments that indicated some attitude towards speech characteristics as well as influencing the eventual rating were counted, whether they influenced the rating positively (“ze praatte redelijk snel

… dus ik vond het wel gewoon goed” [she talked quite fast… so I thought it was pretty good] - P109) or

negatively (“ik vond haar traag” [I thought she was slow] - P109). As there may be different reasons for giving a fragment a particular rating, different categories for one fragment were possible – however, participants sometimes tended to repeat themselves, which is why more mentions of one category were still counted only once per speech fragment. An example can be found below: here, five different comments were made about one fragment, but two of those comments were about speech rate. The participant seemed to have changed their mind about the speech rate, so eventually, the speech rate category was counted only once.

Example B: “Ik vond wel dat ze een fijne heldere stem had. Ik hoorde wel aan haar dat de Nederlandse taal volgens mij niet haar eerste taal is, ik denk dat dat ook een rol heeft gespeeld. Ze praatte rustig, dus qua snelheid vond ik hem over het algemeen goed, maar ik vond wel dat ze best wel vaak van die onnatuurlijke pauzes nam. Ja, ik denk dat deze toch op snelheid niet zo hoog heeft gekregen als dat het misschien had kunnen krijgen. ” P108, rating natLP

[I thought she had a nice, clear voice. I did hear that Dutch wasn’t her first language, I think that also influenced me. She talked slowly, so in general I thought her speech rate was good, though she took a lot of unnatural pauses. Yeah, I think that this fragment could have been rated higher considering the speech rate.]

Debriefing question answers

The other 15 participants were sent a Microsoft Word document with the following two questions:

1. Waar heb jij jouw beoordelingskeuzes voornamelijk op gebaseerd? (wat betekende ‘vloeiendheid’ voor jou?)

[What did you mainly base your rating choices on? (what did ‘fluency’ mean to you?)]

2. Heb je nog overige opmerkingen? (vond je het leuk, waarom wel/niet, vond je bepaalde dingen lastig, etc.)

[Is there anything else you would like to share? (did you like it, why did you/did you not, did you struggle with certain things, etc.)]

All 15 participants typed in their answers and sent the document back to me. Only the answers to question 1 were analysed; the same colour coding method was applied during the analysis of this data. The answers to the two debriefing questions can be found in the appendix.

4.4.2 Data analysis for section 6

Section 6 focuses on the differences between the implicit group and the explicit group, and examines how the phonetic manipulations took effect on the implicit group on the one hand and the explicit group on the other hand. Quantitative data analysis is therefore necessary for the implicit group (as quantitative measurements of explicit instructions have already been given by Bosker). For that, the ratings of the 19 participants of the implicit group were assembled and put in an excel spreadsheet. In order to see how the manipulated sound files were interpreted by the raters, one-tailed paired t-tests were performed. Overall ratings across native and nonnative speakers were also compared through a one-tailed paired t-test. After that, the conclusions between the two groups were compared, and it was discussed how the

(21)

results between the two groups differ, and how the phonetic manipulations of the speech fragments have taken effect on the two different groups. Other perceptual differences between the two groups have also been discussed.

(22)

Section 5: Perceptions of Dutch fluency

This section focusses on the first research question: which temporal and non-temporal factors constitute

the perception of Dutch fluency? The function of this section is to show which speech factors emerge

when asking participants to rate fluency in the narrow sense – whether they are provided with utterance fluency characteristics to base their judgments on, or not. Therefore, the data of the implicit group and explicit group are taken together, in order to show all the different characteristics of speech that seem to influence the participants when they are asked to judge narrow fluency. In short, the qualitative data of

all participants is taken in consideration here: the transcripts of the 8 stimulated recall participants, as

well as the answers to debriefing question 1 of the 15 remaining participants (“Waar heb jij jouw

beoordelingskeuzes voornamelijk op gebaseerd? (wat betekende ‘vloeiendheid’ voor jou?)” [What did you mainly base your rating choices on? (what did ‘fluency’ mean to you?)]

5.1 Results: influencing temporal and non-temporal factors

In table 6, the different categories that emerged from the qualitative data can be seen, and they will be discussed with some examples in the discussion part below.

Table 6: The 17 categories attributed to qualitative comments in fluency judgments

Influencing speech factors Frequency (total: 321)

Speech rate 48

Silent pauses 43

Filled pauses 39

Native/nonnative judgments 28 Self-corrections and repetitions 24

Accent/pronunciation 23 Processing efficiency 21 Intonation/rhythm 19 Grammatical accuracy 19 Expressivity/psychological state 14 Lexical complexity 13

Voice quality judgments 10

Sentence structure 9

Native-like interjections 3

Comprehensibility 3

Articulation 3

Personality judgments 2

5.2 Discussion: explanation of the speech factors

Speech rate (frequency: 48/321)

In previous research, it has often been concluded that speech rate is an important indicator of fluency – faster speech rate indicates a higher perception of fluency (Bosker et al. 2013; Derwing et al. 2004; Kormos and Dénes, 2004; Lennon, 1990). It can be noted in this experiment that on the one hand, participants seem to agree with this idea: P102, for example, refers to a lower fluency rating as a consequence of a lower speech rate (“hij praat wat langzamer” [he talks slower]), and a fast speech rate positively influenced the rating (“hij praat vlot” [he talks quickly]). More participants used this argument in their stimulated recall sessions. On the other hand, speech rate should not be too fast, since this will reduce comprehensibility, as well as the fluency rating itself – this was also found by Préfontaine and Kormos (2016, p. 157).

(23)

1: “Vooral de laatste twee zinnen, ik weet niet meer precies wat ie zei, maar het ging heel snel, en of ie nou dit bedoelde of dat bedoelde, was voor mij niet helemaal duidelijk.”2 P210 (impl), rating nonNP [Especially the last two sentences, I don’t remember exactly what he said, but it went very quickly, and it wasn’t really clear to me what he meant.]

It was also often stated that a stable speech rate is optimal; if the speaker changes their speech rate often, it tends to confuse or distract the listener. The speech rate should be balanced; not too slow and not too fast, and if the speaker interrupts their flow of speech with some disfluencies, it is seen as less fluent.

2: “Maar het grootste punt waardoor deze voor mij laag scoorde is door de snelheid, omdat ie mégasnel ging, en dan weer niet, en het was echt een soort van rollercoaster van zijn ritme in hoe die het vertelde. Dus daar raakte ik nogal van afgeleid.” P108 (expl), rating nonLP

[But the biggest reason for his low score is because of the speed, he went super fast, and then he slowed down again, and his rhythm was really kind of like a rollercoaster. It distracted me a lot.]

3: “speech rate was fijn. Als in, ja, gewoon vlot, niet heel snel, maar ook niet heel langzaam.” P216 (expl), rating nonLP

[I liked the speech rate. It was fast, though not too fast, but also not too slow.]

Silent pauses (frequency: 43/321)

It has previously been concluded that less silent pausing may lead to higher perceptions of fluency (Bosker et al. 2013; Lennon, 1990; Riggenbach, 1991), although we are conscious of the fact that pauses naturally occur in spontaneous speech, fulfilling several functions in linguistic processing (Clark and Fox Tree, 2002). Silent pauses can differ in frequency, length, and distribution, and listeners seemed to be conscious of this, which is why the different kinds of comments about silent pauses in the stimulated recall data were also distinguished during the data analysis. There were 35 comments about silent pauses in the stimulated recall data. Most comments were made about the length of silent pauses (12 out of 35 stimulated recall participants): the longer the silent pause, the less fluent the speaker was rated. 9 out of 35 comments were made about the frequency of silent pauses, and 4 out of 35 about the distribution. The remaining 10 comments were about silent pausing in general.

4: “Ik heb vooral gelet op de ‘flow’ van de woorden, dus dat er niet te veel onlogische pauzes zaten tussen woorden die er niet horen te zitten of dat er pauzes zaten op plekken die voor mij niet logisch overkwamen.” P212 (impl)

[I especially paid attention to the flow of the words, meaning that there were not too many pauses in between words that didn’t make any sense, or that there were pauses in places that are illogical.]

5: “Maar er zaten soms heel erg lange pauzes tussen alle woorden, en hij praat heel traag en ook wel een beetje monotoon.” P317 (impl), rating natSRM

[There were very long pauses in between all the words, and he talked very slow and also a bit monotonous.]

6: “De flow was aan één stuk door hetzelfde, op een prettige manier. Er waren helemaal geen stiltes.” P108 (expl), rating natNP

[The flow happened in just one sequence, in a very pleasant way. There were no silences at all.]

2As the comments were literally transcribed, some of them contain interjections and fillers that are likely to occur in spoken

language, but do not typically appear in written language. Therefore, words and phrases that do not have a function but are merely a byproduct of spoken language have been deleted, so that the comments are easier to understand when reading this thesis. In the appendix, all original (i.e. unedited) comments can be found.

(24)

Filled pauses (frequency: 39/321)

Filled pauses are also considered as disfluencies in already existing literature, although Préfontaine and Kormos (2016) found that the use of filled pauses can also heighten fluency rates: the use of filled pauses was often accompanied by French fillers such as ‘euh, enfin, c’est à dire, bon, alors etc.’, (p. 159) and were therefore seen as more fluent, since native speakers of French use the same filler trick. A similar kind of finding did not emerge for the Dutch language in my experiment; filled pauses were always seen as disfluencies and mostly had a lower fluency rating as a consequence:

7: “Wat voor mij denk ik wel heel erg maakt of iemand wel of niet vloeiend praat… heel vaak zegt iemand ‘ehh’, dat soort woorden.” P102 (impl), rating nonSRM

[The thing that makes someone’s speech very fluent or disfluent to me is… often people say ‘uhh’, those kinds of words.]

8: “Had een 9 gekregen als ze minder gevulde pauzes had gebruikt.” P216 (expl), rating natSP

[She would have gotten a 9 if she used less filled pauses.]

However, sometimes the filled pause was noticed, but it was explicitly mentioned that the participant did not mind this particular filled pause, and that it did not influence the eventual rating:

9: “als ze niet op een woord komt dan herhaalt ze het nog een keer of ze zegt wel eens ‘uhh’ tussendoor, maar ik vind het bij deze echt totaal niet storend.” P317 (impl), rating natOR

[if she doesn’t know a word she’ll repeat it one more time, or she sometimes says ‘uhh’, but with this speaker I don’t mind it at all.]

Native/nonnative judgments (frequency: 28/321)

What was noticed quite often in this experiment is that listeners made comments about whether they thought the speaker was a native or a nonnative speaker of Dutch. These kinds of comments were made so often that this was eventually counted and noted as a category that influenced the perception of fluency, especially because this was often the first thing that people said when they explained the reasons behind their fluency ratings. Nativeness judgments were more often noted in the comments by the stimulated recall participants than in the answers to the debriefing question by the remaining participants, which indicates that nativeness judgments are more likely to have influenced the listener unconsciously rather than consciously; although some stimulated recall participants admitted that their judgments about the speaker’s nativeness may have influenced them:

10: “Ik hoorde wel meteen dat Nederlands niet haar eigen taal was, en ik probeerde rekening te houden met, oké, dat is niet hetgene waar ik op moet letten, of Nederlands haar echte taal is, maar ik denk dat dat toch wel bij heel veel dingen heeft meegespeeld.” P108 (expl), rating nonSP

[I instantly heard that Dutch wasn’t her first language, and I tried to think of the fact that I should not pay attention to that, whether Dutch is her first language or not, but I do think that it often played a part.]

11: “Ja.. ik denk toch wel weer dat het je toch beïnvloed als je hoort dat het een native speaker is. Gewoon qua uitspraak en zo.” P323 (expl), rating natNP

[Yeah.. I do think that it still influences you if you hear that it is not a native speaker. In the pronunciation and everything.]

Referenties

Outline

GERELATEERDE DOCUMENTEN

This thesis is published within the Research Institute SHARE (Science in Healthy Ageing and healthcaRE) of the University Medical Center Groningen / University

The perceived review relevance and persuasiveness of positive and negative reviews were used as within-subject variables and the purchase decision phase was used as

our model where the X-ray source is the star itself, the scattering surface moves to deeper layers in the disk as the radial column density seen by stellar X-rays is reduced (see

de sonde des volks geweent he[:eft] In het Bijbelboek Klaagliederen van Jeremia treurt Jeremia over de Val van Jeruzalem en de verwoesting van de tempel in 586 v. Volgens

This thesis presents findings regarding the research question: ‘How does sense of belonging play a role in the day-to-day lives of IDPs in relief camps in BTC

Wat kunnen we leren voor de kwestie van de alignment tussen intrinsieke prikkels (van regio’s resp. Bedrijven daarbinnen zelf) en extrinsieke prikkels (door regelgevers die willen

Komplekse strukture, byvoorbeeld ’n kanon in die tenoor met ’n omgekeerde kanon in die pedale terwyl die gospel-melodie daaroorheen gehoor word, maak van ’n prelude soos Just as I am

Naar aanleiding van de uitbreiding van een bestaande commerciële ruimte en het creëren van nieuwe kantoorruimte gelegen in de Steenstraat 73-75 te Brugge wordt door Raakvlak