• No results found

Correlated production and perception of similar L2 vowels: Categorical dominance as an indication of robust acquisition

N/A
N/A
Protected

Academic year: 2021

Share "Correlated production and perception of similar L2 vowels: Categorical dominance as an indication of robust acquisition"

Copied!
94
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Correlated production and perception of similar L2 vowels:

Categorical dominance as an indication of robust acquisition

Research Master Thesis Language & Cognition

Amber Nota

21.821 words

21-09-2014

(2)

Table of contents Abstract………. 3 1 Introduction ……….. 4 2 Background ……….. 4 2.1 L2 accent acquisition ………. 4 2.2 L2 speech production ………. 7 2.3 L2 speech perception ………. 7

2.4 Theoretical models of L2 accent acquisition ………. 8

3 Research questions ………... 11 4 Methodology ……… 13 4.1 Production task ……….. 13 4.2 Perception task ………... 15 4.2.1 Word type ……… 16 4.2.2. Vowels ………. 17

4.2.3 Vowel category selection ……… 18

4.2.4 Distractor task ………. 19

4.2.5 Audio recordings ………. 19

4.2.5.1 Relevant vowel characteristics ……….. 20

4.2.5.2 Nonce word recordings ………. 23

4.2.5.3 Vowel splicing ……….. 24

4.2.5.4 Vowel manipulation ……….. 24

4.3 Feasibility pilot study ……… 28

4.4 Perception task construction ……….. 29

4.4.1 Distractor task construction ………. 30

4.4.2 Image construction ……….. 31 4.5 Background questionnaire ………. 31 4.6 Experiment instructions ………. 33 4.7 Participants ……… 35 4.8 Data extraction ……….. 35 5 Results ……….. 36 5.1 Background questionnaire ………. 36

5.2 Left or right side preference ……….. 37

5.3 Vowel categories ………37

5.4 Correlations between production and perception ……….. 40

5.4.1 Young female participants ……….. 40

5.4.2 Young female non-English students ………... 43

5.4.3 Young female English students ……….. 48

5.4.4. Gender normalisation ……….. 52

5.4.5 Young non-English students ……….. 54

5.4.6 Young English students ……….. 58

5.4.7 Older speakers ………. 62

6 Discussion ……… 66

6.1 Production ……….. 66

6.2 Perception ……….. 69

6.3 Production and perception correlated ……… 72

7 Conclusion ……….. 81

8 References ……… 84

9 Appendix ……….. 89

9.1 Images used in perception task ……….. 89

9.2 Images used in distractor task ……… 90

9.2.1 Symbols ………... 91

9.2.2 Colours ……… 91

(3)

Abstract

(4)

1. Introduction

In second language (L2) acquisition, all main theoretical models (i.e. Best’s (1995) Perceptual Assimilation Model (PAM), Kuhl’s (1992) Native Language Magnet Model (NLMM) and Flege’s (1995) Speech Learning Model (SLM)) agree that the most difficult sounds to acquire are those sets of sounds which are phonologically distinct in the L2, but not in the native language. This is due to the fact that the L2 learner, based on his or her native sound system, is not capable of distinguishing between those sounds. The general consensus is that, since these speakers cannot perceive the difference, they cannot accurately store the sounds in a mental representation and, as such, are incapable of native-like production of the sounds. Findings from both L2 speech production and perception research confirm this: speakers have trouble accurately perceiving the distinction between sounds which do not exist in their native language, and they frequently display influences of their native language in their L2 speech production of such sounds. This suggests that, in the native-like acquisition of similar sounds, the influence of the first language (L1) sound categories is very difficult to overcome. To investigate to what extent this is possible in highly proficient L2 learners, and even more importantly, to what extent correlations can be found between their L2 production and perception, as is suggested by the theoretical models, this thesis considers the English speech production and perception of similar vowels in L1 Dutch highly proficient L2 English speakers.

2. Background

2.1 L2 accent acquisition

(5)
(6)

and perception are not theoretically impossible, but unlikely to become entirely native-like, as all these factors influence both production and perception. This juxtaposition of the two main views on L2 language acquisition with particular focus on accent acquisition must not be considered to have a binary outcome, as numerous factors such as those mentioned above interact in either scenario. However, as a general observation it can be claimed that, from the point of view of the CP, native-like L2 speech production acquisition is impossible, whereas this is not necessarily the case for native-like L2 speech perception acquisition. Likewise, it can be claimed that, when L2 acquisition is considered to be constrained by mental and external factors, and not by any CP, native-like acquisition of either production or perception is not theoretically impossible, but unlikely.

(7)

acquisition are higher in an immersion context, as this would drastically reduce the L1 use and input.

2.2 L2 speech production

Findings from research into both production and perception corroborate these assumptions. In L2 acquisition research, a frequent finding is that, even in highly proficient L2 speakers, L2 speech production may not be native-like, but displays evidence of L1 influence. For L2 English production, such findings are particularly common in the production of Voice Onset Time (Flege & Eefting, 1987; Flege, 1995), syllabification (Altenberg, 2005) and vowel production. In the latter case, vowel production is investigated in terms of both quality and quantity, i.e. vowel length (see for instance Flege, MacKay & Meador, 1999), which can be phonemic, but can also differ between vowels which exhibit spectral differences as well, as is the case in English. Vowel quality refers to the tongue’s relative position during production, i.e. high or low and backed or fronted. Although some debate exists concerning the reliability of formant value measures as an indication of tongue position (see for instance Rathcke & Stuart-Smith, 2014; Kendall & Vaughn, 2014), the first and second formant (F1 and F2) are generally taken as measures of height and front- or backness, respectively. The general finding in L2 vowel production research is that both Age of Acquisition (AoA) (Munro, Flege & MacKay, 1996; Flege, Bohn & Jang, 1997) and Length of Residence (LoR) (Bohn & Flege, 1992) correlate positively with L2 vowel production accuracy.

2.3 L2 speech perception

(8)

bilinguals do not have the ability to (easily) perceive contrasts which are not native to them. Research on speech perception consistently reveals that differentiation of non-native contrasts is very complex for adults (Strange, 1995; Strange & Schafer, 2008): the selective perception patterns acquired in the native language are very difficult to modify in L2 acquisition, even under large amounts of exposure. Research on vowel differentiation in perception tasks for various language combinations has shown that, generally speaking, participants have trouble distinguishing between vowels that do not occur in their own language, or similar vowels which are not phonologically distinctive in the native language (Strange & Schafer, 2008). This finding is replicated for participants ranging from naïve to, in some instances, those who have had large amounts of L2 exposure (Levi & Strange, 2008). However, on the discrimination of non-phonologically distinctive similar vowels, it is also the case that some similar vowel pairs are easier to distinguish than others (see for instance Flege, 1995 on the discrimination of various vowel contrasts in L2 English by native Spanish speakers).

2.4 Theoretical models of L2 accent acquisition

(9)
(10)

occur less frequently, large amounts of prolonged exposure are required. According to Eckman’s (1997) Markedness Differential Hypothesis, acquisition of sounds which are marked due to their low frequency occurs later than that of highly frequent – and therefore less marked – sounds.

One important feature these three models share is the fact that they all assume a correlation between L2 speech perception and production, in which the mental representation of sound categories serves both production and perception. A large amount of research into the correlation between production and perception on various features in both first and second language acquisition has been conducted. The dominant claim that can be made, based on that research, is that perception appears to proceed production (Baker & Trofimovich, 2006; Bent, 2005; Bettoni-Techio, Rauber & Koerdich, 2007, Bion, Escudero, Baptista & Rauber, 2005; Flege, 1995; Kluge, Rauber, Reis & Bion, 2007; Lambacher, Martens, Kakehi, Marasinghe and Moholt, 2005, Podlipský, n.d), although no full consensus has been reached on this view, with some claiming that production precedes perception (Baptista & Bion, 2005; Bohn & Flege, 1997; Gómez-Lacabex & García-Lecumberri, 2007), or that both production and perception facilitate each other (Gómez-Lacabex, García-Lecumberri & Cooke, 2005; McAllister, 1997), or, in some earlier research, that both are unrelated (Lane & Schneider, 1963; Lee, 1996).

(11)

and TRAP acquisition, DRESS is much closer to the Dutch vowel category than TRAP. This may mean that the process of acquisition for these two vowel sets is not entirely similar.

3. Research questions

Based on the difficulty of acquisition of similar vowels described in these L2 speech acquisition models, and on the influence of factors such as language exposure and motivation on L2 speech acquisition, this thesis will attempt to answer the following main research question:

1) What is the relation between production and perception of the DRESS, TRAP, GOOSE, and FOOT vowels in highly proficient native Dutch L2 speakers of English?

Specifically, this will be investigated in speakers who acquired English at a Dutch university. This means these speakers are highly proficient in their L2, and are strongly motivated to acquire native-like English to the best of their abilities. However, it also means they do not acquire their L2 English in an immersion environment. Instead, they receive large amounts of L1 impacted input in which the distinctions between these similar vowel sets do not necessarily occur as they do in native L1 input, thereby increasing the difficulty of native-like accent acquisition. This group is continuously influenced by Dutch sound categories, both in their continued L1 input, and in their L1 impacted L2 input. The Dutch equivalents of the DRESS and GOOSE vowel categories therefore continue to exert influence over these speakers’ vowel representations. These speakers have typically begun to acquire English at school around the age of 11, and have started to acquire English at university at the average age of 18. According to all models of L2 accent acquisition, either Critical Period-based or not, this age would be too late to facilitate native-like L2 speech acquisition. Additionally, these ages of acquisition ensure that the L1 has become successfully entrenched in the speakers’ minds before L2 acquisition (see Schmid, 2011).

(12)

influences, amongst which L1 interference, it is expected that correlations will be found, as these models predict a much more equal influence of the L1 on both production and perception.

Based on the main research question and expectations derived from previous research, several subquestions will also be answered:

2) Do the four vowel categories exist separately in each participant, and, if they do, are they native-like categories?

Research based on the correlation between production and perception is founded on the assumption that a participant indeed has separate categories for each investigated sound. In this case, this means that, as it has been established that acquisition of the four vowel categories in the L2 is difficult for Dutch native speakers, it must first be determined whether the participants indeed have mentally distinct representations of all four vowel categories. With regards to native-like categories, it must of course be kept in mind that the representation of the categories can vary between different varieties of English. However, it can be investigated whether the participant’s production adheres to the distribution of the categories as they are found in the two most influential general varieties of English: General American and British English.

3) Can differences be found between the perception of DRESS and TRAP on the one hand and FOOT and GOOSE on the other?

Based on the finding in previous research that some vowel pairs are easier to distinguish than others, it is possible that differences occur in the perception of the two pairs. Based on the characteristics of the four vowel categories and their resemblances to the Dutch vowel categories, it can be expected that perception along the DRESS-TRAP continuum is easier than perception along the FOOT-GOOSE continuum.

4) Can stronger correlations be found for the DRESS and GOOSE vowel categories, which can be expected to be dominant due to L1 Dutch interference?

(13)

problems. This potential difference in acquisition may be reflected in differences in (strength of the) correlations between production and perception of the vowel categories.

5) Based on the results, can any indication be found of the direction of the relationship between production and perception, if such a relation is found? Although the prevailing view on the relation between production and perception is that perception precedes production, no consensus has as of yet been reached on this topic. Although this thesis does not take a longitudinal approach or investigate early stages of L2 acquisition, indications concerning the direction of this relationship, for instance in the case of stronger correlations for either production or perception, may still be found.

4. Methodology

To investigate the correlation between production and perception as accurately as possible, the choice was made to use a production task in which the participants’ speech was as free and natural as possible under the circumstances, to ensure that the speech they produced displayed their personal vowel categories as well as possible. For the investigation of perception, the choice was made to use a phonetic categorisation task. This is a perception task with a high degree of control over the input the participants receive. Although this means the input was further removed from natural free speech, this particular task was selected to ensure that those factors on which the participants’ perception was based could be controlled. In addition to these two tasks, the participants were presented with a general background questionnaire.

4.1 Production task

(14)

the same fragment, the cognitive load is similar for all participants (Schmid, 2011), making the speech they produce suitable for comparison. While these aspects make this task very suitable for the present purposes, the low frequency of the FOOT and GOOSE vowels makes it less suitable, as it was deemed unlikely, based on the contents of the fragment commonly used for this task, that participants would produce a sufficient number of tokens for these two vowel categories. Although exact frequencies are highly dependent of the context in which they are measured, Gimson’s (1980, see also Deterding, 1997 for low frequencies of both FOOT and GOOSE vowels) findings suggest that DRESS vowels are nearly four times as frequent as FOOT vowels and nearly three times as frequent as GOOSE vowels. Even when these findings are only used as an indication of trend, it suggests that care must be taken to ensure the production of a sufficient number of tokens for the latter two vowel categories. A picture description task was devised which did ensure the production of a sufficient number of tokens, while still maintaining the positive aspects of the film retelling task as much as possible. This task consisted of four separate cartoons containing six pictures each. The choice was made to use cartoons depicting a story line, instead of separate unrelated pictures, to approach the movie retelling task as closely as possible, as describing a story is a less static, and therefore less formal task than the description of independent images. The description of a story that unfolds across several pictures allows the participant to become focused on the story itself, instead of on the task at hand. This helps to ensure that their speech production does not become too formal, but instead approaches naturalistic speech as closely as possible.

(15)

named to ensure the production of the target vowels. Colour was used in the cartoons to place emphasis on objects, so as to increase the chances of participants mentioning them. In addition to the inclusion of separate objects which can be identified in the cartoons, care was taken to incorporate as many words as possible which could be described in a sentence, such as he looks out the window or the girl sits on her mother’s lap/with a girl on her lap (with the target words in bold face). These phrasings were incorporated as frequently as possible because the incorporation of a target sound in a sentence will result in a more naturalistic production than that of list-wise naming. Compare for instance to production of the target words in and the cat danced with the broom and there is a cat, there is a broom. In the latter instance, speech is more likely to not only become formal, but also unnaturally stressed. However, in addition to the attempt to elicit vowel production in naturalistic context as much as possible, a number of words were included in script in the cartoons, to function as back-up options, should a participant not produce enough FOOT and GOOSE words in the rest of the retelling. These were only selected for analysis in those speakers who did not produce enough freely spoken instances of these vowels, but were, where necessary, included in the general analysis to ensure enough data points were available for analysis at group level. For instance, in the cartoon depicting a boy’s birthday and the presents he receives, one of the presents consist of a number of books with legible titles, which can be read out by the participants. Another example is the inclusion of signs in various cartoons, denoting the function of buildings (e.g. hotel, saloon) and names of characters (e.g. Brooke). The cartoons can be found in their entirety in the Appendix. All cartoons were produced on A4 format and were laminated, to allow participants to pick them up and handle them in any way they desired during the retelling experiment, without fear of damaging them. All recordings were made with a noise suppressing Logitech H390 headset, to ensure quality and comparability of the production results.

4.2 Perception task

(16)

a response from a set of presented alternatives. In discrimination tasks, participants are required to identify whether presented sets of stimuli are different or the same (see Strange & Schafer, 2008 for an overview of tasks used in speech perception research). Discrimination tasks have a higher memory load than identification tasks, and require a metalinguistic judgement, making them less direct in testing the relation between a speaker’s personal perception and production (M. Broersma, p.c.). For this experiment, a phonetic categorisation task was selected, consisting of recordings of single words without further context which are presented to the participant. After each word, the participant must choose whether the vowel he or she just heard in that word belongs to one vowel category, or to another.

4.2.1 Word type

For this particular task, nonce words were used. This was done to ensure that participants would not be influenced by connotations of existing words. When an existing word is perceived, in conditions in which the listener is aware of the fact that he or she will need to make a judgement, all knowledge available to the participant will most like play a role in the production of this judgement. This means that the listener calls forth the spelling of the word, and will take into account the letter used for the spelling of the vowel when making his or her judgement. As it is public knowledge that the English spelling system very poorly coincides with its different vowel categories, this effect of spelling will most likely be a confounding factor in the production of a judgement. Additionally, the perception of an existing word might cause the participant to call forth memories of other speakers producing this word. As these speakers do not necessarily produce this word, and especially this vowel, in the same manner as the participant would, these memories can also distort the judgement of the participant in an experiment which seeks to elicit judgements based purely on the speaker’s own vowel categories.

(17)

next step in future research, but a problematic first step. In this experiment, specifically, nonce words of the C-V-C type were selected, in which all consonants are plosives. Plosives were chosen to limit the effect of the consonant on the vowel, as plosives are more robust and less susceptible to coarticulation than other consonant classes (McCully, 2009; Nathan, 2008). To ensure that any potential effect that place of articulation of the plosives might have on the vowel would be controlled for, plosives with different places of articulation (e.g. bilabial, alveolar and velar) were used. After eliminating all plosive-V-plosive combinations which formed an existing word or sounded like an existing word when combined with one of the target vowels (e.g. DRESS, TRAP, FOOT, GOOSE), the following nonce word plosive combinations were selected for use in the perception task: /b_p/, /d_p/, /d_g/, /g_k/.

It must be noted that one participant has since pointed out that /gu:k/ is in fact not a nonce word, but a derogatory term used for the purpose of describing a Korean, obtained from the Korean pronunciation of their country, Hangook. However, this participant was the only one who pointed out that gook was in fact an existing word, which suggests that it is not widely known and is therefore unlikely to have skewed the results. In the case that other participants have also recognised this as an existing word, the confounding effects described above will have been very limited, as the spelling cannot influence the participant’s choice in this case (i.e. both FOOT and GOOSE words are generally spelled with ‘oo’) and the chances that participants have frequent memories of other speakers producing this particular word are very small. Unless a follow-up questionnaire is sent to all participants containing the question “On a scale of 1 to 10, how good is your racist vocabulary in the area of Koreans?” (participant 6, p.c.), the exact effect of this existing word on the participants’ judgement cannot be determined. It can, however, be reasoned that if such an effect existed at all, it is negligible.

4.2.2 Vowels

(18)

contained vowels of different percentages of prototypicality. This means that, for example on a scale between DRESS and TRAP, vowels would be presented in the experiment that were placed at various points on this scale. Vowels were used that were 100% prototypical DRESS, 75% prototypical DRESS, 50% prototypical DRESS, meaning that that vowel was produced with formant values halfway between prototypical DRESS and TRAP, 25% prototypical DRESS, and 0% prototypical DRESS, which corresponds to 100% prototypical TRAP. A similar scale of vowels was used to produce a FOOT-GOOSE continuum. In the case of that continuum, the vowels at different points on the scale not only differ from each other with regards to formant values, but also with regards to vowel length, as GOOSE is a long vowel while FOOT is a short vowel. This means that for each of the two vowel continuums, five vowels were produced. All these ten vowels were produced in all four nonce word contexts, resulting in a total of 40 different nonce words to be judged by the participants.

4.2.3 Vowel category selection

(19)

judgements less reliable, as this increases the chance of participants simply making a random guess. Additionally, it would result in a difference between the DRESS-TRAP judgement task and the FOOT-GOOSE judgement task, making the results for the two vowel continuums less comparable. To avoid these influences of spelling, pictures were devised to represent the different vowel categories1. For the DRESS-TRAP continuum, participants would be presented with a picture of a bed and of a cat, respectively, while for the FOOT-GOOSE continuum, they would see pictures of a cook and of a boot, respectively. These representations were chosen, not only because they contained the target vowels, but also because they adhered to the C-V-C pattern with only plosives as consonants, ensuring that the vowels depicted in the images would be as representative as possible.

4.2.4 Distractor task

To ensure that the monotony of the task would not negatively impact on participants’ judgement, a distractor task was devised. This task is a non-linguistic memory task. A non-linguistic task was selected to ensure that the distractor task would in no way influence the results of the judgement task by priming or otherwise activating specific language areas. The distractor task consists of four different symbols; a star, a circle, a square and a triangle, each of which are presented to the participant at different moments during the task. After each nonce word in the judgement task, a symbol is presented in one of four colours; red, green, blue or yellow. It was established that these prototypical colours would not present any problems to potentially colour blind participants (S. Gilbers, p.c.). Each symbol is presented in a different colour. After four nonce words, all four symbols and all four colours have been presented once. At this point, the task for the participant is to remember in which colour one of the four symbols, which is presented again in white, was depicted earlier.

4.2.5 Audio recordings

To produce the nonce words that would be presented to the participants during the perception experiment, recordings were made of all plosive contexts and all vowels. The

1

(20)

target nonce words were produced by recording the plosive context with each of the four vowels (i.e., for the /b_p/ plosive context, recordings were made of /bɛp/, /bæp/, /bʊp/ and /buːp/), to ensure that any coarticulatory effects of the target vowels on the plosives were present in the recordings. This increased the naturalistic nature of the recordings. Originally, after these recordings had been made, slow gliding vowels on both continuums were to be recorded. From these gliding vowels, relevant sections containing formant values at 0%, 25%, 50%, 75% and 100% were to be extracted and spliced into the previously recorded plosive contexts to form the nonce words with vowels at different points on the scales between DRESS and TRAP and between FOOT and GOOSE. Given the fact that vowels between plosives have clear beginning and end points, splicing was possible for these words. This section discusses the selection of vowel characteristics, problems encountered in the production of the nonce words and subsequent adaptation of the methodology.

4.2.5.1. Relevant vowel characteristics

(21)

For all these Dutch native speakers, the same principle that they are exposed to several varieties of English, which all influence their own English, holds.

For these reasons, the choice was made to base the formant values for the vowels in the perception task on a combination of General American (GA) and Standard Southern British English (SSBE). The choice was also made to use a Dutch native, highly proficient L2 speaker of English to produce the required recordings. This was done for two reasons. The first was to ensure compatibility with the large amount of L1 Dutch impacted English input the target participants receive. The second reason was that using a native English speaker would have resulted in a larger bias towards the variety of English that was native to that speaker.

For the vowels in the perception task, the formant values for SSBE were taken from Deterding (1997), who provides male and female average formant values for each of the eleven monophthong vowels present in the speech of ten different speakers (five male and five female) producing relatively free spoken data. The formant values for GA were originally based on Hillenbrand et al. (1995) who provide average formants and duration for twelve monophthongs produced by large numbers of male (N=45) female (N=48) and child (N=46) speakers producing words in a word list reading task. However, further inspection of the values found in this research revealed them to be unusual and incompatible with the SSBE formants (e.g. the GA F2 for TRAP was higher than the F2 for DRESS, while this is generally not the case, and was also not the case for the British F2s. Averaging these values resulted in an almost complete lack of change along F2 in the DRESS-TRAP continuum). A suitable alternative was found in Yang (1996), who provides average formant values for thirteen monophthongs produced by ten male and ten female speakers. These formant values are also based on word list reading tasks, reliable formant values based on free speech could not be found for GA.

(22)

instead of on values found in previous research. This option was chosen to ensure that the nonce words to be used in the perception task would sound natural. While it must be said that it may very well be the case that substantial differences in vowel length exist between representations of these vowels in various varieties of English, based on the lack of reliable information available in the literature, and the need for natural sounding data, this was the best option available for the present purposes. The average formant values found in the literature, and the formant values necessary for this project can be found in Tables 1 and 2, respectively. The formant values in Table 2 were calculated by taking the average values for SSBE and GA in Table 1 for each vowel (i.e. 100% DRESS and 100% GOOSE in Table 2 contain the prototypical DRESS and GOOSE formants, respectively, while 0% DRESS and 0% GOOSE contain the prototypical TRAP and FOOT formants, respectively). As the speaker used for the production of the nonce vowels was female, only the female values will be discussed.

As can clearly be seen from Table 2, the difference in formants between the consecutive steps on the vowel scales are much smaller for the FOOT-GOOSE continuum than they are for the DRESS-TRAP continuum. This may potentially make it harder for participants to distinguish between the vowels on the FOOT-GOOSE continuum. However,

SSBE GA F1 F2 F1 F2 DRESS 719 2063 631 (57) 2244 (190) TRAP 1018 1799 825 (81) 2059 (208) FOOT 410 1340 491 (56) 1486 (172) GOOSE 328 1437 417 (29) 1511 (326)

Table 1: Average first and second formant values in Hz for female speakers in SSBE (Deterding, 1997) and GA (Yang, 1996) for the relevant vowels with standard deviations reported between brackets (SDs not available in Deterding, 1997).

DRESS GOOSE F1 F2 F1 F2 100% 675 2154 373 1474 75% 737 2098 393 1459 50% 799 2042 412 1444 25% 860 1985 432 1428 0% 922 1929 451 1413

(23)

it must be noted that these vowels will have an additional length difference, which is not a factor in the DRESS-TRAP continuum.

4.2.5.2 Nonce word recordings

All nonce word plosive contexts and gliding vowels were recorded in a recording studio at the University of Groningen. A speaker to produce these recordings was found, based on the following two criteria: as described above, the speaker would have to be a Dutch native, highly proficient L2 English speaker who did not have any particularly noticeable regional accent. A proficient L2 English speaker was preferred over a native English speaker because the former’s English was considered to be more similar to both the English accent of the target population at large and to the English accent to which the target population is regularly exposed, namely that of proficient English with L1 Dutch influences. The second criterion was that the speaker would have to be unrecognisable for the participants. Should a participant take part in the experiment and recognise the speaker he or she is listening to, this would not only be a distraction to the participant, it would make his or her judgement process different from that of participants who do not recognise the speaker. This is caused by a process similar to that described earlier, in which the participant will no longer purely based his or her judgements on the question “how would I produce these categories?”, but instead (also) on the question “how would the speaker produce these categories?”, a question which the participant will attempt to answer by bringing forth memories of the speaker.

(24)

vowels. Additionally, several gliding vowels, either from DRESS to TRAP and vice versa, or from GOOSE to FOOT and vice versa were recorded, to allow for subsequent splicing of the relevant part of the vowel into the pre-recorded plosive contexts.

4.2.5.3. Vowel splicing

Although the exact averages listed in Table 2 could not be found in the gliding vowels, it was possible to find excerpts of these vowels that contained formant values which fell within the range denoted by the SSBE and GA values. For all four vowels, it was possible to find five excerpts which together formed a range. These excerpts were spliced into the plosive contexts using PRAAT (Boersma & Weenink, 2014). While the plosive context ensures that the start and end points of the original vowels in the plosive contexts could be easily determined, after which the vowel could be replaced with the relevant vowel excerpt selected from the gliding vowel, this did not result in natural sounding speech. This was mainly due to the fact that the pitch contour of the spliced vowel radically differed from that of the original vowel. However, even after manipulation of the pitch contour, the nonce words still did not sound acceptably natural. After pitch manipulations, the nonce words were sent to a native English speaker who did not have any formal linguistic training for judgement. This native speaker deemed them to be unacceptable, describing the words as sounding “unnatural, like I’m listening to a computer that can’t get it right” (C. Younger, p.c.).

4.2.5.4. Vowel manipulation

This outcome resulted in a revision of the proposed methodology. The gliding vowels were discarded and the plosive contexts, which had been recorded with each vowel, were used for vowel manipulation. The Praat Vocal Toolkit, “a Praat plugin with automated scripts for voice processing” (Corretge, 2012) was used to manipulate the first and second formant frequencies in the original recordings. To ensure that the manipulations of all four nonce words along both vowel continuums were as comparable as possible, the following procedure was used:

(25)

greatly between speakers and no reliable values could be found in the literature, vowel length is based on this speaker’s own production, to ensure the fabrication of natural sounding stimuli. The vowel lengths necessary for manipulation, based on average lengths found in this speaker’s vowel production, are listed in Table 3.

100% GOOSE 193

75% GOOSE 170

50% GOOSE 147

25% GOOSE 124

0% GOOSE 101

Table 3: Vowel length (in ms) needed for the FOOT-GOOSE vowel continuum in the perception experiment, based on average values produced by the recorded speaker.

Secondly, the first and second formant was measured for the vowel in all nonce words that had been produced with either a DRESS or a GOOSE vowel. One the basis of these measurements, one token per plosive context was selected for manipulation for each of those two vowels. Measurements were conducted in PRAAT (Boersma & Weenink, 2014) at the intensity peak in the vowel, as this is commonly accepted to produce stable measurements (D. Gilbers, p.c.). In addition to the formants, a general inventory of pitch contours was made as well; words with deviating (sharply rising, for instance) contours were discarded from further analysis, to ensure that pitch contour would not be a potential influence on participants’ judgement.

(26)

TRAP. For the FOOT-GOOSE continuum, the choice to use nonce words produced with GOOSE was made to accommodate the manipulation of vowel length, as GOOSE vowels are longer than FOOT vowels. Shortening a vowel is easier than lengthening it.

Once all formants in all DRESS and GOOSE nonce words had been measured, one nonce word per plosive context was selected, based on compatibility of the formants with the formants listed in Table 2. Specifically, the original vowels in the nonce words were compared to the formants listed for 50%. As this is the middle point along the vowel range, compatibility with that point ensured that manipulations could be as minimally invasive as possible. For the GOOSE vowels, the formants were not only compared to the 50% range, a nonce word could also only be selected for manipulation if the vowel length was sufficiently long to accommodate the length needed at 100%, as listed in Table 3.

The formant manipulation function in Praat Vocal Toolkit changes formants by recalculating their values, based on a combination of a given value to be entered into the program, the existing formants in the audio file and the interaction between these formants. It is therefore not possible to simply enter the desired formant values and have to toolkit produce exactly those formants. A certain aspect of trial and error was involved concerning the formant values to be entered into the program for each manipulation, until nonce words containing the right formant values were produced. To ensure comparability, the vowels were manipulated in such a manner that the desired formant values could be found at the intensity peak. Table 4 displays both the formant values at the intensity peak and the average values across the whole vowels for each manipulated nonce word.

(27)

velar, respectively) as the coda plosives in the target contexts. As the pitch contours of the vowels were very similar in these instances, splicing resulted in natural sounding nonce words. After formant manipulation, the vowel length for all vowels in the FOOT-GOOSE continuum was adjusted to the desired values as listed in Table 3. After all manipulations were completed, the volume for all nonce word files was levelled using Adobe Audition (version 5.5).

Formants at intensity peak Average formants

F1 F2 F1 F2 Bep 100% 672 2154 632 2088 75% 735 2095 671 2022 50% 798 2045 708 1954 25% 861 1988 796 1917 0% 924 1927 856 1889 Dep 100% 676 2155 626 2090 75% 736 2100 666 2029 50% 799 2043 743 1988 25% 861 1986 812 1940 0% 922 1929 860 1871 Deg 100% 675 2151 607 2228 75% 737 2097 654 2155 50% 800 2041 717 2112 25% 859 1983 775 2068 0% 923 1930 828 1986 Gek 100% 675 2153 624 2145 75% 738 2098 662 2095 50% 800 2043 715 2015 25% 860 1984 788 1956 0% 923 1928 824 1902 100% 374 1475 361 1501 Boop / Doop 75% 394 1460 374 1480 50% 413 1445 386 1440 25% 432 1427 398 1396 0% 451 1414 418 1390 100% 373 1474 348 1416 Doog / Gook 75% 394 1459 361 1393 50% 412 1444 374 1390 25% 434 1429 405 1388 0% 450 1413 412 1378

(28)

4.3 Feasibility pilot study

At this point, a feasibility pilot study was conducted, using only the nonce words for the DRESS-TRAP continuum. The pilot study was conducted using E-prime, version 2. For each nonce word, a fixation cross was presented for 500 ms. This cross was followed by a first presentation of the audio stimulus, followed by a second fixation cross with a duration of 500 ms and subsequently a second presentation of the audio stimulus. This second presentation was accompanied by two pictures, one of a cat and one of a bed, both drawn in a similar style. The participants were then asked to choose the picture they thought best corresponded to the vowel they had just heard in the nonce word, answering the question “does the vowel you just heard sound more like the vowel in cat or like the vowel in bed?” They were instructed beforehand that they would have to make this choice for each word they heard, this question was not presented to them during the experiment itself. All twenty nonce words were presented once in random order. Reaction time was not measured. This feasibility pilot study was conducted to establish whether the experiment was not too far removed from natural speech conditions to allow the participants to distinguish between the vowels. Or, in more colloquial terms, to ensure it did not become one big blur. Two female participants (both aged 19 and current students of the English Language and Culture bachelor program at the University of Groningen) took part in this pilot study. Female participants were chosen for this pilot study to ensure maximum compatibility between the participants and the experimental set-up, as the recorded speaker was female as well. The results for these two participants can be found in Table 5. Participant 1 Participant 2 b_p d_p d_g g_k Total b_p d_p d_g g_k Total 100% D D D D 100%D D D D D 100%D 75% D D T T 50%D D D D D 100%D 50% T T D D 50%D T D T T 25%D 25% T T T D 25%D T T D T 25%D 10% T T T T 0%D D D T T 50%D

Table 5: Perception results for the pilot study (D = “vowel was recognised as DRESS vowel” T = “vowel was recognised as TRAP vowel”)

(29)

other are presented in succession, participants are still capable of logically judging these vowels, as is evidenced by the fact that vowels at 100% percent prototypical DRESS formant values are much more likely to be judged as DRESS (in 100% of the cases, for both participants) than the vowels at 0% of the scale, i.e. the vowel with prototypical TRAP values (in 0% and 50% of the cases, respectively) and b) even though participants can perceive that the vowels presented to them belong on a continuum between two vowels, their judgement is not a reflection of the continuum itself. This would have been the case if both participants had judged the vowels in a similar manner, with total scores of (approximately) 100%, 75%, 50%, 25% and 0%. The facts that both participants’ judgements differ from this ‘ideal’ scale and that, in their judgements, they differ from each other, indicate that they are indeed capable of doing this experiment, and that they provide their personal judgements, which are presumably based on their own personal vowel categories.

4.4 Perception task construction

(30)

for the cook and boot pictures used for the FOOT-GOOSE stimuli. To ensure that all these adjustments could be incorporated in the perception experiment, the stimuli were not presented in random order.

The experiment was set up as follows: per nonce word, the participants would be presented with a fixation cross, which had a duration of 500 ms, after which the audio stimulus was presented for the first time. During this first presentation, no visual stimulus appeared on the screen. The first presentation was followed by another fixation cross with a duration of 500 ms. After this fixation cross, the audio stimulus was presented again, together with the two images. The participant would at this point have to choose one of the two images as a representation of the vowel they had just heard in the audio stimulus. There was no time limit on this choice. Once the participant had made a choice by pressing a corresponding key on the keyboard, one of the four symbols in one of the four colours belonging to the distractor task described earlier would be presented for 500 ms. After this image had been presented, the experiment would automatically continue on to the first fixation cross belonging to the next nonce word.

4.4.1 Distractor task construction

After four nonce words and distractor symbols had been presented, all four distractor symbols and colours had been presented once. Therefore, after the participants had selected their preferred picture for the fifth nonce word, one of the four previously displayed distractor symbols would be displayed again, with a duration of 500 ms. This time, the distractor symbol was presented in white. Afterwards, the participants would be presented with two images of paint splatters (see the Appendix for all images used in the perception experiment) in two different colours. The participants would then have to choose the colour in which the distractor symbol – which they had just seen in white – had previously been displayed. For this choice there was also no time limit.

(31)

4.4.2 Image construction

All images used, for both the vowel categories and the distractor items, were carefully adapted for both maximum clarity and potentially influencing characteristics. Although the four images used for the vowel categories (cat, bed, cook, boot) did not have the same shape, they were resized in such a manner that the multiplication of the height and the width for each image resulted in (virtually) the same number, so that their perceived size would be equal for the participants. This was done to ensure that the size of the pictures would not influence the participants’ judgement. All four symbols used in the perception task had a thick black border, to signal that all four belonged to the same category of symbols and to ensure that all four symbols would still be recognisable when presented in white, against the white background used in E-prime experiments. In those screens where participants had to choose a colour in the distractor task, the different colours were depicted using identical paint splatters of identical size. These paint splatters did not have a black border, to signal that these were not symbols, but only referred to the colours depicted.

4.5 Background questionnaire

Each participant was asked to answer a number of personal background questions prior to the start of the experiment.

1) Each participant was asked to provide their current age

2) Each participant was asked to provide the age at which they began to acquire English as a second language

3) Each participant was asked to provide the age at which they started studying at university

(32)

university will coincide with the age at which they began to study English and therefore learned to pay attention to their English production in a different and more thorough manner than had been asked of them before that moment. For those participants who had not studied English at university (see question 4), this moment was still maintained, both for comparability’s sake, but also because most Dutch university studies have some aspects of English, be it in classroom settings or in the required reading, that is above and beyond the level required in secondary school.

4) Each participant was asked whether they studied or had previously studied English at university. If this was not the case, they were asked what they did study/had studied.

5) Each participant was asked whether they had spent any more time than an average holiday in an English speaking country.

Immersion contexts are commonly considered to be the most effective contexts to successfully acquire a second language in, as the amount of L1 use and exposure is drastically reduced in these contexts. This might possibly result in deviating performances for those participants who have experienced an immersion context. Although it must be noted that prolonged exposure in an immersion context is not the only relevant factor in L2 exposure, it is a clearly distinct one. For all participants, various degrees of exposure to the L2 through media and contact with English speaking friends and/or relatives plays affect their input, a factor which cannot be readily controlled for.

6) Each participant was asked whether they were left or right handed

(33)

7) Each participant was asked whether they had any form of hearing, eye sight, or speech defects, not counting eye sight defects which could be corrected with glasses or contact lenses.

Prior to the start of the experiments, it was ensured that each participant did not have any physical defects which might compromise their production or perception. Defective eye sight which could be corrected with glasses or contact lenses was not considered a problem, as these participants were still perfectly capable of properly seeing the visual stimuli.

During participant selection, it was made clear that participants were needed who were native speakers of Dutch and who had acquired English as their second language, not from birth. As all Dutch native speakers come into contact with other languages, especially French and German during secondary school, no Dutch native speaker can be said to be truly monolingual, or, in this case, none of the participants can be said to be truly bilingual, instead of multilingual. For this reason, no further questions regarding the participants’ linguistic backgrounds were asked.

4.6 Experiment instructions

(34)

participants were presented with an introduction screen in which the entire experiment, both the main experiment and the distractor task (which was not described as a distractor task, but merely as a second task), was explained again in writing. This was done to ensure that all participants understood what was expected of them. Participants who felt they did not need this written instruction after the one they had already received verbally could skim and skip this screen. In both the written and the spoken instruction, it was stressed that the participants’ reactions would not be timed. This choice was made to ensure that the participants would not make mistakes under pressure, as there is no way of determining afterwards whether a deviating result is due to deviating judgement or an error.

All instructions, both spoken and written, were in Dutch, as were the pause screens between the blocks and the screen indicating the end of the experiment. Dutch was selected for two reasons: the first was to ensure that the spoken English of the instructions would not prime the participants towards basing their judgements on the vowel categories in the English they had just heard spoken, instead of their own vowel categories. The second was that the use of Dutch would activate the participants’ Dutch as much as possible. That in turn means that the language system in which they do not have separate vowel categories for DRESS and for TRAP, as well as for FOOT and for GOOSE is activated as much as possible, making this task as difficult as possible. This increases the potential of establishing whether, under any circumstances, speakers of a second language, even if they are advanced speakers, make use of the first language vowel categories when processing the second language.

Before the start of the experiment, participants were given two practice nonce words (pep and doot, both at 100%) and subsequent distractor stimuli, with a distractor question after both nonce words at the end of the practice session, which allowed them to understand the order of presentation of the stimuli and the choices they would have to make. After the practice session, participants were given the opportunity to ask any questions they may still have had at that point.

(35)

in the cartoons, without having seen them. These instructions, as well as any communication during the picture description task, were given in English, to ensure that the participants would provide English that was a fluent as possible.

4.7 Participants

Participants were recruited personally, via email, via recruitment texts on social media and various relevant University of Groningen-Nestor course pages and via word of mouth. All means of recruitment allowed potential participants to immediately see which time slots for testing were still available and make an appointment online2. Participants’ English proficiency was not specifically tested, however, it was stated clearly during recruitment that participants needed to be able to comfortably use English at an academic level. All participants were Dutch monolingual native speakers who acquired English after early childhood. In total, 33 native Dutch highly proficient L2 English speakers participated in the experiments. They were tested in a quiet environment. All participants first provided the information relevant for the personal background questionnaire, after which they participated in the perception experiment. The personal background questions were asked during a very brief informal interview, and were asked in Dutch, as this provided an extra opportunity to activate the participants’ Dutch language (see above). After the perception experiment, the participants were asked to switch to speaking English for the final part in which they provided spoken English samples during the picture description task. During this task, the general spoken language was English, for the interviewer as well the participants, to encourage the participants to speak English naturally. Afterwards, participants were thanked for their participation and ‘paid’ in chocolate. On average, an entire session lasted half an hour.

4.8 Data extraction

For the perception task, button presses per nonce word were extracted from E-prime, translated to indicate which vowel category was selected and placed in logical order. For each of the two vowel continuums, percentages of vowel category selection were calculated for each of the five points along the range. The average of these percentages

2

(36)

was used as an indication of the turning point between the two vowel categories in each continuum per participant. Later on, these turning points were also calculated separately for each of the four plosive contexts.

For the production task, twenty tokens from each of the four vowel categories were selected from the recordings. All tokens were taken from stressed syllables. For each token, the first and second formant was measured at the intensity peak using PRAAT (Boersma & Weenink, 2014). In addition to the first and second formant, vowel length was measured as well for the FOOT and GOOSE vowels. To allow for comparison between speakers, all formants were converted to the auditory Bark scale using the formula suggested by Traunmüller (1990): B = (26.81 x F) / (1960 + F) – 0.53, in which F is a formant value in Herz and B is the corresponding value on the Bark scale.

5. Results

5.1 Background questionnaire

(37)

participants were grouped based on age at testing, which was either above or below 30. This cut-off point was selected for practical purposes, based on an age gap existing within the participant group, which clearly divided participants into a young (mean age = 21.93, St. D. = 1.65) and an older (mean age = 46.60, St. D. = 10.53) age group. This corresponds to long LoL2A for the older participants and shorter LoL2A for the younger participants. Finally, a very large majority (N= 30) of the participants was right handed.

5.2 Left or right side preference

An independent samples t-test comparing all judgements produced for nonce words with the – for lack of a better description – ‘dominant target’ image on the left to all judgements produced for nonce words with the ‘dominant target’ image on the right revealed no significant difference between the two conditions for DRESS-TRAP (t(1318)= .055, p=.956) or for FOOT-GOOSE (t(1318)= -.166, p=.868). Individual independent sample t-tests revealed that a significant difference between ‘target left’ and ‘target right’ stimuli could only be found for one participant for the DRESS-TRAP continuum (t(38)= 2.333, p=.025) and for one participant for the FOOT-GOOSE continuum (t(38)= 2.390, p=.022). As these are two different participants, who therefore both only display a slight influence of target image position for one vowel continuum, and not for the other, this cannot be considered to be a reliable effect. As for both continuums, only one participant out of 33 displays a slightly significant difference between the two conditions, this difference is negligible and can be treated as a coincidence. As such, the judgement results for the left and right sided conditions will not be treated separately.

5.3 Vowel categories

(38)
(39)

15 t(38)= -3.923 p<.001 t(38)= 4.448 p<.001 t(38)= 1.761 p=.086 t(38)= 1.429 p=.161 t(38)= -6.300 p<.001 16 t(38)= -4.770 p<.001 t(38)= 6.911 p<.001 t(38)= 3.200 p=.003 t(38)= -1.808 p=.078 t(38)= -3.722 p=.001 17 t(38)= -5.817 p<.001 t(38)= 4.501 p<.001 t(38)= 2.450 p=.019 t(38)= -.424 p=.674 t(38)= -3.722 p=.001 18 t(38)=-10.021 p<.001 t(38)= 8.132 p<.001 t(38)= 3.078 p=.004 t(38)= -3.232 p=.003 t(38)= -8.166 p<.001 19 t(38)= -7.827 p<.001 t(38)= 7.632 p<.001 t(38)= 4.080 p<.001 t(38)= -4.957 p<.001 t(38)= -3.557 p=.001 20 t(38)= -3.693 p=.001 t(38)= -.400 p=.692 t(38)= 1.704 p=.097 t(38)= .057 p=.954 t(38)= -6.112 p<.001 21 t(38)= -7.218 p<.001 t(38)= 5.737 p<.001 t(38)= 5.102 p<.001 t(38)=.174 p=.863 t(38)=-6.121 p<.001 22 t(38)= -8.175 p<.001 t(38)= 3.516 p=.001 t(38)= 6.545 p<.001 t(38)= .154 p=.879 t(38)= -6.187 p<.001 23 t(38)= -6.042 p<.001 t(38)= 5.587 p<.001 t(38)= 2.327 p=.025 t(38)= .516 p=.609 t(38)= -5.992 p<.001 24 t(38)= -2.519 p=.014 t(38)= 1.363 p=.181 t(38)= 4.677 p<.001 t(38)= 2.034 p=.049 t(38)= -6.527 p<.001 25 t(38)= -8.949 p<.001 t(38)= 2.777 p=.008 t(38)= 2.775 p=.009 t(38)= -2.494 p=.017 t(38)= -6.035 p<.001 26 t(38)= -9.192 p<.001 t(38)= 6.566 p<.001 t(38)= 3.280 p=.002 t(38)= -2.149 p=.038 t(38)= -4.415 p<.001 27 t(38)= -6.880 p<.001 t(38)= 3.346 p=.002 t(38)= 2.773 p=.009 t(38)= -2.029 p=.050 t(38)= -5.787 p<.001 28 t(38)= -4.734 p<.001 t(38)= 2.433 p=.020 t(38)= 4.820 p<.001 t(38)= -2.753 p=.009 t(38)= -6.723 p<.001 29 t(38)= -8.359 p<.001 t(38)= 5.099 p<.001 t(35)= 3.390 p=.002 t(35)= -5.344 p<.001 t(35)= -3.683 p=.001 30 t(38)= -6.647 p<.001 t(38)= 5.504 P<.001 t(38)= .663 p=.511 t(38)= -1.779 p=.083 t(38)= -3.602 p=.001 31 t(38)= -8.637 p<.001 t(38)= 4.674 p<.001 t(38)= 4.501 p<.001 t(38)= -.369 p=.715 t(38)= -4.316 p<.001 32 t(38)= -9.925 p<.001 t(38)=10.411 p<.001 t(38)= 2.622 p=.013 t(38)= -1.656 p=.106 t(38)= -6.622 p<.001 33 t(38)= -6.111 p<.001 t(38)=12.889 p<.001 t(38)= .719 p=.477 t(38)= -.840 p=.406 t(38)= -5.934 p<.001

Table 6: independent sample t-test results per speaker, indicating (non)significant differences between F1 and F2 for both DRESS/TRAP and FOOT/GOOSE as well as for the differences in vowel length for

(40)

5.4 Correlations between production and perception 5.4.1 Young female participants

As the results from the personal background questionnaire have shown, variation on several factors exists within the participant group. As all of those varying factors might potentially influence both the participants’ production and their perception, an initial analysis of the whole group, encompassing all forms of variety, would most likely obscure patterns which can be found when that variation is controlled for. For this reason, the results of several subgroups of the participant group were analysed separately. The first and largest, relatively homogenous, group was that of the young (aged under 30) female participants (N= 22). Descriptive statistics for this group can be found in Table 7.

Correlations were calculated between the average preferred turning point in perception, calculated per vowel continuum, per participant and their production data. For the DRESS-TRAP continuum, significant negative correlations were found between the preferred turning point and the production of DRESS F2 (Pearson R= -.202, n=440,

Mean St. D. Minimum Maximum Range

DRESS F1 5.715 .743 2.67 8.05 5.37 DRESS F2 12.892 .534 11.45 14.52 3.07 TRAP F1 7.020 .733 5.05 9.16 4.11 TRAP F2 12.138 .590 10.40 13.78 3.38 FOOT F1 4.859 .641 2.82 7.07 4.25 FOOT F2 10.574 1.137 7.32 12.96 5.64 FOOT Length 94 34.81 31 256 225 GOOSE F1 4.383 .515 2.73 6.34 3.61 GOOSE F2 10.990 1.053 7.63 13.14 5.50 GOOSE Length 164 57.90 48 368 320 DRESS-TRAP preferred turning point 44.0 13.31 15.0 72.5 57.5 FOOT-GOOSE preferred turning point 57.0 12.34 37.5 90.0 52.5

Table 7: Descriptive statistics for the young female participants, with the formants in Bark, the vowel length in ms and the preferred turning points in percentages along the respective vowel continuums.

(41)

both FOOT (Pearson R= -.129, N=437, p=.007) and GOOSE (Pearson R= -.137, N=434, p=.004) and significant positive correlations were found between the preferred turning point and the vowel length produced for both FOOT (Pearson R= .108, N=437, p=.023) and GOOSE (Pearson R= .292, N=434, p<.001). Figures 1, 2 and 3 depict the correlations found for this group.

Figure 1: correlations between the preferred turning point in percentages for the FOOT-GOOSE continuum

(42)

Figure 2: correlations between the preferred turning point in percentages for the DRESS-TRAP continuum and F2 production in Bark for DRESS (in blue) and for TRAP (in green), for the young females.

(43)

Although most correlations found for this subgroup are highly significant, the correlations are weak. Further division of the young female participants into subgroups of students who did or did not study English revealed the influence of the participants’ study. Correlations between the average preferred turning point per participant, calculated for each vowel-continuum, and the vowel production data were calculated separately for the young female speakers who studied English (N= 16), and those who did not (N= 6).

5.4.2 Young female non-English students

The descriptive statistics for the group can be found in Table 8. For the DRESS-TRAP continuum, strong significant negative correlations were found between the preferred turning point and the production of DRESS F2 (Pearson R= -.399, n=120, p<.001) and between the preferred turning point and the production of TRAP F2 (Pearson R=-.445, n=120, p<.001). For the FOOT-GOOSE continuum, moderate significant negative correlations were found between the preferred turning point and the F1 production of both FOOT (Pearson R= -.318, N=120, p<.001) and GOOSE (Pearson R= -.318, N=120,

Mean St. D. Minimum Maximum Range

DRESS F1 5.782 .766 3.96 8.05 4.09 DRESS F2 12.763 .545 11.45 13.89 2.40 TRAP F1 6.733 .635 5.41 8.28 2.87 TRAP F2 12.453 .487 10.66 13.38 2.72 FOOT F1 4.768 .704 3.54 7.07 3.53 FOOT F2 10.490 1.211 7.32 12.91 5.59 FOOT Length 95 36.81 38 256 218 GOOSE F1 4.205 .487 2.86 5.30 2.44 GOOSE F2 10.545 1.136 7.63 12.79 5.16 GOOSE Length 180 68.31 68 368 300 DRESS-TRAP preferred turning point 44.0 15.25 27.5 72.5 45.0 FOOT-GOOSE preferred turning point 60.0 15.02 47.5 90.0 42.5

(44)

Mean St. D. Minimum Maximum Range DRESS-TRAP PTP /b_p/ 16.7 18.94 0.0 50.0 50.0 DRESS-TRAP PTP /d_p/ 43.3 18.03 20.0 70.0 50.0 DRESS-TRAP PTP /d_g/ 53.3 23.67 10.0 80.0 70.0 DRESS-TRAP PTP /g_k/ 61.7 24.20 20.0 90.0 70.0 FOOT-GOOSE PTP /b_p/ 41.6 24.20 0.0 80.0 80.0 FOOT-GOOSE PTP /d_p/ 45.0 25.77 10.0 90.0 80.0 FOOT-GOOSE PTP /d_g/ 70.0 25.93 30.0 100.0 70.0 FOOT-GOOSE PTP /g_k/ 85.0 15.06 60.0 100.0 40.0

Table 9: Descriptive statistics for preferred turning points (PTP) per plosive context (in percentages) for the young female non-English studying participants.

P<.001) and between the preferred turning point and the F2 production of GOOSE (Pearson R= -.272, N=120, p=.003). Significant positive correlations were found between the preferred turning point and the vowel length produced for both vowels. A moderate correlation was found for FOOT (Pearson R= .267, N=120, p=.003) and a strong correlation for GOOSE (Pearson R= .448, N=120, p<.001).

(45)

(Pearson R= -.192, N=120, p=.036) and /d_g/ (Pearson R= -.345, N=120, p<.001) contexts) and GOOSE (in /b_p/ (Pearson R= .258, N=120, p=.004), /d_p/ (Pearson R= -.265, N=120, p=.003), and /d_g/ (Pearson R= -.210, N=120, p=.021) plosive contexts) could be found. Significant negative correlations could also be found for GOOSE F2 in /d_g/ (Pearson R= -.275, N=120, p=.002) and /g_k/ (Pearson R= -.235, N=120, p=.010) plosive contexts, while FOOT F2 revealed a significant positive correlation for the /d_p/ plosive context (Pearson R= .233, N=120, p=.011). Significant positive correlations between preferred turning point and vowel length could be found for all four plosive contexts with GOOSE (/b_p/: Pearson R= .286, N=120, p=.002, /d_p/: Pearson R= .246, N=120, p=.007, /d_g/: Pearson R= .365, N=120, p<.001, /g_k/: Pearson R= .280, N=120, p=.002) and three out of four plosive contexts with FOOT (/b_p/: Pearson R= .230, N=120, p=.011, /d_g/: Pearson R= .239, N=120, p=.008, /g_k/: Pearson R= .187, N=120, p=.040). Figures 4-8 display these findings.

Figure 4: correlations between the preferred turning point in percentages for the DRESS-TRAP continuum

(46)

Figure 5: correlations between the preferred turning point in percentages for the DRESS-TRAP continuum and F2 production in Bark for the /d_g/ and /g_k/ plosive contexts for DRESS (ranging from light blue at bilabial to dark blue at velar) and for the /d_g/ and /g_k/ plosive contexts for TRAP (ranging from light

green at bilabial to dark green at velar), for the young, non-English studying females. /d_g/= △, /g_k/= ◊.

Figure 6: correlations between the preferred turning point in percentages for the FOOT-GOOSE continuum

(47)

Figure 7: correlations between the preferred turning point in percentages for the FOOT-GOOSE continuum and F2 production in Bark for the /d_p/ plosive context for FOOT (ranging from light red at bilabial to dark red at velar) and for the /d_g/ and /g_k/ plosive contexts for GOOSE (ranging from light orange at bilabial to dark orange at velar), for the young, non-English studying females. /d_p/= □, /d_g/= △, /g_k/= ◊.

(48)

5.4.3 Young female English students

Descriptive statistics for this group can be found in Table 10. For the DRESS-TRAP continuum, a weak negative correlation was found between DRESS F2 and preferred turning point (Pearson R= -.114, N=320, p=.041). For the FOOT-GOOSE continuum, only a weak significant positive correlation was found between the preferred turning point and the vowel length produced for GOOSE (Pearson R= .160, N=314, p=.005).

(49)

Mean St. D. Minimum Maximum Range DRESS F1 5.690 .733 2.67 7.92 5.25 DRESS F2 12.942 .521 11.54 14.52 2.98 TRAP F1 7.130 .739 5.05 9.16 4.11 TRAP F2 12.017 .583 10.40 13.78 3.38 FOOT F1 4.894 .612 2.82 7.06 4.24 FOOT F2 10.606 1.107 7.38 12.96 5.59 FOOT Length 94 34.06 31 240 209 GOOSE F1 4.452 .509 2.73 6.34 3.61 GOOSE F2 11.161 .968 8.16 13.14 4.98 GOOSE Length 157 52.11 48 347 299 DRESS-TRAP preferred turning point 43.8 12.51 15.0 67.5 52.5 FOOT-GOOSE preferred turning point 56.3 10.95 37.5 80.0 42.5 DRESS-TRAP PTP /b_p/ 26.2 13.36 10.0 50.0 40.0 DRESS-TRAP PTP /d_p/ 42.8 16.56 10.0 70.0 60.0 DRESS-TRAP PTP /d_g/ 49.3 21.32 0.0 80.0 80.0 DRESS-TRAP PTP /g_k/ 56.9 18.17 30.0 90.0 60.0 FOOT-GOOSE PTP /b_p/ 35.9 23.00 0.0 70.0 70.0 FOOT-GOOSE PTP /d_p/ 39.2 23.15 0.0 90.0 90.0 FOOT-GOOSE PTP /d_g/ 68.8 21.44 30.0 100.0 70.0 FOOT-GOOSE PTP /g_k/ 81.3 18.11 40.0 100.0 60.0

Referenties

GERELATEERDE DOCUMENTEN

As listeners hear the difference between whispered declarative questions and statements, though less clearly than in phonated speech, the question central to this section is

In contrast to this source-filter theory of human speech (Fant 1960) it has been long thought that frequency and amplitude modulations of bird vocalizations are mainly

Speech across species : on the mechanistic fundamentals of vocal production and perception..

Zebra finches exhibit speaker-independent phonetic perception of human speech. Zebra finches and Dutch adults exhibit the same cue weighting bias in

Although there are differences in vocal communication between songbirds, parrots and humans the mechanisms of sound production share the principle of active vocal tract

This table gives Wilks’ lambda for the two discriminant functions, using beak gape and OEC expansion as parameters, calculated for every bird separately and the chi-square values

(b) Beak opening and tongue depression during the production of the chatter sounds illustrated in panel (a). Note that both beak and tongue reach their maximum

In the first phase of the experiment all birds learned to discriminate reliably between the two words wit and wet and fulfilled the discrimination criterion after an average of 41