• No results found

The influence of voice familiarity on early word learning: an electrophysiological study

N/A
N/A
Protected

Academic year: 2021

Share "The influence of voice familiarity on early word learning: an electrophysiological study"

Copied!
28
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The influence of voice familiarity on early word learning: an electrophysiological study

Mathilde de Smit

Bachelor thesis: Psychobiology, University of Amsterdam Supervisor: dr. C.M.M. (Caroline) Junge

Second assessor: dr. E. (Eleni) Spyropoulou June 19, 2020

(2)

Abstract

During the first year of life, infants already begin with the process of word learning. This study aims to investigate how voice familiarity influences early word learning. 11-month-old infants were trained and tested on pseudo-word-object combinations, with the pseudo words either pronounced by their own mother (familiar voice condition) or by an unfamiliar female’s voice (unfamiliar voice condition). Word learning was measured using

electroencephalography (EEG) and specifically by looking at differences in event-related-potentials (ERP) responses. We assessed three ERP components which are associated with the three steps of word learning: the Nc with object recognition, the N200-500 with word form recognition and the N400 with concept-to-word mapping. The results revealed that the infants showed effects of object recognition and word form recognition, but not of concept-to-word mapping. Furthermore, the results showed that voice familiarity had no effect on all three steps of word learning, as there were no differences between the familiar voice condition and the unfamiliar voice condition. In sum, our research provides counterevidence for the

(3)

Introduction

A miraculous process occurs in the early development of children; the acquisition of language. This process happens at a remarkable pace although it is very complex; it is so complex that even artificial intelligence approaches have not succeeded in creating a

computer capable of learning language (Zue, 2000). A main process in language acquisition is word learning, which can be described as ‘generating associative links between words and their references by noticing the co-occurrence between a certain word and a certain object’ (van Rooijen, Bekkers, & Junge, 2019, p. 38). Important for this process is speech perception, considering this is the main channel through which infants are exposed to novel words. Speech perception is dependent on both an integrated brain system, which ensures that infants already have biases for speech perception (Werker, 2018), as on experience (Kuhl, 2004). More specifically, speaker familiarity, which refers to how familiar infants are with the voices they get exposed to, has been found to be important for word learning, (Pierrehumbert, 2003).

The most important voice for infants is without any doubt the voice of the mother, as this is the voice that infants have had the most experience with. Infants are even able to hear and recognize their mother’s voice in the womb, compared to other voices (Kisilevsky et al., 2009). Shortly after birth neonates already show a preference for the maternal voice over the voice of another female (DeCasper & Fifer, 1980). Research also suggests an association between the mother’s voice and the development of social and emotional functioning; in middle-aged children social communication skills were predicted by the strength of

connectivity between multiple brain regions while perceiving their mother’s voice (Abrams et al., 2016).

Moreover, a profound role of the mother’s voice has been found in language development. This can be seen in the brain very early; research demonstrated that the mother’s voice and heartbeat affect the growth and development of the auditory cortex in

(4)

infants born extremely prematurely (Webb, Heller, Benson, & Lahav, 2015). These results indicate that the auditory cortex is adaptive to maternal speech and that the mother’s voice thus plays a role in the early shaping of language areas in the brain. This is also suggested by the research of Dehaene-Lambertz et al. (2010), which examined the organization of brain activity in response to the mother’s voice in 2-month-old infants. In this study they found an advantage for maternal speech relative to unfamiliar speech in the left posterior temporal lobe, which is associated with speech processing. Both studies thus reveal an important role for the mother’s voice in the development of language in the brain.

As mentioned before, speaker familiarity is important for word learning

(Pierrehumbert, 2003) and multiple studies show an effect of the mother’s voice on processes in early word learning (e.g. van Rooijen et al., 2019). One of these processes is word form recognition, which refers to the ability to recognize that a cluster of sounds form a word together and to identify that word. Although studies have shown a beneficial effect of the mother’s voice on word recognition (e.g. Barker & Newman, 2004), little is known about the effect of voice familiarity on word learning. The eye-tracking study from van Rooijen et al. (2019) did demonstrate that 24-month-old infants learn novel words easier when listening to their own mother compared to unfamiliar speakers. However, this was a behavioral study and no electrophysiological research has been conducted into voice familiarity and word learning yet. It is important though to perform electrophysiological research, as this helps with gaining insight into the underlying processes of word learning. This knowledge could be used in designing better intervention strategies aimed at boosting early word learning. To add to this knowledge, this study aims to answer the following research question: how does voice familiarity influence early word learning?

In order to investigate this, we conducted an experiment with 11-month-old infants in which they were trained and tested on pseudo-word-object combinations. We used a

(5)

between-subjects design with two conditions considering voice familiarity; the infants got presented with the pseudo words pronounced by either their own mother (the familiar voice condition) or by an unfamiliar female’s voice (the unfamiliar voice condition). For this, pre-recorded stimuli were used. Word learning was measured by using electroencephalography (EEG) and specifically by looking at several event-related-potential (ERP) components. This method gave us the possibility to look into the underlying processes of word learning, for which we recognize three steps: (1) object recognition; (2) word form recognition and (3) concept-to-word mapping (Junge, Cutler, & Hagoort, 2012; Waxman & Lidz, 2002). Each of these steps is associated with an infant ERP component, which we consider separately.

The first step in word learning is object recognition, which entails the process of identifying an object and being able to categorize it. An example of this is having all different looking cups with different features, but still generalizing them to one category: cups. The ERP component that is associated with object recognition is the Nc, which is a negative signal found in fronto-central areas in response to all types of visual stimuli, but not to auditory stimuli (de Haan, 2007). More specifically, this component is usually more negative in response to novel visual stimuli than to familiar visual stimuli and therefore the Nc is associated to attention and memory (Ackles & Cook, 2007). Earlier research has found that for 6-to-12-month-old infants the Nc peaks around 400-600ms after stimulus onset and that the component gets significantly attenuated by the repetition of visual stimuli (Junge et al., 2012). Therefore, it is expected that the same effect will be seen in this study. However, we hypothesize that there is no effect of maternal speech on the Nc, as the Nc reflects visual processing and not auditory processing. In addition, in the experimental design the auditory stimulus (pseudo word) is played at 1000ms after picture onset and because the Nc occurs earlier it is unlikely that the auditory stimuli affect the Nc.

(6)

The next important skill in word learning is word form recognition, which refers to the ability to identify a word based on its form. For instance, the Dutch word ‘poes’ means the same as the English word ‘cat’, although the words differ in their form. Research from Barker & Newman (2004) provides evidence that the mother’s voice facilitates the process of word form recognition in infants.

To assess word form recognition, we consider the N200-500, which is a broad component that occurs 200-500ms after word onset and is associated with auditory

processing. In infants this component is usually the largest in anterior and lateral electrodes and enhanced negativity is found in response to familiar words as compared to unfamiliar words (Friedrich & Friederici, 2008; Junge et al., 2012). Based on these results the N200-500 is considered as a measure of familiarity and therefore of word form recognition. Research also reveals that this word form familiarity effect is also found for pseudo words (Friedrich & Friederici, 2008). Based on this and on findings that voice familiarity has an enhancing effect on word recognition (Barker & Newman, 2004; Houston & Jusczyk, 2003), we hypothesize two things. Firstly, we expect to see a N200-500 effect in response to repetition of the pseudo words in all infants. Secondly, we hypothesize that maternal speech will facilitate word form recognition and therefore we expect to see a bigger N200-500 effect in the familiar voice

condition compared to the unfamiliar voice condition.

Finally, we consider the last crucial skill in word learning: concept-to-word mapping. This skill entails creating associations between words and objects, so that a connection between the phonological and semantic representation gets established. The N400 is associated with concept-to-word mapping and this component is thus supposed to provide insight into word learning. The N400 is a negative deflection peaking around 400ms after stimulus onset and is most pronounced in posterior electrodes (Kutas & Federmeier, 2011). Numerous research has been conducted on the N400 after it was first discovered by Kutas &

(7)

Hillyard (1980). In their study, a significant increase in negativity was found in the EEG-signal around 400ms after people read a semantically inappropriate word in a sentence. Since then, the N400 is seen as an indication of semantical processing, and more specifically as a reflection of the ease with which a word can be integrated in a certain context (Kutas & Federmeier, 2011). Many studies demonstrate that the N400 effect can already occur in infants, it has been found for example as early as in 6-month-olds (Friedrich & Friederici, 2011), 9-month-olds (Junge et al., 2012; Parise & Csibra, 2012) and 14-month-olds (Friedrich & Friederici, 2008). Therefore, we expect to find a N400 effect in response to the incongruent object-word pairings.

Furthermore, we carefully hypothesize that maternal speech facilitates concept-to-word mapping in a similar manner as with concept-to-word form recognition, and therefore we expect to see a bigger N400 effect in the familiar voice condition compared to the unfamiliar voice

condition. This is substantiated by research, which found a positive link between maternal

speech and word learning (e.g. Parise & Csibra, 2012). However, we must be careful when making hypotheses about the N400, as a recent review by Morgan, van der Meer,

Vulchanova, Blasi, & Baggio (2020) pointed out several factors leading to inconsistent findings considering the N400. For instance, there are multiple methodological

inconsistencies between studies examining the N400, such as different target time intervals, different experimental set-ups and channel localizations. This makes it harder to interpret and compare research about the N400 effect. Furthermore, age and vocabulary knowledge have been found to modulate the amplitude, latency and topographical distribution of the N400 component (Morgan et al., 2020). We must keep these contributing factors in mind when considering the N400. Still, we speculate that voice familiarity effects concept-to-word mapping and thus the N400.

(8)

To summarize, we expect that infants will be capable of all three steps of word learning and that this will be reflected in the ERP responses. Furthermore, we expect that voice familiarity will only enhance the two latter steps of word learning; word form recognition and concept-to-word mapping.

Methods

Participants

In this study a total of 43 11-month-old infants participated (mean age: 333 days; range: 293-379; 23 girls1). All infants were either assigned to the familiar voice condition (n = 21), or to

the unfamiliar voice condition (n = 22). Participants were recruited through letters sent via the municipality. All infants were healthy, born between 37 and 42 weeks of pregnancy, and came from monolingual Dutch households. Prior to participation, parents signed an informed consent and a general questionnaire, which confirmed that none of the infants had a family history of neurological or language impairments. After the experiment, the infants received a small book in appreciation of their participation.

Materials

For the visual stimuli, eight pictures of stuffed toys were used which were distinct in color and shape (Figure 1).

1 Part of this study has been described before as Blommers (2015); our study includes more

(9)

Figure 1: The visual stimuli

These toys were chosen as they do not resemble any object that the infants were already familiar with. In addition to these unfamiliar objects, real objects were presented with which the infants were familiar prior to the experiment. These familiar visual stimuli consisted of pictures of a baby, a cat, a cow and a dog. All pictures were sized approximately 20x20cm and were presented on a screen against a dark grey background.

For the auditory stimuli, the following eight Dutch distinct, bi-syllabic and trochaic pseudo words were used: /bœymiη/, /dεibǝl/, /funi/, /xemǝr/, /kavǝn/, /mikǝl/, /pola/ and /tεpǝr/. These words are phonotactically legal, but are non-existent in Dutch which made it possible to control for prior knowledge of the infants. Beside the pseudo words there were also auditory stimuli of real words which matched the real objects. The real words were Dutch words: ‘baby’ (baby); ‘poes’ (cat); ‘koe’ (cow); ‘hond’ (dog). All of the words were preceded by the Dutch definite article ‘de’.

Prior to testing, the auditory stimuli were collected by making recordings of the mother’s voice. Half of the mothers of the infants were asked to read a word list out loud, as only half of the infants were assigned to the familiar voice condition. Each recording was thus used twice: once for the child of the mother and once for another infant who was in the

(10)

occurrences of each pseudo word and 4 occurrences of each real word. The mothers were asked to read the words at a slow pace and as if they were talking to their infants. From each recording we chose 12 tokens for each pseudo word and 4 tokens for each real word. The recordings were cut, equalized to 65dB and word onset was marked using Praat software (Boersma & Weenink, 2001).

Experimental design

The experiment consisted of a total of 136 word-picture trials. These trials were divided into one practice block with 8 trials, containing the real words and real objects, and four blocks with 32 pseudo word-picture trials. In the practice block each word-object combination was presented twice. The aim of these practice trials was to clarify the referential relationship between the words and the pictures.

All the trials had the same structure, as is outlined in Figure 2. Pictures appeared on the screen and remained for 3000ms, with the word presented 1000ms from picture onset. We used different onsets for the pictures and the words as this enabled us to study object

recognition and word form recognition separately. The variance between the different speakers and different tokens led to a varying word offset. However, to assure that word offset always preceded picture offset, picture offset occurred 2000ms after word onset. Each trial ended with an Inter Stimulus Interval (ISI) with a length of 750ms, during which a plain light grey screen got presented. During this ISI, an attention getter could be played. The duration of each trial was 3750ms and the total duration of the experiment was 8.5 minutes.

(11)

Figure 2: Timing of a trial

There were four blocks with pseudo word-object trials and in each block the infants were trained and tested on two word-object combinations. Figure 3 shows the structure of one block. Each block consisted of 16 trials in the training phase and 16 trials in the test phase. During the training phase the trials were organized in an ABAB order so that the 16 trials were divided in four groups of four trials. For each trial a different token of the word was used, but the pictures were the same for each word-object combination. During the test phase, the infants also got presented 8 times with both word-object combinations. However, half of the trials were congruent, so that the word and picture matched (e.g. blue toy with the word ‘funi’), and half of the trials were incongruent, so that the word and picture did not match (e.g. the blue toy with the word ‘dεibǝl’). In the test phase we used four tokens per word, so that each token got presented twice; once in a congruent trial and once in an incongruent trial. The order of presentation of the trials was pseudo-randomized: no more than two of the same picture, word or trial type (congruent vs. incongruent) could appear consecutively. To control for order effects four presentation-order lists were created, which differed from each other in word-object combinations and order of presentation.

(12)

Figure 3: The outlining of a block containing a training phase and a test phase. Underlined words indicate the incongruent word-object combinations. Subscripted numbers indicate the token number.

The experiment was written and performed using Presentation® software (Version

17.210.08.14, www.neurobs.com). The experiment was programmed to automatically send a marker to the EEG signal at picture onset and word onset, which enabled us to compose the ERP time windows.

Procedure

Infants were either assigned to the familiar voice condition, in which they heard the pseudo words pronounced by their own mother, or to the unfamiliar voice condition, in which they heard the pseudo words pronounced by the mother of another infant.

We provided the parents with an information brochure prior to participation, which contained specific information on the study and its procedure. Before starting the experiment, parents had to fill in a general questionnaire and sign an informed consent. The experiment took place in a dimly lit, sound attenuated room. The infant was seated in a car seat, which was attached to a low table. On this table an Acer monitor of 27.5x34.5cm sat at a distance of 70cm from the infant. The parent and one of the experimenters each sat next to one side of the infant. Sessions were recorded with a Canon Legria HFG25 camera to allow for offline

(13)

EEG recordings and processing

EEG was recorded with a sampling rate of 16384Hz, using an infant-size ActiveTwo headcap with 32 inserted Signa Gel electrodes, placed according to the traditional 10/20 positioning. Electrodes were referenced to the left and right mastoids. In addition, an electrode was placed underneath the left eye to record eye blinks, enabling to filter eye blinks out of the signal.

Data was pre-processed by down-sampling it to 512Hz and filtering it using a band-pass filter of 0.1-30 Hz. Subsequently, the signal was segmented into epochs of 1200ms, starting 200ms before onset of the target word or target picture. Video recordings were used to reject the trials in which the infants had not looked at the screen. Furthermore, based on visual inspection we excluded trials which contained artifacts, such as drifts, eye movements or amplitudes exceeding 250mV. Finally, participants had to have at least 10 artifact-free trials in a condition (1st-4th instances of a picture and 5th-8th instances of a picture; 1st-4th

instances of a word and 5th-8th instances of a word; match and mismatch) to be included in

statistical analyses.

Statistical analysis

We assessed the data using a repeated measures analysis of variance (repeated measures ANOVA). Mean amplitudes were calculated per electrode over selected time windows: 300-800ms after picture onset for the Nc; 200-600ms after word onset onset for the N200-500 and 400-600ms after word onset for the N400. We chose these time windows based on prior knowledge (Junge et al., 2012) and on visual inspection.

We performed repeated measures ANOVA’s for each of the three time windows. From the training phase we assessed picture repetition effects and word repetition effects separately. For both, we compared the mean amplitudes from the 1st-4th instances with the 5th-8th instances of either the word or the picture, with voice condition (familiar vs.

(14)

unfamiliar) as between-subjects factor. For the test phase we assessed the match-mismatch

effects by comparing the mean amplitudes of the congruent trials with the incongruent trials, again with voice condition as between-subjects factor. In addition, we performed analyses with the following within-subject factors: hemisphere (2) and relevant electrodes per ERP (for the Nc: F3, FZ, F4, FC5, FC1, FC2, FC6, C3, CZ, C4; for the N200-500: F7, F3, FZ, F4, F8, FC5, FC6, T7, T8; and for the N400: CP5, CP1, CP2, CP6, P7, P3, PZ, P4, P8). Adding these within-subject factors allowed for statistical analyses of the distribution of effects. We report uncorrected F-ratio’s, degrees of freedom and p-values, which are corrected using the Huynh- Feldt epsilon correction.

Results

Training phase – object recognition

In Figure 4 the grand average waveforms are displayed, for the training phase with the time locked to picture onset. Four waveforms per electrode are visible, for both the 1st-4th instances

of a picture and the 5th-8th instances of a picture, in combination with either the familiar voice

condition or the unfamiliar voice condition. For this figure and the following grand average

waveform figures, negativity is plotted upwards and the electrodes are displayed in order according to the position on the head: from left to right and from anterior (top) to posterior (bottom).

A broad deflection is visible in almost all electrodes for all conditions, which gets less negative with repetition. The deflection starts at 200-300ms after picture onset and peaks around 500ms with a negative polarity, which is visible for all electrodes, except for P7 and P8. No clear deflection is visible in P3 and P4.

(15)

Figure 4: The grand average waveforms for the training phase, time locked to picture onset, for the following four conditions: Blue = 1st-4th occurrences of picture, familiar voice condition; Green = 5th-8th occurrences of picture, familiar voice condition; Black = 1st-4th occurrences of picture, unfamiliar voice condition;Red = 5th-8th occurrences of picture, unfamiliar voice condition.

We performed statistical analyses over the time window of 300-800ms after picture onset. The first ANOVA, which analyzed all electrodes, reveals a significant effect of picture repetition (F1,41 = 12.228, p = 0.001), but no significant interaction of picture repetition with

voice familiarity (F1,41 = 0.942, p = 0.337). A second analysis of the relevant electrodes for

the Nc (in fronto-central areas) gave similar results: a significant effect of picture repetition (F1,41 = 12.275, p = 0.001) but again no significant interaction of picture repetition with voice

familiarity (F1,41 = 1.188, p = 0.282). The repetition effect can also be seen in the grand

average waveforms, in particular in the fronto-central areas. In general, the graphs of the 5th

-8th occurrences (in green and red) lie under the graphs of the 1st-4th occurrences (in blue and

black). The 1st-4th occurrences of a picture thus appear to have elicited a more negative signal

than the 5th-8th occurrences of a picture. This is also visualized in Figure 5, which depicts the

interaction between picture repetition and the voice condition (over all electrodes). In addition, we assessed the hemispheres with an ANOVA, however the results

revealed no significant interaction between picture repetition and hemisphere (F1,41 = 0.642, p

= 0.428). Also, no significant interaction was found between picture repetition, hemisphere and voice familiarity (F1,41 = 0.205, p = 0.653). Thus, no differences were found between the

(16)

hemispheres.

Overall, we conclude that there is an Nc effect, based on the results of the analyses. This is confirmed by visual inspection. Interpretation of our findings follows in the discussion section below.

Figure 5: A plot showing the interaction between picture repetition and the two voice conditions, over all electrodes.

Training phase – word form recognition

Figure 6 depicts the grand average waveforms for the training phase, time locked to word onset, for the 1st-4th occurrences of a word, the 5th-8th occurrences of a word, for both the

familiar voice condition and the unfamiliar voice condition. A slightly positive deflection is

visible in the anterior and lateral electrodes, which starts at 100-200ms after word onset and peaks around 400ms. Subsequently, this positive deflection turns into a negative deflection at around 800ms after word onset. However, this trend is not visible for the more posterior electrodes. Here, there are hardly any deflections visible. Overall, the graphs of the 5th-8th

(17)

occurrences (in green and red) lie above the graphs of the 1st-4th occurrences (in blue and

black), which indicates that repetition of words led to a less positive (i.e. more negative) signal. Figure 7 also substantiates this, as it shows a more negative average signal for the 5th

-8th occurrences compared to the 1st-4th occurrences, for both voice conditions.

Figure 6: The grand average waveforms for the training phase, time locked to word onset, for the following four conditions: Blue = 1st-4th occurrences of word, familiar voice condition; Green = 5th-8th occurrences of word, familiar voice condition; Black = 1st-4th occurrences of word, unfamiliar voice condition;Red = 5th-8th occurrences of word, unfamiliar voice condition.

Figure 7: A plot showing the interaction between word repetition and the two voice conditions, over all electrodes.

(18)

Statistical analyses were performed over the time window of 200-600ms after word onset. The initial ANOVA over all electrodes confirms our visual inspection; there is a significant effect of word repetition (F1,41 = 5.905, p = 0.020). A similar result is found in the ANOVA

which assesses the relevant electrodes for the N200-500 (anterior and lateral electrodes): there is a significant effect of word repetition (F1,41 = 8.639, p = 0.005). These results establish that

the effect we can visually see in the grand average waveforms, regards the N200-500. However, both the overall analysis and the analysis over the relevant electrodes reveal that there is no significant interaction between word repetition and voice familiarity (overall: F1,41

= 0.425, p = 0.518; relevant electrodes: F1,41 = 0.667, p = 0.419)

Furthermore, the distribution of the word repetition effect was analyzed. Although this ANOVA showed no significant interaction between word repetition and the hemisphere (F1,41

= 2.988, p = 0.091), follow-up ANOVA’s per hemisphere were performed due to the small p-value. This led to the results that there is a significant effect of word repetition in the left hemisphere (F1,41 = 8.521, p = 0.006), but not in the right hemisphere (F1,41 = 2.326, p =

0.135). In both hemispheres there was no significant interaction between word repetition and voice familiarity (left: F1,41 = 0.005, p = 0.942; right: F1,41 = 1.644, p = 0.207).

In sum, based on the results of analyses we conclude that a N200-500 effect is present. Further interpretation follows in the discussion.

Test phase – concept-to-word mapping

Figure 8 depicts the grand average waveforms, for the test phase trials with time locked to word onset. Per electrode four waveforms are visible, for the congruent trials and the incongruent trials, for both the familiar voice condition and the unfamiliar voice condition. No clear trends are visible over all electrodes, although a negative deflection is visible in some electrodes, most pronounced in the lateral anterior electrodes (F7, FC5, T7, F8, FC6,

(19)

T8). This negative deflection starts around 400ms after word onset. The rest of the electrodes show less clear deflections.

Figure 8: The grand average waveforms for the test phase, time locked to word onset, for the following four conditions: Blue = congruent pairs, familiar voice condition; Green = incongruent pairs, familiar voice condition;

Black = congruent pairs, unfamiliar voice condition;Red = incongruent pairs, unfamiliar voice condition.

The time window we used for statistical analyses is 400-600ms after word onset. The initial ANOVA over all electrodes shows that there is no significant main effect of word-object congruency (F1,41 = 0.494, p = 0.486) and also no significant interaction of word-object

congruency and voice familiarity (F1,41 = 0.056, p = 0.814). A visualization of these findings

is shown in Figure 9, which displays the interaction between congruency and both voice conditions. A second ANOVA assessed the pre-determined relevant electrodes for the N400, which are the posterior electrodes. This led to similar results as the initial ANOVA; no significant effect of word-object congruency was found (F1,41 = 0.327, p = 0.571) and no

significant interaction between word-object congruency and voice familiarity (F1,41 = 0.021, p

= 0.886).

To investigate the topological distribution of the conditions, an ANOVA with the within-subject factor hemisphere was performed. This revealed that there is no significant interaction between word-object congruency and hemisphere (F1,41 = 0.557, p = 0.460) and no

(20)

significant interaction between word-object congruency, hemisphere and voice condition (F1,41 = 0.252, p = 0.618). To asses laterality, we performed an ANOVA per hemisphere, but

again no significant interaction between word-object congruency and laterality was found (left: F1,41 = 0.232, p = 0.633; right: F1,41 = 1.331, p = 0.255).

Based on the results from the statistical analyses and visual inspection, we conclude that there is no clear N400 effect present. Further interpretation follows in the discussion.

Figure 9: A plot showing the interaction between congruency and the two voice conditions, over all electrodes.

Discussion

The aim of this study was to investigate how voice familiarity influences early word learning in 11-month-old infants. We investigated this by performing a pseudo word-object learning paradigm, in which infants either heard the voice of their own mother or an unfamiliar female voice. Hereby, we assessed the ERP’s related to the three steps in early word learning; object recognition, word form recognition and concept-to-word mapping. We will consider these three separately.

(21)

Object recognition and voice familiarity

To assess the influence of voice familiarity on object recognition, we analyzed the ERP signals of the training trials, time locked to picture onset. The results show that there is an Nc effect in response to picture repetition, which is also visible in the grand average waveforms. The Nc effect is evenly distributed over the brain, as no differences between the hemispheres were found and an equally great Nc effect was found in the relevant electrodes for the Nc (in fronto-central areas). The finding of the Nc effect is in line with our hypothesis, as we expected that picture repetition would lead to an attenuation of the Nc component, based on earlier research (Junge et al., 2012). These results indicate that infants familiarized with the pictures and encoded them, as there is a difference between the infants’ familiarity with the objects during the 1st-4th occurrences and during the 5th-8th occurrences.

We did not find an interaction between voice familiarity and object recognition, which is also in line with our predictions. As discussed in the introduction, we hypothesized that voice familiarity would not influence object recognition because the Nc is a visual component and not an auditory component (de Haan, 2007). Furthermore, in our experimental design the auditory stimulus occurred at 1000ms from picture onset, and the Nc usually peaks 400-600ms after onset of the visual stimulus (Junge et al., 2012). Our results confirm this; the Nc occurred between 200-800ms after stimulus onset, so it disappeared 200ms before the occurrence of the mother’s voice. Therefore, we deemed an interaction between voice familiarity and object recognition unlikely.

In sum, we conclude that object recognition did occur in 11-month-old infants, but that voice familiarity does not influence this.

(22)

The current study investigated the influence of voice familiarity on word form recognition by assessing the ERP signals of the training trials, with time locked to word onset. Our results reveal that there is an N200-500 effect in response to word repetition, when comparing the grand average ERP’s of the 1st-4th occurrences with the 5th-8th occurrences. This effect was

also found for the relevant electrodes for the N200-500, namely the anterior and lateral electrodes. No differences were found between the two hemispheres. These results are in line with our expectations. However, this was not the case for our findings regarding voice familiarity, as the results revealed that there is no interaction between voice familiarity and word form recognition while we did expect to find one.

When considering the lack of an interaction between voice familiarity and word form recognition, it is remarkable that previous studies did show an enhancing effect of voice familiarity on word form recognition (Barker & Newman, 2004; Houston & Jusczyk, 2003). However, both studies use the Headturn Preference Procedure, which is a behavioral

measurement. This possibly explains why our findings are inconsistent with the findings of Barker & Newman (2004) and Houston & Jusczyk (2003), considering that behavioral studies not only measure infants’ ability of discrimanation but also the preference of the infants, while ERP studies only measure discrimination. For instance, it could be that the results of the above mentioned behavioral studies are due to the heightened attention of infants in the familiar voice condition compared to the infants in the unfamiliar voice condition. Therefore, it could be the case that these studies merely measure that infants have a preference for their own mother’s voice over an unfamiliar voice but that they do not actually measure whether voice familiarity enhances word form recognition. ERP measurement do look into underlying

processes of word learning instead of the outcomes, therefore more electrophysiological

research should be conducted into the influence of voice familiarity on word form recognition.

(23)

In sum, we conclude we conclude that word form recognition did occur in 11-month-old infants, but that voice familiarity does not influence this.

Concept-to-word mapping and voice familiarity

The last crucial step in word learning is concept-to-word mapping, which we assessed by analyzing the ERP signals of the test phase, time locked to word onset. We compared the ERP’s of the congruent trials with the incongruent trials, which led to the result that there were no significant differences between the congruent and incongruent trials. Similarly, we found no N400 effect in the posterior electrodes, which is considered to be the relevant area for the N400 in infants (Junge et al., 2012). Furthermore, no interactions between word-object congruency and the hemispheres and laterality was found. We thus found no N400 effect and although we were careful in hypothesizing about the N400, this is against our expectations. Based on research demonstrating that the N400 semantic priming effect does occur in infants (e.g. Friedrich & Friederici, 2008; 2011, Junge et al., 2012), we did expect to find this effect, at least in the familiar voice condition.

Considering these unexpected results, there are multiple factors which possibly contributed to the inconsistency between our findings and hypotheses. First of all, we should keep in mind the unclarities concerning the N400 in infants. As discussed before, it has been found that several factors are of importance in studies regarding the N400, which regularly leads to inconsistent findings. Methodological inconsistencies are common and this makes it harder to interpret and compare research about the N400 effect (Morgan et al., 2020).

However, our hypotheses were based on studies using comparable methodologies, such as the use of novel pseudo word-object combinations (Friedrich & Friederici, 2008) and the amount of repetitons of the stimuli (Friedrich & Friederici, 2011; Junge et al., 2012). Still, our

(24)

an analysis over the relevant area for the N400, for which we used the posterior electrodes, based on previous research (Junge et al., 2012). However, other studies performed with different age groups revealed that the location of the N400 is more anterior in children, whereas it is indeed more central-parietal in adults (Atchley et al., 2006; Holcomb, Coffey, & Nevilly, 1992). Perhaps it would have been more likely to find an N400 effect in the anterior electrodes.

Another factor that could have been of influence is the infants’ state of language development. In the study of Friedrich & Friederici (2010) it was found that infants with high early word production did show an N400 semantic priming effect at the age of 12 months, but that this was not the case for 12-month-old infants with low early word production. Perhaps this also contributed in the current study, because we have made no distinction between infants with high and low early word production in our analyses. Future research should consider this factor, as it possibly contributes to the inconsistencies between studies regarding the N400 semantic priming effect.

Furthermore, the results demonstrated no interaction between voice familiarity and congruency of the word-object pairs. This is not in line with our hypothesis, as we expected that voice familiarity would facilitate the process of concept-to-word mapping, based on research which found that maternal speech has a facilitating effect on word learning (Parise & Csibra, 2012). However, we cannot draw hard conclusions regarding the influence of voice familiarity on concept-to-word mapping based on these results, as we found no N400 effect to begin with. The lack of a N400 effect in our results could be due to multiple factors, so we cannot say with certainty that one of the voice conditions had either a positive or negative effect on word learning. For now, we can only conclude that the infants were apparently just not able to map the pseudo words on the objects in our paradigm and that voice familiarity was not able to boost the mapping sufficient enough. Possibly our paradigm was too hard for

(25)

the 11-month-old infants and another paradigm would have been better, for instance a paradigm in which a distinction is made between rotated and consistent pairing. Future research should consider which paradigm is most suited for infants of a certain age when investigating word learning.

In sum, we conclude that in the current study concept-to-word mapping did not occur in the 11-month-old infants and that it is unclear what effect voice familiarity has on this process.

Conclusion

The aim of the current study was to investigate how voice familiarity influences early word learning in 11-month-old infants. We found that that there is an Nc effect in response to picture repetition, and thus that object recognition occurred in the infants. Furthermore, a significant N200-500 effect was found for word repetition, which indicates that the infants familiarized with the word forms over repetition. We can thus conclude that the first two steps in early word learning (object recognition and word form recognition) are present in 11-month-old infants. However, we cannot conclude this for concept-to-word mapping, the last step in word learning, as we found no N400 effect. As mentioned before, multiple factors may have contributed to this result, therefore we should carefully consider these in future research.

Strikingly, the results showed that voice familiarity had no effect at all in our

investigation. This was expected for the process of object recognition, but not for word form recognition and concept-to-word mapping. We thus conclude that voice familiarity does not influence early word learning in a paradigm like ours. However, more electrophysiological research should be conducted into this, as only behavioral research has been performed on this so far.

(26)

Overall, the current study was the first to investigate how voice familiarity influences early word learning with electrophysiological measurements. Our research provides

counterevidence for the suggestion that voice familiarity enhances language acquisition in infants.

(27)

References

Abrams, D. A., Chen, T., Odriozola, P., Cheng, K. M., Baker, A. E., Padmanabhan, A., … Menon, V. (2016). Neural circuits underlying mother’s voice perception predict social communication abilities in children. Proceedings of the National Academy of Sciences of

the United States of America, 113(22), 6295–6300.

Ackles, P. K., & Cook, K. G. (2007). Attention or memory? Effects of familiarity and novelty on the Nc component of event-related brain potentials in six-month-old infants.

International Journal of Neuroscience, 117(6), 837-867.

Atchley, R. A., Rice, M. L., Betz, S. K., Kwasny, K. M., Sereno, J. A., & Jongman, A. (2006). A comparison of semantic and syntactic event related potentials generated by children and adults. Brain and Language, 99(3), 236–246.

Barker, B. A., & Newman, R. S. (2004). Listen to your mother! The role of talker familiarity in infant streaming. Cognition, 94(2), 45–53.

Blommers, K. H. (2015). The influence of voice familiarity on early word learning (Master's thesis).

Boersma, P. P. G., & Weenink, D. J. M. (2015). Praat: doing phonetics by computer. http://www.praat.org.

DeCasper, A. J., & Fifer, W. P. (1980). Of human bonding: Newborns prefer their mothers' voices. Science, 208(4448), 1174-1176.

de Haan, M. (2007). Infant EEG and event-related potentials. New York: Psychology Press. Dehaene-Lambertz, G., Montavont, A., Jobert, A., Allirol, L., Dubois, J., Hertz-Pannier, L., &

Dehaene, S. (2010). Language or music, mother or Mozart? Structural and

environmental influences on infants’ language networks. Brain and Language, 114(2), 53–65.

Friedrich, M., & Friederici, A. D. (2008). Neurophysiological correlates of online word learning in 14-month-old infants. NeuroReport, 19(18), 1757–1761.

Friedrich, M., & Friederici, A. D. (2010). Maturing brain mechanisms and developing behavioral language skills. Brain and Language, 114(2), 66–71.

Friedrich, M., & Friederici, A. D. (2011). Word learning in 6-month-olds: fast encoding-weak retention. J Cogn Neurosci, 23(11), 3228-3240.

Holcomb, P. J., Coffey, S. A., & Neville, H. J. (1992). Visual and auditory sentence processing: A developmental analysis using event-related brainpotentials.

(28)

Houston, D. M., & Jusczyk, P. W. (2003). Infants' long-term memory for the sound patterns of words and voices. Journal of Experimental Psychology: Human Perception and

Performance, 29(6), 1143–1154.

Junge, C., Cutler, A., & Hagoort, P. (2012). Electrophysiological evidence of early word learning. Neuropsychologia, 50(14), 3702–3712.

Kisilevsky, B. S., Hains, S. M. J., Brown, C. A., Lee, C. T., Cowperthwaite, B., Stutzman, S. S., … Wang, Z. (2009). Fetal sensitivity to properties of maternal speech and language.

Infant Behavior and Development, 32(1), 59–71.

Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nature Reviews

Neuroscience, 5(11), 831–843.

Kutas, M., & Federmeier, K. D. (2011). Thirty years and counting: Finding meaning in the N400 component of the Event-Related brain Potential (ERP). Annual Review of

Psychology, 62(1), 621–647.

Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect semantic incongruity. Science, 207(4427), 203-205.

Morgan, E. U., van der Meer, A., Vulchanova, M., Blasi, D. E., & Baggio, G. (2020). Meaning before grammar: A review of ERP experiments on the neurodevelopmental origins of semantic processing. Psychonomic Bulletin and Review, 1-24.

Parise, E., & Csibra, G. (2012). Electrophysiological Evidence for the Understanding of Maternal Speech by 9-Month-Old Infants. Psychological Science, 23(7), 728–733. Pierrehumbert, J. B. (2003). Phonetic diversity, statistical learning, and acquisition of

phonology. Language and Speech, 46(2–3), 115–154.

van Rooijen, R., Bekkers, E., & Junge, C. (2019). Beneficial effects of the mother’s voice on infants’ novel word learning. Infancy, 24(6), 838–856.

Waxman, S.,& Lidz, J.L. (2002). Early word learning. In D.Kuhn & R.Siedler (Eds.), Handbook of child psychology (6th ed.)(pp.299–335). Hoboken,NJ: Wiley.

Webb, A. R., Heller, H. T., Benson, C. B., & Lahav, A. (2015). Mother’s voice and heartbeat sounds elicit auditory plasticity in the human brain before full gestation. Proceedings of

the National Academy of Sciences of the United States of America, 112(10), 3152–3157.

Werker, J. F. (2018). Perceptual beginnings to language acquisition. Applied

Psycholinguistics, 39(4), 703–728.

Referenties

GERELATEERDE DOCUMENTEN

Such labelling does not make sense when \chapter generates a page break, so the last page before a \chapter (or any \clearpage) gets a blank “next word”, and the first page of

With regard to the first research question, the significant effect of test block on both number of correct answers and reaction time shows that in the second learning

He believes that the first member represents an old vocative, reconstructs PT * wlan(t) and, in order to explain the aberrant onset in both languages, assumes "that A wl-

Hoffmann assumed that the original meaning of aṣṭhīlā- is ‘Kugelförmiges’ and that this word is etymologically related to aṣṭhīvá(nt)- (1956: 16 = 1976: 396), but I

woman is rather a derivative of this root For the denvation cf Slovene zena wife , z^nski female (adj) , z^nska woman , and the Enghsh noun female Thus, we may look for an

A parsimony analysis phylogram for the genus Pedioplanis based on the combined data (mitochondrial and nuclear fragments) of the 72 most parsimonious trees (L = 2887, CI = 0.4465, RI

The research focuses mainly on the moderating role of customer commitment and the perceived reliability of online information sources for customers, when

-u- was preceded by a consonant, which is only possible if we sepa- rate αύος from Balto-Slavic and Germanic *sousos/*sausos and assume a zero grade in the Greek word.. In Order