• No results found

Pitch and Pitch Contour Processing in Musical and Linguistic Stimuli

N/A
N/A
Protected

Academic year: 2021

Share "Pitch and Pitch Contour Processing in Musical and Linguistic Stimuli"

Copied!
34
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Pitch and Pitch Contour Processing in

Musical and Linguistic Stimuli

A Literature overview and an EEG Experiment

Abstract: The goal of this thesis is to provide an in-depth overview of research into pitch processing and pitch contour processing, focusing on the overlap between music and linguistic stimuli. In the literature overview it is argued that the different pitch mechanisms involved with pitch and pitch contour

processing are neither language nor music specific but are specific to certain aspects of the sound signal. These aspects can be, but not necessarily are, present in both music and in spoken language. The EEG experiment tested whether the pitch contour present in musical stimuli affects the linguistic pitch contour processing. Mandarin as a tonal language was used for the stimuli. Participants were exposed to either congruent or incongruent combinations of rising and falling pitch contours in spoken language and music. The N400 component was used as benchmark. No indications of simultaneous processing have been found. However, an unexpected difference between the processing of rising and of falling pitch contours has been found. Results indicate that this difference is a result of the linguistic stimuli.

Johan Tangerman University of Amsterdam Master thesis Student ID: 10194231 Contact: johan.tangerman@student.uva.nl Supervisor Experiment: Joey Weidema Supervisor Thesis: Makiko Sadakata

(2)

2 Foreword

During my Master in Musicology at the University of Amsterdam I was lucky enough to have the opportunity to assist in an experiment conducted by the Institute for Logic, Language and Computation (ILLC) and write my Master thesis about it. The experiment, conducted by Joey Weidema, focused on the relation between music and language processing, a field of research in which I developed an interest during my Bachelors. My Bachelor thesis focused on the O.P.E.R.A. hypothesis, which was proposed by Patel in 2011, and only fueled this fascination with language and music processing.

So when I was given the opportunity to assist in this study I immediately took it. Witnessing and participating in the process of designing and executing an EEG experiment like this one has been a great learning experience but also and especially a lot of fun. I have been able to further develop skills that will be helpful in a future academic career. My supervisors for this thesis, Joey Weidema and Makiko

Sadakata have been very helpful and I am grateful for their role in this Master thesis. Reader guidance

The thesis is separated in two parts. The first part provides the literature overview and consists of two chapters. The second part describes the EEG experiment, the methodology used and the results found. This part consists of three chapters. The final chapter is the discussion where I further discuss the found literature and experimental results. Ideas for future results are provided in this chapter as well. The thesis ends with an overall conclusion.

The literature overview is partly meant as an introduction to the fields of music cognition, language cognition and especially pitch processing. When familiar with these fields, it may not be necessary to read the complete literature overview. I would than recommend to read sections 2.4 and 2.5 as introduction to the EEG experiment.

(3)

3

Table of Content

Abstract ... 1

Foreword ... 2

Reader guidance ... 2

Table of Content ... 3

Introduction ... 5

Part 1: Literature Overview ... 7

Chapter 1: Pitch Processing ... 7

1.1 Introduction ... 7

1.2 Neuroimaging studies ... 7

1.3 Pitch Processing Specialization ... 9

1.4 Summary and Conclusion ... 10

Chapter 2: Overlap Between Music and Language Processing ... 11

2.1 Introduction ... 11

2.2 Cross-domain Training Effects ... 11

2.2.1 Music Training ... 11

2.2.2 Tonal Language Experience ... 13

2.2.3 O.P.E.R.A Hypothesis ... 14

2.3 Neural Activation Patterns ... 14

2.4 Simultaneous processing ... 16

2.5 Summary and Conclusion ... 17

Part 2: The Experiment ... 18

Chapter 3: The Current Study ... 18

Chapter 4: Method ... 20 4.1 Participants ... 20 4.2 Materials... 20 4.3 Procedure ... 20 4.4 EEG-acquisition ... 21 4.5 pre-processing ... 21 4.6 Statistical Analyses ... 21 Chapter 5: Results ... 22 Chapter 6: Discussion ... 23 6.1 The Experiment ... 23 6.2 Serendipity ... 23 6.3 Future Research... 24

Conclusion ... 26

Bibliography ... 27

(4)

4

Appendix ... 32

Figure 1. ... 32 Figure 2. ... 32 Table 1. ... 33 Table 2. ... 33 Table 3. ... 34 Table 4. ... 34

(5)

5

Introduction

The Italian 20th century composer Luciano Berio describes music as “everything that one listens to with

the intention of listening to music” (Berio, Dalmonte, & Varga, 1985, p. 19). Berio tries to define music but actually and more interestingly suggests that listening to music is something completely different than listening to other sounds, like spoken language. It leads to the question how one listens when listening with the intention of listening to music. In other words how do our brains process music? Honing et al. (2015) in a way elaborated on Berio’s description of music. Their definition of music is based on the ability to process music, ability they named musicality. Musicality is, according to Honing et al. (2015, p. 2) “a natural, spontaneously developing set of traits based on and constrained by our cognitive and biological system”. Music is than described as “a social and cultural construct based on that very musicality” (Honing et al., 2015, p. 2).

According to the definition of musicality provided by Honing et al. (2015), musicality is compiled of a series of different traits. In the article four of these traits are defined; relative pitch processing, tonal encoding of pitch, beat perception and metrical encoding of rhythm. Relative pitch processing enables someone to identify a sequence of tones according to the pitch intervals between them (Ngo, Vu, & Strybel, 2016). We can still recognize a melody when it is played in a higher or lower register, or when the melody starts on a different note. This is because the intervals, the relations between the notes, stay the same. The tonal encoding of pitch is the sorting of tones as notes in a scale (Peretz & Coltheart, 2003). Many music styles make use of a hierarchical arrangement of tones in which the position of the tone has a certain meaning or function. For instance, western music culture traditionally uses a scale of seven notes of which the tonic is the first note and the tonal center. The tonic in a tonal piece of music creates a feeling of conclusion. Beat perception is the ability to hear a reoccurring pulse in music

(Honing, 2012) even when the actual beat is not present. Metrical encoding of rhythm is the hierarchical structuring of a rhythm in reoccurring meters (Fitch, 2013). This mechanism enables us to make music together and to dance on music.

Pitch processing will be the main focus of this thesis, which exists of a literature overview and an EEG experiment. The experiment focuses on the ability to process pitch contour, an aspect of relative pitch processing. The pitch contour of a sound is the shape of the pitch heights of the sound over time and it involves the direction of the pitch. The processing of pitch contour is important in building up expectancies in music. However, pitch contour processing is not only important for music processing but for language processing as well. In most western languages the pitch contour of spoken language communicates emotions; this use of pitch contour is called prosody. In so-called tonal languages the pitch contour with which a word is uttered can determine its meaning (Yip, 2002). Two thirds of the world’s languages are tonal languages (Yip, 2002).

The possible overlap between the processing of music and language both in general and specific for pitch has been studied intensively. From an efficiency point of view you might expect that the pitch contours present in music and in language are processed by the same neural mechanisms. The overall tendency of these studies is that there is overlap between the two but within boundaries; which means that certain aspects of the processing of music and language overlap others overlap partly and some do not overlap (Patel, 2014; Steinbeis & Koelsch, 2008). Pitch contour processing has been studied as well and an overlap between music and language processing has indeed been found (Bidelman, Gandour, & Krishnan, 2011a; Bidelman, Hutka, & Moreno, 2013; Fujioka, Trainor, Ross, Kakigi, & Pantev, 2004; Moreno & Besson, 2005; Schön, Magne, & Besson, 2004; Stevens, Keller, & Tyler, 2011; Thompson,

(6)

6

Schellenberg, & Husain, 2004). The literature overview will discuss these studies among others to provide a theoretical background for the EEG experiment. The goal of the experiment is to test whether it is possible to process pitch contour in language and in music simultaneously and whether the

processing of linguistic stimuli will be influenced by musical stimuli. The first would provide new insights in pitch contour processing and the latter would provide strong evidence for extensive overlap between the types of pitch contour processing.

The goal of this thesis is to provide both an introduction to the fields of music cognition, language cognition and pitch processing and new insights and arguments for those already acquainted with those fields of study. The EEG experiment makes use of a unique methodology that promises interesting new directions for future research.

(7)

7

Part 1: Literature Overview

Chapter 1: Pitch Processing

1.1 Introduction

Pitch processing starts when sound waves reach the eardrum. Movement of the eardrum is passed on to the ossicles, which in turn bring perilymphic fluid in the cochlea in motion. The hair cells in the cochlea measure the movement of the perilymph and convert it to an electrical signal that is then send through the nerve system. This signal is than processed by a variety of neural mechanisms, all processing aspects of the sounds signal, which leads to the conscious experience of sound.

One of the aspects of sound that is processed extensively and is especially important for the understanding of music and language is pitch. Pitch is the frequency with which a sound wave vibrates and is experienced as being low or high. A single sound wave can often be decomposed into multiple frequencies, a tone with several overtones. These overtones determine the quality or timbre of a sound. Pitch height of a tone is already determined in the ear, in the cochlea to be precise. The cochlea follows a tonotopic organization and the hair cells of the cochlea activate for different frequencies. The initial structuring of sound in pitches is thus already done before it reaches the brain. This literature overview will discuss aspects of the further processing of pitch information.

1.2 Neuroimaging studies

Lesion studies suggest that pitch is processed differently in the left and right hemispheres of the brain (Zatorre & Belin, 2001). These studies show that lesions affecting pitch processing in the right or left hemisphere result in different impairments. Zatorre and Belin (2001) suggest that the left and right hemisphere focus on different properties of sound while processing pitch, namely on spectral properties and on temporal properties respectively. The spectral properties of a sound refer to the frequencies present in the sound. The temporal properties of a sound refer to all the changes of pitch in the sound signal over time. Zatorre and Belin (2001) designed an experiment to test this hypothesis. Participants listened to stimuli in which either the temporal information was constant while the spectral information fluctuated or where the spectral information was constant while the temporal information fluctuated. While the participants listened to the stimuli, a PET (Positron Emission Tomography) scan was made. PET is a neuroimaging technique where the participants are injected with a radioactive substance which binds to oxygen-rich blood cells. The PET scan can then trace the substance in the brain and see which areas in the brain are active; those areas will attract more oxygen-rich blood. Zatorre and Belin (2001) found that the pitch processing mechanisms in the right hemisphere were more active when the spectral features of the stimuli were changed while pitch processing mechanisms in the left hemisphere were more active when the temporal features of the stimuli were changed. This study demonstrates how sound can be processed according to different aspects in the sound signal. Further studies will demonstrate this in more detail.

Pitch processing is hierarchical, starting in in the cochlear nucleus and up to the auditory cortex (Griffiths, Büchel, Frackowiak, & Patterson, 1998; Griffiths, Uppenkamp, Johnsrude, Josephs, &

Patterson, 2001). However it seems that a part of the auditory cortex, the Heschl’s gyrus, is the center of pitch processing or at least the lateral side of Heschl’s gyrus (Krumbholz, Patterson, & Seither-Preisler, 2003; Patterson, Uppenkamp, Johnsrude, & Griffiths, 2002). Patterson and colleagues (2002) designed a

(8)

8

fMRI (functional magnetic resonance imaging) study were participants heard subsequent bursts of broadband noise and tonal versions of that noise. fMRI is a neuroimaging technique with a high spatial resolution. The tonal noise bursts were created by pasting an exact copy of the noise onto the original, a weak pitch is than introduced to the noise by delaying the copy a little bit compared to the original. These tonal noise stimuli are called iterated rippled noise (IRN). IRN is described as sounding like a “cracked bassoon” by Griffiths and colleagues (1998; see Mathias (2014) for additional information and examples of IRN). Patterson and collegues (2002) found that the lateral side of Heschl’s gyrus is only activated for IRN and not for broadband noise. This indicates that Heschl’s gyrus plays a central role in pitch processing.

MEG (magnetoencephalography) is a neuroimaging technique that has a strong temporal resolution and weak spatial resolution. The N100m is an MEG component, an identified phenomenon in the MEG signal, that occurs early after the event and is associated with sound onset (Näätänen & Picton, 1987). It was originally hypothesized that the N100m was evoked by the introduction of tonality to a sound though extensive research did not yield any evidence for this hypothesis (Näätänen & Picton, 1987). The N100m is evoked by multiple sound processing mechanisms of which not all focus on pitch (Krumbholz et al., 2003). However, Krumbholz and colleagues (2003) argue that IRN can be used to separate the part of the N100m component that is evoked by pitch processing from the rest of the N100m component. They focused on temporal pitch processing. They manipulated temporal aspects of broadband noise on the smallest scale to introduce pitch to the noise without changing any other aspect of the noise signal. This way they were able to single out the N100m elicited by the tonal noise. Initial processing of pitch happens thus early after the onset of the sound. Even though MEG does have a weak spatial resolution, the results of the experiment by Krumbholz and colleagues (2003) again point in the direction of Heschl’s gyrus as center of pitch processing.

A more recent study however yields different results. A fMRI study using six types of tonal stimuli found more pitch related activation in the planum temporale, a part of Wernicke’s area, than in the lateral side of Heschl’s gyrus (Hall & Plack, 2009). However, the authors used IRN among the stimuli as well and found the same results as Krumbholz and collegues (2003) found for the IRN stimuli, that is, strong activation in the lateral side of Heschl’s gyrus. The other five tonal stimuli were all variations on sinus tones with sound properties added or removed. They also found activation related to pitch processing in the temporo-parieto-occipital junction and the prefrontal cortex. This leads to the

assumption that there are several brain areas related to pitch processing, each sensitive to an aspect of the sound signal (Hall & Plack, 2009).

Butler and Trainor (2012) devised a neuroimaging experiment that made use of IRN and complex harmonic stimuli. EEG (electroencephalogram) was used as neuroimaging technique, EEG is comparable to MEG in that both have high temporal resolution but low spatial resolution and that both produce results which consist of identifiable components. The authors studied the P1 and N1 components, which are early EEG components that relate to processing in the primary and secondary auditory cortices respectively. The EEG experiment of Butler and Trainor (2012) aimed at finding differences between pitch processing of IRN stimuli and complex harmonic stimuli. Besides the P1 and N1 differences the authors were also interested in whether the tonal stimuli were processed differently in early semantic processes. To do this the stimuli were designed to evoke the mismatch negativity (MMN) component. MMN is a component associated with unexpected abnormalities within a sound signal (Näätänen, Pakarinen, Rinne, & Takegata, 2004). The authors found that the P1 and N1 components occurred later for IRN stimuli than for complex harmonic stimuli. Results also indicate that the P1 and N1, elicited by

(9)

9

the IRN stimuli components, originated dorsal of the area were the P1 and N1 for harmonic complex stimuli originated. However, there was no difference between the MMN components evoked by IRN stimuli and the MMN components evoked by complex harmonic stimuli. These results suggest that the two studied types of tonal stimuli are processed both differently and similarly over the course of the cognitive analyses of the sound signals: differently at least for the processes related to the P1 and the N1 components and similarly for the processes related to the MMN component.

1.3 Pitch Processing Specialization

Melody in music and prosody in language are the types of tonal input we encounter most in everyday life. However, the types pitch changes in melody and prosody are fundamentally different. Language prosody consists of small and continuous pitch changes, while melodies, at least in western tonal music, consist mostly of larger and discrete pitch differences.

The hemispheres of the brain rely on different properties of sound to process pitch. The hemispheres are also biased towards music and language processing; language processing has a tendency to be lateralized to the left hemisphere and music to the right hemisphere (Zatorre, 2001). Now remember that the left hemisphere relies more on temporal properties for pitch processing and the right hemisphere relies more on spectral properties for pitch processing. It is convincingly argued by Zatorre & Belin (2001) that the processing of smaller and continuous pitch differences, present in language prosody, benefits from processing mechanisms focused on processing temporal properties. In spoken language pitch changes occur often. These pitch changes are, however, small and usually continuous. This means that while the spectral differences in spoken language are small the temporal differences are relatively large. Focusing on temporal properties for language processing compared to spectral properties is thus more efficient for the involved neural mechanisms. Just as convincingly they also stated that the efficiency of music processing benefits from a focus on the processing of spectral properties. The spectral differences in music are larger than those in spoken language, at least in the case of the English language and Western tonal music, and therefore stand out more. However, it is important to keep in mind that these left and right hemisphere specializations are not exclusive. Both music and spectral properties of sound are processed in the left hemisphere as are language and temporal properties of sound processed in the right hemisphere.

Besides preferences of the left and right hemisphere, we have seen that sound is processed by different neural mechanisms depending on the stimuli (Butler & Trainor, 2012). Alho and colleagues (1996) demonstrated this with an MEG study which compared the neural reactions to simple and complex sounds. They evoked MMNm components, MEG counterpart of the MMN, by introducing changes in the sound signals. They did find differences in the MMNm component for simple and complex stimuli. Another study tested for processing differences between semantically different stimuli, namely consonant or dissonant harmonic stimuli (Foo et al., 2016). Foo and colleagues (2016) did found different activation patterns for consonant or dissonant harmonic stimuli. What is interesting about these results is that consonance and dissonance are not natural distinctions but cultural ideas. This means that cultural experience has an effect on how stimuli are processed.

Another type of studies that demonstrates how pitch processing is dependent on stimuli are lesion studies. After acquiring brain damage, patients can experience specific cognitive problems while other cognitive functions are intact. These lesions, combined with modern neuroimaging techniques that can determine the exact location of the brain damage, can tell us much about the functionality of

(10)

10

& Hyde, 2002; Peretz & Hyde, 2003). In the beginning of the research into amusia, no apparent language impairments were found in patients. This led Peretz and Coltheart (2003) to hypothesize that music and language processing mechanisms are isolated from each other. Recent studies have shown that specific pitch processing impairments exist in language processing due to amusia (Hutchins, Gosselin, & Peretz, 2010; Tillmann et al., 2011). However, amusia does not always affect everyday use of language while it does affect everyday use of music. Amusia is thus not music specific but it does affect the processing of stimuli aspects that are more prominent present in music compared to language.

1.4 Summary and Conclusion

The purpose of this chapter was to introduce the structure of pitch processing. The studies discussed in the second section of this chapter demonstrate how pitch is processed by a variety of neural

mechanisms. Secondly studies are described 0that convincingly argue that these mechanisms have a bias towards aspects of the sound signal. Even the left and right hemispheres of the brain demonstrate such a bias. Lesion studies initially pointed to the idea that pitch processing mechanisms separated between musical and linguistic stimuli. However, further research has shown that the different mechanisms specialize in aspects of pitch that can be more prominently present in either music or language. To conclude pitch processing mechanisms are not music or language specific but they are stimuli specific.

(11)

11

Chapter 2: Overlap between Music and Language Processing

2.1 Introduction

Language is mainly processed in the left hemisphere of the brain while music is mainly processed in the right hemisphere. However, ‘mainly‘ is used deliberately here, because this separation is not absolute; language is also processed in the right hemisphere and music is also processed in the left hemisphere. The brain areas involved with the processing of language might be activated by music stimuli and vice versa. In this chapter I will describe studies demonstrating the overlap between music and language processing. The studies discussed in this chapter are divided in three categories according to their methodology: (1) cross-domain training effect studies, (2) neural activation comparison studies and (3) interaction of simultaneous presented stimuli studies.

2.2 Cross-domain Training Effects

In general, cross-domain training effects provide strong evidence for overlap between cognitive domains. When you train one cognitive domain and you can see results of this training in another domain, there must be at least a partial overlap between the neural recourses of the two domains. In practice this leads to experiments taking the form of comparing matched participants that have comparable training and experience in one domain of interest but differ in training and experience in another domain of interest. The participants are compared on tasks related to the domain in which the participants had comparable training and experience. Both the cross-domain training effects of music on language and of language on music have been studied extensively using such methods.

2.2.1 Music Training

Probably the most studied comparison is pitch processing in musicians compared to that in non-musicians. Wayman et al. (1992) performed an EEG study in which they compared the ERPs elicited by sinus tones in non-musicians and musicians with and without absolute pitch. The authors designed a so-called “oddball” experiment, which means that participants are initially familiarized with a sequence of stimuli. Consecutively deviating stimuli are introduced in these sequences at random points. This allows the researcher to study how the brain reacts to the new stimuli. In this case, participants listened to a sequence of sinus tones in which deviating tones were introduced at random. Wayman et al. (1992) found that the P3 EEG component, which is often associated with the oddball paradigm, had a shorter latency for musicians than for non-musicians and was shortest for musicians with absolute pitch. This indicates that musicians are capable of detecting deviating tones earlier than non-musicians. Similar results were found in an experiment which studied the ERPs of musicians and non-musicians elicited by manipulations in the fundamental frequencies of both musical and linguistic stimuli (Schön et al., 2004). They found again that musicians process pitch changes faster than non-musicians; this study

demonstrated that this is also the case for pitch processing of linguistic stimuli. Using a variety of interval lengths, Tervaniemi et al. (2005) found that musicians not only react faster to even the smallest pitch changes compared to non-musicians but more accurate as well.

Marques et al. (2007) tested the ability of musicians and non-musicians to identify pitch changes in unfamiliar languages. French speaking participants were tested using Portuguese sentences. The final word of each sentence was either pronounced naturally or manipulated to deviate from the rest of the sentence. The deviations could either be small or large. EEG recordings were made to study the ability to process these deviations. The results were comparable between musicians and non-musicians for the

(12)

12

large deviations. However, musicians were faster and more accurate than non-musicians in recognizing the small deviations. A study by Magne, Schön and Besson (2006) found these cross-domain training effects to be already present in young children. They tested 8 year olds on their ability to distinguish between natural and manipulated target words using EEG recordings, comparable with the method used by Marques et al. (2007). Half of the children followed extracurricular music lessons while the other half took part in other types of extracurricular activities. The authors found that the children who received musical training already show a faster reaction time to the manipulated deviating target words than children without musical training. To test whether familiarity with the stimuli has any effect on pitch processing, Brattico, Näätänen and Tervaniemi (2002) designed an EEG study using three type of stimuli and again with both musicians and non-musicians as participants. The three stimuli types were

categorized as familiar, unfamiliar and no-context. An oddball design was used and random infrequent deviations were added to the three stimuli conditions. The familiar stimuli were made up of tones belonging to the Western A major scale. In the deviant variation of this condition, the frequency of the third position was changed such that the major scale became a minor scale. The unfamiliar condition consisted of tones of arithmetically determined frequencies, the stimuli were chosen in a way as not to sound like any western music scale. For the deviant form the same third position frequency was changed in a similar fashion. The no-context stimuli condition existed of a single sinus tone and the deviation was a similar frequency change as in the other two stimulus conditions. Again it was found that musicians reacted faster to deviant conditions than non-musicians. However, both groups were sensitive to the differences between the stimuli. The deviant conditions all elicited mismatch negativity components but of different amplitudes. The most and least negative MMN components were respectively found in the familiar condition and in the no-context condition. The authors hypothesized that this was due to higher levels of attention for stimuli of the familiarity condition compared to the other two and of the

unfamiliar condition compared to the no-context condition.

We have seen an effect of music training on pitch processing but does music training influence pitch contour processing as well? Fujioka et al. (2004) designed a MEG experiment to compare musicians and non-musicians on their ability to process pitch contours and pitch intervals. An oddball paradigm was used to compare the reaction of the participants to the deviant stimuli. The authors found larger MMNm components (the MEG counterpart of the MMN component in EEG recordings) in musicians for both pitch contour deviant and pitch interval deviant stimuli compared to non-musicians. For the musicians the MMNm components for pitch interval deviant stimuli were found to be larger than for pitch contour deviant stimuli. This was not the case for the non-musicians; for those participants the MMNm was comparable for the two types of deviant stimuli.

In a study by Thompsonk, Schellenberg and Husain (2004) it is shown that the enhancements due to the cross-domain training effect, can actually be beneficial in everyday use of language. In English, like other non-tonal languages, pitch and pitch contour in speech are used to communicate information that exists besides the meaning of the words. This information is often of an emotional nature but can also be used to indicate a question. Prosody is the term for speech aspects that carry sense

independently of the meaning of the word. Thompson, Schellenberg and Husain (2004) compared musicians and non-musicians on their ability to identify the emotional prosody with which a sentence was uttered. They found that musicians were better at identifying the emotional prosody than non-musicians.

(13)

13

2.2.2 Tonal Language Experience

Studying the cross-domain training effects of linguistic pitch processing on other domains is problematic because functionally everyone is an expert language user, be it in one or in several languages or dialects. However, tonal languages provide the opportunity to study cross-domain training effects of language expertise on pitch processing. In a tonal language meaning is partly determined by pitch contour (Yip, 2002); words that consist of the same syllables can have different meanings depending on the pitch contour with which they are pronounced. An experiment by Krishnan, Gandour and Bidelman (2010) demonstrates how these differences affect pitch processing. Tonal and non-tonal language speaking participants listened to Mandarin and Thai words while EEG recordings were made. The authors

calculated the FFR (frequency-following response), which is an electrophysiological measure of auditory brainstem activity, and found that tonal language speakers’ FFR follow pitch contour information more accurately than non-tonal language speakers. Auditory brainstem processing is an early stage of sound processing. This study demonstrates thus that early sound processing is already influenced by music experience.

Bidelman, Gandour and Krishnan (2010) further tested the pitch processing enhancements present in tonal language speakers compared to non-tonal language speakers. In a second study they compared Mandarin Chinese speakers with English speaking musicians and non-musicians. They used IRN to represent a musical interval and a lexical tone and they used FFR to test pitch-tracking accuracy. Both tonal language speakers and musicians outperformed non-musicians. The authors also found differences between the musicians and the tonal language speakers. Tonal language speakers

outperformed the musicians in sections with rapid changing pitches and musicians showed more robust neural representation of pitch than tonal language speakers. However, both musicians and tonal language speakers did show enhancements in both the musical and the lexical style IRN stimuli

compared to the non-musicians. These suggest that the cross-domain training effect between music and language is present in both directions but dependent on the type of experience.

To further study the difference between pitch processing in tonal language speakers and musicians, the authors did two follow-up studies (Bidelman et al., 2011a; Bidelman, Gandour, & Krishnan, 2011b) using FFR. Bidelman et al. (2011b) described an experiment to test gliding pitch processing in Mandarin Chinese speakers and English speaking musicians. The stimuli used were based on gliding tone sequences common in Mandarin Chinese. They found enhancements in the FFR of musicians for scale tones present in the stimuli; this was not the case for the Mandarin Chinese speakers. These results suggest that musicians and tonal language speakers both have enhanced pitch processing compared to non-tonal language speaking non-musicians. However, due to the different experiences with pitch, musicians and tonal language speakers do not process pitch in entirely the same fashion. Musicians are trained to process pitch according to fixed musical scales while tonal language speakers are trained to process pitch contour directions in gliding intervals. These findings imply that experience with music and experience with tonal language influence different pitch processing mechanisms. The authors also tested the reaction of the same participants to tuned and detuned musical chords (Bidelman et al., 2011a). Both electrophysiological and behavioral measures were taken. They found enhancements in the FFR for processing musical chords in both musicians and tonal language speakers. However, musicians convincingly outperformed both the tonal language speakers and the non-musicians in the behavioral tasks. The authors’ interpretation is that the Chinese mandarin speakers have

enhanced pitch perception at the sensory level but do not show enhancement at the cognitive level because the information is not behaviorally relevant to them.

(14)

14

Does a combination of both musicianship and speaking a tonal language enhance pitch processing even more? Tang et al. (2016) compared Mandarin Chinese speaking musicians and non-musicians on their ability to process pitch. They used an oddball design and collected both behavioral results and EEG recordings. The authors found increased MMN amplitudes and faster reactions to the deviant tones for the musicians group compared to the non-musicians group. The results show that even though both groups have enhanced pitch processing due to tonal language experiences, musicianship strengthens this ability even further. This is not unexpected as we have seen that the enhancements as a result of speaking a tonal language and of music training are not the same; they can thus complement each other. Unfortunately, there are no studies comparing musicians with a non-tonal language background with musicians with a tonal language background.

A broader experiment by Bidelman, Hutka and Moreno (2013) tested Cantonese Chinese

speaking individuals and English speaking musicians and non-musicians on not just pitch processing skills but also on fluid intelligence and working memory. Again they found that musicianship and experience in tonal language positively affect pitch processing. They also found enhancements of working memory in both Chines Cantonese speakers and musicians. Apparently tonal language experience also has effects on other cognitive domains.

The last study that will be discussed in this section tests the ability of tone language speakers to identify pitch contour in both linguistic and musical stimuli (Stevens et al., 2011). In this behavioral experiment the authors tested Thai (a tonal language) speakers and English speakers on their reaction time and accuracy while they had to determine the direction of the pitch contour of Thai and English spoken stimuli and musical stimuli. The Thai speaking participants did outperform the English speaking participants on both the accuracy and reaction time measures.

2.2.3 O.P.E.R.A Hypothesis

To explain the cross-domain training effects Patel suggested the O.P.E.R.A. hypothesis (overlap, precision, emotion, repetition, attention; Patel, 2011, 2012, 2014). Patel argues that the cross-domain training effect is due to overlap of the neural mechanisms responsible for processing music and

language. If a skill in one domain is trained with precision and attention and if it the training is repeated often enough and with positive emotions the skill can positively enhance a comparable skill in the other domain. This hypothesis also states that even though the neural mechanisms that process music and language overlap, the neural representation of the two does not. This might help explain cases of amusia without apparent language problems. The hypothesis states that it is the processing mechanisms that overlap and not the notion and representation of music and language.

The O.P.E.R.A. hypothesis supports the idea that processing mechanisms involved with processing music and language are specific to aspects of the sound signal and to either music or language. It also provides this idea with leverage by separating stimuli specific processing mechanisms and domain specific neural representations.

2.3 Neural Activation Patterns

Neuroimaging techniques provide the possibility to compare neural activation patterns. Both EEG studies and fMRI studies, in which reactions to linguistic and musical stimuli were compared, will be discussed.

Patel et al. (1998) designed an EEG experiment to compare the neural reaction of the participants to syntactical errors in musical and linguistic stimuli respectively. The authors especially focused on the P600 component, an EEG component associated with syntactical errors in language. They

(15)

15

found statistically indistinguishable P600 amplitudes and scalp distributions for the syntactical errors in musical and linguistic stimuli. These results provide a strong argument for overlap in music and language processing. However, the authors also identified a music specific negative EEG component between 300 and 400 msec. The unexpected negativity was found to originate from the right anterior-temporal lobe and was therefore named RATN (right anterior-temporal negativity). The RATN bears similarities to the LAN component (left anterior negativity), an EEG component associated with linguistic grammatical processing. The LAN is, as the name suggests, evoked in the left hemisphere in contrary to the RATN. These findings point at a similar process that has a hemisphere bias for linguistic and musical stimuli.

Another EEG experiment which focused not on syntactical processing but on semantic

processing, was done by Koelsch et al. (2004). They primed target words with either a spoken sentence or a music excerpt; the meaning of the sentence or music excerpt was either congruent or incongruent with the target word. The target word was presented visually. Whether stimuli combinations were congruent and incongruent was determined in a pre-study. The authors studied the N400 component elicited by the target word. The N400 component is most often linked to semantic violations in language (Kutas & Federmeier, 2011). They found that both incongruent linguistic and musical stimuli produced a more negative N400 than congruent stimuli. There were no differences found in either latency,

amplitude or scalp distribution between the N400 components resulting from incongruent music stimuli or incongruent linguistic stimuli. In a follow up study, Steinbeis and Koelsch (2008) further investigated the differences between semantic processing of music and language. They tested the semantic priming effects of music on language and of language on music using both EEG and fMRI techniques. As stimuli they used consonant or dissonant chords and spoken words. They again found that an incongruent music prime elicits a strong N400 in the target word and they found the same for incongruent linguistic primes on target chords. However, the fMRI data showed both overlap and significant differences between the areas of activation for the linguistic targets and the musical targets. The middle temporal gyrus was activated for target words while the right anterior superior temporal sulcus was activated for target chords. These results suggest that though semantic aspects of music and language are processed in a similar, partly overlapping way, they are not processed by entirely the same neural mechanisms.

Levitin and Menon (2003) performed an fMRI experiment to compare neural activation patterns for classical music excerpts and scrambled version of these excerpts. The purpose of the experiment was to define cortical areas involved with processing of structure in music. They did not find any difference in activation patterns between the stimuli but they did find activation in areas associated with linguistic processing, especially in Brodmann area 47. Brodmann area 47 is part of the left inferior frontal cortex and had been hypothesized to process structure in spoken and signed language. However, the findings of Levitin and Menon (2003) suggest that Brodmann area 47 is involved in more general structure

processing than language specific structure processing. A study directly comparing fMRI recordings of music and language stimuli was done by Rogalsky et al. (2011). For stimuli they used sentences, scrambled sentences and short melodies. They found large areas of overlap between the activation patterns for the three types of stimuli. However, distinct differences between the musical and the linguistic stimuli were found as well. Sentences elicited ventrolateral activation patterns while the melodies evoked dorsomedial activation patterns which extend into the parietal lobe. Interestingly, the authors also found distinguishable activation patterns within the areas of overlapping activation, suggesting that even though the same neural mechanisms are involved, the actual processes still differ.

(16)

16 2.4 Simultaneous processing

In the EEG experiment which is the subject of the second part of this thesis, musical and linguistic stimuli are presented simultaneously. This methodology is not often used but in this section three studies will be described that have presented musical and linguistic stimuli simultaneously.

Reineke (1981) had both musicians and non-musicians listen to auditory sequences of either digits or tones. Three tasks were designed to test process differences in musicians and non-musicians. The first two tasks were of similar design except that one used digit sequences and the other musical sequences. Two different sequences were presented simultaneous to the left and right ear respectively after which a third sequence was provided. Participants had to judge the third sequence as being equal to one of the two previously heard sequences. During the third task participants were presented with both a digit sequence and a musical sequence simultaneously, again divided over the two ears. The third sequence, provided after the initial two, could either be a digit or a musical sequence and again

participants were asked to judge whether this sequence was the same as one of the previously heard sequences. Musicians outperformed non-musicians in the musical sequence tasks. The groups performed comparably in the digit sequence task. However, the third task was easier for both groups. Simultaneous presented linguistic and musical sequences apparently elicit less processing interference compared to simultaneous presented sequences of either just musical or just linguistic stimuli. These results might be interpreted as evidence against the theory that there is overlap between music and language processing. However, it can also be that it is easier for the overlapping neural mechanisms to distinguish between the two types of sequences. Reineke (1981) also found a right ear preference for the digit sequences for all the participants and a left ear preference for the musical stimuli for the musicians. This is in

concordance with the hemisphere bias for music and language processing; the signal from the left ear is processed in the right hemisphere and vice versa.

The second study was designed by Koelsch et al. (2005). In this study participants were

simultaneously exposed to auditory chord sequences and visual presented sentences. The last chord of each sequence could either be syntactically regular or irregular while the last word of each sentence could either be syntactically correct or incorrect, or semantically expected or unexpected. Participants were asked to ignore the music and focus on reading the sentence. EEG recordings were made during the experiment. The music-syntactically irregular chords elicited ERAN components (early right anterior negativity) while the syntactically incorrect words elicited LAN components (left anterior negativity). When both irregular chords and incorrect words were presented simultaneously the LAN was reduced. The N400 component resulting from semantically unexpected words was not influenced by either

regular or irregular chords. To test whether the effect found of irregular chords on incorrect words is due to overlap in syntactic processing or due to a general effect of deviant stimuli, the authors designed a second experiment. In this experiment chords were replaced with single tones; the last tone of the sequence could either be a standard tone or a physical defiant tone, known to cause a MMN component in the EEG signal. Otherwise the experiment was identical to the previous one. No interaction was found between the deviating tones and the incorrect words. This indicates that the interaction between the neural mechanisms responsible for the ERAN and LAN components in the EEG signal is specific for syntactic processing.

The third and last study to be described in this section is an EEG experiment that did not use ERP’s but focused on oscillatory aspects of the EEG recording (Carrus, Koelsch, & Bhattacharya, 2011). The stimuli and procedure used were comparable to the previously described study by Koelsch et al. (2005). The authors focused on the low frequency aspects of the EEG signal, delta and theta waveforms.

(17)

17

For musical irregular chords they found an early decrease in power in the theta waves and a late power increase in both delta and theta waves. For incorrect words they found the same late increase of power but not the early decrease found as found for irregular chords. When irregular chords and incorrect words were present simultaneously the authors found that the late power increase was diminished. For semantically unexpected words, the authors found a later increase in power specifically in the delta and theta waves elicited in posterior regions. This increase was slightly diminished when the unexpected words were presented simultaneous with irregular chords. These results demonstrate an interaction between music syntax and language syntax processing, as was also found in Koelsch et al. (2005). 2.5 Summary and Conclusion

Three research methodologies, with corresponding studies, were described in this chapter. A stimuli dependent training effect between music and language was found convincingly across the literature. Neuroimaging techniques show a clear, though not absolute, overlap between the activation patterns elicited by musical and linguistic stimuli. Finally, interactions between simultaneous presented musical and linguistic stimuli have been found, whether this interaction is of the form of interference or facilitation is unclear and possible stimuli dependent.

It is argued that pitch in music and pitch in language are processed by overlapping neural mechanisms. The studies discussed in this chapter all demonstrate overlap and interaction between music and language processing. Pitch processing mechanisms are stimuli dependent in that they process certain specific aspects of the sound signal. These aspects are not necessarily present in music and in language in the same way, which is why we do find differences in music and language processing. The O.P.E.R.A. hypothesis states that the neural representations of language and music are separated but that they are indeed processed by the same recourses.

Few studies have been done assessing the ability to process music and language simultaneously. Results indicate that the while provided simultaneous, music and language do interact but whether this leads to interference or facilitation is unclear. The experiment described in part two of this thesis will further investigate the ability of the overlapping neural mechanisms to process the pitch contour present in both music and language simultaneously.

(18)

18

Part 2: The Experiment

Chapter 3: The Current Study

The current study investigates how we process pitch contour in music and in language when these are presented simultaneously. As discussed, simultaneous processing of music and language is not a

methodology which is often used. The aim of the current study was to test whether the pitch contour of a music excerpt could influence the processing of the pitch contour of linguistic stimuli, when provided simultaneous.

The processing of pitch contour in language is especially important in tonal languages, where the meaning of words depends on the pitch contour with which it is uttered (Yip, 2002). Chinese Mandarin, a tonal language, provides thus a starting point for this experiment. Pitches change over a continuous scale in Mandarin, much like a glissando, a glide from one note to another, in music. The simultaneously presented stimuli in the current experiment consist of a Mandarin word and a glissando.

During the experiment, participants were exposed to different combinations of stimuli while an EEG recording was made. The stimuli consisted of a tonal melody ending on a glissando. Simultaneously with the glissando, a spoken Mandarin word, the target word, was presented. The direction of the glissando was either congruent or incongruent with the pitch contour of the target word. The target word itself was either pronounced normally or was manipulated digitally to sound monotone causing the word to become delexicalised.

EEG (electroencephalogram) is a noninvasive neuroimaging technique that has a high temporal and a low spatial resolution. Electrodes placed on the skull measure the synaptic activation of neurons, when enough neurons activate in synchrony (Berger, 1924). This means that EEG can determine when the brain reacts to something and what the signal send by the neurons looks like. However, a single EEG measurement of a single event gives little significant information; EEG is very sensitive to noise since it measures not just the neural reaction to the experimental stimuli but measures all neural activity in the brain which is strong enough to actually be measured. The solution is to include as many similar events in the experiment as possible and then average out the noise. The result is called an ERP (event related potential; Bressler, 2002). An ERP is visualized as a waveform and consists of several components, which are basically the positive and negative peaks of the wave (Bressler, 2002).

A significant amount of studies established the association between a component and a stimulus, event or behavior (Bressler, 2002; Kutas & Federmeier, 2011). Examples of identified EEG components are: the mismatch negativity (MMN) which is associated with deviant sounds (Näätänen et al., 2004) or the P600 which is associated to grammatical and other syntactical errors in language (Patel et al., 1998).

The experiment was designed to test the influence of the pitch contour of a melody on the processing of the pitch contour of a spoken word. The EEG component that will be compared between the events is the N400. The N400 is associated with semantic violations in language (Kutas & Federmeier, 2011). When semantic expectations are violated, the N400 component tends to be more negative compared to when the expectations are fulfilled. This has also been found for semantic violations in music (Koelsch et al., 2004).

If there is an influence of music on language when processed simultaneously, as hypothesized, a significant difference between the N400 for congruent and for incongruent stimuli is expected. When the

(19)

19

N400 is similar for both type of stimuli it is likely that there is no influence of the musical stimuli pitch contour on the pitch contour processing of language. The monotone target word condition is used for the second hypothesis, which predicts that the pitch contour of the musical stimuli will provide pitch contour information for the processing of the flat pitch contour linguistic stimuli. This would be

(20)

20

Chapter 4: Method

4.1 Participants

The experiment was completed by 21 participants, all native speakers of Mandarin Chinese; the results of the first 4 participants were not included in the overall results because of an error in the experiment. 17 participants completed the corrected version of the experiment (7 males, 10 females, average age 25.3, SD = 3.6). All participants were at the time of the experiment living in the Netherlands and spoke at least English as a secondary language. The participants had to fill in a short for this study compiled questionnaire to assess their music background. None of them had experienced any formal music training in the past five years. The participants received a compensation of 30 euro for participating in the experiment. Participants provided formal written consent before the start of the experiment. The ethics committee of the Faculty of Humanities of the University of Amsterdam approved the study. 4.2 Materials

The linguistic stimuli used were Mandarin words. 36 Pairs of disyllabic nouns were chosen. The words in each pair only differed in pitch contour on the second syllable. The selected words were also matched on frequency of use, they were compared using the SUBTLEX-CH database (all ps > .24, Cai & Brysbaert, 2010). A female native speaker was recorded while speaking the words. The computer software Praat (Boersma & Weenink, 2017) was used to create the monotone manipulated stimuli. The written characters of the Mandarin words were also used as visual stimuli during the experiment.

The musical stimuli consisted of four different short melodies synthesized in MIDI format. The melodies were made up of 8 notes and all four of the melodies used were from a diatonic scale. The stimuli were made such that the last tone of the melody would be presented at the same time as the last syllable of the target word. The last tone could be a rising or falling glissando, a continuous glide of one pitch to another pitch. FluidSynth software (Henningsson, Green, & Lopez-Cabanillas, 2017) was used to make the melodies sound like a flute.

4.3 Procedure

At entering the lab, participants were asked to fill in the music background questionnaire. After completion the participants started the experiment in a soundproof room. The visual stimuli were presented in white characters on a black 15 inch computer screen; the audio stimuli were presented through loudspeakers. Participants were instructed to move as little as possible during the experiment. The participants were unaware of the goal of the experiment.

During the experiment participants saw a Mandarin Chinese character pictured on the computer screen, after which the melody and target word were presented. Subsequently the participants were asked to judge if the spoken target word was the same as the character that was shown on the screen. The answer was given using a keyboard. Participants had thus to focus on the target word instead of the melody. The experiment took about 120 minutes. In between the experimental blocks, the participants were allowed to take a break.

Each trial was preceded by an asterisk on the screen. 500 ms after the onset of the asterisk, the Mandarin character was displayed for 500 ms. Then the audio stimuli were presented, with the screen set on black. The melody lasted for 3,000 ms followed by another 1,000 ms. Participants had than 2,000 ms to answer whether the onscreen character was the same as the spoken target word. Between each trial there was a 1,000 ms interlude.

(21)

21 4.4 EEG-acquisition

For the EEG recording 64 Ag–AgCl electrodes were used. They were distributed over the scalp according to the international 10-20 system (Homan, Herman, & Purdy, 1987). A Biosemi ActiveTwo AD-box with a band pass of. 1 – 100 Hz was used to amplify the EEG signal (Brooks & List, 2006). To control for eye movement, four electrodes were attached around the eye: two on the outer canthi of both eyes, and two on the infra- and supraorbital areas of the right eye. Two additional electrodes were placed on the left and right mastoids to function as a reference.

4.5 pre-processing

For the initial data analyses the EEGlab software for Matlab has been used. The data was digitized at 512 Hz. The electrodes situated at the left and right mastoids were set as reference. The data was filtered at a cutoff of .01 – 30 Hz. Events with high levels of noise were manually removed for every participant, as well as channels with high noise interference. The results of one participant were left out because of high levels of noise. After filtering, an independent component analysis (ICA) was performed to remove eye blinks and other muscular artifacts. Deleted channels were replaced by averaging neighbor channels. ERP’s for every experimental condition were calculated form this data using ERPlab.

4.6 Statistical Analyses

Average strength of the ERP in the N400 time window, 350 ms - 450 ms, was extracted and further processed in R. The ERP’s were plotted to visualize the data and the direction of the differences. Due to differences between the rising and falling pitch contour stimuli, visible in Figure 1 and Figure 2, the data were separated in a falling pitch contour group and a rising pitch contour group. The groups were divided according to the pitch contour of the linguistic stimuli. The electrodes were divided in nine clusters according to their location on the skull. The locations were determined by three levels of laterality (left side, midline and right side) and three levels of caudality (anterior, central, posterior). The nine clusters are named as follows: AL (anterior left), AR (anterior right), AM (anterior midline), CL (central left), CR (central right), CM (Central midline), Pl (posterior left), PR (posterior right) and PM (posterior midline).

Two series of two three-way repeated measures ANOVAs were performed to compare different events. The first series compared the main effect of the congruent and incongruent stimuli and included the factors of laterality and of caudality. This created a three factor ANOVA in which the first factor has two levels (congruent or incongruent), the second has three levels (left side, midline and right side) and the third factor has three levels as well (anterior, central, posterior). This ANOVA was performed twice, once for the rising target word conditions and once for the falling target word conditions. The second series of ANOVAs introduced the flat word condition to the former three-way repeated measures ANOVAs. This created a three factor ANOVA in which the first factor has now three levels (congruent, incongruent and flat word), the second still has three levels (left side, midline and right side) and the third factor still has three levels as well (anterior, central, posterior). Again this ANOVA was performed twice, once for each of the different pitch contour possibilities of the target word. The pitch contours of the melodies in the flat word condition were matched with the pitch contour of the target words in the other two conditions.

(22)

22

Chapter 5: Results

The ERPs elicited by the different stimuli conditions have been plotted in Figures 1 and 2. An unexpected difference between the pitch contour directions is notable in these Figures. The ERP to the falling pitch contour stimuli tend to be more positive overall than the rising pitch contour stimuli. The ERP to The flat word conditions do not seem to differ.

The first series of three-way repeated measures ANOVAs found no significant effects for the falling target word events but did find a significant interaction effect between condition and laterality for the rising target word events. Mauchly’s test for sphericity was violated (W = 0.65, p = 0.05), therefore the degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity: F(1.48, 22.23) = 6.73, p = 0.01 and ƞ² = 0.01. A Bonferroni post hoc test was used to further test this interaction and revealed a significant differences, p = 0.01, between the incongruent left side and incongruent midline conditions. Figure 1 shows that the N400 is more negative for the incongruent midline condition.

The second series of three-way repeated measures ANOVAs found two significant main effects and one significant interaction effect. For the rising target word group there was a significant interaction effect between condition and laterality and a significant main effect of caudality. The F statistics for the interaction effect were: F(4,60) = 5.28, p = 0.02 and ƞ² = 0.02. Bonferroni post hoc test indicated significant differences between the incongruent left side and incongruent midline conditions, p = 0.03, and between the flat word midline and incongruent left side conditions, p = 0.04. As stated the N400 for the incongruent midline condition is more negative that the N400 for the incongruent left side condition. Figure 1 also shows that the N400 for the flat word midline condition is more negative than for the incongruent left side condition. For the main effect of caudality, Mauchly’s test for sphericity was violated (W = 0.59, p = 0.03). After the degrees of freedom were corrected using Greenhouse-Geisser estimates, the score was: F(1.42, 21.30) = 6.48, p = 0.01 and ƞ² = 0.03. Bonferroni post hoc test found significant differences between posterior and anterior, p < 0.001, and between posterior and central, p < 0.001. Figure 1 show a more negative N400 for both the anterior caudality and the central caudality than for the posterior caudality. For the falling target word group a significant main effect for condition was found. Mauchly’s was violated as well (W = 0.51, p = 0.01) and after correcting the degrees of freedom, the F statistics were: F(1.34, 20.17) = 5.81, p = 0.02, ƞ² = 0.13. Bonferroni post hoc test found significant differences between the congruent and flat word conditions, p < 0.001, and between the incongruent and flat word conditions, p < 0.001. Figure 2 shows that the N400 for the flat word conditions is more negative than for both the congruent and incongruent conditions.

The complete results of the two series of the three-way repeated measures ANOVAs are given in tables 1, 2, 3 and 4.

(23)

23

Chapter 6: Discussion

6.1 The Experiment

The goal of the current experiment was to better understand the way neural mechanisms process pitch contour. To be more precise: the experiment was designed to test how these mechanisms can

simultaneously process pitch contour in music and language stimuli and whether the pitch contour of the musical stimuli will influence the processing of the linguistic stimuli.

Native speakers of Mandarin were asked to listen to Mandarin target words while a melody was played simultaneously; the pitch contour of the melody was either congruent or incongruent with the pitch contour of the target word. An extra test condition was made by digitally manipulating the target word to sound monotone. For the analyses, focus lay on the N400, an ERP component that is associated with unexpected events in especially language (Kutas & Federmeier, 2011). The first hypothesis

predicted that when the pitch contour of the musical stimuli does influence the processing of the linguistic stimuli, the N400 will be stronger for incongruent stimuli than for congruent stimuli. The second hypothesis predicts a similar N400 for the monotone condition as for the congruent condition, which would mean that the pitch contour of the melody can replace the missing pitch contour of the target word.

The analysis of the collected data did yield some interesting effects. However a difference between congruent and incongruent stimuli, as hypothesized, has not convincingly been found. A difference between conditions was found for the falling target words. Further testing though found that the difference was between the flat word condition and the other two conditions and thus not between congruent and incongruent. These results have probably nothing to do with the difference between conditions but with the difference between pitch contours, which I will discuss in the next section. The only possible indications of difference that were found, were the interaction effects for condition and laterality found for the rising target word, both with and without the inclusion of the flat word condition. A subsequent analyses of these findings indicates that the midline level of laterality is the source of the N400 for the incongruent stimuli. This might be different for the congruent stimuli though no possible origin for the N400 elicited by congruent stimuli has been found. This makes it impossible to claim any difference between congruent and incongruent stimuli. For the second hypothesis a comparable N400 for the flat word condition and the congruent condition was found in the rising target word group though the N400 was also not significantly different from the incongruent condition. In the falling word condition the target word group was significantly different from both the congruent and incongruent conditions. This leads to the conclusion that also the second hypothesis was not confirmed by the data.

The results do indicate that the N400 elicited by rising linguistic pitch contours originates in midline and anterior brain areas.

6.2 Serendipity

While analyzing the results of the EEG experiment, an unexpected phenomenon arose from the data: a large asymmetry between the ERPs elicited by the rising pitch contour and falling pitch contour. It appears that 100ms after the onset of the target word the ERPs elicited by the falling pitch contour stimuli became more positive and the ERPs elicited by the rising pitch contour stimuli became more negative. Interestingly, the ERP asymmetry seems to be evoked by the pitch contour of the target word and not by the pitch contour of the melody. The deviation depends on the pitch contour direction of the

(24)

24

target word and not of that of the melody. The response to the flat word stimuli behaves similar to the rising target word stimuli, again without being affected by the direction of the pitch contour of the melody. This strongly implies that this effect is language specific, or maybe tonal language specific. Otherwise it could also be because the participants were asked to focus on the target word and ignore the melody.

It is logically arguable that different processing pathways for different pitch contour directions would make processing pitch contour less demanding. This would especially affect the processing of pitch contour in language since the use of pitch is often smaller and more continuous compared to music. This makes the pitch contours smaller and possible more demanding to process.

6.3 Future Research

The hypothesis was not confirmed; no indication of any influence of music on language processing has been found nor any confirmation that pitch contour of music replaces the role of a missing pitch contour in language. However, more research is needed before these hypotheses can be completely disregarded. The current study compared the results in the timeframe of the N400, which is just one EEG component that might provide evidence for the influence of music on language processing. Another possible candidate is: the P600 which is associated to grammatical and other syntactical errors in language (Patel et al., 1998). It is possible that the incongruent combinations of stimuli elicit a syntactical error and not a semantical error.

Alterations to the used methodology might also yield more insight from future research. In the current study, participants were instructed to focus on the linguistic stimuli, which is in accordance with the hypotheses. However, it is possible that interaction effects are more enhanced when the participant is not asked to focus on either type of stimuli. Asking the participant to focus on the musical stimuli is also a possibility to further study simultaneous processing of music and language.

The linguistic stimuli in the current study existed of single target words and expectancy was created by the on screen character presented at the beginning of each trial; using full sentences in future research can create a stronger feeling of expectancy which might elicit a more robust N400 component (Kutas & Federmeier, 2011). Another way to alter the linguistic stimuli is by using more than the two pitch contour types. For example, Mandarin uses four pitch contours (Yip, 2002). However, even though using multiple pitch contours might provide a more accurate real-world experiment, it does complicate the experimental design to a great extent. This is important to keep in mind while devising such an experiment.

The unexpected asymmetry between processing rising and falling pitch contours needs to be studied more extensively before any claims can be made about the cause. I have mentioned possible explanations for this asymmetry. To determine the most viable explanation a series of tests will be proposed. Firstly, the results need to be reproduced. To determine whether the asymmetry in processing is specific to tonal language speakers, a similar experiment needs to be performed with non-tonal language speakers. Next it is important to test whether the asymmetry in processing is language specific, general to pitch contour processing or even a result of the simultaneous presentation of the stimuli. To achieve this participants need to be tested in a similar setting using individual presented linguistic and musical stimuli. The current experiment already demonstrates that the processing asymmetry is not due to the musical pitch contour, since the asymmetry in this experiment was determined by the pitch contour of the target word. After the proposed tests, if evidence for the processing asymmetry is indeed found, there are six possible outcomes for its cause: (1) it is tonal language experience specific and

(25)

25

language specific, (2) it is tonal language experience specific and general for pitch contour processing, (3) it is tonal language experience specific and simultaneous processing specific, (4) it does not depend on tonal language experience and is language specific, (5 )it does not depend on tonal language experience and is simultaneous processing specific or (6) it does not depend on tonal language experience and is general for pitch contour processing.

Finally, it is important to determine whether this separation of processing different pitch contours is innate or learned through experience. If it was found that the processing asymmetry is specific for tonal language experience, that would already provide an argument for the learned theory. Subsequently testing whether second language speakers of a tonal language do show the same

processing asymmetry would strongly support this theory. When the prior experiments would have shown that the processing difference is in fact present in both tonal and non-tonal language speakers, it would be problematic to determine whether it is learned or innate. It would seem that the processing asymmetry is the default and thus innate but it can very well be established to be the result of early input as an infant. The best way to study this would be by studying how infants, as young as possible, process pitch contour in different stimuli. EEG study is possible with infants but has of course its own difficulties.

Referenties

GERELATEERDE DOCUMENTEN

If specified this option the contour is printed by a real outline of the text instead of copies.. This increases speed as well as quality 1 and reduces the

Het is wel van belang voor de praktijk, omdat het model wordt gebruikt bij diverse beleidsevaluaties.

In het De Swart systeem wordt varkensdrijfmest door een strofilter gescheiden in een dunne en een dikke fractie.. Het strofilter is in een kas geplaatst van lichtdoorlatend kunst-

Speakers did not economize on accent lending pitch movements, but 40% of the boundary marking pitch movements disappeared under time pressure, reflecting the linguistic hierar- chy

de nieuwe drieslag reduceert Nederlands tot een vaardighedenvak. 2) Dan (2) de kennis: waar is de vakinhoud bij Nederlands, met name de taalwetenschappelijke en

Compared to older same sex drivers, both male and female young drivers in Europe report more frequently to engage in various traffic infringements, a preference for higher

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

lated production-smoothing problem considered in Aronson, !torten, Thompson [l] and expla~s why it is not possible to get planning hori- zon results for the