• No results found

Are neural expressive phrasing mechanisms shared by music and language? A review of neurophysiological evidence supporting the musilanguage model

N/A
N/A
Protected

Academic year: 2021

Share "Are neural expressive phrasing mechanisms shared by music and language? A review of neurophysiological evidence supporting the musilanguage model"

Copied!
31
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 1

Are neural expressive phrasing mechanisms shared by music and language?

A review of neurophysiological evidence supporting the musilanguage model

A master thesis by Merwin E. Olthof, University of Amsterdam

Abstract

The evolutionary relationship between language and music is highly debated. The musilanguage model states that music and language originate from a common precursor, which is neither music nor language. This review focuses on the potential existence of neural substrates for the referential

processing of one of the key elements in such a musilanguage, expressive phrasing. After introduction of the musilanguage model, the neural pathways for processing of expressive phrasing

in language and music are discussed. This process seems to involve at least the brainstem, the superior temporal sulcus (STS), the superior temporal gyrus (STG), the (right) inferior frontal gyrus

(IFG) and the orbitofrontal cortex. The role of the amygdala in this process is unclear but is also discussed. Although the existence of shared neural substrates does not necessarily imply that these substrates were also present in a musilanguage, the present evidence does not falsify the existence

of a musilanguage. This makes adopting this evolutionary model highly efficient in terms of theoretical economy.

Master programme: MSc Brain & Cognitive Sciences Track: Cognitive Science

Date: 19th of August 2015. Student ID: 5974097

Supervisor: Joey Weidema, MA. Co-assessor: Prof. Dr. Henkjan Honing

(2)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 2

Table of Contents

1. Introduction ... 3

1.1. The origins of music and language. ... 3

1.2. Five competing theories about the origins of language and music. ... 4

1.3. Implications of shared neural substrates for expressive phrasing mechanism in music and language……….. ... 6

2. Theoretical Framework ... 6

2.1. The musilanguage model ... 6

2.2. Dual-natured processing of vocal emotional meaning: subcortical and cortical emotive pitch processing. ... 8

3. Basic pitch encoding in the brainstem ... 8

3.1. (Lexical) tone processing ... 8

3.2. Involvement of the brainstem in (lexical) tone processing ... 9

4. The referential pathway for the processing of vocal emotional stimuli ... 11

4.1. Emotional prosody processing: a theoretical framework ... 11

4.2. STS & STG involvement in the processing of an auditory stimulus ... 13

4.3. Involvement of inferior frontal gyrus ... 15

4.4. Involvement of orbitofrontal cortex ... 16

4.5. Involvement of amygdala ... 17

5. Discussion... 18

5.1. Evidence for a musilanguage model ... 18

5.2. Future research ... 22

(3)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 3 1. Introduction

1.1. The origins of music and language.

Although vocal emotional communication is a skill that is broadly shared in the animal world (Fitch & Züberbühler, 2013; Ackermann et al., 2014), humans are unique in the ability to vocally convey emotional states and even arbitrary meanings in a referential manner by means of music and vocal language. Both music and vocal language can therefore be regarded a vocal communication system. Since both music and vocal language are temporally organized and reach our brains in the form of frequencies (McMullen & Saffran, 2004), in which aspect they differ from other communicational systems such as sign language, one might expect certain similarities between the two faculties. Indeed, some features of music are shared with vocal language1, while others are music or language specific (Tallal & Gaab, 2006). These similarities and differences beg the question to what extent music and language are evolutionarily related.

At some point in evolutionary history, early hominid species must have gradually shifted from a very basic vocal emotional communication system to a more advanced system that led them to have an adaptive advantage. Brown (2000) has argued that language and music have a common precursor, musilanguage that was neither music nor language. This system could infer emotive meaning from combination of tones, not just in a reflexive way but also in a referential manner. To identify which common features of music and language constitute such a musilanguage and which ones have evolved in a homologous manner in later stages of evolution is one of the major

challenges of developing such a model. Brown has identified certain features of language and music as ancestral, which means they were already present in musilanguage. For now it is sufficient to say that this musilanguage was capable of conveying (emotional) meaning by expressively phrasing individual lexical tonal elements and combining them in a meaningful but non-hierarchical way.

This thesis critically reviews the neurophysiological evidence for a musilanguage available in the domains of music and language. It is likely that if music and language have a common precursor, we would find shared neural mechanisms for some aspects of music and language processing. I hypothesize that this is indeed the case. Specifically, I would expect to find evidence of a vocal emotional communication loop, a neural pathway that is both directly (acoustically), and indirectly (referentially) involved in the processing of emotional meaning of a sequence of lexical tones. Due to space limitations, in this review, I will focus on the referential pathway only.

(4)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 4 If there is indeed any neural overlap in the referential emotive processing of music and language, this could favor a musilanguage model over other structural evolutionary models. Mapping these neural substrates would then be highly efficient in terms of theoretical economy (Juslin & Laukka, 2003). The knowledge of general vocal emotional communication mechanisms might even contribute to a better understanding of other types of emotional processing, for example in the visual domain (e.g. face perception). This in turn promotes interdisciplinary research.

Interdisciplinary research might be more fruitful than using a more traditional disciplinary approach, since the latter often greatly underestimates the complexity of (evolutionary) research questions by overlooking the involvement of other disciplines.

Emotion is a specifically complex and interdisciplinary topic. Before talking about emotive processing mechanisms in music and language, it is therefore important to adopt a definition of emotion. In this review, for the sake of brevity, I adopt a rather general definition of emotion.

According to Scherer (2001), “emotion […] can be defined as a brief, delimited time period or episode in the life of an organism when there is an emergent pattern of synchronization between several components (and related bodily subsystems) in the preparation of adaptive responses to relevant events as defined by their behavioral meaning” (Scherer, p.2). One way to evoke such adaptive responses is by using vocal emotional communication systems such as music or language. We know that music and language use the same emotional acoustic cues (Juslin & Laukka, 2003). Both music and vocal language are able to convey emotions by means of phrasing, intonation and various other attributes (Brown, 2000). Therefore it is likely that a musilanguage also conveyed emotions using these same features.

Because of the similarities between language and music it is not surprising that their origin has intrigued scholars for a long time (e.g. Darwin, 1871). As a result, five influential models for the origins of music and language have been developed over the years (Brown), which will be introduced shortly hereafter. However, before introducing these models, it is important to clarify that this thesis is about structural rather than functional evolutionary theories. Functional theories (and models) look at the why of an evolutionary trait. These types of models look at possible adaptive roles for traits, for example, in music. In contrast, structural models focus on what a model could entail. These types of models try to separate so called shared ancestral properties that were already present in a common precursor for language and music, from properties that are not shared or might be shared only as a result of analogous evolution (Brown, 2000). In all of the existing theories about the origins of music and language, these shared properties are interpreted differently.

(5)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 5 1.2. Five competing theories about the origins of language and music.

According to Brown (2000), theories on the origins of music and language can be categorized into five types of evolutionary scenarios, each of which could account for the similarities between music and language. Two options regard one to be the outgrowth of the other: the music outgrowth model regards music as a possible outgrowth of language, and the language outgrowth model regards language as a possible outgrowth of music. Although at first sight, both outgrowth models seem viable options that could account for the similarities between language and music, a pragmatic problem swiftly arises when one tries to find evidence for these types of models. The two outgrowth models focus on what is unique in music or language, rather than on the similarities. This results in, as Brown describes it “endless semantic qualifications as to what constitutes an ancestral musical property versus what constitutes an ancestral linguistic property” (Brown, 2000, p. 277).

Another option is the parallel model. In this model music and language have evolved in a homologous but parallel manner, and it therefore assumes that there are no shared ancestral features. However, it is highly unlikely that the similarities that were mentioned have arisen purely by chance. There are simply too many similarities between music and language to back up such a claim. This is especially true if the exact same neural mechanisms are indeed used in music and language processing.

A fourth theory is mainly based on the slightly outdated hypothesis of hemispheric specialization, which states that for the majority of people language can be localized in the left hemisphere of the brain, whereas music can be localized in the right hemisphere. In this model, the binding model, it is very well possible that the commonalities or slightly analogous features have evolved through continuous interaction between two discrete domains, music and language. Evidence for such lateralization mainly comes from lesion studies. For example, speaking abilities may be impaired while singing the exact same words is unaffected, which is often the case in aphasia (Yamadori et al., 1977). Likewise, in congenital amusia, the ability to use proper use of pitch in singing and pitch perception can be severely impaired, while musical properties of speech such as speech prosody are intact (Ayotte et al., 2002). However, in a more recent experiment, speech prosody has been found to be impaired in amusics (Patel et al. 2008). Thus, the evidence on hemispheric specialization for music and language is inconsistent, which makes the binding model unlikely.

(6)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 6 Lastly, there is the common precursor model (e.g. Darwin, 1871; Brown, 2000). These musilanguage models hypothesize that there is a common precursor for music and language, that is neither music nor linguistic but can still be regarded as a complex vocal emotional communication system. In this paper I will adopt Browns version of the musilanguage model as a framework, as it has two advantages over the other models. Firstly, because it allows us to look at similarities rather than differences between music and language. Secondly, because unlike other models (e.g. Lerdahl & Jackendoff, 1983; Katz & Pesetsky, 2011) it is a mechanistic rather than a metaphoric model. Brown identifies four expressive phrasing mechanisms, of which their presence and neural substrates can be investigated in both music and language. In this review I will focus on the neural pathways for the processing of these expressive phrasing mechanisms.

1.3. Implications of shared neural substrates for expressive phrasing mechanism in music and language.

It is important to notice that the existence of shared neural substrates does not necessarily imply that these were also present in our hypothesized precursor. Perhaps both modalities have started to use the same domain-general neural mechanisms for different purposes. However, the fact that some of these properties share a neural mechanism does make the existence of a common precursor more likely, especially because both can be regarded a vocal emotional communication system.

This review is the first review that investigates whether there is any neural basis for Brown’s musilanguage model. The stages that together constitute Brown’s model will serve as a framework for the thesis in a step-by-step fashion. In the following paragraph I will first outline Brown’s musilanguage model. Subsequently, in the third paragraph, lexical tone encoding in the brainstem will be discussed, as it is a first step in the processing of vocal emotive meaning. The fourth

paragraph will outline referential pathway for emotive tone processing in music and language. Lastly, the combined evidence is critically reviewed in the discussion.

2. Theoretical framework. 2.1. The musilanguage model.

When reviewing the evidence regarding a certain framework, it is important to first clarify the features of that framework. In essence, the musilanguage stage of the framework consists of a two-step evolutionary progression (see Figure 1). The first is a hominid referential emotive vocalization system. This system is capable of processing (emotional) meaning of sound in two ways: sound

(7)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 7 emotion and sound reference. The former allows certain acoustic patterns to be used to convey emotions, whereas in the latter, arbitrary sound patterns have a symbolic or emotional meaning. In this dual-natured musilanguage stage hominids could not only perceive emotive meaning in sound in a fast reflex-like way by assessing pitch qualities (Peretz et al., 2013), an ability which is hypothesized to be shared with nonhuman primates (Fitch, 2013). A musilanguage could also derive meaning on a higher order processing level, most likely slower and more attention-dependent, which makes it different from primate or other mammals’ vocal communication systems. Sound emotion might influence emotion perception in a bottom-up fashion by using direct subcortical pathways, whereas sound referential does this in a top-down fashion by means of a cortical pathway. Referential meaning is achieved by analysis of discrete pitch vocalizations that have individual semantic and emotive meaning. If we look for the neural substrates for (lexical) tone processing in language and music, we thus might be able to find out whether this is indeed a possible feature of a common precursor in language and music. Any evidence on this matter will be discussed in paragraph 3.

Figure 1. In essence, the musilanguage stage of the framework consists of a two-step evolutionary progression. The first is a (hominid) referential emotive vocalization system. This system is capable of processing (emotional) meaning from lexical tones in two ways: sound emotion and sound reference. A second evolutionary step completes the musilanguage stage as hypothesized by Brown (2000). In this step, use of combinatorial syntax and expressive phrasing abilities are added to the mix. Eventually, musilanguage, although debatable, diverged into two distinct but interacting faculties

(8)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 8

that might still share some of the neural substrates that were also used in musilanguage. Figure taken from Brown, pp. 295.

A second evolutionary step completes the musilanguage model. In this step, both individual and sequences tonal elements can now have lexical and emotional meaning. Combinatorial syntax and expressive phrasing abilities are added. Combinatorial syntax allows formation of simple

combinations of the lexical-tonal elements that were introduced in step one. This combinatorial skill results in both local and global levels of meaning. On the local level the relations between the individual elements may have semantic or emotive meaning. On the global level, a sequence of lexical tones may have meaning as a whole in addition to its individual meaning. I will not go into this combinatorial syntax any further, as it is beyond the scope of this review, in which I merely focus on lexical tone processing and expressive phrasing.

The second step in Brown’s musilanguage model is completed by adding expressive phrasing abilities. Expressive phrasing is an emotional feature rather than syntactic feature. Therefore it might use an emotional referential (cortical) pathway to give a certain lexical tonal element, or a sequence of those, extra (emotive) meaning. Like combinatorial syntax, it acts on both the local and global level. On the local level, individual lexical-tonal elements can be stressed to give them additional emotive meaning or importance. This modification of individual elements is described as local sentic modulation. On the global level, melodic and rhythm contour or intonation can convey emotive meaning. The general contour of a phrase and the affective information that it can contain is called global sentic modulation. For example, by increasing fundamental frequency, tempo and volume, one might convey an angry message, rather than a neutral one. Together, global and local sentic modulation make up affective prosody, which has been widely investigated in neurophysiological studies, some of which will be discussed later in this review.

In summary, musilanguage allows expressive phrasing by means of intonation of a sequence of lexical-tonal elements or stressing individual elements in that sequence, resulting in the ability to convey (emotive) meaning on a global and local scale. The pitch changes that allow this expressive phrasing can also be found in language, and music to a certain degree. If we look for the neural substrates for these features in language and music, we might be able to find out whether these could have been part of a musilanguage.

2.2. Dual-natured processing of vocal emotional meaning: subcortical and cortical emotive pitch processing.

(9)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 9 One of the qualities of the musilanguage, compared to vocal emotional communication systems in nonhuman animals, is that it allows direct acoustical as well as indirect referential processing of lexical tonal elements. In this regard, the musilanguage model fits nicely with a reflexive/referential (or associative) model of musical emotion processing developed by Peretz and colleagues (2011). Peretz and colleagues state that musical emotion processing is of dual nature, with bidirectional influences of both processing routes. One route is a direct (automatic), subcortical way of processing acoustic sounds, in which attention does not play a role. The other route is a cortical one. Although also starting out at the auditory cortex, this route is (partly) dependent of attention and evaluates the emotional qualities of a musical stimulus. This review will not discuss the complete subcortical (direct) route, which is shared with nonhuman primates and other mammals (Fitch & Züberbühler, 2013). Rather, it focuses on the referential pathway, starting with basic pitch encoding in the brainstem.

3. Basic pitch encoding in the brainstem. 3.1. (Lexical) tone processing.

The processing of pitch variations can be regarded a fundamental feature of both music and

language (Alexander et al., 2005). In music, intonation, stressing of certain notes, or a tempo change of a musical phrase can induce affective responses. In speech on the other hand, fundamental frequency changes (‘intonation’) may contain more than just emotional information. Approximately two third of the world’s human languages are tone languages2 (Yip, 2002) indicating that lexical information is often, but not always enclosed in pitch variation. In tone languages, intonation can also change the lexical meaning of a word (e.g. Deutsch et al., 2004; Luo et al., 2006). This means that the same word can have different semantic meaning depending on the pitch height. This requires tone language speakers to be very sensitive to pitch changes, a skill that musicians also possess. Thus, in language, like in the hypothesized musilanguage, a tone can contain lexical information. The fact that music does not contain lexical information does not necessarily argue against a

musilanguage model. Perhaps, in music, just as in non-tonal languages, the capacity is simply not used.

Therefore, it still makes sense to investigate the neural substrates for early pitch processing in both music and language. If music and tone languages make use of the same neural substrates one

2

Examples of tonal languages that were used in studies presented in this review are Chinese languages such as Mandarin, and Thai.

(10)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 10 might find transfer effects from expertise in one domain to another (e.g. Wong et al., 2007; Schön et al., 2004). A key structure involved in pitch encoding in both language and music (e.g. Parbery-Clark, Skoe & Kraus et al., 2009) is the brainstem. The following subparagraph will review some

neurophysiological evidence of the brainstem’s involvement in both language and musical tone processing.

3.2. Involvement of the brainstem in (lexical) tone processing.

The auditory brainstem (inferior colliculus) is thought to be involved in the first steps of pitch

encoding (Juslin & Västfjäll, 2008). There is evidence that robustness of brainstem encoding is related to cortical sensitivity to pitch information, although it is not clear whether this relationship is bottom-up, top-down, or bidirectional (Banai et al., 2005). Before outlining the neural pathway for referential emotive tone processing, it therefore makes sense to first describe the involvement of the brainstem in music and linguistic pitch processing. If there is indeed evidence for shared brainstem encoding in both domains, this might indicate that a musilanguage made use of this structure as well. Several studies have found such transfer effects from the musical to the linguistic domain and vice versa (e.g. Bidelman et al., 2011; Wong et al., 2007), some of which will reported now.

Brainstem encoding in response to an auditory stimulus can be assessed by looking at the Frequency Following Response (FFR). This is an evoked response that can be measured by EEG, and encodes the energy of the stimulus’ fundamental frequency (Krishnan, 2006). It is thought to originate in the previously mentioned inferior colliculus.

Evidence of shared brainstem encoding comes from studies by Krishnan and colleagues (Krishnan & Gandour, 2009). In an EEG study, FFR responses in Chinese and English speakers to four Mandarin tonal contours generated by a rippled noise algorithm3 in a non-speech context were compared. Whole-contour tracking accuracy and whole contour pitch strength were extracted from the FFR. Additionally, spectral analysis4 of the FFR was used to assess pitch-relevant harmonic representation in the participants. Compared to the English non-tonal language speakers, Chinese tonal language speakers showed a more robust FFR representation in response to the stimuli. The authors concluded that this subcortical processing mechanism is not necessarily speech-specific. This again suggests that other modalities such as music (or in this case, non-speech) make use of this

3

Iterated rippled noise (IRN) algorithm generates dynamic pitch contours that can be regarded representative of contours present in natural speech (Swaminathan et al., 2008).

4

Spectral analysis is a method that allows extraction power-frequency information from, for example, the FFR response. The output shows the power of each frequency in a stimulus or FFR response.

(11)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 11 pitch encoding mechanism. This mechanism thus seems to be, evolutionary speaking, a rather old and domain-general, that is nevertheless a first essential step in (sub)cortical processing of pitch (Banai et al., 2005).

Another study that used the FFR in order to determine the effects of musical training on tone language encoding was done by Wong et al. (2007). In this experiment, linguistic tonal patterns were presented to amateur musician and non-musician non-tonal speakers. Amateur musicians performed better in all kinds of linguistic pitch encoding tasks, such as pitch tracking, pitch discrimination and pitch identification, and these activities elicited a more intense FFR in the brainstem. These results indicate that some of the very basic phonological encoding processing mechanisms might be shared by language and music.

A pivotal question is whether the transfer effects mentioned work bi-directionally. In other words, does tonal language experience enhance musical pitch processing? Bidelman and colleagues (2011) investigated this question by presenting a rippled noise homologues of musical pitch intervals to English musicians, English non-musicians and Chinese speakers. As in the studies mentioned previously, the FFR response was measured, and pitch tracking accuracy and pitch strength were extracted. Both English musicians and Chinese speakers outperformed English non-musicians on the task and showed a more robust brainstem representation of the FFR response, indicating that transfer (enhancement) effects of musical and tone language effects are bi-directional.

Juslin and Västfjäll (2008) have indicated that the automatic pitch encoding processes that are regulated in the brainstem are highly sensitive to learning and training. Musicians and tone language speakers seem to be more sensitive to pitch changes than non-musician non-tonal

speakers, due to extensive ear and brainstem training (Krishnan & Gandour, 2014). In line with Juslin & Västfjäll’s theory however, both expert groups generally show even more robust brainstem encoding if the stimuli used in the task resemble, for musicians and tonal speakers respectively, musical or tonal linguistic contour. This is likely because FFR’s for both types of stimuli have slightly different acoustic properties (in other words, different tones and tonal relationships are used in music compared to tone languages). The encoding process is the same, explains the bi-directional transfer effects, but apparently, the brainstem will become even more sensitive to certain acoustic tone profiles.

In light of the musilanguage framework it is likely that the brainstem played a key role in tracking and encoding of the lexical tonal elements present in musilanguage. Once sequences of tonal elements have a referential emotive or lexical meaning, processing these in a proper way

(12)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 12 becomes more valuable for an organism. It allows to infer the emotional and motivational state of another human, simply by listening to the sounds he or she is making. Therefore it goes beyond simple reflexive emotive meaning that other species can attribute to sounds that are present in nature (e.g. vocalizations of other animals, the sound of thunder).

The brainstem encoding step described above is one of the first steps of many in the processing of (lexical) tonal elements. The question therefore remains which brain pathways were responsible in the further processing of the individual (lexical) tonal elements of musilanguage. In the following paragraph, the neurophysiological evidence for the referential pathway of emotional vocal stimulus perception is discussed by comparing affective prosody processing and contour processing in music.

4. The referential pathway for the processing of vocal emotional stimuli. 4.1. Emotional prosody processing: a theoretical framework.

Very basic emotional communication is an ability that is to a certain extent shared with non-human mammals (Fitch & Züberbühler, 2013). The vocal communication abilities of non-human mammals mostly seem to be limited to the reflexive pathway. Due to space limitations, I will not further discuss the reflexive processing of vocal emotional stimuli. Rather, I will focus on the uniquely human

referential pathway that allows attentional top-down influence on the processing of lexical tonal elements. It is therefore important to separate processing of sequences of emotive and lexical tone, although in musilanguage, both types of information were enclosed in individual and sequences of lexical tones. Musilanguage was innovative compared to pre-musilanguage communication systems in the sense that it can do both types of processing using referential pathways. Some of the

structures in these pathways are solely involved general referential pitch processing, but others are domain-general emotive processing structures.

Both music and language are capable of evoking emotions by means of intonation (Brown, 2000). How we say or sing something is therefore often more important than the content

(Wildgruber et al. 2006). In vocal music intonation of certain notes or a sequence of notes is used to conveying intended emotional content. If one sings a piece in a monotone way, this is often

considered dull. Once intonation and stressing of certain notes or contours is added, this will give emotive meaning to the piece. In language, likewise a speaker’s arousal-related mood such as anger or joy can shape the “tone” of spoken language (emotive/affective speech prosody). Emotional prosody can be defined by an (emotional) vocal effect in speech and non-speech utterances that can

(13)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 13 extend over a sequence of tones (Grandjean & Frühholz, 2013). These modifications can for example be increasing pitch height or intensity of general contour or individual elements. Considering their presence in both music and language, intonation and stressing of (sequences of) tones are good candidates to be domain-shared, features of vocal emotional communication.

The neural pathways of emotional prosody processing have recently received more attention (e.g. Grandjean et al, 2005; Wildgruber et al., 2006). Some of the work, mainly EEG and fMRI studies, will now be discussed. Figure 2 shows a rough overview of the neural pathway involved in processing emotional prosody.

Figure 2. Schematic (right hemispheric brain) overview of cortical and subcortical regions involved in the decoding of emotional prosody. After basic auditory processing in primary and non-auditory auditory cortex (AC), middle superior temporal gyrus (mSTG) and superior temporal sulcus (STS) decode emotional cues from stimuli, independent of attention. Next, the emotional cues are evaluated by frontal regions such as the inferior frontal gyrus (IFG) and orbitofrontal cortex (OFC). These cortical structures ‘inform’ the amygdala (AMG) of this emotional assessment in a top-down fashion. Picture taken from Grandjean et al. (2013).

4.2. STS & STG involvement in the processing of an auditory stimulus.

After initial pitch processing in the auditory cortices and the brainstem, the next step in the processing of an auditory emotional stimulus is done by the superior temporal sulcus (STS) and the superior temporal gyrus (STG) (Schirmer & Kotz, 2006; Grandjean & Frühholz, 2013).

(14)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 14 In some studies, substrates of local sentic modulation (intonation/stress) were investigated by manipulating intonation or accents of isolated (pseudo)words. For example, in an influential study by Grandjean and colleagues (2005), adults listened to meaningless but word-like utterances with neutral or angry prosody while in an fMRI scanner. At the same time, the effect of attention was assessed by means of a dichotic listening paradigm in which the participants were told to ignore one of the incoming auditory streams. Incoming auditory stimuli on left and right ear were respectively spoken using either neutral/neutral, angry/neutral or neutral/angry intonation and were presented in a counterbalanced fashion. Angry prosody elicited increased activity in the superior temporal gyrus (STG) and superior temporal sulcus (STS). This increased activity was present independent of

attention, indicating that this is an automatic process.

Evidence for involvement of both STS and STG in (emotional) music processing comes from two fMRI studies by Altenmüller et al. (2014). Altenmüller and colleagues presented short excerpts of film music (known for eliciting emotional responses) and compared the BOLD-responses5 to these excerpts with the BOLD-responses to silence. STS and STG areas were found to be more active in the music than in silent conditions, suggesting that these areas involved in the processing of musical sequences.

Some studies have found increased STG activation during pitch incongruity detection in music, and to a lesser extent, in language. Nan & Friederici (2013) used fMRI to determine the role of the STG in pitch processing in Mandarin speaking musicians. These participants were presented four note musical phrases, of which half ended incongruously as the result of a minimal acoustic

modulation of the ending note. Also, they were presented quadrisyllabic Chinese phrases. Like the musical phrases, half of the Chinese phrases were altered. Ending tonal contours were adjusted, resulting in phrases that were either semantically incongruous only or semantically and syntactically incongruous. Right STG was active in the incongruous conditions for both music and, to a lesser extent, language. In light of the prediction that the right STG are involved in emotional processing of prosody and musical tone intonation, lower activation during this particular linguistic task is not surprising, because this task reflected lexical/semantic processing rather than emotional. These results therefore nevertheless support involvement of the right STG in a musilanguage, because this structure seems to be involved in musical processing (Nan & Friederici, 2013) as well in as the processing of affective prosody (Grandjean et al., 2005).

5

fMRI measures Blood Oxygen Level Dependent (BOLD) to assess brain activation. It is based on the

assumption that whenever a brain region becomes increasingly active, it uses oxygen from the blood. This in turn results in increased oxygenated blood flow towards to restore oxygen levels.

(15)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 15 More evidence for the involvement of the STG comes from an fMRI experiment by Armony and colleagues (2015). Participants were presented novel musical clips (played with piano or violin) and nonlinguistic vocalizations. The STG was more active for musical excerpts than for the

vocalizations. This seems to be evidence for a domain-specificity. However, the vocalizations used in this experiment might have been processed subcortically rather than cortically (through STS and STG). As vocalizations such as those used in this experiment are often processed automatically and therefore are likely processed through the same subcortical route, it is not surprising that such vocalizations do not elicit as much activation in structures involved in the cortical processing of vocal emotions such as the STS and STG.

In another music experiment Koelsch and colleagues (2002) presented chord sequences to participants. Every now and then, these sequences contained deviant instrumental sounds, inducing an emotional reaction in the listener. Compared to the expected chord sequences, these unexpected auditory events elicited higher activation in both the STS and the STG, again suggesting that it was the emotional unexpectedness that increased activation in these structures.

It is important to note that all of the above experiments that found evidence of involvement of the STS and STG in music used musical stimuli in the form of instrumental chord sequences or complete musical excerpts. These types of musical stimuli elicit activation in STS and STG because nowadays, they are highly emotional stimuli. However, these are stimuli that were nonexistent in the time of our hypothesized musilanguage (I will elaborate this point of critique in the discussion). These studies cannot serve as direct evidence of STS/STG involvement in musilanguage. Even if this is true however, these results show that the activation of the STS and STG is not limited to auditory linguistic emotional processing. Unsurprisingly, it has been hypothesized the main task of these structures is a domain general one: to detect unexpected emotional content (Ueda et al. 2003). Thus, all results taken together, the right STS and right STG are hypothesized to be key structures in affective prosody processing in the musilanguage stage. Activity in these brain areas is mostly driven by the emotional quality of the stimulus and is proposed to be bottom-up. Evidence suggests that right STS and STG activation depends highly on their intensity (Ethofer et al., 2009) and fundamental frequency (Wiethoff et al., 2008; Frühholz & Gschwind, 2015), features that are also present in music. Importantly, STS and STG activation seems to be present in both local and global aspects of affective prosody and music processing. These local and global aspects are key elements in the musilanguage model. STS and STG most likely reflect a domain-general detection of unexpected emotional content (Ueda et al. 2003). Their involvement is not evidence for their involvement in a

(16)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 16 musilanguage per se. However, a musilanguage would have needed such an emotional expectancy detector. It is therefore likely that both the right STS and the right STG were involved in the emotional processing of lexical-tonal elements in musilanguage.

4.3. Involvement of inferior frontal gyrus.

A next step in the processing of an emotional auditory stimulus is the conscious evaluation of the emotional properties of the stimulus. Two (bilateral) structures, the inferior frontal gyrus and the orbitofrontal cortex, are thought to be involved in this process. Several studies have investigated the role of the inferior frontal gyrus (IFG) in the processing of emotional prosody. The evidence for involvement of this structure mainly comes from fMRI and DTI studies.

For instance, in an interesting fMRI training study by Rota et al. (2009), participants received real-time neurofeedback to enhance right IFG activity. Before and after their training, the

participants were given a linguistic task. The objective in this task was to determine affective prosodic category (happy, sad, angry or neutral) of a sentence based on intonation. Examination of the blood-oxygen dependent level in the right IFG before and after neurofeedback revealed that the training resulted in increased oxygenation in the right IFG. The increase correlated with higher performance on the prosodic task after neurofeedback training. The authors concluded that the right IFG is indeed involved in the processing of affective prosody. In terms of the musilanguage

framework, these results can be interpreted as evidence for right IFG involvement in the conscious detection expressive phrasing of lexical-tonal elements on a global scale (global sentic modulation).

Evidence of IFG involvement in the processing of emotional prosody on a local level does not only come from fMRI studies. Another method to determine involvement of certain structures such as the IFG during cognitive processes, is DTI fiber tracking. This technique makes use of the fact that different types of tissue have different water molecule diffusion rates (Alexander et al., 2007). MRI can be used to map these diffusion rates as a function of spatial location. This way, DTI fiber tracking allows researchers to investigate which white matter tracts are used during a cognitive task. Frühholz and Gschwind (2015) used DTI fiber tracking to determine white matter fiber connectivity during the processing of affective speech. Four speech-like but meaningless words were spoken in either a neutral or angry tone. Angry prosody showed strongest white matter connections from auditory cortex, through right STS and STG, to the right inferior frontal gyrus. This result suggests that this right dorsal pathway is involved in the processing of emotional prosody.

(17)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 17 More evidence for such a (hemispheric) lateralization effect comes from Altenmüller’s study (2014), which was described earlier. In their study the right inferior frontal gyrus showed higher activation in response to recognized musical excerpts than to new musical excerpts. This apparent lateralization effect seems to be present for both music and language processing, which can be explained in terms of a general right hemispheric predominance for (emotional) pitch processing (Friederici & Alter, 2004). According to Friederici and Alter, the left hemisphere, on the other hand, is more involved in semantic and syntactic processing of acoustic stimuli.

Combining the evidence presented above, the right inferior frontal gyrus seems to be involved in both affective prosody and musical emotion processing on both the local and the global level. The right IFG has been hypothesized to be involved in attentional control (Hampshire et al., 2010). It is recruited when important cues are detected, and might thus not be emotion-specific. As emotional cues are generally important for an organism, it is likely that, if a musilanguage existed, this structure was involved in the (conscious) detection of emotive value in auditory (vocal) stimuli. However, the left IFG (Broca’s area) might have also been involved in musilanguage, as tones had lexical meaning as well as emotional meaning. To assess this hypothesis further however, it is necessary to conduct more experiments with tone language speakers, as in tone languages, a tone can also have both lexical and emotional meaning.

4.4. Involvement of orbitofrontal cortex.

A next, domain-general step in the processing of emotional vocal stimuli is the conscious evaluation of its emotional content. Schirmer and Kotz (2006) have proposed that this evaluation process is located in the orbito-frontal cortex (OFC).

Several studies have pointed towards involvement of the OFC during conscious processing of emotional auditory stimuli. Wildgruber et al. (2004), for instance, investigated the neural substrates for processing affective prosody on a global (sentence) level. In this fMRI experiment, subjects were asked to pick one of two sentences as an answer to the question: “which sentence sounds more excited?” which was supposed to reflect emotional prosody discrimination ability. In a similar way (selection of one of two sentences) linguistic discrimination ability was tested. Compared to linguistic processing, affective prosody processing elicited more activation in the OFC. Moreover, the effects were dependent on attention. Therefore the OFC seems to be especially involved in the conscious evaluation of affective prosody, a skill that would be very useful in a musilanguage because it allows humans to influence their emotional reaction to auditory stimuli such as angry prosody.

(18)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 18 More evidence for this hypothesis comes from a study by Sander et al. (2005). In a dichotic listening experiment, meaningless but word-like utterances were presented to the participants spoken with angry or neutral prosody to both ears (neutral/neutral, angry/neutral or neutral/angry on left/right side). At the same time, BOLD-levels were measured by fMRI. Participants were instructed to attend one of the auditory streams to allow the researchers to assess the influence of voluntary attention on prosody processing. Whereas STS and amygdala activation in response to the stimuli spoken with angry prosody was independent of attention, OFC activation increased

significantly when the stimulus was voluntarily attended. This result suggests that evaluation of local prosodic features also seems to recruit OFC regions.

The orbitofrontal cortex also seems to be involved in conscious emotional evaluation of musical stimuli. In an fMRI study by Menon and Levitin (2005), 350 millisecond long musical excerpts of musical pieces were presented to participants. The participants then had to rate pleasantness of these excerpts. Pleasantness was correlated with increased OFC activation. It is unclear whether this experiment assessed local or global emotional processing of the musical excerpts used, because the excerpts were all taken from the same piece but had very different intonation and stress properties. The results did show however that musical stimuli that are judged as pleasant recruit the OFC. This can be explained by the fact that the orbitofrontal cortex is also found to be involved in general decision processes regarding hedonic reward (Kringelbach, 2005).

The idea that the OFC is involved in general conscious evaluation of emotional stimuli is affirmed by other studies that have investigated musical emotional processing. Lehne and colleagues for example recently (2013) investigated the neural substrates for tension flow and resolution in music. According to Lehne and colleagues, musical tension flow is a key mediator in the processing of musical emotions, because it is a phenomenon that is highly related to expectancy and prediction processes. Violations and confirmations of musical prediction and expectancy can result in emotional reactions. Participants were asked to continuously rate tension levels in four piano pieces while in an MRI scanner. fMRI analysis of the BOLD-dependent levels showed increased activation in OFC and amygdala during increased tension in the four piano pieces, again suggesting that the OFC is highly involved in a general conscious evaluation process of emotional stimuli.

The evidence taken together, the OFC is implicated in the conscious evaluation of emotional stimuli, such as emotional prosody and emotional musical pieces. The exact role of the OFC however is as of yet unclear. Price (1999) proposed that the OFC supports domain-general integration of emotional responses with sensory information. Bottom-up, this structure seems to receive input

(19)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 19 from the subcortical route. Top-down, it outputs to a structure in the subcortical route, the amygdala (Lehne et al., 2013).

The results imply that the OFC is highly involved in the conscious assessment of emotional value of a(n auditory) stimulus. As this seems to be a domain-general mechanism, it is likely that a hypothesized musilanguage would have made use of this same structure in the referential pathway to consciously evaluate the emotional properties of a vocal message, as this conscious processing is an essential difference between musilanguage and other vocalization systems. More research that uses a higher rate of fractionation of musical features is needed to strengthen this claim.

4.5. Involvement of amygdala.

Once the orbitofrontal cortex has evaluated the emotional properties of a stimulus, its sends feedback to the amygdala (Schirmer & Kotz, 2006), a structure that has long been implicated in emotion processing. It is therefore hypothesized that this structure is also involved in the processing of musical emotions and affective prosody. However, neurophysiological evidence on the

involvement in these processes is inconsistent.

Besides the previously mentioned study by Sander and colleagues (2005), in which angry prosody triggered amygdala activation, some other studies found the amygdala seems to be involved in the processing of emotional auditory stimuli. In an fMRI study, Bach et al. (2008) presented sounds of either rising, falling, or constant intensity to their participants in a counterbalanced fashion. The amygdala showed higher activation during processing of those sounds that were presented with rising intensity. Stressing a syllable or note can also viewed as increasing its intensity. Therefore the results imply that the amygdala might at least be involved in processing of local prosodic features.

Contrasting evidence in the linguistic domain however comes from, amongst others, a study by Pourtois and colleagues (2005). This study used positron emission tomography6 to assess

amygdala activation in response to disyllabic words spoken in an angry or happy voice. The authors did not find any substantial amygdala activation for either of the two conditions. This result argues against involvement of the amygdala in the processing of local prosodic features.

When combining the evidence of involvement of amygdala presented above, the results seem pretty inconsistent. In music processing, its role is even less clear. There is ample evidence for the involvement of the amygdala in music processing. The evidence that is there, is mostly anecdotal.

(20)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 20 Gosselin et al. (2007) investigated music processing a patient that suffered from amygdala lesions. The authors asked the patient to rate the intensity of fear, happiness, peacefulness and sadness present in computer-generated instrumental music excerpts. The behavioral results suggested that she was perfectly capable of recognizing happy music. In contrast however, sad and scary music recognition was impaired, suggesting that the amygdala is involved in the recognition of these types of emotions in music.

Overall, evidence on the involvement of the amygdala in the processing of emotional processing is inconsistent. Therefore, its exact function is as of yet unclear (Grandjean & Frühholz, 2013). It seems to receive both bottom-up input from the STS and STG as well as top-down input from the orbitofrontal cortex (Kotz et al., 2013). Some studies have found that functional activation in amygdala seems to be independent of attentional focus and might support involuntary processing and orienting to biologically important information (Grandjean et al., 2005; Ousdal et al., 2008). In general, results are tentative. However, it is clear that the amygdala is an essential structure in emotional processing due to its functional connections with structures all over the brain. Besides the bottom-up projections to the OFC, it is also functionally connected to structures that regulate vital body procedures. This way the amygdala can (in)directly influence neurophysiological state of an animal. Although the amygdala has been found to be increasingly active during processing of

vocalizations such as laughter (Armony et al., 2015) it might not be involved in referential processing of music and language. Until more (non-case-study) research is done, it is impossible to make any statement regarding the involvement of the amygdala in a hypothesized musilanguage.

5. Discussion.

5.1. Evidence for a musilanguage model.

This review investigated the neurophysiological evidence for a common precursor for language and music in the form of a musilanguage that is neither of the two. This musilanguage consists of a two-step process. The first two-step is a referential emotive communication system that is capable of processing sound emotive meaning and sound emotive reference. This system uses lexical-tonal elements that have these properties, which can be (syntactically) combined and stress or intonated (expressive phrasing).

A model by Grandjean and Frühholz (2013) for further processing of emotional meaning in the form of affective prosody or music was adopted in this review. In general, all structures that play

(21)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 21 a role in the referential processing of emotional prosody seem to also be active in musical (pitch) processing, whether this role be pitch processing specific or domain-general.

After initial processing in the primary and secondary auditory cortices, (lexical) tonal elements (pitches) are encoded in the brainstem. FMRI and EEG experiments have shown

bidirectional transfer effects of musical and tonal-linguistic expertise, indicating that this structure is indeed a shared substrate for music and language. As the brainstem is (evolutionarily) a relatively old structure, it is likely that in musilanguage, too, it was responsible for the initial encoding of an

acoustic profile. Because learning seems to shape sensitivity of brainstem encoding, the brainstem was probably especially sensitive to acoustic profiles that were used in musilanguage, which could have substantially different from those used in music and language.

After brainstem encoding, the emotional content of the auditory stimulus is implicitly evaluated. The evidence on emotional prosody and musical sequences supports involvement of the superior temporal gyri and sulci. However, nearly all of the studies that assessed the involvement of these structures used deviant stimuli or congruous/incongruous stimuli. Therefore the involvement of these structures could also be limited to an implicit domain-general mechanism of expectancy management. Nevertheless, it is likely that if a musilanguage has existed, the STG and STS were involved in the referential processing of auditory emotive stimuli, whether it be as a ‘simple’ domain-general or auditory specific surprise detector.

The inferior frontal gyri also seems to be a shared neural substrate for music and language. Some evidence points towards lateralization for processing in both modalities. An alternative

explanation for this apparent lateralization is that the right hemisphere is more involved in emotional (pitch) processing, whereas the left hemisphere is more involved in lexical (semantic) processing. In general there are bilateral IFG activations in response to both affective prosody in language and musical intonation. Depending on task requirements, one of these will show predominance. The right inferior frontal gyrus seems to be involved in the conscious evaluation of emotional cues. In

musilanguage, it could have had the same function. It is therefore hypothesized to have been involved in the conscious emotional evaluation local and global sentic modulation in musilanguage.

After processing in the IFG, a decision about the emotional valence of the auditory stimulus has to be made. The orbitofrontal cortex is thought to be involved in this process. It is active in processing of both emotional music and affective prosody, and therefore it is likely that this structure was also present in a musilanguage in order to evaluate (emotional) reward of vocal stimuli. This structure evaluates the incoming stimuli and then ‘decides’ on the course of action. After the

(22)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 22 decision on the value of an emotional stimulus, the OFC informs the amygdala of its assessment, although studies on amygdala involvement in the processing of emotional music and affective prosody are inconclusive. The role of the amygdala in a musilanguage is therefore also unclear. Depending on the emotional assessment it could have directly influenced neurophysiological state to ensure appropriate behavioral response to a musilinguistic message.

Before interpreting shared substrates as proof for a musilanguage, it is important to realize that even if music and language have some shared substrates, this does not mean that they are one and the same domain. However, the high amount of structures that appear to be shared in

emotional processing of music and language, does make the existence of a common precursor that, compared to language, is a more simple vocal emotional system involving prosody and lexical tone processing, more likely. This is especially true if an entire pathway (such as the one described in this review) seems to be shared. It is unlikely that this high degree of sharing of neural substrates has arisen purely by chance and that the two modalities are completely unrelated.

The purpose of this review however was not to prove that a musilanguage has existed. Rather, the evidence reported in this review does not falsify the existence of a precursor for language and music. A valid question then is whether this shared pathway favors a musilanguage model more than, for example, a music or language outgrowth model. The short answer is, it does not. However, after adopting a musilanguage model, mapping these neural substrates is highly efficient in terms of theoretical economy (Juslin & Laukki, 2003), as mentioned in the introduction. Also, adopting a musilanguage model avoids the endless discussions about which one came first. Instead, it focuses on what is ancestrally shared between the two domains and assumes these are features that were present in a common precursor.

It is important to realize that this thesis reviewed the evidence for shared expressive phrasing mechanisms in a very simplified manner. For example, brain structures and their involvement in emotional processing were presented as though processing of musical and prosodic features is linear. This is a highly simplified interpretation of what is really going on during processing of such features. In truth, the processing of emotional auditory features is a dynamic process, with bidirectional connections involved, ensuring feedback between the various structures. As a result, the brain is very capable in decoding complex (continuous) stimuli online.

A second way in which this review greatly (but perhaps unjustly) simplifies emotional processing of auditory stimuli, is by focusing purely on affective prosody and music perception. It thus completely disregarded production. However, for the sake of brevity, any structure that might

(23)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 23 be involved in emotional processing of music (such as the basal ganglia (Peretz et al., 2013)) and language that might find its origin in production was not discussed. Similarly, due to space

limitations, temporal processing was not discussed, although rhythm and speaking tempo is certainly a part of emotional prosody or emotional content of music.

A challenge is that the measures used in neuroscientific studies often entail more than just those basic components described by Brown. Even though emotional linguistic prosody does make use of intonation and stress, like musilanguage, it is more than just those two components. It remains linguistic phenomenon and thus can only prove that linguistic prosody makes use of a brain mechanism. What studies on linguistic emotional prosody cannot do is tell us which component of linguistic prosody is the reason for observed activation of brain structures. Therefore, I would like to emphasize the need for fractionation of phenomena such as music and language once more.

Another problem is that most recent studies used fMRI to assess involvement in auditory emotional processing. fMRI has a high spatial, but a low temporal resolution, while both language and music are of highly temporal nature. It is especially hard to separate local from global processing using fMRI.

One last point of critique on some of the studies, is that they used artificial stimuli (speech stimuli with no semantic meaning, for example). Therefore, the ecological validity of those

experiments is questionable. On the other side of the spectrum there are studies that use, for example, musical excerpts. When a certain structure is involved in the processing of a musical excerpt, it is not clear which specific feature of the musical piece is processed in this structure. Therefore, researchers should fractionate music into components that together make up music. Music is a broad concept, which involves not only vocal emotional communication, but also, for example, instrumental music, which some of the studies used as stimuli. However, the perception of these sounds might still make use of the same mechanisms that are also used in vocal emotional communication (or a musilanguage, for that matter).

Despite these points of critique, looking for neural substrates is one of the most promising ways for testing evolutionary models such as Brown’s musilanguage model. After all, cognition doesn’t fossilize (Lewontin, 1998), and as a consequence it is impossible to test evolutionary models directly. However, research methods could be subject to a lot of improvements when looking for neural substrates of linguistic or musical features.

(24)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 24 Comparative research of music and language processing could improve in various ways. Firstly, as spatial resolution in neuroimaging techniques increases, we can determine involvement of certain (parts of) structures in auditory emotional processing more precisely. Also, more research should focus on functional connections, for example using DTI, rather than just looking at activation. Such an approach would do more justice to the complexity of acoustic processing. This would also shed more light on the question to what extent music and language processing are shared.

Secondly, future research should look at components that make up language and music, and compare the neural substrates of these features. In other words, it should focus on shared ancestral properties of music and language. For example, studies should separately investigate the global and local aspects of prosody and music processing. An example would be to look at the processing of intonation in music and language.

Thirdly, researchers interested in affective prosody processing should more often use tone language speakers as participants in their study. Tonal speakers are interesting because, just as in musilanguage, a tone can have emotional as well as lexical meaning. Comparing of processing of musical/affective prosody features in tonal speakers and non-tonal speakers is in especially interesting.

Fourth, in this review, I have only investigated the existence of shared neural substrates for one component of musilanguage. Combinatorial syntax and semantic meaning were completely disregarded, while these, as expressive phrasing, are essential components of the musilanguage. Future studies could investigate the evidence for shared neural substrates for these features in a similar manner as was done in this review.

Although we will never exactly know how language and music have evolved, assuming language and music have a common precursor (that is neither language nor music) is yet a fruitful way to investigate the origins of language and music. These two uniquely human abilities are very much alike in some aspects, but are very different in others. After reviewing the present evidence of the existence of shared neural substrates for language and music, and realizing that the evidence does not falsify the use of expressive phrasing mechanisms in a musilanguage model, it is a very exciting prospect to know that there is still much to be gained in the research field.

(25)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 25 Ackermann, H., Hage, S. R., & Ziegler, W. (2014). Brain mechanisms of acoustic communication in humans and nonhuman primates: An evolutionary perspective. Behavioral and Brain Sciences, 37(6), 529-546.

Alexander, J., Wong, P. C. M., & Bradlow, A. (2005). Lexical tone perception in musicians and nonmusicians. Paper presented at the Interspeech 2005 (Eurospeech) 9th European Conference on Speech Communication and Technology, Lisbon, September 2005.

Alexander, A. L., Lee, J. E., Lazar, M., & Field, A. S. (2007). Diffusion Tensor Imaging of the Brain. Neurotherapeutics : The Journal of the American Society for Experimental NeuroTherapeutics,4(3), 316–329.

Alexander, N., Klucken, T., Koppe, G., Osinsky, R., Walter, B., Vaitl, D., .. & Hennig, J. (2012). Interaction of the serotonin transporter-linked polymorphic region and environmental adversity: increased amygdala-hypothalamus connectivity as a potential mechanism linking neural and endocrine hyperreactivity. Biological psychiatry, 72(1), 49-56.

Altenmüller, E., Siggel, S., Mohammadi, B., Samii, A., & Münte, T. F. (2014). Play it again, Sam: brain correlates of emotional music recognition. Frontiers in Psychology, 5, 114.

Armony, J. L., Aubé, W., Angulo-Perkins, A., Peretz, I., & Concha, L. (2015). The specificity of neural responses to music and their relation to voice processing: An fMRI-adaptation study. Neuroscience letters, 593, 35-39.

Ayotte, J., Peretz, I., & Hyde, K. (2002). Congenital amusia. Brain, 125(2), 238-251.

Bach, D. R., Schächinger, H., Neuhoff, J. G., Esposito, F., Di Salle, F., Lehmann, C., .. & Seifritz, E. (2008). Rising sound intensity: an intrinsic warning cue activating the amygdala. Cerebral Cortex, 18(1), 145-150.

Townsend, D. W., Valk, P. E., & Maisey, M. N. (2005). Positron emission tomography. Springer-Verlag London Limited.

(26)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 26 Banai, K., Nicol, T., Zecker, S. G., & Kraus, N. (2005). Brainstem timing: implications for cortical processing and literacy. The Journal of Neuroscience, 25(43), 9850-9857.

Besson, M., & Schön, D. (2001). Comparison between language and music. Annals of the New York Academy of Sciences, 930(1), 232-258.

Bidelman, G. M., Gandour, J. T., & Krishnan, A. (2011). Cross-domain effects of music and language experience on the representation of pitch in the human auditory brainstem. Journal of Cognitive Neuroscience, 23(2), 425-434.

Brown, S. (2000). The “musilanguage” model of music evolution. In N. L. Wallin, B. Merker & S. Brown (Eds.), The origins of music (pp. 271-300). Cambridge, MA: MIT Press.

Darwin, C. (1871/2006). The Descent of Man, and Selection in Relation to Sex.”. In E. O. Wilson (ed.), From so simple a beginning: The four great books of Charles Darwin. New York: W.W. Norton.

Deutsch, D., Hinthorn, T., & Dolson, M. (2004). Absolute pitch, speech, and tone language: some experiments and a proposed framework. Music Perception: An Interdisciplinary Journal, 21 (3), 339-356.

Ethofer, T., Kreifelts, B., Wiethoff, S., Wolf, J., Grodd, W., Vuilleumier, P., & Wildgruber, D. (2009). Differential influences of emotion, task, and novelty on brain regions underlying the processing of speech melody. Journal of cognitive neuroscience, 21(7), 1255-1268.

Fitch, W. T. (2006). The biology and evolution of music: A comparative perspective. Cognition, 100(1), 173-215.

Fitch, W. T., & Zuberbühler, K. (2013). Primate precursors to human language: beyond discontinuity. In E. Altenmüller, S. Schmidt & E. Zimmermann (Eds.), The Evolution of Emotional Communication: From Sounds in Nonhuman Mammals to Speech and Music in Man (pp. `26-48). Oxford, UK: Oxford University Press.

(27)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 27 Friederici, A. D., & Alter, K. (2004). Lateralization of auditory language functions: a dynamic dual pathway model. Brain and language, 89(2), 267-276.

Frühholz, S., Gschwind, M., & Grandjean, D. (2015). Bilateral dorsal and ventral for the processing of affective prosody identified by probabilistic fiber tracking. NeuroImage, 109, 27-34.

Gosselin, N., Peretz, I., Johnsen, E., & Adolphs, R. (2007). Amygdala damage impairs emotion recognition from music. Neuropsychologia, 45(2), 236-244.

Grandjean, D., & Frühholz, S. (2013). An integrative model of brain processes for the decoding of emotional prosody. In E. Altenmüller, S. Schmidt & E. Zimmermann (Eds.), The Evolution of Emotional Communication: From Sounds in Nonhuman Mammals to Speech and Music in Man (pp. 211-228). Oxford, UK: Oxford University Press.

Grandjean, D., Sander, D., Pourtois, G., Schwartz, S., Seghier, M. L., Scherer, K. R., & Vuilleumier, P. (2005). The voices of wrath: brain responses to angry prosody in meaningless speech. Nature neuroscience, 8(2), 145-146.

Hampshire, A., Chamberlain, S. R., Monti, M. M., Duncan, J., & Owen, A. M. (2010). The role of the right inferior frontal gyrus: inhibition and attentional control. Neuroimage, 50(3), 1313-1319.

Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and music performance: Different channels, same code?. Psychological bulletin, 129(5), 770.

Juslin, P. N., & Västfjäll, D. (2008). Emotional responses to music: The need to consider underlying mechanisms. Behavioral and brain sciences, 31(05), 559-575.

Katz, J., & Pesetsky, D. (2011). The identity thesis for language and music. URL http://ling. auf. net/lingBuzz/000959.

(28)

Are neural expressive phrasing mechanisms shared by music and language? - by M. E. Olthof, University of Amsterdam 28 Koelsch, S., Gunter, T. C., Cramon, D. Y. V., Zysset, S., Lohmann, G., & Friederici, A. D. (2002). Bach speaks: a cortical “language-network” serves the processing of music. Neuroimage, 17(2), 956-966.

Kotz, S. A., Hasting, A. S., & Paulmann, S. (2013). On the orbito-striatal interface in (acoustic) emotional processing. In E. Altenmüller, S. Schmidt & E. Zimmermann (Eds.), The Evolution of Emotional Communication: From Sounds in Nonhuman Mammals to Speech and Music in Man (pp. 229-253). Oxford, UK: Oxford University Press.

Kringelbach, M. L. (2005). The human orbitofrontal cortex: linking reward to hedonic experience. Nature Reviews Neuroscience, 6(9), 691-702.

Krishnan A. (2006). Human frequency following response. In R. F. Burkard, M. Don & J. J. Eggermont (Eds.), Auditory evoked potentials: Basic principles and clinical application, (pp. 313-335). Baltimore: Lippincott Williams & Wilkins.

Krishnan, A., & Gandour, J. T. (2014). Language experience shapes processing of pitch relevant information in the human brainstem and auditory cortex: electrophysiological evidence. Acoustics Australia/Australian Acoustical Society, 42(3), 166.

Krishnan, A., Swaminathan, J., & Gandour, J. T. (2009). Experience-dependent enhancement of linguistic pitch representation in the brainstem is not specific to a speech context. Journal of Cognitive Neuroscience, 21(6), 1092-1105.

Lehne, M., Rohrmeier, M., & Koelsch, S. (2013). Tension-related activity in the orbitofrontal cortex and amygdala: an fMRI study with music. Social cognitive and affective neuroscience, nst141.

Lerdahl, F., & Jackendoff, R. (1983). An overview of hierarchical structure in music. Music Perception, 229-252.

Lewontin, R. C. (1998). The evolution of cognition: Questions we will never answer. In D. Scarborough & S. Sternberg (Eds.), Methods, models, and conceptual issues: An invitation to cognitive science, 4, 107–132. Cambridge, MA: MIT Press.

Referenties

GERELATEERDE DOCUMENTEN

In this thesis, I will present studies on A) cognitive mechanisms underlying the perception of multidimensional speech including similarities and differences between humans and

According to article 4 ICRPD, State parties have to “promote, protect and ensure the full and equal enjoyment of all human rights.” Due to the fact that this obligation, such as other

Analysing the accident reports related to the crushing and milling plants obtained from the several data bases and identifying with a Cause-Consequences analysis the root causes

This approach is being used to investigate phenomena such as optical super-resolution and the time reversal of light, thus opening many new avenues for imaging and focusing in

Business Goal Is supported by Quality Goal Business Damage is violated by Quality Defficiency Is caused by violate Threat Agent Threat- Vulnerability Quality Attribute

Regressions (2) and (3) do not indicate the existence of a Dutch discount on listed companies, but rather a premium on Dutch listed companies’ market values when compared to all

The professorship of Sustainable Agribusiness in Metropolitan Agriculture mapped the chain and is studying how farmers can acquire suit- able technology and what business models

The Consortium to Inform Molecular and Practical Approaches to CNS Tumor Taxonomy (cIMPACT- NOW) has recommended that isocitrate dehydrogenase 1 and 2 wildtype (IDH1/2wt)