Syntactic Processing in Music and Language

(1)

Page 1 of 26

Master Thesis

Title:

Shared Syntactic Processing Resources of Music and Language: a Brain Imaging Study

Author:

Richard Kunert

Cognitive Science Center Amsterdam, University of Amsterdam

rikunert@gmail.com

Supervisor:

Peter Hagoort

Max Planck Institute for Psycholinguistics, Nijmegen &

Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen,

Peter.Hagoort@mpi.nl

Co-Assessor:

Rens Bod

Institute for Logic, Language and Computation, University of Amsterdam

rens.bod@gmail.com

Uva Representative:

Titia van Zuijen

Afdeling Pedagogiek, Onderwijskunde en Lerarenopleiding, University of Amsterdam

T.L.vanZuijen@uva.nl

(2)

Page 2 of 26

Shared Syntactic Processing Resources of Music and Language: a Brain Imaging Study

Richard Kunert

Roel Willems

Daniel Casasanto

Aniruddh D. Patel

Peter Hagoort

ABSTRACT

Music and language have been proposed to share basic syntactic integration resources. This

study aimed to find out where in the brain these shared resources reside. As opposed to

previous studies we did not simply look for a conjunction of music and language processing but

instead used a design whereby language processing directly interacts with music processing.

Participants heard songs containing subject-extracted or object-extracted relative clauses

whose critical verb was sung in-key, out-of-key, or unusually loudly. The latter was used to

control for attention capture effects and activated the right hemisphere’s inferior frontal gyrus.

The interaction between language syntax and music harmony, on the other hand, behaved

differently and could be localised in the anterior part of the left hemisphere’s inferior frontal

gyrus: Broca’s area pars triangularis and pars orbitalis. These findings provide direct evidence

for the competition of two different cognitive domains for high level, neural integration resources.

That this supramodal syntactic integration area was found in the anterior part of Broca’s area

rather than other brain areas associated with both music and language syntax is discussed in

light of theories about Broca’s area function.

(3)

Page 3 of 26

INTRODUCTION

Music and language are human abilities whose shared mental and neural underpinnings are

increasingly well understood. One proposed area of overlap lies in the syntactic domain.

Syntactic processing – whether in language or in music – involves the integration of discrete

elements (e.g., words in language, tones/chords in music) into higher order structures (e.g.,

sentences in language and harmonic sequences in music) according to a set of syntactic

principles. The present study aims to localise where in the brain music and language syntactic

integration processes share basic neural resources.

Music syntax as defined in this paper is harmonic in nature (for details see Patel, 2008).

Harmony refers to expectations which are based on the statistical regularities present in

Western tonal music (see Tillmann, Bharucha, & Bigand, 2000). In detail, in the Western

tradition, every musical key, e.g., C-major, is associated with a hierarchy that applies to the

twelve octave equivalent pitch classes (referred to as the tones C, C#, D, D#, E, F, F#, G, G#,

A, A#, and B). For the purpose of this article it suffices to say that part of the hierarchical

organisation is a distinction between in-key and out-of-key tones. The seven tones which make

up each scale, e.g., for C-major: C, D, E, F, G, A, and B, are more likely to co-occur and, thus,

more stable than the five tones which are not part of a given scale, e.g., for C-major: C#, D#,

F#, G#, and A# (Krumhansl, 1979; Krumhansl & Kessler, 1982). In this sense, harmonic

sequences can be said to be structured as they create different expectations for different tones:

the higher in the hierarchy the more a tone is expected to occur in a sequence.

It is important to bear in mind that tone and key perception influence each other. Tones

are perceived in terms of the harmonic context but the harmonic context, in turn, is also

dependent on the incoming tones (Bharucha, 1987; Krumhansl & Kessler, 1982). In this way,

in-key tones are easy to integrate into an established context while out-of-in-key tones are not easy

to integrate and, thus, could point to a change in harmony. Normal listeners without formal

musical training incorporate these and similar regularities in their music perception (Bigand &

Poulin-Charronnat, 2006).

Patel (2008) proposed that the process of syntactic integration of elements is shared

between music and language. This shared syntactic integration resource hypothesis (SSIRH)

was based on the finding that these two cognitive domains elicit a similar event related brain

potential (ERP) component in response to structural violations: the P600 (Patel, Gibson, Ratner,

Besson, & Holcomb, 1998).

(4)

Page 4 of 26

Beyond similar effects in music and language processing alone, Patel’s (2008) SSIRH

predicted that when music and language are presented simultaneously their concurrent

processing demands can interfere with each other. This prediction is supported by both

behavioural and electrophysiological findings. When structural integration is hard in both

domains reading times and speech comprehension are worse compared to if either the

language or the music dimension alone are difficult to process (Fedorenko, Patel, Casasanto,

Winawer, & Gibson, 2009; Slevc, Rosenberg, & Patel, 2009; see also Hoch, Poulin-Charronnat,

& Tillmann, 2011). Similarly, simultaneous deviations in music and language have also been

shown to interact in terms of the same EEG potentials. The left anterior negativity (LAN) elicited

by linguistic, syntactic anomalies was reduced if presented with a concurrent harmonic deviation

(Koelsch, Gunter, Wittfoth, & Sammler, 2005; Steinbeis & Koelsch, 2008; see also Carrus,

Koelsch, & Bhattacharya, 2011). Furthermore, the early right anterior negativity (ERAN) elicited

by harmonic irregularities was reduced with a concurrent syntactic language violation (Maidhof

& Koelsch, 2011; Steinbeis & Koelsch, 2008; but see Koelsch et al., 2005). This strongly

suggests that syntactic processing in music and language compete for the same neural

resources.

It has been shown that this competition is not located at the level of general attention

processes. In most music-language interference studies – including the present one –

participants are told to ignore the musical dimension and concentrate on the language (but see

Maidhof & Koelsch, 2011; Steinbeis & Koelsch, 2008). One could hypothesise that changes in

linguistic processing occur during music syntax violations simply because harmonic irregularities

are salient events which draw attention away from the language task. However, when attention

is drawn to the musical domain by non-syntactic means, e.g., by loudness increases or timbral

changes, similar behavioural or neural interactions as those elicited by harmonic violations are

not found (Fedorenko et al., 2009; Koelsch et al., 2005; Slevc et al., 2009).

The Present Study

The brain areas underlying these interaction effects are unclear. To date no

music-language interference study has shown the location of the aforementioned behavioural and

electrophysiological interaction effects. Still, a comparison of the localisation results of

experiments investigating either music harmony or language syntax shows a number of

overlapping regions. Firstly, based on brain lesion data (Patel, Iversen, Wassenaar, & Hagoort,

2008) Patel’s (2008) SSIRH predicts Broca’s area in the left inferior frontal gyrus to show an

interaction effect. This is supported by brain lesion work (Drai & Grodzinsky, 2006; Sammler,

(5)

Page 5 of 26

Koelsch, & Friederici, 2011), EEG/MEG studies (Friederici, Wang, Herrmann, Maess, & Oertel,

2000; Maess, Koelsch, Gunter, & Friederici, 2001; Villarreal, Brattico, Leino, Ostergaard, &

Vuust, 2011), as well as fMRI experiments (Bookheimer, 2002; Embick, Marantz, Miyashita,

O'Neil, & Sakai, 2000; Kaan & Swaab, 2002; Koelsch et al., 2002; Koelsch, Fritz, Schulze,

Alsop, & Schlaug, 2005; Tillmann et al., 2006).

Secondly, the right hemisphere homologue of Broca’s area is found across syntax

studies of music and language using either EEG/MEG (Friederici et al., 2000; Maess et al.,

2001; Villarreal et al., 2011) or fMRI (Embick et al., 2000; Koelsch et al., 2002; Koelsch et al.,

2005; Tillmann et al., 2006). Thirdly, the bilateral superior temporal gyrus is another region

activated in response to syntax irregularities in either music or language whether measured

electrophysiologically (Sammler et al., 2009; Sammler et al., 2013) or haemodynamically

(Embick et al., 2000; Kaan & Swaab, 2002; Koelsch et al., 2002; Koelsch et al., 2005; Tillmann

et al., 2006).

Still, whether these regions are the locus of basic music-language overlap is difficult to

answer based on the available findings (Peretz & Zatorre, 2005). One problem relates to the

high degree of anatomical variability in the frontal lobes rendering comparisons of activation

sites across experiments difficult (Amunts et al., 1999; Fischl et al., 2008; Juch, Zimine, Seghier,

Lazeyras, & Fasel, 2005), i.e. designs testing music and language intra-individually are

warranted. Moreover, with separate music and language experiments different neural

generators in the same brain tissue could also be responsible for any spatial overlap in a given

region.

One recent fMRI study by Rogalsky, Rong, Saberi, and Hickok (2011) attempted to

avoid some of these shortcomings by presenting either simple melodies or meaningless

sentences to the same participants. Regions of overlap between music (vs. rest) and speech

(vs. rest) were found in the bilateral superior temporal gyrus, especially primary auditory cortex,

but not in other regions associated with language syntax processing such as Broca’s area.

However, in a multivariate pattern analysis the two modalities could still be distinguished based

on differential activation patterns in overlapping regions of activation. The authors interpreted

this as reflecting different acoustic characteristics of music and speech which are processed

differently in auditory cortex. However, whether this invalidates the proposal for shared syntactic

integration resources between language and music can be questioned. To be brief, it is trivial to

propose that any auditory input – including language and music – is processed first by the same

low-level sensory processing regions. The crucial question is whether for specific higher

(6)

Page 6 of 26

question an investigation of the areas known to be involved in the cognitive operation in

question – syntax in the case of this study – is necessary. As a result we adopted a region of

interest approach focussing specifically on the aforementioned syntax areas.

Furthermore, by adopting an interaction paradigm (Fedorenko et al., 2009) in a brain

imaging setting we are able to go beyond insights into topographical overlap between music and

language processing. Instead, any location of the music-language interaction has to exhibit at

least partially shared neural resources recruited by both cognitive domains. This contrasts with

a topographical overlap which could be due to a local aggregation of functionally independent

modules.

In order to elicit this interaction, we manipulated both music harmony and language

syntax. Participants heard songs containing either a syntactically easy construction containing

only a local dependency (SR: subject-extracted relative clause) or a difficult construction

containing a non-local dependency (OR: object-extracted relative clause; see Gibson, 1998).

Sentences were sung a cappella (unaccompanied) and the critical word which disambiguated

between these two linguistic options was either sung on a regular tone (in-key tone which is

easy to integrate in the prevailing harmonic context) or on an irregular tone (out-of-key tone

which is not easy to integrate harmonically). Thus, the time point of integration difficulty in music

was aligned with the one in language. A previous behavioural study in English using a similar

design showed an interaction between linguistic and musical conditions in terms of sentence

comprehension (Fedorenko et al., 2009).

This approach is superior to previous studies in this field in the following respects. As

opposed to the aforementioned experiments investigating music-language syntax interactions

with brain measures (Carrus et al., 2011; Koelsch et al., 2005; Maidhof & Koelsch, 2011;

Steinbeis & Koelsch, 2008) our syntactic manipulation did not introduce morphosyntactic rule

violations but instead used two syntactically legal constructions of different integration difficulties

(SR vs. OR). Thus, error monitoring mechanisms – suggested by Rogalsky et al. (2011) to be

the source of previously reported overlapping brain processes – cannot account for our findings.

Thus, rather than investigating shared syntactic integration resources used under the

exceptional circumstances of error processing, we investigated the basic shared machinery

used also when processing grammatically legal stimuli.

Secondly, we control for an attentional mechanism explaining any interactive pattern by

use of an auditory anomaly condition presenting the manipulated syllable 10dB louder than

normal. This is necessary because many of the brain areas associated with music and language

syntax processing are also associated with the bottom-up attention network, including the

(7)

Page 7 of 26

bilateral inferior frontal gyrus (Corbetta & Shulman, 2002; Fox, Corbetta, Snyder, Vincent, &

Raichle, 2006) and the superior temporal gyrus (Downar, Crawley, Mikulis, & Davis, 2000; Fox

et al., 2006). Including an auditory anomaly condition allows us to check (a) whether this control

condition does indeed draw attention as previously claimed (Fedorenko et al., 2009) and (b)

whether the interaction of music and language syntax processing is generally attentional in

nature or specifically located at the level of syntax.

If, as predicted by the SSIRH (Patel, 2008), music and language truly use shared brain

resources for syntactic processing, one would expect there to be one or more brain areas

sensitive to the superadditive processing difficulty when integration is challenging in both

domains. Previous research suggests Broca’s area to be one candidate region. Furthermore,

this locus is not predicted to be sensitive to the interaction between language syntax and a

perceptually salient loudness increase at the critical sentence position as the latter is not

syntactic in nature but instead acoustic.

(8)

Page 8 of 26

MATERIALS AND METHODS

Participants

19 healthy participants were included in the final analysis (mean age = 22 years, range 18 –

27). No subject had a known history of neurological, language related or hearing problems and

all had normal or corrected-to-normal vision. Five participants were excluded due to technical

difficulties (2) or excessive movement (3). The remaining 7 men and 12 women were all right

handed, native speakers of Dutch with at most six years of formal musical training (mean = 1.9

years). All were naive as to the purpose of the study and were paid for their participation.

Informed consent was obtained from all participants and the study was approved by the local

ethics committee.

Design and Materials

Design

We used a within-subjects 2 (language: subject-extracted relative clauses vs. object-extracted

relative-clauses) × 3 (music: critical note in-key vs. out-of-key vs. auditory anomaly) design.

The language material consisted of 120 sets of sentences, each in two versions as

shown in (1).

(1)

(1a)

Subject-extracted (SR)

De dichter die de schrijvers aanmoedigde juichte zeer fanatiek.

Literal: The poet

singular

that the writers

plural

encouraged

singular

cheered

singular

very fanatically.

Correct:

The poet that encouraged the writers cheered very fanatically.

(1b)

Object-extracted (OR)

De dichters die de schrijver aanmoedigde juichten zeer fanatiek.

Literal: The poets

plural

that the writer

singular

encouraged

singular

cheered

plural

very fanatically.

(9)

Page 9 of 26

The example stimulus in (1) shows that the subject of the matrix clause differed in number from

the noun phrase in the relative clause. Therefore, the number agreement of the relative clause

verb (‘aanmoedigde’) ensured that participants interpreted the relative clause as

subject-extracted (number agreement with matrix clause noun phrase) or as object subject-extracted (number

agreement with relative clause noun phrase). Note that the critical word which disambiguated

between the two conditions was identical in position and form across different sentence

versions.

Each of these two sentence versions was combined with three versions of a melody

(in-key, out-of-(in-key, auditory anomaly). The three music versions differed only in the tone sung on

the stressed syllable of the disambiguating relative clause verb in terms of pitch (in-key vs.

out-of-key; see Figure 1) or loudness (in-key vs. auditory anomaly). All melodies were composed

specifically for this study by a professional composer (Jason Rosenberg) and recorded by a

trained Dutch singer (Jan-Mathijs Schoffelen).

Figure 1. A sample melody (in the key of C-major). The top system shows the in-key version in

which no note is off the C-major scale. The bottom system shows the out-of-key version in which only the tone coinciding with the stressed syllable of the relative clause verb (highlighted) is not part of the C-major scale.

Sentences

Sentences were on average 10 (SD = 1.3) words long with the disambiguating relative clause

verb always being the sixth word. The matrix subject was plural in half of the SR sentences, i.e.

the plurality of the first noun phrase was not indicative of the linguistic condition. Each verb and

sentence ending was used for two sets which differed in terms of their noun phrases. Between

these two set pairs the plurality of the first noun phrase in each condition was different.

Melodies

Melodies were rhythmically diverse and on average 10.2 seconds long (SD = 1.3) at a tempo of

70 beats per minute, i.e. a quarter note corresponded to a nominal duration of 857ms. The

(10)

Page 10 of 26

beginning of each melody established a strong sense of key. Both the in-key and the out-of-key

conditions were in the same key and differed only by one note. This critical tone – coinciding

with the stressed syllable of the relative clause verb – was either part of the established key

(in-key) or not (out-of-(in-key) and always a quarter note in length. Each of the twelve major keys was

used 10 times (10 × 12 = 120 sets). Tones were in the baritone range, i.e. between F#2 (92.5

Hz) and E4 (329.6 Hz).

Stimulus Recording

All stimuli were recorded in a soundproof room at the Max Planck Institute in Nijmegen. First,

each song (four per set) was recorded separately in each of the linguistic and harmonic

conditions. Afterwards, all recordings were normalized for loudness level. Next, steps were

taken to control for acoustic cues prior to the critical verb. Of the four recordings per set, one

from each linguistic condition (e.g., SR/in-key and OR/out-of-key) was kept without modification.

To arrive at the other harmonic version of each language condition, the audio signal of the

critical verb in the remaining two recordings (e.g., SR/out-of-key and OR/in-key) was copied into

the corresponding stream of the kept recordings. This exchange of verb recording effectively

changed the music condition of the stimulus (e.g., from SR/in-key to SR/out-of-key). After this

splicing step the new song signal was adjusted in order to avoid the audibility of the verb

recording exchange. To exclude any possible systematic influence of this processing step it was

ensured that an equal number of in-key and out-of-key recordings were left unchanged. Next,

the auditory anomaly condition of each sentence was created. Of the resulting four files the

in-key versions were chosen and the critical tone’s loudness was increased by 10dB in line with

Fedorenko et al. (2009). All audio manipulations were done with the programme Audacity

version 1.3 (audacity.sourceforge.net).

fMRI data acquisition and processing

Data acquisition

The experiment was carried out on a 1.5-Tesla MRI scanner (Siemens Avanto, Siemens

Medical Systems, Erlangen, Germany). 33 axial slices were acquired (3.5mm × 3.5mm in-plane

resolution, 3 mm slice thickness, 0.51mm slice spacing, field of view [FOV] = 224mm) covering

the whole brain. We used a single-shot echo-planar imaging (EPI) sequence (repetition time

[TR] = 2140ms, echo time [TE] = 40ms, 90° flip-angle [FA]). After the first of two functional runs

a 3-D T1 scan was acquired (176 slices per slab, voxel size = 1mm × 1mm × 1mm, TR =

2250ms, TE = 2.95ms, FA = 15°, sagittal orientation).

(11)

Page 11 of 26

fMRI analysis

Analysis was carried out using SPM8 (www.fil.ion.ucl.ac.uk/spm). The first five volumes were

discarded to avoid equilibrium effects. In order to compensate for small head movements the

remaining images were realigned. Data were spatially smoothed using an 8 mm FWHM

Gaussian kernel. All functional datasets were individually co-registered using the participants’

individual high-resolution anatomical images. Afterwards, this co-registered EPI dataset was

normalised to Montreal Neurological Institute (MNI) space by linear scaling. The time series

were high pass filtered with a cut-off frequency of 128 seconds.

The statistical evaluation was performed using the general linear model. The design

matrix was generated with a synthetic haemodynamic response function modelled on the

manipulated song region, i.e. the start of the critical verb until the end of the song. We

separately modelled the six conditions of interest and included two nuisance regressors to

capture the effect of functional scanning run as well as 18 nuisance regressors extracting

variability explained by linear motion, quadratic motion and the first derivative of linear motion

(Lund, Norgaard, Rostrup, Rowe, & Paulson, 2005). Contrast maps were generated for each

participant at the first level. Because the individual functional datasets were all aligned to the

same stereotactic reference space, a random effects group analysis was then performed at the

second level using SPM8. For the whole brain analysis, no cluster emerged for any of the main

effects or their interaction with a probability of p<.05, corrected for multiple comparisons using

Gaussian random field theory and false discovery rate adjustment (q = .05; Chumbley & Friston,

2009).

We derived anatomically defined regions of interest (ROI) from the Automated

Anatomical Labelling library (Tzourio-Mazoyer et al., 2002). The chosen ROIs are those where

overlapping activation sites between music harmony and language syntax have been reported

(see Introduction): bilateral superior temporal gyrus and bilateral Broca’s area divided into pars

opercularis, pars triangularis, and pars orbitalis. The Marseille ROI toolbox version 0.42 (Brett,

Anton, Valabregue, & Poline, 2002) was used to derive average activation levels across voxels

in each ROI based on contrast values generated during the first level analysis with SPM8.

Inferential analyses on the ROI data were carried out using random permutation based

tests which require no parametric assumptions and have been found to be very powerful tests

for neuroimaging data (Nichols & Holmes, 2002). For the dependent t-test this amounts to

creating a null hypothesis t-distribution by randomly applying condition labels to data points

within each participant 20,000 times and testing the effect of interest on the randomised data

(12)

Page 12 of 26

each time. The proportion of randomly obtained t-values equal or greater than the true t-value

represents the likelihood of obtaining the t-statistic under the null hypothesis, i.e. the p-value.

Similarly, the random permutation based ANOVA randomised labels within each participant but

otherwise in an unrestricted way across experimental factors (Manly, 2007). Planned

comparisons were Bonferroni corrected and only the corrected p-values are reported.

Procedure

The experiment was run using Presentation software (http://www.neurobs.com, version 16.2).

The auditory stimuli were played to the participant using MR-compatible Sensimetrics Insert

Earphones (Model S14) at a comfortable level. Of each stimulus set, each participant heard only

one music version but both linguistic versions, i.e. a total of 240 trials. The stimuli were ordered

randomly with the following constraints: (1) no more than three times the same correct answer,

(2) no more than three times the same music condition, (3) no more than three times the same

language condition, (4) at least ten trials between any stimulus set’s SR and OR versions, (5) at

least ten trials between any two songs with the same verb and sentence ending. Every three

participants a new pseudorandomized stimulus order was used. Within each such

participant-triplet, for each trial the musical condition was counterbalanced. Thus, before participant

rejection, the three different music conditions were balanced across participants in terms of trial

position.

Participants were asked to concentrate on the linguistic dimension of the songs. They

were not asked to do a musical task. The experiment was organised as follows. Four example

trials preceded the experimental session. Experimental trials were divided into eight blocks of

30. After four blocks participants could rest for approximately ten minutes while an anatomical

MRI scan was acquired.

Each trial was organised as follows. After a song was played a comprehension prompt

was displayed visually through a projector from outside the scanner room. Subjects saw it

through a mirror attached to the head-coil. Within 10,000ms they had to press a button to

indicate whether the prompt matched the preceding song’s sentence or not. Except for the

example trials, no feedback was given. In order to ensure that participants would process the

full sentences, half the comprehension prompts checked for matrix clause understanding. The

other half focussed on the relative clause. Because of concerns about a possible verb matching

strategy whereby a comprehension prompt is true if it includes the same verb form as in the

song, we also created (1) more challenging passive prompts and (2) prompts with ‘someone’

(13)

Page 13 of 26

(‘iemand’) as a singular subject possibly representing either a plural or a singular noun phrase in

the song. Within each comprehension prompt version half the prompts matched the songs.

Stimulus onset was jittered with respect to volume acquisition by varying the intertrial

interval between 3500 and 6000ms in steps of 1ms. During the intertrial interval as well as

during the song presentation a fixation cross was displayed centrally. An experimental session

lasted approximately 100 minutes.

(14)

Page 14 of 26

RESULTS

Behavioural Results

Overall, participants were 77% accurate in their comprehension prompt judgements (range

60.17% - 90.01%). All scored above the maximal performance level expected by chance (56%

correct). Figure 2 shows the mean accuracies across the six conditions pooled over all

comprehension prompts. A 2(prompt type: matrix or relative clause) × 2(language) × 3(music)

dependent ANOVA revealed three effects. First, there was a main effect of prompt type

[F(1,20)=165.727, p<.001, p

η

2

=.892], such that prompts targeting main clause understanding

were easier to answer (87%) than prompts targeting relative clause understanding (66%).

Furthermore, a main effect of linguistic condition was found [F(1,20)=51.987, p<.001, p

η

2

=.722 ]

indicating that prompts after SR sentences were answered more accurately (84%) than those

after OR sentences (69%). Furthermore, these two main effects interacted [F(1,20)=52.308,

p<.001,

p

η

2

=.723 ]. Follow-up t-tests revealed that the difference between SR and OR sentences

is significant for both kinds of prompts albeit larger for those targeting relative clause

comprehension [t(20)=7.523, p<.01] than those targeting main clause comprehension

[t(20)=4.060, p<.01]. It should be noted that both in the overall ANOVA and in a separate

(15)

Page 15 of 26

ANOVA analysing only relative clause comprehension prompts, there was no main effect of

music, nor did music interact with any of the other conditions (ps>.3).

fMRI Results

Figure 3. fMRI Results in the Right Hemisphere. A) ROI location. B) The music main effect.

Note that the zero point of the BOLD signal in B), i.e. the implicit baseline, is not informative as it includes song beginnings and challenging comprehension prompts rather than just rest. Error = SEM. P-values of follow-up t-tests are Bonferroni adjusted.

Separate randomisation based two way ANOVAs revealed a main effect in right Broca’s area

pars triangularis for music (F = 3.41, p<.05; see Figure 3). This music main effect is the result of

the auditory anomaly condition resulting in marginally greater activation compared to the

out-of-key condition (t=2.45; p<.07). The contrast with the in-out-of-key condition did not reach significance

(t=1.89; p>.2), nor did the in-key vs. out-of-key contrast (t<1). Left Broca’s area pars opercularis

showed a main effect of language (F=5.11, p<.05) indicating greater activation in the OR

condition compared to the SR condition; see Figure 4B.

(16)

Page 16 of 26

Figure 4. fMRI Results in the Left Hemisphere. A) ROI location. B) The language main effect.

C) The language × music interaction. Note that the zero point of the BOLD signal in B), i.e. the implicit baseline, is not informative. Error = SEM. P-values of follow-up t-tests are Bonferroni adjusted.

Crucially, the language × music interaction was found in left Broca’s area pars

triangularis (F=4.30, p<.05) and left Broca’s area pars orbitalis (F=5.98, p<.01). Follow-up t-tests

showed that these interactions emerged because the OR>SR contrast only reached

significance in the out-of-key condition (left Broca’s area pars triangularis: t=3.02, p<.02; left

Broca’s area pars orbitalis: t=3.24, p<.02). We also found a language × music interaction in the

left superior temporal gyrus (F=3.82, p<.05). However, in this case, there was only a marginal,

reversed simple language main effect in the auditory anomaly condition. In neither of the other

two music conditions did we find a significant simple language main effect in the left superior

temporal gyrus (ps>.05).

Superior Temporal Gyrus

pars opercularis pars triangularis pars orbitalis Broca’s area

A)

L

B)

Language

Broca’s pars opercularis

Superior Temporal Gyrus Broca’s pars triangularis Broca’s pars orbitalis

(17)

Page 17 of 26

DISCUSSION

The present study aimed to localise syntactic integration resources which are shared between

two different cognitive domains – music and language – in the brain. It was found that the left

hemisphere’s Broca’s area is the location of a shared integration processor. This areas’ pars

triangularis and pars orbitalis exhibited a superadditive interaction pattern, i.e. when syntactic

demands are high in both music – due to an out-of-key tone rather than an in-key tone – and

language – due to an object-extracted relative clause rather than a subject-extracted relative

clause – this area’s activation was higher than would be expected by a linear combination of

each domain’s integration demands. This suggests that in this brain area at least some of the

neural resources which process syntactic relations between words in language are also

sensitive to syntactic relations between tones in music. In order to check whether this interaction

pattern could also be elicited by purely acoustic anomalies which are not associated with

differential syntactic integration demands, an auditory anomaly condition was included in the

design. In this condition the critical tone was not a harmonic irregularity but instead an unusually

loud syllable. This led to a music main effect in the right hemisphere’s inferior frontal gyrus pars

triangularis. However, crucially, it did not interact with the language main effect as seen for the

harmonic manipulation. Note that this is not due to a lack of statistical power. In fact, the

auditory anomaly condition did not even numerically lead to a greater language effect than the

in-key condition which played the critical syllable at a normal loudness level. In the left superior

temporal gyrus the auditory anomaly even marginally reversed the simple language effect.

Finally, a language main effect independent of music was found in the left hemisphere’s Broca’s

area pars opercularis.

Broca’s area as a supramodal syntactic integrator

Broca’s area’s role as a syntactic processing hub is widely recognised in the language

domain (Bookheimer, 2002; Kaan & Swaab, 2002). Hagoort’s (2005) memory, unification and

control model suggested that its precise contribution to language understanding lies in the

integration of elements stored in memory in the temporal lobe. The difference between a

syntactic integration processor and memory storage of syntactic elements is also proposed in

the music domain by Patel’s (2008) shared syntactic integration resource hypothesis (SSIRH).

Moreover, Patel (2008) suggested that this integrator can be located in the same brain area as

that proposed for language syntax processing: Broca’s area.

However, thus far it was unclear whether the parallel theoretical accounts for Broca’s

area’s role in syntactic processing of both music and language and the overlapping activation

(18)

Page 18 of 26

peaks for syntactic processing across music and language studies (Bookheimer, 2002; Embick

et al., 2000; Kaan & Swaab, 2002; Koelsch et al., 2002; Koelsch et al., 2005; Tillmann et al.,

2006) were merely due to a local aggregation of functionally separated modules with similar

roles in different cognitive domains or whether music and language actually shared neural

resources for this specific higher order cognitive process. The published literature on brain

measures of language-music syntactic interactions was not suited for this task. As Rogalsky et

al. (2011) have pointed out, in previous studies the overlap between music and language may

well be at the level of error processing. Indeed, all past brain measures of a syntactic interaction

effect between music and language were based on morphosyntactic rule violations (Carrus et

al., 2011; Koelsch et al., 2005; Maidhof & Koelsch, 2011; Steinbeis & Koelsch, 2008). This study

stands out in this respect as the language domain contrasted a syntactically easy with a

syntactically challenging language condition. The fact that we still found the predicted interaction

effect in Broca’s area reveals the basic nature of shared resources. Broca’s area appears to be

the location of some neural circuitry processing syntactic relations in both music and language

in the absence of errors, i.e. under non-exceptional conditions.

Finding domain-general integration resources in Broca’s area is in line with a growing

literature suggesting that this area plays the same role in a number of cognitive domains.

Beyond music and language, this brain area also appears to be involved in a similar way in

action sequence processing (Fadiga, Craighero, & D'Ausilio, 2009; Tettamanti & Weniger, 2006;

Ingvar, 2004; Udden & Bahlmann, 2012). It is noteworthy that the present study is the first to

show that shared inferior frontal involvement is not merely topographical, i.e. potentially related

to separate neural mechanisms in the same brain area, but indeed neural. The somewhat more

anterior location of the interaction effect in Broca’s area pars triangularis and pars orbitalis in

conjunction with a more posterior language main effect independent of music is in line with

Udden and Bahlmann’s (2012) proposal for a rostro-caudal abstraction gradient in the left

inferior frontal gyrus. Shared neural circuitry for structured sequence processing may be located

rather rostrally leading to an interaction compared to caudal regions where independent main

effects can be expected. Similarly, Levitin and Menon (2003) hypothesised that any structured

pattern unfolding over time will be processed in Broca’s area pars orbitalis independent of

stimulus type. Our results generally support these accounts. Note that this partial rather than

complete sharing of syntactic resources between music and language is not contradictory to

Patel’s (2008) SSIRH. Indeed, this model predicts that a subset of syntactic integration

resources are not domain-general. The present study has shown that this subset is located

(19)

Page 19 of 26

caudally in the left inferior frontal gyrus. The rostral part of Broca’s area appears to be a region

selectively activated by the computational structure of a problem rather than the content of the

information the problem is based on.

Non-syntactic overlap

Recently, Rogalsky and colleagues (2011) have reported that music and language

processing merely overlap in terms of low-level auditory areas. This sort of overlap between

cognitive domains is arguably trivial as any auditory stimulus can be expected to be processed

by the same low level processing regions. It is noteworthy, in this respect, that in their study the

topographical overlap between music and language processing was not found in inferior frontal

areas which are associated with more elaborate cognitive functions. Our analysis has shown

that shared higher cognitive processing in inferior frontal areas can be hidden by noise but

emerge once music and language demands interact. This increased sensitivity is an advantage

of an interaction design over a conjunction analysis.

However, music and language processing could also be said to interact in terms of

general attention mechanisms. Note that this confound is also a problem for the topographical

overlap found across studies between language and music studies. Indeed, we found the

auditory anomaly condition to activate the right hemisphere’s inferior frontal gyrus. This is in line

with this area’s known role in the bottom-up attention network (Corbetta & Shulman, 2002; Fox

et al., 2006). It is remarkable that – instead of the far more salient auditory anomaly condition

which left a brain signature – it was the harmonic manipulation which interacted with language

syntax processing. This is further support for a specifically syntactic shared neural architecture

between music and language.

Conclusion

The present study aimed to localise where in the brain music and language syntactic

integration processes share basic resources. The predicted interactive pattern between music

and language demands was found in the anterior part of the left inferior frontal gyrus: Broca’s

area pars triangularis and pars orbitalis. However, in line with Patel’s (2008) SSIRH, the two

cognitive domains do not appear to share all syntactic processing resources since a language

main effect – which was not modulated by music – was found more posteriorly in Broca’s area

pars opercularis. In conclusion, this is the first direct evidence that music and language share

resources in Broca’s area. As such it is an example for how the brain reuses resources across

cognitive domains once domain specific stimulus properties have been abstracted away.

(20)

Page 20 of 26

ACKNOWLEDGEMENT

Most of the linguistic material was designed and piloted by Dan Acheson. Jason Carl

Rosenberg composed the melodies. The songs, i.e. the combined linguistic and musical

material, were sung by Jan-Mathijs Schoffelen. Paul Gaalman assisted during the scanning

sessions. We would like to thank all of them for their contribution. Richard Kunert was supported

by the Studienstiftung des deutschen Volkes and through the Huygens Scholarship Programme

by the Netherlands Universities Foundation for International Cooperation (NUFFIC).

(21)

Page 21 of 26

REFERENCES

Amunts, K., Schleicher, A., Burgel, U., Mohlberg, H., Uylings, H., & Zilles, K. (1999). Broca's

region revisited: Cytoarchitecture and intersubject variability. Journal of Comparative

Neurology, 412(2), 319-341.

Bharucha, J. (1987). Music cognition and perceptual facilitation - a connectionist framework.

Music Perception, 5(1), 1-30.

Bigand, E., & Poulin-Charronnat, B. (2006). Are we "experienced listeners"? A review of the

musical capacities that do not depend on formal musical training. Cognition, 100(1),

100-130. doi: 10.1016/j.cognition.2005.11.007

Bookheimer, S. (2002). Functional MRI of language: New approaches to understanding the

cortical organization of semantic processing. Annual Review of Neuroscience, 25, 151-188.

doi: 10.1146/annurev.neuro.25.112701.142946

Brett, M., Anton, J., Valabregue, R., & Poline, J. (2002). Region of interest analysis using an

SPM toolbox. Proceedings of the 8th International Conference on Functional Mapping of

the Human Brain, Sendai, Japan.

Carrus, E., Koelsch, S., & Bhattacharya, J. (2011). Shadows of music-language interaction on

low frequency brain oscillatory patterns. Brain and Language, 119(1), 50-57. doi:

10.1016/j.bandl.2011.05.009

Chumbley, J. R., & Friston, K. J. (2009). False discovery rate revisited: FDR and topological

inference using gaussian random fields. NeuroImage, 44(1), 62-70. doi:

10.1016/j.neuroimage.2008.05.021

Corbetta, M., & Shulman, G. (2002). Control of goal-directed and stimulus-driven attention in the

brain. Nature Reviews Neuroscience, 3(3), 201-215. doi: 10.1038/nrn755

Downar, J., Crawley, A., Mikulis, D., & Davis, K. (2000). A multimodal cortical network for the

detection of changes in the sensory environment. Nature Neuroscience, 3(3), 277-283.

(22)

Page 22 of 26

Drai, D., & Grodzinsky, Y. (2006). A new empirical angle on the variability debate: Quantitative

neurosyntactic analyses of a large data set from broca's aphasia. Brain and Language,

96(2), 117-128. doi: 10.1016/j.bandl.2004.10.016

Embick, D., Marantz, A., Miyashita, Y., O'Neil, W., & Sakai, K. (2000). A syntactic specialization

for broca's area. Proceedings of the National Academy of Sciences of the United States of

America, 97(11), 6150-6154. doi: 10.1073/pnas.100098897

Fadiga, L., Craighero, L., & D'Ausilio, A. (2009). Broca's area in language, action, and music.

Neurosciences and Music III: Disorders and Plasticity, 1169, 448-458. doi:

10.1111/j.1749-6632.2009.04582.x

Fedorenko, E., Patel, A., Casasanto, D., Winawer, J., & Gibson, E. (2009). Structural integration

in language and music: Evidence for a shared system. Memory & Cognition, 37(1), 1-9. doi:

10.3758/MC.37.1.1

Fischl, B., Rajendran, N., Busa, E., Augustinack, J., Hinds, O., Yeo, B. T. T., . . . Zilles, K.

(2008). Cortical folding patterns and predicting cytoarchitecture. Cerebral Cortex, 18(8),

1973-1980. doi: 10.1093/cercor/bhm225

Fox, M. D., Corbetta, M., Snyder, A. Z., Vincent, J. L., & Raichle, M. E. (2006). Spontaneous

neuronal activity distinguishes human dorsal and ventral attention systems. Proceedings of

the National Academy of Sciences of the United States of America, 103(26), 10046-10051.

doi: 10.1073/pnas.0604187103

Friederici, A., Wang, Y., Herrmann, C., Maess, B., & Oertel, U. (2000). Localization of early

syntactic processes in frontal and temporal cortical areas: A magnetoencephalographic

study. Human Brain Mapping, 11(1), 1-11. doi:

10.1002/1097-0193(200009)11:1<1::AID-HBM10>3.0.CO;2-B

Gibson, E. (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68(1),

1-76. doi: 10.1016/S0010-0277(98)00034-1

Hagoort, P. (2005). On broca, brain, and binding: A new framework. Trends in Cognitive

(23)

Page 23 of 26

Hoch, L., Poulin-Charronnat, B., & Tillmann, B. (2011). The influence of task-irrelevant music on

language processing: Syntactic and semantic structures. Frontiers in Psychology, 2, 112.

doi: 10.3389/fpsyg.2011.00112

Juch, H., Zimine, I., Seghier, M., Lazeyras, F., & Fasel, J. (2005). Anatomical variability of the

lateral frontal lobe surface: Implication for intersubject variability in language neuroimaging.

NeuroImage, 24(2), 504-514. doi: 10.1016/j.neuroimage.2004.08.037

Kaan, E., & Swaab, T. (2002). The brain circuitry of syntactic comprehension. Trends in

Cognitive Sciences, 6(8), 350-356. doi: 10.1016/S1364-6613(02)01947-2

Koelsch, S., Fritz, T., Schulze, K., Alsop, D., & Schlaug, G. (2005). Adults and children

processing music: An fMRI study. NeuroImage, 25(4), 1068-1076. doi:

10.1016/j.neuroimage.2004.12.050

Koelsch, S., Gunter, T., von Cramon, D., Zysset, S., Lohmann, G., & Friederici, A. (2002). Bach

speaks: A cortical "language-network" serves the processing of music. NeuroImage, 17(2),

956-966. doi: 10.1016/S1053-8119(02)91154-7

Koelsch, S., Gunter, T., Wittfoth, M., & Sammler, D. (2005). Interaction between syntax

processing in language and in music: An ERP study. Journal of Cognitive Neuroscience,

17(10), 1565-1577. doi: 10.1162/089892905774597290

Krumhansl, C. (1979). The psychological representation of musical pitch in a tonal context.

Cognitive Psychology, 11, 346-374.

Krumhansl, C., & Kessler, E. (1982). Tracing the dynamic changes in perceived tonal

organization in a spatial representation of musical keys. Psychological Review, 89(4),

334-368. doi: 10.1037//0033-295X.89.4.334

Levitin, D., & Menon, V. (2003). Musical structure is processed in "language" areas of the brain:

A possible role for brodmann area 47 in temporal coherence. NeuroImage, 20(4),

2142-2152. doi: 10.1016/j.neuroimage.2003.08.016

(24)

Page 24 of 26

Lund, T., Norgaard, M., Rostrup, E., Rowe, J., & Paulson, O. (2005). Motion or activity: Their

role in intra- and inter-subject variation in fMRI. NeuroImage, 26(3), 960-964. doi:

10.1016/j.neuroimage.2005.02.021

Maess, B., Koelsch, S., Gunter, T., & Friederici, A. (2001). Musical syntax is processed in

broca's area: An MEG study. Nature Neuroscience, 4(5), 540-545.

Maidhof, C., & Koelsch, S. (2011). Effects of selective attention on syntax processing in music

and language. Journal of Cognitive Neuroscience, 23(9), 2252-2267.

Manly, B. F. J. (2007). Bootstrap, and monte carlo methods in biology (3rd ed.). London:

Chapman & Hall.

Nichols, T., & Holmes, A. (2002). Nonparametric permutation tests for functional neuroimaging:

A primer with examples. Human Brain Mapping, 15(1), 1-25. doi: 10.1002/hbm.1058

Patel, A. (2008). Music, language, and the brain. New York: Oxford University Press.

Patel, A., Gibson, E., Ratner, J., Besson, M., & Holcomb, P. (1998). Processing syntactic

relations in language and music: An event-related potential study. Journal of Cognitive

Neuroscience, 10(6), 717-733. doi: 10.1162/089892998563121

Patel, A. D., Iversen, J. R., Wassenaar, M., & Hagoort, P. (2008). Musical syntactic processing

in agrammatic broca's aphasia. Aphasiology, 22(7-8), 776-789. doi:

10.1080/02687030701803804

Peretz, I., & Zatorre, R. (2005). Brain organization for music processing. Annual Review of

Psychology, 56, 89-114. doi: 10.1146/annurev.psych.56.091103.070225

Petersson, K., Forkstam, C., & Ingvar, M. (2004). Artificial syntactic violations activate broca's

region. Cognitive Science, 28(3), 383-407. doi: 10.1016/j.cogsci.2003.12.003

Rogalsky, C., Rong, F., Saberi, K., & Hickok, G. (2011). Functional anatomy of language and

music perception: Temporal and structural factors investigated using functional magnetic

resonance imaging. Journal of Neuroscience, 31(10), 3843-3852. doi:

(25)

Page 25 of 26

Sammler, D., Koelsch, S., Ball, T., Brandt, A., Elger, C. E., Friederici, A. D., . . .

Schulze-Bonhage, A. (2009). Overlap of musical and linguistic syntax processing: Intracranial ERP

evidence. Neurosciences and Music III: Disorders and Plasticity, 1169, 494-498. doi:

10.1111/j.1749-6632.2009.04792.x

Sammler, D., Koelsch, S., Ball, T., Brandt, A., Grigutsch, M., Huppertz, H. -., . . .

Schulze-Bonhage, A. (2013). Co-localizing linguistic and musical syntax with intracranial EEG.

NeuroImage, 64, 134-146. doi: 10.1016/j.neuroimage.2012.09.035

Sammler, D., Koelsch, S., & Friederici, A. D. (2011). Are left fronto-temporal brain areas a

prerequisite for normal music-syntactic processing? Cortex, 47(6), 659-673. doi:

10.1016/j.cortex.2010.04.007

Slevc, L. R., Rosenberg, J. C., & Patel, A. D. (2009). Making psycholinguistics musical:

Self-paced reading time evidence for shared processing of linguistic and musical syntax.

Psychonomic Bulletin & Review, 16(2), 374-381. doi: 10.3758/16.2.374

Steinbeis, N., & Koelsch, S. (2008). Shared neural resources between music and language

indicate semantic processing of musical tension-resolution patterns. Cerebral Cortex, 18(5),

1169-1178. doi: 10.1093/cercor/bhm149

Tettamanti, M., & Weniger, D. (2006). Broca's area: A supramodal hierarchical processor?

Cortex, 42(4), 491-494. doi: 10.1016/S0010-9452(08)70384-8

Tillmann, B., Bharucha, J., & Bigand, E. (2000). Implicit learning of tonality: A self-organizing

approach. Psychological Review, 107(4), 885-913. doi: 10.1037//0033-295X.107.4.885

Tillmann, B., Koelseh, S., Escoffier, N., Bigand, E., Lalitte, P., Friederici, A., & von Cramon, D.

(2006). Cognitive priming in sung and instrumental music: Activation of inferior frontal

cortex. NeuroImage, 31(4), 1771-1782. doi: 10.1016/j.neuroimage.2006.02.028

Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., . . .

Joliot, M. (2002). Automated anatomical labeling of activations in SPM using a macroscopic

anatomical parcellation of the MNI MRI single-subject brain. NeuroImage, 15(1), 273-289.

doi: 10.1006/nimg.2001.0978

(26)

Page 26 of 26

Udden, J., & Bahlmann, J. (2012). A rostro-caudal gradient of structured sequence processing

in the left inferior frontal gyrus. Philosophical Transactions of the Royal Society B-Biological

Sciences, 367(1598), 2023-2032. doi: 10.1098/rstb.2012.0009

Villarreal, E. A. G., Brattico, E., Leino, S., Ostergaard, L., & Vuust, P. (2011). Distinct neural

responses to chord violations: A multiple source analysis study. Brain Research, 1389,

103-114. doi: 10.1016/j.brainres.2011.02.089

Willems, R. M., & Hagoort, P. (2007). Neural evidence for the interplay between language,

gesture, and action: A review. Brain and Language, 101(3), 278-289. doi:

10.1016/j.bandl.2007.03.004

-