Page 1 of 26
Master Thesis
Title:
Shared Syntactic Processing Resources of Music and Language: a Brain Imaging Study
Author:
Richard Kunert
Cognitive Science Center Amsterdam, University of Amsterdam
rikunert@gmail.com
Supervisor:
Peter Hagoort
Max Planck Institute for Psycholinguistics, Nijmegen &
Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen,
Peter.Hagoort@mpi.nl
Co-Assessor:
Rens Bod
Institute for Logic, Language and Computation, University of Amsterdam
rens.bod@gmail.com
Uva Representative:
Titia van Zuijen
Afdeling Pedagogiek, Onderwijskunde en Lerarenopleiding, University of Amsterdam
T.L.vanZuijen@uva.nl
Page 2 of 26
Shared Syntactic Processing Resources of Music and Language: a Brain Imaging Study
Richard Kunert
Roel Willems
Daniel Casasanto
Aniruddh D. Patel
Peter Hagoort
ABSTRACT
Music and language have been proposed to share basic syntactic integration resources. This
study aimed to find out where in the brain these shared resources reside. As opposed to
previous studies we did not simply look for a conjunction of music and language processing but
instead used a design whereby language processing directly interacts with music processing.
Participants heard songs containing subject-extracted or object-extracted relative clauses
whose critical verb was sung in-key, out-of-key, or unusually loudly. The latter was used to
control for attention capture effects and activated the right hemisphere’s inferior frontal gyrus.
The interaction between language syntax and music harmony, on the other hand, behaved
differently and could be localised in the anterior part of the left hemisphere’s inferior frontal
gyrus: Broca’s area pars triangularis and pars orbitalis. These findings provide direct evidence
for the competition of two different cognitive domains for high level, neural integration resources.
That this supramodal syntactic integration area was found in the anterior part of Broca’s area
rather than other brain areas associated with both music and language syntax is discussed in
light of theories about Broca’s area function.
Page 3 of 26
INTRODUCTION
Music and language are human abilities whose shared mental and neural underpinnings are
increasingly well understood. One proposed area of overlap lies in the syntactic domain.
Syntactic processing – whether in language or in music – involves the integration of discrete
elements (e.g., words in language, tones/chords in music) into higher order structures (e.g.,
sentences in language and harmonic sequences in music) according to a set of syntactic
principles. The present study aims to localise where in the brain music and language syntactic
integration processes share basic neural resources.
Music syntax as defined in this paper is harmonic in nature (for details see Patel, 2008).
Harmony refers to expectations which are based on the statistical regularities present in
Western tonal music (see Tillmann, Bharucha, & Bigand, 2000). In detail, in the Western
tradition, every musical key, e.g., C-major, is associated with a hierarchy that applies to the
twelve octave equivalent pitch classes (referred to as the tones C, C#, D, D#, E, F, F#, G, G#,
A, A#, and B). For the purpose of this article it suffices to say that part of the hierarchical
organisation is a distinction between in-key and out-of-key tones. The seven tones which make
up each scale, e.g., for C-major: C, D, E, F, G, A, and B, are more likely to co-occur and, thus,
more stable than the five tones which are not part of a given scale, e.g., for C-major: C#, D#,
F#, G#, and A# (Krumhansl, 1979; Krumhansl & Kessler, 1982). In this sense, harmonic
sequences can be said to be structured as they create different expectations for different tones:
the higher in the hierarchy the more a tone is expected to occur in a sequence.
It is important to bear in mind that tone and key perception influence each other. Tones
are perceived in terms of the harmonic context but the harmonic context, in turn, is also
dependent on the incoming tones (Bharucha, 1987; Krumhansl & Kessler, 1982). In this way,
in-key tones are easy to integrate into an established context while out-of-in-key tones are not easy
to integrate and, thus, could point to a change in harmony. Normal listeners without formal
musical training incorporate these and similar regularities in their music perception (Bigand &
Poulin-Charronnat, 2006).
Patel (2008) proposed that the process of syntactic integration of elements is shared
between music and language. This shared syntactic integration resource hypothesis (SSIRH)
was based on the finding that these two cognitive domains elicit a similar event related brain
potential (ERP) component in response to structural violations: the P600 (Patel, Gibson, Ratner,
Besson, & Holcomb, 1998).
Page 4 of 26
Beyond similar effects in music and language processing alone, Patel’s (2008) SSIRH
predicted that when music and language are presented simultaneously their concurrent
processing demands can interfere with each other. This prediction is supported by both
behavioural and electrophysiological findings. When structural integration is hard in both
domains reading times and speech comprehension are worse compared to if either the
language or the music dimension alone are difficult to process (Fedorenko, Patel, Casasanto,
Winawer, & Gibson, 2009; Slevc, Rosenberg, & Patel, 2009; see also Hoch, Poulin-Charronnat,
& Tillmann, 2011). Similarly, simultaneous deviations in music and language have also been
shown to interact in terms of the same EEG potentials. The left anterior negativity (LAN) elicited
by linguistic, syntactic anomalies was reduced if presented with a concurrent harmonic deviation
(Koelsch, Gunter, Wittfoth, & Sammler, 2005; Steinbeis & Koelsch, 2008; see also Carrus,
Koelsch, & Bhattacharya, 2011). Furthermore, the early right anterior negativity (ERAN) elicited
by harmonic irregularities was reduced with a concurrent syntactic language violation (Maidhof
& Koelsch, 2011; Steinbeis & Koelsch, 2008; but see Koelsch et al., 2005). This strongly
suggests that syntactic processing in music and language compete for the same neural
resources.
It has been shown that this competition is not located at the level of general attention
processes. In most music-language interference studies – including the present one –
participants are told to ignore the musical dimension and concentrate on the language (but see
Maidhof & Koelsch, 2011; Steinbeis & Koelsch, 2008). One could hypothesise that changes in
linguistic processing occur during music syntax violations simply because harmonic irregularities
are salient events which draw attention away from the language task. However, when attention
is drawn to the musical domain by non-syntactic means, e.g., by loudness increases or timbral
changes, similar behavioural or neural interactions as those elicited by harmonic violations are
not found (Fedorenko et al., 2009; Koelsch et al., 2005; Slevc et al., 2009).
The Present Study
The brain areas underlying these interaction effects are unclear. To date no
music-language interference study has shown the location of the aforementioned behavioural and
electrophysiological interaction effects. Still, a comparison of the localisation results of
experiments investigating either music harmony or language syntax shows a number of
overlapping regions. Firstly, based on brain lesion data (Patel, Iversen, Wassenaar, & Hagoort,
2008) Patel’s (2008) SSIRH predicts Broca’s area in the left inferior frontal gyrus to show an
interaction effect. This is supported by brain lesion work (Drai & Grodzinsky, 2006; Sammler,
Page 5 of 26
Koelsch, & Friederici, 2011), EEG/MEG studies (Friederici, Wang, Herrmann, Maess, & Oertel,
2000; Maess, Koelsch, Gunter, & Friederici, 2001; Villarreal, Brattico, Leino, Ostergaard, &
Vuust, 2011), as well as fMRI experiments (Bookheimer, 2002; Embick, Marantz, Miyashita,
O'Neil, & Sakai, 2000; Kaan & Swaab, 2002; Koelsch et al., 2002; Koelsch, Fritz, Schulze,
Alsop, & Schlaug, 2005; Tillmann et al., 2006).
Secondly, the right hemisphere homologue of Broca’s area is found across syntax
studies of music and language using either EEG/MEG (Friederici et al., 2000; Maess et al.,
2001; Villarreal et al., 2011) or fMRI (Embick et al., 2000; Koelsch et al., 2002; Koelsch et al.,
2005; Tillmann et al., 2006). Thirdly, the bilateral superior temporal gyrus is another region
activated in response to syntax irregularities in either music or language whether measured
electrophysiologically (Sammler et al., 2009; Sammler et al., 2013) or haemodynamically
(Embick et al., 2000; Kaan & Swaab, 2002; Koelsch et al., 2002; Koelsch et al., 2005; Tillmann
et al., 2006).
Still, whether these regions are the locus of basic music-language overlap is difficult to
answer based on the available findings (Peretz & Zatorre, 2005). One problem relates to the
high degree of anatomical variability in the frontal lobes rendering comparisons of activation
sites across experiments difficult (Amunts et al., 1999; Fischl et al., 2008; Juch, Zimine, Seghier,
Lazeyras, & Fasel, 2005), i.e. designs testing music and language intra-individually are
warranted. Moreover, with separate music and language experiments different neural
generators in the same brain tissue could also be responsible for any spatial overlap in a given
region.
One recent fMRI study by Rogalsky, Rong, Saberi, and Hickok (2011) attempted to
avoid some of these shortcomings by presenting either simple melodies or meaningless
sentences to the same participants. Regions of overlap between music (vs. rest) and speech
(vs. rest) were found in the bilateral superior temporal gyrus, especially primary auditory cortex,
but not in other regions associated with language syntax processing such as Broca’s area.
However, in a multivariate pattern analysis the two modalities could still be distinguished based
on differential activation patterns in overlapping regions of activation. The authors interpreted
this as reflecting different acoustic characteristics of music and speech which are processed
differently in auditory cortex. However, whether this invalidates the proposal for shared syntactic
integration resources between language and music can be questioned. To be brief, it is trivial to
propose that any auditory input – including language and music – is processed first by the same
low-level sensory processing regions. The crucial question is whether for specific higher
Page 6 of 26
question an investigation of the areas known to be involved in the cognitive operation in
question – syntax in the case of this study – is necessary. As a result we adopted a region of
interest approach focussing specifically on the aforementioned syntax areas.
Furthermore, by adopting an interaction paradigm (Fedorenko et al., 2009) in a brain
imaging setting we are able to go beyond insights into topographical overlap between music and
language processing. Instead, any location of the music-language interaction has to exhibit at
least partially shared neural resources recruited by both cognitive domains. This contrasts with
a topographical overlap which could be due to a local aggregation of functionally independent
modules.
In order to elicit this interaction, we manipulated both music harmony and language
syntax. Participants heard songs containing either a syntactically easy construction containing
only a local dependency (SR: subject-extracted relative clause) or a difficult construction
containing a non-local dependency (OR: object-extracted relative clause; see Gibson, 1998).
Sentences were sung a cappella (unaccompanied) and the critical word which disambiguated
between these two linguistic options was either sung on a regular tone (in-key tone which is
easy to integrate in the prevailing harmonic context) or on an irregular tone (out-of-key tone
which is not easy to integrate harmonically). Thus, the time point of integration difficulty in music
was aligned with the one in language. A previous behavioural study in English using a similar
design showed an interaction between linguistic and musical conditions in terms of sentence
comprehension (Fedorenko et al., 2009).
This approach is superior to previous studies in this field in the following respects. As
opposed to the aforementioned experiments investigating music-language syntax interactions
with brain measures (Carrus et al., 2011; Koelsch et al., 2005; Maidhof & Koelsch, 2011;
Steinbeis & Koelsch, 2008) our syntactic manipulation did not introduce morphosyntactic rule
violations but instead used two syntactically legal constructions of different integration difficulties
(SR vs. OR). Thus, error monitoring mechanisms – suggested by Rogalsky et al. (2011) to be
the source of previously reported overlapping brain processes – cannot account for our findings.
Thus, rather than investigating shared syntactic integration resources used under the
exceptional circumstances of error processing, we investigated the basic shared machinery
used also when processing grammatically legal stimuli.
Secondly, we control for an attentional mechanism explaining any interactive pattern by
use of an auditory anomaly condition presenting the manipulated syllable 10dB louder than
normal. This is necessary because many of the brain areas associated with music and language
syntax processing are also associated with the bottom-up attention network, including the
Page 7 of 26
bilateral inferior frontal gyrus (Corbetta & Shulman, 2002; Fox, Corbetta, Snyder, Vincent, &
Raichle, 2006) and the superior temporal gyrus (Downar, Crawley, Mikulis, & Davis, 2000; Fox
et al., 2006). Including an auditory anomaly condition allows us to check (a) whether this control
condition does indeed draw attention as previously claimed (Fedorenko et al., 2009) and (b)
whether the interaction of music and language syntax processing is generally attentional in
nature or specifically located at the level of syntax.
If, as predicted by the SSIRH (Patel, 2008), music and language truly use shared brain
resources for syntactic processing, one would expect there to be one or more brain areas
sensitive to the superadditive processing difficulty when integration is challenging in both
domains. Previous research suggests Broca’s area to be one candidate region. Furthermore,
this locus is not predicted to be sensitive to the interaction between language syntax and a
perceptually salient loudness increase at the critical sentence position as the latter is not
syntactic in nature but instead acoustic.
Page 8 of 26
MATERIALS AND METHODS
Participants
19 healthy participants were included in the final analysis (mean age = 22 years, range 18 –
27). No subject had a known history of neurological, language related or hearing problems and
all had normal or corrected-to-normal vision. Five participants were excluded due to technical
difficulties (2) or excessive movement (3). The remaining 7 men and 12 women were all right
handed, native speakers of Dutch with at most six years of formal musical training (mean = 1.9
years). All were naive as to the purpose of the study and were paid for their participation.
Informed consent was obtained from all participants and the study was approved by the local
ethics committee.
Design and Materials
Design
We used a within-subjects 2 (language: subject-extracted relative clauses vs. object-extracted
relative-clauses) × 3 (music: critical note in-key vs. out-of-key vs. auditory anomaly) design.
The language material consisted of 120 sets of sentences, each in two versions as
shown in (1).
(1)
(1a)
Subject-extracted (SR)
De dichter die de schrijvers aanmoedigde juichte zeer fanatiek.
Literal: The poet
singularthat the writers
pluralencouraged
singularcheered
singularvery fanatically.
Correct:
The poet that encouraged the writers cheered very fanatically.
(1b)
Object-extracted (OR)
De dichters die de schrijver aanmoedigde juichten zeer fanatiek.
Literal: The poets
pluralthat the writer
singularencouraged
singularcheered
pluralvery fanatically.
Page 9 of 26
The example stimulus in (1) shows that the subject of the matrix clause differed in number from
the noun phrase in the relative clause. Therefore, the number agreement of the relative clause
verb (‘aanmoedigde’) ensured that participants interpreted the relative clause as
subject-extracted (number agreement with matrix clause noun phrase) or as object subject-extracted (number
agreement with relative clause noun phrase). Note that the critical word which disambiguated
between the two conditions was identical in position and form across different sentence
versions.
Each of these two sentence versions was combined with three versions of a melody
(in-key, out-of-(in-key, auditory anomaly). The three music versions differed only in the tone sung on
the stressed syllable of the disambiguating relative clause verb in terms of pitch (in-key vs.
out-of-key; see Figure 1) or loudness (in-key vs. auditory anomaly). All melodies were composed
specifically for this study by a professional composer (Jason Rosenberg) and recorded by a
trained Dutch singer (Jan-Mathijs Schoffelen).
Figure 1. A sample melody (in the key of C-major). The top system shows the in-key version in
which no note is off the C-major scale. The bottom system shows the out-of-key version in which only the tone coinciding with the stressed syllable of the relative clause verb (highlighted) is not part of the C-major scale.
Sentences
Sentences were on average 10 (SD = 1.3) words long with the disambiguating relative clause
verb always being the sixth word. The matrix subject was plural in half of the SR sentences, i.e.
the plurality of the first noun phrase was not indicative of the linguistic condition. Each verb and
sentence ending was used for two sets which differed in terms of their noun phrases. Between
these two set pairs the plurality of the first noun phrase in each condition was different.
Melodies
Melodies were rhythmically diverse and on average 10.2 seconds long (SD = 1.3) at a tempo of
70 beats per minute, i.e. a quarter note corresponded to a nominal duration of 857ms. The
Page 10 of 26
beginning of each melody established a strong sense of key. Both the in-key and the out-of-key
conditions were in the same key and differed only by one note. This critical tone – coinciding
with the stressed syllable of the relative clause verb – was either part of the established key
(in-key) or not (out-of-(in-key) and always a quarter note in length. Each of the twelve major keys was
used 10 times (10 × 12 = 120 sets). Tones were in the baritone range, i.e. between F#2 (92.5
Hz) and E4 (329.6 Hz).
Stimulus Recording
All stimuli were recorded in a soundproof room at the Max Planck Institute in Nijmegen. First,
each song (four per set) was recorded separately in each of the linguistic and harmonic
conditions. Afterwards, all recordings were normalized for loudness level. Next, steps were
taken to control for acoustic cues prior to the critical verb. Of the four recordings per set, one
from each linguistic condition (e.g., SR/in-key and OR/out-of-key) was kept without modification.
To arrive at the other harmonic version of each language condition, the audio signal of the
critical verb in the remaining two recordings (e.g., SR/out-of-key and OR/in-key) was copied into
the corresponding stream of the kept recordings. This exchange of verb recording effectively
changed the music condition of the stimulus (e.g., from SR/in-key to SR/out-of-key). After this
splicing step the new song signal was adjusted in order to avoid the audibility of the verb
recording exchange. To exclude any possible systematic influence of this processing step it was
ensured that an equal number of in-key and out-of-key recordings were left unchanged. Next,
the auditory anomaly condition of each sentence was created. Of the resulting four files the
in-key versions were chosen and the critical tone’s loudness was increased by 10dB in line with
Fedorenko et al. (2009). All audio manipulations were done with the programme Audacity
version 1.3 (audacity.sourceforge.net).
fMRI data acquisition and processing
Data acquisition
The experiment was carried out on a 1.5-Tesla MRI scanner (Siemens Avanto, Siemens
Medical Systems, Erlangen, Germany). 33 axial slices were acquired (3.5mm × 3.5mm in-plane
resolution, 3 mm slice thickness, 0.51mm slice spacing, field of view [FOV] = 224mm) covering
the whole brain. We used a single-shot echo-planar imaging (EPI) sequence (repetition time
[TR] = 2140ms, echo time [TE] = 40ms, 90° flip-angle [FA]). After the first of two functional runs
a 3-D T1 scan was acquired (176 slices per slab, voxel size = 1mm × 1mm × 1mm, TR =
2250ms, TE = 2.95ms, FA = 15°, sagittal orientation).
Page 11 of 26
fMRI analysis
Analysis was carried out using SPM8 (www.fil.ion.ucl.ac.uk/spm). The first five volumes were
discarded to avoid equilibrium effects. In order to compensate for small head movements the
remaining images were realigned. Data were spatially smoothed using an 8 mm FWHM
Gaussian kernel. All functional datasets were individually co-registered using the participants’
individual high-resolution anatomical images. Afterwards, this co-registered EPI dataset was
normalised to Montreal Neurological Institute (MNI) space by linear scaling. The time series
were high pass filtered with a cut-off frequency of 128 seconds.
The statistical evaluation was performed using the general linear model. The design
matrix was generated with a synthetic haemodynamic response function modelled on the
manipulated song region, i.e. the start of the critical verb until the end of the song. We
separately modelled the six conditions of interest and included two nuisance regressors to
capture the effect of functional scanning run as well as 18 nuisance regressors extracting
variability explained by linear motion, quadratic motion and the first derivative of linear motion
(Lund, Norgaard, Rostrup, Rowe, & Paulson, 2005). Contrast maps were generated for each
participant at the first level. Because the individual functional datasets were all aligned to the
same stereotactic reference space, a random effects group analysis was then performed at the
second level using SPM8. For the whole brain analysis, no cluster emerged for any of the main
effects or their interaction with a probability of p<.05, corrected for multiple comparisons using
Gaussian random field theory and false discovery rate adjustment (q = .05; Chumbley & Friston,
2009).
We derived anatomically defined regions of interest (ROI) from the Automated
Anatomical Labelling library (Tzourio-Mazoyer et al., 2002). The chosen ROIs are those where
overlapping activation sites between music harmony and language syntax have been reported
(see Introduction): bilateral superior temporal gyrus and bilateral Broca’s area divided into pars
opercularis, pars triangularis, and pars orbitalis. The Marseille ROI toolbox version 0.42 (Brett,
Anton, Valabregue, & Poline, 2002) was used to derive average activation levels across voxels
in each ROI based on contrast values generated during the first level analysis with SPM8.
Inferential analyses on the ROI data were carried out using random permutation based
tests which require no parametric assumptions and have been found to be very powerful tests
for neuroimaging data (Nichols & Holmes, 2002). For the dependent t-test this amounts to
creating a null hypothesis t-distribution by randomly applying condition labels to data points
within each participant 20,000 times and testing the effect of interest on the randomised data
Page 12 of 26
each time. The proportion of randomly obtained t-values equal or greater than the true t-value
represents the likelihood of obtaining the t-statistic under the null hypothesis, i.e. the p-value.
Similarly, the random permutation based ANOVA randomised labels within each participant but
otherwise in an unrestricted way across experimental factors (Manly, 2007). Planned
comparisons were Bonferroni corrected and only the corrected p-values are reported.
Procedure
The experiment was run using Presentation software (http://www.neurobs.com, version 16.2).
The auditory stimuli were played to the participant using MR-compatible Sensimetrics Insert
Earphones (Model S14) at a comfortable level. Of each stimulus set, each participant heard only
one music version but both linguistic versions, i.e. a total of 240 trials. The stimuli were ordered
randomly with the following constraints: (1) no more than three times the same correct answer,
(2) no more than three times the same music condition, (3) no more than three times the same
language condition, (4) at least ten trials between any stimulus set’s SR and OR versions, (5) at
least ten trials between any two songs with the same verb and sentence ending. Every three
participants a new pseudorandomized stimulus order was used. Within each such
participant-triplet, for each trial the musical condition was counterbalanced. Thus, before participant
rejection, the three different music conditions were balanced across participants in terms of trial
position.
Participants were asked to concentrate on the linguistic dimension of the songs. They
were not asked to do a musical task. The experiment was organised as follows. Four example
trials preceded the experimental session. Experimental trials were divided into eight blocks of
30. After four blocks participants could rest for approximately ten minutes while an anatomical
MRI scan was acquired.
Each trial was organised as follows. After a song was played a comprehension prompt
was displayed visually through a projector from outside the scanner room. Subjects saw it
through a mirror attached to the head-coil. Within 10,000ms they had to press a button to
indicate whether the prompt matched the preceding song’s sentence or not. Except for the
example trials, no feedback was given. In order to ensure that participants would process the
full sentences, half the comprehension prompts checked for matrix clause understanding. The
other half focussed on the relative clause. Because of concerns about a possible verb matching
strategy whereby a comprehension prompt is true if it includes the same verb form as in the
song, we also created (1) more challenging passive prompts and (2) prompts with ‘someone’
Page 13 of 26
(‘iemand’) as a singular subject possibly representing either a plural or a singular noun phrase in
the song. Within each comprehension prompt version half the prompts matched the songs.
Stimulus onset was jittered with respect to volume acquisition by varying the intertrial
interval between 3500 and 6000ms in steps of 1ms. During the intertrial interval as well as
during the song presentation a fixation cross was displayed centrally. An experimental session
lasted approximately 100 minutes.
Page 14 of 26
RESULTS
Behavioural Results
Overall, participants were 77% accurate in their comprehension prompt judgements (range
60.17% - 90.01%). All scored above the maximal performance level expected by chance (56%
correct). Figure 2 shows the mean accuracies across the six conditions pooled over all
comprehension prompts. A 2(prompt type: matrix or relative clause) × 2(language) × 3(music)
dependent ANOVA revealed three effects. First, there was a main effect of prompt type
[F(1,20)=165.727, p<.001, p
η
2=.892], such that prompts targeting main clause understanding
were easier to answer (87%) than prompts targeting relative clause understanding (66%).
Furthermore, a main effect of linguistic condition was found [F(1,20)=51.987, p<.001, p
η
2=.722 ]
indicating that prompts after SR sentences were answered more accurately (84%) than those
after OR sentences (69%). Furthermore, these two main effects interacted [F(1,20)=52.308,
p<.001,
pη
2=.723 ]. Follow-up t-tests revealed that the difference between SR and OR sentences
is significant for both kinds of prompts albeit larger for those targeting relative clause
comprehension [t(20)=7.523, p<.01] than those targeting main clause comprehension
[t(20)=4.060, p<.01]. It should be noted that both in the overall ANOVA and in a separate
Page 15 of 26
ANOVA analysing only relative clause comprehension prompts, there was no main effect of
music, nor did music interact with any of the other conditions (ps>.3).
fMRI Results
Figure 3. fMRI Results in the Right Hemisphere. A) ROI location. B) The music main effect.
Note that the zero point of the BOLD signal in B), i.e. the implicit baseline, is not informative as it includes song beginnings and challenging comprehension prompts rather than just rest. Error = SEM. P-values of follow-up t-tests are Bonferroni adjusted.
Separate randomisation based two way ANOVAs revealed a main effect in right Broca’s area
pars triangularis for music (F = 3.41, p<.05; see Figure 3). This music main effect is the result of
the auditory anomaly condition resulting in marginally greater activation compared to the
out-of-key condition (t=2.45; p<.07). The contrast with the in-out-of-key condition did not reach significance
(t=1.89; p>.2), nor did the in-key vs. out-of-key contrast (t<1). Left Broca’s area pars opercularis
showed a main effect of language (F=5.11, p<.05) indicating greater activation in the OR
condition compared to the SR condition; see Figure 4B.
Page 16 of 26
Figure 4. fMRI Results in the Left Hemisphere. A) ROI location. B) The language main effect.
C) The language × music interaction. Note that the zero point of the BOLD signal in B), i.e. the implicit baseline, is not informative. Error = SEM. P-values of follow-up t-tests are Bonferroni adjusted.
Crucially, the language × music interaction was found in left Broca’s area pars
triangularis (F=4.30, p<.05) and left Broca’s area pars orbitalis (F=5.98, p<.01). Follow-up t-tests
showed that these interactions emerged because the OR>SR contrast only reached
significance in the out-of-key condition (left Broca’s area pars triangularis: t=3.02, p<.02; left
Broca’s area pars orbitalis: t=3.24, p<.02). We also found a language × music interaction in the
left superior temporal gyrus (F=3.82, p<.05). However, in this case, there was only a marginal,
reversed simple language main effect in the auditory anomaly condition. In neither of the other
two music conditions did we find a significant simple language main effect in the left superior
temporal gyrus (ps>.05).
Superior Temporal Gyrus
pars opercularis pars triangularis pars orbitalis Broca’s area
A)
LB)
Language
Broca’s pars opercularis
Superior Temporal Gyrus Broca’s pars triangularis Broca’s pars orbitalis