The neurodynamics of auditory letter perception in blind and sighted humans

(1)

Research Project Report

The neurodynamics of auditory letter

perception in blind and sighted humans

Massachusetts Institute of Technology

Computer Science and Artificial Intelligence Laboratory

Oliva Lab for Computational Perception & Cognition

17 February 2015 - 17 August 2015 (36 EC)

Research Master Brain and Cognitive Sciences

Cognitive Neuroscience track

University of Amsterdam

Student ID 10629653

Author

Verena R. Sommer, BSc

Supervisor

Santani Teng, PhD

Co-assessor &

(2)

Abstract

Information can reach the brain via multiple modalities. The current experiment ac-companies a main study investigating whether stimuli containing the same information in different sensory modalities are represented by the brain in the same way, i.e. modality-independently. In magnetoencephalography (MEG) experiments we found differences in the neural dynamics of Braille reading in blind versus visual reading in sighted par-ticipants. To investigate whether these dissimilarities are due to processing information from different modalities or inherent population differences, here we presented identical stimuli, i.e. auditory letters, to both the blind and sighted group, the results of which are presented in this paper. Using multivariate pattern analyses (MVPA) and represen-tational similarity analyses (RSA), we decoded letter identity from the MEG responses, and thus compared the brain representations of blind and sighted subjects listening to spoken letters. We found major similarities in (1) the time courses of decoding accura-cies, (2) the temporal generalization behavior of the classifiers, and (3) the correlations of representational patterns of letter processing. These results reveal great homogeneity in the neurodynamics of auditory letter perception in blind and sighted humans, suggest-ing that there are modality-dependent differences in processsuggest-ing information from tactile (Braille) and visual stimuli, rather than intrinsic population differences between the blind and sighted.

Keywords: multivariate pattern analysis MVPA, representational dissimilarity analysis RSA, magnetoencephalography MEG, linear pattern classification, letter decoding, time-time analysis, multimodal sensory processing, auditory perception, neural representation, modality dependence

(3)

1 Introduction

How the brain processes information from different sensory modalities is an essential and

frequently studied question in cognitive neuroscience. There has been great advancement

in the unraveling of the neural mechanisms underlying, e.g. visual and auditory perception,

both on the uni- and multimodal level. Areas associated with visual letter perception

(in-ferior occipital-temporal cortex), auditory letter perception (superior temporal cortex), and

the multimodal integration of visual and auditory letter processing (superior temporal

sul-cus and gyrus) have been identified (van Atteveldt et al., 2004). Furthermore,

modality-dependent processing differences have been studied in extreme cases of sensory loss, such

as blindness and deafness. Braille reading in blind humans is among numerous non-visual

tasks associated with activity in ”visual” cortical regions (Merabet and Pascual-Leone, 2010;

Pascual-Leone et al., 2005). Moreover, it has been shown that tactile discrimination tasks

acti-vate primary and secondary visual cortices in blind Braille readers but not in sighted subjects

(Sadato et al., 1996).

However, all these results from prior neuroimaging studies are based on univariate

analy-ses and do not allow conclusions about the representational structure of information carried by

those brain regions. In 2001, Haxby et al. first described multivariate pattern analysis (MVPA)

in functional magnetic resonance imaging (fMRI) data instead of traditional univariate

analy-sis, e.g. with a general linear model (GLM) (Haxby et al., 2001). This enabled the

investiga-tion of how informainvestiga-tion is encoded in neural activity patterns, rather than simply where in the

brain particular processes are performed (Haxby, 2012). Furthermore, MVPA allows

compar-ison of different response patterns, and to investigate their similarity or dissimilarity, a method

called representational similarity analysis (RSA) (Kriegeskorte et al., 2008). Several studies

successfully used representational dissimilarity matrices (RDM) in order to analyze brain data

beyond univariate patterns (Kriegeskorte et al., 2008). These methods are now applied in

nu-merous studies, mainly using machine-learning classifiers to decode brain activity data, i.e.

(5)

be reconstructed (decoded) from non-invasive measures of brain activity (Haynes and Rees,

2006; Tong and Pratte, 2012). These decoding techniques have primarily been used in fMRI

studies, to characterize the information given by particular regions (Haxby, 2012; Norman

et al., 2006). Recently, classifiers have also been used in time-resolved methods, including

neural recordings, electroencephalography (EEG) and magnetoencephalography (MEG), to

characterize the temporal information carried by a given representation (Cichy et al., 2014;

Carlson et al., 2013; Isik et al., 2014; King and Dehaene, 2014). The way trained pattern

classifiers generalize over time and conditions can provide an understanding of the temporal

structure of information processing (King and Dehaene, 2014). These techniques have been

used to study the activity patterns associated with numerous brain functions, especially

sen-sory processing, for example, vision (Carlson et al., 2013) and audition (King and Dehaene,

2014).

What has not been examined before is whether they also allow comparison of the structure

of information assessed through different sensory modalities, such as the tactile, visual, and

auditory modality, in both sighted and blind humans. Furthermore, very few studies have

investigated the neural mechanisms of letter recognition in general, and Braille-related neural

activation in the blind in particular using MVPA or similarity analysis. With our research we

addressed these questions and investigated the modality dependence of temporal information

structure. The present study complements a larger study comparing Braille and visual letter

processing (preliminary results in Teng et al. (2015)). In the main study, we compared the

MEG responses to letter presentation in different modalities between blind and sighted groups.

We reasoned that if Braille-visual processing differences reflected sensory processing rather

than inherent group differences, then presenting the same modality to both groups should

elicit strongly similar responses. Thus, we conducted an MEG study investigating auditory

letter processing in both blind and sighted subjects using the same general paradigm as in the

(6)

2 Materials and Methods

2.1 Subjects

Fourteen volunteers (seven female) with self-reported normal hearing participated in the study,

comprising seven subjects (three female) in the sighted group and seven subjects (four female)

in the blind group. The subjects’ age ranged from 18 to 36 years (mean ± SD: 28.07 ± 5.06

years). Table 1 shows a detailed summary of the subjects’ demographic data. All blind

sub-jects were congenitally blind and proficient Braille readers. All sighted subsub-jects had normal

or corrected-to-normal vision. All participants were native, bilingual, or very early English

speakers.

Table 1: Demographic data of all, sighted, and blind participants in the experiment. Min = minimum, Max = maximum, SD = standard deviation of the mean.

Total Sighted Blind (n=14) (n=7) (n=7) Gender Female 7 3 4 Male 7 4 3 Age Min 18 18 24 Max 36 36 34 Mean 28.07 28.57 27.57 SD 5.06 6.63 3.31

The subjects gave written informed consent after getting written and verbal information

about the experimental procedures. For the blind subjects the consent form was available in

digital form that they could read with a screen-reader before the day of the experiment, as well

as in Braille once they were in the laboratory. Participants were financially compensated for

taking part in the study and free to withdraw from the experiment at any time. The study was

reviewed and approved by the Committee on the Use of Humans as Experimental Subjects

(COUHES ) at the Massachusetts Institute of Technology (MIT). Sighted participants were

(7)

and Cognitive Sciences at MIT, and blind participants were recruited via contacts from

previ-ous experiments and by word-of-mouth advertising. All of the blind and three of the sighted

subjects had already participated in the Braille and visual portions of the study, respectively.

2.2 Experimental design

Stimuli. Stimuli were audio recordings of twelve spoken letters of the American English

al-phabet (namely B, C, D, E, L, M, N, O, V, X, Y, Z), with E and O as target letters for a

discrimination task. We used this subselection of the alphabet to gain greater power by more

repetitions per letter. The same subselection approach was also used in the Braille part of the

main study. The duration of the non-target letter presentation ranged between 349 ms and 500

ms depending on the letter (mean: 423.2 ms). The target letter sounds lasted for 342 ms and

360 ms, respectively.

Procedure. The letters were presented in random order, with target stimuli (letters E and

O) occurring every three, four, or five (on average every four) trials. In this way, each run

contained on average 25 target stimuli, randomly selected between E and O for each. The task

was to respond to each target letter with button press; target stimulus trials were excluded from

further analyses. Each of the ten target letters appeared ten times per run, i.e. 100

non-target letters per run in total. Stimulus onset asynchrony (SOA) was 1000 to 1100 ms after a

non-target letter, and 2000 ms after a target letter. Eye-tracking served as additional control of

the sighted participants’ vigilance. Eye-tracking was not performed on blind subjects because

they might have no eyes, or involuntary eye movements. After each run, the subjects could

take a break and decide on their own when to continue with the next run. The experiment

contained twelve runs in total and, including subject preparation, lasted approximately one

(8)

Figure 1: Experimental design. Example of a sequence of events of two letter stimuli: B (non-target stimulus) and O (target stimulus). A run starts with the button press by the subject and comprises presentation of on average 125 stimuli with stimulus onset asynchrony (SOA) between 1000 and 1100 ms after a non-target letter, and 2000 ms after a target letter. Button press is desired after a target stimulus (grey button icon) and required to start a run (black button icon). Loudspeaker icons indicate whether or not a stimulus is presented to the subject. The experiment comprises twelve runs. For blind subjects, the instructions were given and confirmed verbally before the start of the experiment.

2.3 MEG preparation and acquisition

Data acquisition took place in the magnetoencephalography (MEG) laboratory of the

McGov-ern Institute for Brain Research at MIT. After giving written consent, subjects were prepared

for the MEG experiment. Five head position indicator coils were placed on the subjects’ heads,

and their positions as well as the positions of three fiducials and multiple other points over

the head surface were digitized. After that, participants were positioned in the MEG scanner.

Continuous MEG signals from 306 channels (204 planar gradiometers, 102 magnetometers,

(9)

and filtered between 0.03 and 330 Hz. We preprocessed the raw data with spatiotemporal

fil-ters (maxfilter software, Elekta, Stockholm). Data preprocessing and analysis were performed

with Brainstorm (Tadel et al., 2011) and custom analysis code in MATLAB (Release 2014a,

The MathWorks, Inc., Natick, MA, USA). We extracted the MEG trials with a 200 ms baseline

and 1000 ms post-stimulus (i.e. 1201 ms length), removed the baseline mean of each channel,

and applied a low-pass filter at 30 Hz.

2.4 Statistical analysis

2.4.1 Multivariate analysis of MEG data

Multivariate analysis of MEG data (see Figure 2) was performed using linear support vector

machines (SVM; Chang and Lin (2011); M¨uller et al. (2001)). SVM analysis was conducted

for each participant independently. MEG data were arranged in the form of 306-dimensional

measurement vectors for each time point (200 ms before to 1000 ms after stimulus onset),

generating pattern vectors for each time point and condition. To improve the signal-to-noise

ratio and manage computational load, the pattern vectors were subaveraged in random groups

of 10, and these N subaverages were used in further analyses. The SVM classifier was trained

to decode pairwise any two conditions (letters) using supervised learning with a

leave-one-out cross-validation procedure. That is, for each time point and condition pair, the training set

included N-1 pattern vectors (subaverages), whereas the remaining Nth pattern vector served

as testing set. We evaluated the classifier’s performance to separate the two conditions, and

repeated this procedure 100 times by randomly reassigning the data to training and testing

sets. The average accuracy of the 100 permutations served as the pairwise decoding accuracy

for that time point. In this way, the overall decoding accuracy of the classifier was assessed for

each pair of the ten stimulus conditions. From this, we generated a 10 x 10 matrix of pairwise

decoding accuracies for each time point. The higher the decoding accuracy for two conditions

the more dissimilar are the conditions. This representational dissimilarity matrix (RDM)

(10)

accuracies from each time point can then be plotted to obtain the time course of classification

accuracy, including the standard deviations of the mean from all pairwise accuracies at each

time point.

Figure 2: Multivariate analysis of MEG data. The values from all 306 channels of the MEG raw data at time point t were extracted, and arranged in the pattern vector of t. In this way, one pattern vector per trial per condition was generated, for each pairwise comparison. To classify between two conditions, e.g. letters M and V, the pattern vectors (subaverages) were randomly assigned to a training set and a testing set via leave-one-out cross-validation. The support vector machine (SVM) learned to discriminate each pair of conditions with the training set and its accuracy was tested on the testing set. This procedure was repeated 100 times, each time randomly assigning new training and testing sets. The classification accuracy for each pair was mapped in a representational dissimilarity matrix (RDM) of time point t. By averaging the decoding accuracies of all pairs the overall accuracy of time point t can be calculated. Doing this for all time points resulted in the time course of classification accuracy. Figure adapted and modified from Cichy et al. (2014) and Teng et al. (2015).

(11)

2.4.2 Temporal generalization (time-time decoding)

In addition to the one-dimensional analysis of letter decodability (see 2.4.1), we also

investi-gated the presence and disparity of transient and persistent neural activity during letter

percep-tion. Persistent representations could be important for preserving the outcomes of particular

neural processes. To do so, we conducted a time-time decoding analysis (Cichy et al., 2014),

also called temporal generalization method (King and Dehaene, 2014), as illustrated in Figure

3. Because the 1-d time course analysis trains and tests at the same time point, it only

mea-sures instantaneous decodability, i.e. decodability at that time point, whereas the time-time

analysis can distinguish dynamics over time.

Figure 3: Time-time decoding. The values from all 306 channels of the MEG raw data at time points txand ty were extracted, and arranged in pattern vectors. We trained an SVM to

discriminate between the brain responses of each pair of conditions at time point txand tested

its accuracy on responses to the same letter at a different time point ty. All pairs of conditions

were classified and averaged to the overall decoding accuracy. The average decoding accuracy was mapped at point tx,ty of a time-time decoding matrix. This was done for all pairs of time

(12)

To distinguish between MEG signals induced by persistent and transient neural

represen-tations, we trained an SVM at one time point and tested it at the other time points (see Figure

3). This shows how well the classifier can generalize over time, and thus if persistent

simi-larities exist in the auditory representations of letter perception. We performed all pairwise

classifications of conditions and obtained one MEG decoding matrix for every pair of

train-ing time point and testtrain-ing time point. Averagtrain-ing across the decodtrain-ing matrix for each train-test

combination, and then repeating the procedure for all train-test combinations of time points,

resulted in a time-time decoding matrix (temporal generalization matrix).

A time-time decoding matrix yields detailed information about the temporal structure of

information processing in the brain. Figure 4 illustrates two possible outcomes of the

time-time analysis, namely underlying transient and persistent neural activity (for more examples

and simulations see King and Dehaene (2014)). A diagonal-shaped decoding performance

demonstrates high decoding accuracy at small distances between training and testing time

points, and poorer accuracy at larger distances, thus generalization over a small amount of time

only. This means that the neural processes are dynamically changing over time, i.e. transient.

A square-shaped time-time decoding performance illustrates high classification accuracy for

nearby as well as distant time points, indicating greater temporal generalization and thus more

sustained underlying brain processes.

Figure 4: Conceptual sketch of temporally persistent and transient decoding performance in a time-time decoding matrix. The different generalization behaviors of a classifier illustrate distinct temporal structures of the underlying brain processes. Figure adapted and modified from Teng et al. (2015).

(13)

2.4.3 Within- and between-group correlations

The time-time method works within individuals, but because the MEG signal is highly

sensi-tive to small variations in brain topography and head position, it is inappropriate for

classifi-cation across individuals. Thus, we compared the MEG decoding matrices between subjects

to assess across-subject similarities in the neural representation of letters.

For this, we performed Spearman rank correlations, which are conservative and do not

assume normally distributed data, within and between the blind and sighted groups. The

averaged MEG representational dissimilarity matrices (RDMs) of each time point (from 200

ms before until 1000 ms after stimulus presentation) yield the time course of classification

accuracy (see 2.4.1). We correlated the unaveraged RDMs of each two subjects and thus

obtained a correlation matrix for each pair of subjects (see Figure 5). We then averaged the

correlation matrices to obtain the mean within- and between-group correlations. In this way,

we can investigate if and when representations are (in-)consistent within a group and

(dis-)similar across groups.

2.4.4 Behavioral performance

The subjects’ task was to respond to every occurrence of the letters E and O by button press.

To measure the participants’ performance, we calculated the hit, correct rejection, false alarm,

and miss rates in percent, and from this the sensitivity index d’. The task mainly served as

(14)

Figure 5: Spearman correlation of two subjects. The representational dissimilarity matrices (RDMs) of the classified MEG data from each pair of subjects (with one subject on the x-axis and the other subject on the y-x-axis) were correlated across time points resulting in a color coded correlation matrix with warmer colors representing higher correlation coefficients (Spearman’s ρ). Figure from Teng et al. (2015).

2.4.5 Significance testing

To test for significances in the MEG data analyses, we used non-parametric statistical tests that

do not make assumptions about the data distribution. For the MEG decoding time courses,

we performed Wilcoxon signed-rank tests that tested the null hypothesis of no experimental

effect, i.e. 50% (chance level) decoding accuracy, at each time point (significance level α =

0.05). We corrected for multiple comparisons using the false discovery rate (FDR; Benjamini

and Hochberg (1995)). We determined the threshold of significant decodability of neural

activity as ten consecutive time points (10 ms) of decoding accuracies significantly above

50%, and analogously, ten consecutive time points of decoding accuracies not significantly

above 50% as threshold for breaks in the decodability (Safford et al., 2010). By bootstrapping

(15)

mean peak latencies and onsets of decodability and report the mean ± standard deviation of

the bootstrap distribution.

To compare the MEG decoding time courses between groups we used a two-sided Wilcoxon

rank-sum test (Mann-Whitney U test). To test for significant within- and between-group

cor-relations between patterns of decoding accuracies (RDMs), we performed Wilcoxon

signed-rank tests at each time point combinations with the null hypothesis of no correlation, i.e. ρ =

0 (significance level α = 0.01). FDR was used to correct for multiple comparisons.

To test for group differences in the subjects’ task performance two-sided Student’s t-tests

were performed with a significance level of α = 0.05.

All analyses and tests were performed in MATLAB (Release 2014a or 2015a, The

(16)

3 Results

3.1 Time course of letter decoding

The average time courses of decoding accuracies show strong similarities in auditory letter

classification between the sighted and the blind group (Figure 6A), whereas they show large

differences between the visual and Braille letter classification from our main study (Figure

6B). Before stimulus presentation, the average decoding accuracies fluctuate around chance

level (50%). Just after stimulus onset (0 ms) the mean auditory letter classification accuracies

increase steeply, reaching significance at 29 ± 19.3 ms (SD of the bootstrap distribution) for

the sighted, and 49 ± 24.9 ms for the blind. This is the time point when the MEG signals

begin to distinguish between the different letter conditions. Peak decodability, i.e. the time

point for which the individual auditory representations are most distinctive in terms of linear

separability, is reached at 103 ± 7.5 ms in the sighted group with 84.7 ± 6.3% decoding

accuracy. After that, the classification accuracy in the sighted group decreases gradually, with

interruptions in significance at 626-672 ms and 756-841 ms, until the letters are not decodable

any more after 862 ms. In contrast, the blind group’s decoding accuracy reaches a first peak

of 77.3 ± 10.3% at 105 ± 59.6 ms , followed by a short decline and a second lower peak of 75

± 6.1% at 256 ms. Overall, the blind group exhibits significantly lower decoding accuracies than the sighted group (two-sided Wilcoxon rank-sum test: p = 0.0085) without difference in

variation (p = 0.0877).

In contrast to the auditory letter classification, visual and Braille letter classifications show

large dissimilarities in the overall decoding time courses (Figure 6B). Whereas visual letter

decoding exhibits a more rapid onset, higher peak decodability, and a steep fall, the Braille

(17)

Figure 6: Time courses of the overall SVM classifications. A The average auditory letter decoding accuracies of 7 sighted (left) and 7 blind (right) subjects plotted from 200 ms before to 1000 ms after stimulus onset (0 ms). Blue and red bar on top denote time points significantly (p < 0.05) different from 50% chance level (dotted line), shading indicates standard deviation of the mean, gray bar at the bottom illustrates mean stimulus duration. B The average letter decoding time course of 13 sighted subjects reading visual letters (left) and 11 blind subjects reading Braille letters (right) from the main study.

3.2 Temporal generalization of letter decoding

3.2.1 Transient neural activity

The mean time-time decoding matrices of the sighted and blind participants (Figure 7) show

(18)

more distant time points, and thus neural activity during letter processing is mainly transient

and dynamic (King and Dehaene, 2014). The diagonals of the time-time decoding matrices

illustrate the same information as the time course of decoding accuracy described in the

previ-ous section, i.e. training and testing at the same time points (see 3.1). In this way of graphical

presentation it is especially apparent that after the first peak at approximately 100 ms, the

classification accuracy gradually decreases during and beyond stimulus presentation in the

sighted group, whereas it first gradually decreases, then peaks a second time at approximately

250 ms, and then declines again in the blind group.

Figure 7: Time-time MEG decoding matrices. Heat map of the average temporal general-ization of letter classification for 7 sighted (left) and 7 blind (right) subjects. The color code illustrates decoding accuracy of the SVM classifier for 200 ms before to 1000 ms after stimulus onset (0 ms) of training time (x-axis) and testing (generalization) time (y-axis), with warmer colors representing higher classification accuracy. The dark bars at the axes show average stimulus duration.

(19)

3.2.2 Persistent neural activity

In addition to transient neural activity, we also detected evidence for persistent neural activity,

which is illustrated in the time-time decoding matrices at time point combinations away from

the diagonal (Figure 7). Overall, the decoding accuracies away from the diagonal are at chance

level (50%). However, at point combinations between approximately 300 ms and 600 ms time

for the sighted, and approximately 350 ms and 700 ms for the blind, the diagonals broaden,

meaning higher temporal generalization at the end and after stimulus presentation. In addition

to a wider diagonal, a square of lighter blue (higher decoding accuracies) as compared to the

rest of time points away from the diagonal is observable in the blind group after approximately

350 ms until 1000 ms. Also only apparent in the blind group is a small increase of temporal

generalization at the time of the second peak decodability (at ∼250 ms).

Additionally, small regions of below-chance predictions are observable away from the

di-agonal shortly after the onset of stimulus presentation. In the sighted group, these dips of

decodability are relatively deep (down to 28.6% at 145 ms training time and 90 ms testing

time) and short, whereas in the blind group the decrease is less extreme and more sustained,

lasting until approximately 300 ms after stimulus onset.

3.3 Comparing letter representation within and between groups

We conducted Spearman rank correlations of the letter classifications across time within the

sighted and blind groups, respectively, as well as between the two groups. The resulting

matri-ces of the average correlations (Spearman’s ρ) of each pair of subjects and the corresponding

significance matrices are shown in Figure 8 for the within-group comparisons and in Figure

(20)

Consistency within the sighted group

Correlation within the sighted group is significantly elevated between approximately 50 ms

and 150 ms after stimulus onset (see square-like shaped significance cluster in Figure 8B).

Maximal similarity in letter representation between the sighted subjects is at 100 ms after

stimulus onset (ρ = 0.61 ± 0.15). Besides this cluster of significance shortly after

stimu-lus onset, there are several smaller cstimu-lusters that show significant correlation later on, mostly

around the diagonal. The correlation is generally higher in the time of stimulus presentation.

The last time point of significant correlation is at ∼570 ms with ρ = 0.18 (± 0.12).

Consistency within the blind group

The auditory letter representations between blind subjects start to correlate significantly at

ap-proximately 100 ms after stimulus onset, with maximal correlation of ρ = 0.41 (± 0.23) at 105

ms. A relatively large cluster of significant correlation is observable between approximately

100 ms and 190 ms on the x-axis and approximately 100 ms and 250 ms on the y-axis. This

asymmetric correlation cluster could indicate one-sided similarity of different time points in

the course of letter perception among the blind subjects. However, it could also be noise

be-cause the same subjects can be on both x and y axis during the pairwise correlations. It is

possible that more asymmetrical RDM correlations indicate more general variability in the

population. The letter representations in the blind correlate with each other at more time

points during and after stimulus presentation, also relatively far away from the diagonal, as

(21)

Figure 8: A Within-group correlations. The average correlations between patterns of decoding accuracies (RDMs) across time of each pair of subjects within the sighted (left; n = 7) and the blind (right; n = 7) group plotted as heat maps with warmer colors representing higher correlation coefficients ρ. The dark bars at the axes show average stimulus duration. (Note the different color coding keys for the sighted and blind correlations.) B Significance of within-group correlations. The outcomes of corrected Wilcoxon signed-rank tests at each time point of the within-sighted (left) and within-blind (right) correlations are illustrated with yellow color denoting significance (p < 0.01).

Similarity between the sighted and blind groups

The correlation of letter representations between the sighted and blind group reveals the time

(22)

with ρ = 0.49. In general, the sighted and blind groups show significantly elevated

correla-tion around the diagonal between approximately 50 ms and 450 ms. Furthermore, this cluster

of significance also includes time points more distant from the diagonal, namely at

approx-imately 100-200 ms for the sighted participants and 200-450 ms for the blind participants,

but not vice versa (see triangle-like shaped correlation and significance cluster in Figure 9).

In addition to this large cluster, significant correlation occurs at numerous other time points,

indicating large similarity between auditory letter perception in sighted and blind subjects at

many time points throughout and beyond stimulus presentation.

Figure 9: A Between-group correlation. The average correlations between patterns of de-coding accuracies (RDMs) across time of each pair of subjects between sighted (x-axis; n = 7) and blind (y-axis; n = 7) plotted as heat maps with warmer colors representing higher correlation coefficients ρ. The dark bars at the axes show average stimulus duration. B Signif-icance of between-group correlations. The outcomes of corrected Wilcoxon signed-rank tests at each time point of the between-group correlations are illustrated with yellow color denoting significance (p < 0.01) and green denoting no significance (n.s.).

(23)

3.4 Behavioral performance

The subjects performed well in the vigilance task, indicating that they were consciously

listen-ing to the spoken letters at all times. One sighted subject’s behavioral data were not recorded

correctly due to computer failure and had to be excluded from the analysis. Figure 10 shows

the average rates of hits, correct rejections, false alarms, and misses, as well as the sensitivity

index d’ for the sighted and blind participants, respectively. The two groups did not differ

significantly in any of the performance measures.

Figure 10: Task performance of the sighted and blind group. Average hit, correct rejection, false alarm, and miss rate in percent (left), and sensitivity index d’ (right) for the sighted (n = 6) and blind (n = 7) group, respectively. Error bars indicate standard deviation of the mean. Differences between the groups are not significant.

(24)

4 Discussion

Using multivariate pattern analysis (MVPA) and representational similarity analysis (RSA)

methods on MEG data, we decoded the identity of auditory spoken letters presented to blind

and sighted humans. We showed that letters were distinguishable significantly above chance

from stimulus onset until after stimulus offset. In the main study that compared letter

per-ception in the tactile and visual modalities between blind and sighted subjects, respectively,

we hypothesized that the observed differences between Braille and visual letter processing

were due to processing in different modalities, and not due to general differences between the

brains of the two populations. Hence, we reasoned that processing of identical stimuli, namely

auditory, should be similar between the two groups.

Consistent with our predictions, we found similar structures in the decoding time courses

between blind and sighted subjects. The overall shape of the time courses are very alike,

especially in comparison to the relatively large differences between the decoding time courses

of Braille versus visual letters (Teng et al., 2015). Both the blind and sighted decoding profiles

show a fast and high increase in decodability right after stimulus onset, followed by a less steep

falloff after approximately 100 ms until decodability is back at baseline after stimulus offset.

This suggests equivalent temporal structures of the neural representations between the blind

and sighted while listening to spoken letters. However, we found some differences as well.

The blind decoding time course exhibits a second, lower peak shortly after the initial peak

decodability that is not apparent in the sighted. This might suggest an additional processing

step in the course of stimulus recognition that is not present in the sighted group, or peak

decodability that is generally more plateau-like and sustained over time in comparison to the

clearer peak, followed by a rapid falloff in the sighted group. Nonetheless, this difference

is relatively small and larger group sizes are needed to elaborate on this phenomenon and

to determine its meaning. Furthermore, the blind group’s decoding accuracies are overall

slightly but significantly lower than the sighted group’s decoding accuracies. This might be a

(25)

for example, due to different possible comorbidities accompanying blindness. In addition, the

blind subjects cannot fixate whereas the sighted were asked to do so during the scan. This

experimental difference might thus have led to additionally noisier signals, leading to lower

classification accuracies.

In addition to the similar time courses of letter decoding, the time-time analyses of the

two groups revealed also similarity in the temporal generalization behavior of the letter

clas-sifier: It primarily generalizes well to close and poorly to distant time points with increased

generalization to distant time points at the end and after stimulus presentation. This indicates

that both the blind and the sighted show primarily transient neural activity during auditory

letter recognition with increased persistent activity at the end and after the stimulus. These

slower dynamics later on suggest a process that maintains the neural representations over

longer times, for instance, a working memory process.

One intriguing finding in the temporal generalization analysis are clusters of decoding

accuracies below chance. These occur at time points off the diagonal in the first 100 to

200 ms of stimulus presentation. The phenomenon of below-chance learning has been

re-ported in previous work (King and Dehaene, 2014; Carlson et al., 2011; Carlson et al., 2013;

Nikoli´c et al., 2009), however, its meaning is currently unclear. Possible interpretations are

oscillatory or reversing neural dynamics, in which activity patterns reappear with inverted

po-larity (King and Dehaene, 2014). Nevertheless, it remains poorly understood when and why

such inversions occur and it will be an interesting question for future research.

Besides decoding time courses and temporal generalizations, the RDM correlations

be-tween the blind and sighted groups demonstrate great similarities in the representational

pat-terns. Significant correlation at many time points during and beyond stimulus presentation

suggests that blind and sighted subjects represent auditory letters in a similar fashion. The

main cluster of high correlation is during letter presentation and triangle-shaped, with

pre-dominance above the diagonal. The asymmetric shape illustrates that early auditory letter

processing in the sighted is similar to early as well as later processing in the blind, whereas

(26)

in-dicate that the same or very similar brain processes can occur at different time points in blind

versus sighted participants. This asymmetry suggests that while activity elicited by the same

stimulus and task may exhibit subtly different time courses in blind and sighted participants,

it can involve the same or very similar neural representations. Taken together, all three means

of comparison, the time courses of decoding accuracies, the temporal generalizations, and the

correlations of representational patterns of letter processing, show similarities between the

sighted and blind.

Supplementary to our presented decoding analysis of temporal information structure,

fu-ture research may also provide an understanding of the spatial sources of the decoded signal.

This could be achieved by, for instance, source reconstruction from the MEG signal followed

by decoding at defined regions of interest to resolve in which brain areas the decodable

infor-mation originates (King and Dehaene, 2014), or combining MEG and fMRI in the

represen-tational similarity analyses and therewith integrate brain analyses in time and space (Cichy et

al., 2014).

In conclusion, our results demonstrate great homogeneity in the neural dynamics of

audi-tory letter processing in blind and sighted humans, suggesting that differences in Braille and

visual letter reading are not due to inherent population differences. Integrating the results from

investigating the brain mechanisms involved in processing multimodal stimuli, namely visual,

Braille, and auditory letters, aims to give answers to whether such stimuli are represented in a

independent way, and, if so, how stimulus information is transformed from

modality-specific to modality-independent. So far, preliminary results from the complete study (Teng et

al., 2015) suggest both disparate and common dynamic processes between Braille and visual

reading, as well as similar components of processing occurring at different times specific to

the modality. Since we showed that presenting auditory spoken letters to both groups elicited

strongly similar responses between groups, the differences are unlikely to be inherent group

differences. The present research thus contributed to promising new insights into the nature of

multimodal object recognition and the temporal structure of information processing in blind

(27)

Glossary

EEG Electroencephalography

FDR False Discovery Rate

fMRI Functional Magnetic Resonance Imaging

GLM General Linear Model

MEG Magnetoencephalography

MVPA Multivariate Pattern Analysis

RDM Representational Dissimilarity Matrix

RSA Representational Similarity Analysis

SOA Stimulus Onset Asynchrony

(28)

References

Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate: A Practical and

Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B

(Methodological) 57:289 – 300.

Carlson T, Tovar D, Alink A, Kriegeskorte N (2013) Representational dynamics of object

vision: The first 1000 ms. Journal of Vision 13:1–19.

Carlson TA, Hogendoorn H, Kanai R, Mesik J, Turret J (2011) High temporal resolution

decoding of object position and category. Journal of vision 11:1–17.

Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM

Trans-actions on Intelligent Systems and Technology 2 Software available athttp://www.csie.

ntu.edu.tw/~cjlin/libsvm.

Cichy RM, Pantazis D, Oliva A (2014) Resolving human object recognition in space and

time. Nature neuroscience 17:455–62.

Haxby JV, Gobbini MI, Furey ML, Ishai a, Schouten JL, Pietrini P (2001) Distributed and

overlapping representations of faces and objects in ventral temporal cortex. Science (New

York, N.Y.) 293:2425–2430.

Haxby JV (2012) Multivariate pattern analysis of fMRI: The early beginnings.

NeuroIm-age 62:852–855.

Haynes JD, Rees G (2006) Decoding mental states from brain activity in humans. Nature

reviews. Neuroscience 7:523–34.

Isik L, Meyers EM, Leibo JZ, Poggio T (2014) The dynamics of invariant object recognition

in the human visual system. Journal of neurophysiology 111:91–102.

King JR, Dehaene S (2014) Characterizing the dynamics of mental representations: The

(29)

Kriegeskorte N, Mur M, Bandettini P (2008) Representational similarity analysis -

connect-ing the branches of systems neuroscience. Frontiers in systems neuroscience 2:1–28.

Merabet LB, Pascual-Leone A (2010) Neural reorganization following sensory loss: the

opportunity of change. Nature Reviews Neuroscience 11:44–52.

Müller KR, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to

kernel-based learning algorithms. IEEE transactions on neural networks / a publication of the

IEEE Neural Networks Council 12:181–201.

Nikoli´c D, H¨ausler S, Singer W, Maass W (2009) Distributed fading memory for stimulus

properties in the primary visual cortex. PLoS biology 7:1–19.

Norman Ka, Polyn SM, Detre GJ, Haxby JV (2006) Beyond mind-reading: multi-voxel

pattern analysis of fMRI data. Trends in Cognitive Sciences 10:424–430.

Pascual-Leone A, Amedi A, Fregni F, Merabet LB (2005) The plastic human brain cortex.

Annual review of neuroscience 28:377–401.

Sadato N, Pascual-Leone A, Grafman J, Iba˜nez V, Deiber MP, Dold G, Hallett M (1996)

Acti-vation of the primary visual cortex by Braille reading in blind subjects. Nature 380:526–528.

Safford AS, Hussey EA, Parasuraman R, Thompson JC (2010) Object-Based Attentional

Modulation of Biological Motion Processing: Spatiotemporal Dynamics Using Functional

Magnetic Resonance Imaging and Electroencephalography 30:9064–9073.

Tadel F, Baillet S, Mosher JC, Pantazis D, Leahy RM (2011) Brainstorm: A user-friendly

application for MEG/EEG analysis. Computational Intelligence and Neuroscience 2011.

Teng S, Cichy RM, Pantazis D, Sommer V, Oliva A (2015) The neural dynamics of letter

per-ception in blind and sighted readers. Poster session presented at the Society for Neuroscience

(30)

Tong F, Pratte MS (2012) Decoding patterns of human brain activity. Annual review of

psychology 63:483–509.

van Atteveldt N, Formisano E, Goebel R, Blomert L (2004) Integration of Letters and Speech

The neurodynamics of auditory letter perception in blind and sighted humans

Research Project Report