Research Project Report
The neurodynamics of auditory letter
perception in blind and sighted humans
Massachusetts Institute of Technology
Computer Science and Artificial Intelligence Laboratory
Oliva Lab for Computational Perception & Cognition
17 February 2015 - 17 August 2015 (36 EC)
Research Master Brain and Cognitive Sciences
Cognitive Neuroscience track
University of Amsterdam
Student ID 10629653
Author
Verena R. Sommer, BSc
Supervisor
Santani Teng, PhD
Co-assessor &
Abstract
Information can reach the brain via multiple modalities. The current experiment ac-companies a main study investigating whether stimuli containing the same information in different sensory modalities are represented by the brain in the same way, i.e. modality-independently. In magnetoencephalography (MEG) experiments we found differences in the neural dynamics of Braille reading in blind versus visual reading in sighted par-ticipants. To investigate whether these dissimilarities are due to processing information from different modalities or inherent population differences, here we presented identical stimuli, i.e. auditory letters, to both the blind and sighted group, the results of which are presented in this paper. Using multivariate pattern analyses (MVPA) and represen-tational similarity analyses (RSA), we decoded letter identity from the MEG responses, and thus compared the brain representations of blind and sighted subjects listening to spoken letters. We found major similarities in (1) the time courses of decoding accura-cies, (2) the temporal generalization behavior of the classifiers, and (3) the correlations of representational patterns of letter processing. These results reveal great homogeneity in the neurodynamics of auditory letter perception in blind and sighted humans, suggest-ing that there are modality-dependent differences in processsuggest-ing information from tactile (Braille) and visual stimuli, rather than intrinsic population differences between the blind and sighted.
Keywords: multivariate pattern analysis MVPA, representational dissimilarity analysis RSA, magnetoencephalography MEG, linear pattern classification, letter decoding, time-time analysis, multimodal sensory processing, auditory perception, neural representation, modality dependence
Contents
1 Introduction 3
2 Materials and Methods 5
2.1 Subjects . . . 5
2.2 Experimental design . . . 6
2.3 MEG preparation and acquisition . . . 7
2.4 Statistical analysis . . . 8
2.4.1 Multivariate analysis of MEG data . . . 8
2.4.2 Temporal generalization (time-time decoding) . . . 10
2.4.3 Within- and between-group correlations . . . 12
2.4.4 Behavioral performance . . . 13
2.4.5 Significance testing . . . 13
3 Results 14 3.1 Time course of letter decoding . . . 14
3.2 Temporal generalization of letter decoding . . . 16
3.2.1 Transient neural activity . . . 16
3.2.2 Persistent neural activity . . . 17
3.3 Comparing letter representation within and between groups . . . 17
3.4 Behavioral performance . . . 21
4 Discussion 22
Glossary 25
1
Introduction
How the brain processes information from different sensory modalities is an essential and
frequently studied question in cognitive neuroscience. There has been great advancement
in the unraveling of the neural mechanisms underlying, e.g. visual and auditory perception,
both on the uni- and multimodal level. Areas associated with visual letter perception
(in-ferior occipital-temporal cortex), auditory letter perception (superior temporal cortex), and
the multimodal integration of visual and auditory letter processing (superior temporal
sul-cus and gyrus) have been identified (van Atteveldt et al., 2004). Furthermore,
modality-dependent processing differences have been studied in extreme cases of sensory loss, such
as blindness and deafness. Braille reading in blind humans is among numerous non-visual
tasks associated with activity in ”visual” cortical regions (Merabet and Pascual-Leone, 2010;
Pascual-Leone et al., 2005). Moreover, it has been shown that tactile discrimination tasks
acti-vate primary and secondary visual cortices in blind Braille readers but not in sighted subjects
(Sadato et al., 1996).
However, all these results from prior neuroimaging studies are based on univariate
analy-ses and do not allow conclusions about the representational structure of information carried by
those brain regions. In 2001, Haxby et al. first described multivariate pattern analysis (MVPA)
in functional magnetic resonance imaging (fMRI) data instead of traditional univariate
analy-sis, e.g. with a general linear model (GLM) (Haxby et al., 2001). This enabled the
investiga-tion of how informainvestiga-tion is encoded in neural activity patterns, rather than simply where in the
brain particular processes are performed (Haxby, 2012). Furthermore, MVPA allows
compar-ison of different response patterns, and to investigate their similarity or dissimilarity, a method
called representational similarity analysis (RSA) (Kriegeskorte et al., 2008). Several studies
successfully used representational dissimilarity matrices (RDM) in order to analyze brain data
beyond univariate patterns (Kriegeskorte et al., 2008). These methods are now applied in
nu-merous studies, mainly using machine-learning classifiers to decode brain activity data, i.e.
be reconstructed (decoded) from non-invasive measures of brain activity (Haynes and Rees,
2006; Tong and Pratte, 2012). These decoding techniques have primarily been used in fMRI
studies, to characterize the information given by particular regions (Haxby, 2012; Norman
et al., 2006). Recently, classifiers have also been used in time-resolved methods, including
neural recordings, electroencephalography (EEG) and magnetoencephalography (MEG), to
characterize the temporal information carried by a given representation (Cichy et al., 2014;
Carlson et al., 2013; Isik et al., 2014; King and Dehaene, 2014). The way trained pattern
classifiers generalize over time and conditions can provide an understanding of the temporal
structure of information processing (King and Dehaene, 2014). These techniques have been
used to study the activity patterns associated with numerous brain functions, especially
sen-sory processing, for example, vision (Carlson et al., 2013) and audition (King and Dehaene,
2014).
What has not been examined before is whether they also allow comparison of the structure
of information assessed through different sensory modalities, such as the tactile, visual, and
auditory modality, in both sighted and blind humans. Furthermore, very few studies have
investigated the neural mechanisms of letter recognition in general, and Braille-related neural
activation in the blind in particular using MVPA or similarity analysis. With our research we
addressed these questions and investigated the modality dependence of temporal information
structure. The present study complements a larger study comparing Braille and visual letter
processing (preliminary results in Teng et al. (2015)). In the main study, we compared the
MEG responses to letter presentation in different modalities between blind and sighted groups.
We reasoned that if Braille-visual processing differences reflected sensory processing rather
than inherent group differences, then presenting the same modality to both groups should
elicit strongly similar responses. Thus, we conducted an MEG study investigating auditory
letter processing in both blind and sighted subjects using the same general paradigm as in the
2
Materials and Methods
2.1
Subjects
Fourteen volunteers (seven female) with self-reported normal hearing participated in the study,
comprising seven subjects (three female) in the sighted group and seven subjects (four female)
in the blind group. The subjects’ age ranged from 18 to 36 years (mean ± SD: 28.07 ± 5.06
years). Table 1 shows a detailed summary of the subjects’ demographic data. All blind
sub-jects were congenitally blind and proficient Braille readers. All sighted subsub-jects had normal
or corrected-to-normal vision. All participants were native, bilingual, or very early English
speakers.
Table 1: Demographic data of all, sighted, and blind participants in the experiment. Min = minimum, Max = maximum, SD = standard deviation of the mean.
Total Sighted Blind (n=14) (n=7) (n=7) Gender Female 7 3 4 Male 7 4 3 Age Min 18 18 24 Max 36 36 34 Mean 28.07 28.57 27.57 SD 5.06 6.63 3.31
The subjects gave written informed consent after getting written and verbal information
about the experimental procedures. For the blind subjects the consent form was available in
digital form that they could read with a screen-reader before the day of the experiment, as well
as in Braille once they were in the laboratory. Participants were financially compensated for
taking part in the study and free to withdraw from the experiment at any time. The study was
reviewed and approved by the Committee on the Use of Humans as Experimental Subjects
(COUHES ) at the Massachusetts Institute of Technology (MIT). Sighted participants were
and Cognitive Sciences at MIT, and blind participants were recruited via contacts from
previ-ous experiments and by word-of-mouth advertising. All of the blind and three of the sighted
subjects had already participated in the Braille and visual portions of the study, respectively.
2.2
Experimental design
Stimuli. Stimuli were audio recordings of twelve spoken letters of the American English
al-phabet (namely B, C, D, E, L, M, N, O, V, X, Y, Z), with E and O as target letters for a
discrimination task. We used this subselection of the alphabet to gain greater power by more
repetitions per letter. The same subselection approach was also used in the Braille part of the
main study. The duration of the non-target letter presentation ranged between 349 ms and 500
ms depending on the letter (mean: 423.2 ms). The target letter sounds lasted for 342 ms and
360 ms, respectively.
Procedure. The letters were presented in random order, with target stimuli (letters E and
O) occurring every three, four, or five (on average every four) trials. In this way, each run
contained on average 25 target stimuli, randomly selected between E and O for each. The task
was to respond to each target letter with button press; target stimulus trials were excluded from
further analyses. Each of the ten target letters appeared ten times per run, i.e. 100
non-target letters per run in total. Stimulus onset asynchrony (SOA) was 1000 to 1100 ms after a
non-target letter, and 2000 ms after a target letter. Eye-tracking served as additional control of
the sighted participants’ vigilance. Eye-tracking was not performed on blind subjects because
they might have no eyes, or involuntary eye movements. After each run, the subjects could
take a break and decide on their own when to continue with the next run. The experiment
contained twelve runs in total and, including subject preparation, lasted approximately one
Figure 1: Experimental design. Example of a sequence of events of two letter stimuli: B (non-target stimulus) and O (target stimulus). A run starts with the button press by the subject and comprises presentation of on average 125 stimuli with stimulus onset asynchrony (SOA) between 1000 and 1100 ms after a non-target letter, and 2000 ms after a target letter. Button press is desired after a target stimulus (grey button icon) and required to start a run (black button icon). Loudspeaker icons indicate whether or not a stimulus is presented to the subject. The experiment comprises twelve runs. For blind subjects, the instructions were given and confirmed verbally before the start of the experiment.
2.3
MEG preparation and acquisition
Data acquisition took place in the magnetoencephalography (MEG) laboratory of the
McGov-ern Institute for Brain Research at MIT. After giving written consent, subjects were prepared
for the MEG experiment. Five head position indicator coils were placed on the subjects’ heads,
and their positions as well as the positions of three fiducials and multiple other points over
the head surface were digitized. After that, participants were positioned in the MEG scanner.
Continuous MEG signals from 306 channels (204 planar gradiometers, 102 magnetometers,
and filtered between 0.03 and 330 Hz. We preprocessed the raw data with spatiotemporal
fil-ters (maxfilter software, Elekta, Stockholm). Data preprocessing and analysis were performed
with Brainstorm (Tadel et al., 2011) and custom analysis code in MATLAB (Release 2014a,
The MathWorks, Inc., Natick, MA, USA). We extracted the MEG trials with a 200 ms baseline
and 1000 ms post-stimulus (i.e. 1201 ms length), removed the baseline mean of each channel,
and applied a low-pass filter at 30 Hz.
2.4
Statistical analysis
2.4.1 Multivariate analysis of MEG data
Multivariate analysis of MEG data (see Figure 2) was performed using linear support vector
machines (SVM; Chang and Lin (2011); M¨uller et al. (2001)). SVM analysis was conducted
for each participant independently. MEG data were arranged in the form of 306-dimensional
measurement vectors for each time point (200 ms before to 1000 ms after stimulus onset),
generating pattern vectors for each time point and condition. To improve the signal-to-noise
ratio and manage computational load, the pattern vectors were subaveraged in random groups
of 10, and these N subaverages were used in further analyses. The SVM classifier was trained
to decode pairwise any two conditions (letters) using supervised learning with a
leave-one-out cross-validation procedure. That is, for each time point and condition pair, the training set
included N-1 pattern vectors (subaverages), whereas the remaining Nth pattern vector served
as testing set. We evaluated the classifier’s performance to separate the two conditions, and
repeated this procedure 100 times by randomly reassigning the data to training and testing
sets. The average accuracy of the 100 permutations served as the pairwise decoding accuracy
for that time point. In this way, the overall decoding accuracy of the classifier was assessed for
each pair of the ten stimulus conditions. From this, we generated a 10 x 10 matrix of pairwise
decoding accuracies for each time point. The higher the decoding accuracy for two conditions
the more dissimilar are the conditions. This representational dissimilarity matrix (RDM)
accuracies from each time point can then be plotted to obtain the time course of classification
accuracy, including the standard deviations of the mean from all pairwise accuracies at each
time point.
Figure 2: Multivariate analysis of MEG data. The values from all 306 channels of the MEG raw data at time point t were extracted, and arranged in the pattern vector of t. In this way, one pattern vector per trial per condition was generated, for each pairwise comparison. To classify between two conditions, e.g. letters M and V, the pattern vectors (subaverages) were randomly assigned to a training set and a testing set via leave-one-out cross-validation. The support vector machine (SVM) learned to discriminate each pair of conditions with the training set and its accuracy was tested on the testing set. This procedure was repeated 100 times, each time randomly assigning new training and testing sets. The classification accuracy for each pair was mapped in a representational dissimilarity matrix (RDM) of time point t. By averaging the decoding accuracies of all pairs the overall accuracy of time point t can be calculated. Doing this for all time points resulted in the time course of classification accuracy. Figure adapted and modified from Cichy et al. (2014) and Teng et al. (2015).
2.4.2 Temporal generalization (time-time decoding)
In addition to the one-dimensional analysis of letter decodability (see 2.4.1), we also
investi-gated the presence and disparity of transient and persistent neural activity during letter
percep-tion. Persistent representations could be important for preserving the outcomes of particular
neural processes. To do so, we conducted a time-time decoding analysis (Cichy et al., 2014),
also called temporal generalization method (King and Dehaene, 2014), as illustrated in Figure
3. Because the 1-d time course analysis trains and tests at the same time point, it only
mea-sures instantaneous decodability, i.e. decodability at that time point, whereas the time-time
analysis can distinguish dynamics over time.
Figure 3: Time-time decoding. The values from all 306 channels of the MEG raw data at time points txand ty were extracted, and arranged in pattern vectors. We trained an SVM to
discriminate between the brain responses of each pair of conditions at time point txand tested
its accuracy on responses to the same letter at a different time point ty. All pairs of conditions
were classified and averaged to the overall decoding accuracy. The average decoding accuracy was mapped at point tx,ty of a time-time decoding matrix. This was done for all pairs of time
To distinguish between MEG signals induced by persistent and transient neural
represen-tations, we trained an SVM at one time point and tested it at the other time points (see Figure
3). This shows how well the classifier can generalize over time, and thus if persistent
simi-larities exist in the auditory representations of letter perception. We performed all pairwise
classifications of conditions and obtained one MEG decoding matrix for every pair of
train-ing time point and testtrain-ing time point. Averagtrain-ing across the decodtrain-ing matrix for each train-test
combination, and then repeating the procedure for all train-test combinations of time points,
resulted in a time-time decoding matrix (temporal generalization matrix).
A time-time decoding matrix yields detailed information about the temporal structure of
information processing in the brain. Figure 4 illustrates two possible outcomes of the
time-time analysis, namely underlying transient and persistent neural activity (for more examples
and simulations see King and Dehaene (2014)). A diagonal-shaped decoding performance
demonstrates high decoding accuracy at small distances between training and testing time
points, and poorer accuracy at larger distances, thus generalization over a small amount of time
only. This means that the neural processes are dynamically changing over time, i.e. transient.
A square-shaped time-time decoding performance illustrates high classification accuracy for
nearby as well as distant time points, indicating greater temporal generalization and thus more
sustained underlying brain processes.
Figure 4: Conceptual sketch of temporally persistent and transient decoding performance in a time-time decoding matrix. The different generalization behaviors of a classifier illustrate distinct temporal structures of the underlying brain processes. Figure adapted and modified from Teng et al. (2015).
2.4.3 Within- and between-group correlations
The time-time method works within individuals, but because the MEG signal is highly
sensi-tive to small variations in brain topography and head position, it is inappropriate for
classifi-cation across individuals. Thus, we compared the MEG decoding matrices between subjects
to assess across-subject similarities in the neural representation of letters.
For this, we performed Spearman rank correlations, which are conservative and do not
assume normally distributed data, within and between the blind and sighted groups. The
averaged MEG representational dissimilarity matrices (RDMs) of each time point (from 200
ms before until 1000 ms after stimulus presentation) yield the time course of classification
accuracy (see 2.4.1). We correlated the unaveraged RDMs of each two subjects and thus
obtained a correlation matrix for each pair of subjects (see Figure 5). We then averaged the
correlation matrices to obtain the mean within- and between-group correlations. In this way,
we can investigate if and when representations are (in-)consistent within a group and
(dis-)similar across groups.
2.4.4 Behavioral performance
The subjects’ task was to respond to every occurrence of the letters E and O by button press.
To measure the participants’ performance, we calculated the hit, correct rejection, false alarm,
and miss rates in percent, and from this the sensitivity index d’. The task mainly served as
Figure 5: Spearman correlation of two subjects. The representational dissimilarity matrices (RDMs) of the classified MEG data from each pair of subjects (with one subject on the x-axis and the other subject on the y-x-axis) were correlated across time points resulting in a color coded correlation matrix with warmer colors representing higher correlation coefficients (Spearman’s ρ). Figure from Teng et al. (2015).
2.4.5 Significance testing
To test for significances in the MEG data analyses, we used non-parametric statistical tests that
do not make assumptions about the data distribution. For the MEG decoding time courses,
we performed Wilcoxon signed-rank tests that tested the null hypothesis of no experimental
effect, i.e. 50% (chance level) decoding accuracy, at each time point (significance level α =
0.05). We corrected for multiple comparisons using the false discovery rate (FDR; Benjamini
and Hochberg (1995)). We determined the threshold of significant decodability of neural
activity as ten consecutive time points (10 ms) of decoding accuracies significantly above
50%, and analogously, ten consecutive time points of decoding accuracies not significantly
above 50% as threshold for breaks in the decodability (Safford et al., 2010). By bootstrapping
mean peak latencies and onsets of decodability and report the mean ± standard deviation of
the bootstrap distribution.
To compare the MEG decoding time courses between groups we used a two-sided Wilcoxon
rank-sum test (Mann-Whitney U test). To test for significant within- and between-group
cor-relations between patterns of decoding accuracies (RDMs), we performed Wilcoxon
signed-rank tests at each time point combinations with the null hypothesis of no correlation, i.e. ρ =
0 (significance level α = 0.01). FDR was used to correct for multiple comparisons.
To test for group differences in the subjects’ task performance two-sided Student’s t-tests
were performed with a significance level of α = 0.05.
All analyses and tests were performed in MATLAB (Release 2014a or 2015a, The
3
Results
3.1
Time course of letter decoding
The average time courses of decoding accuracies show strong similarities in auditory letter
classification between the sighted and the blind group (Figure 6A), whereas they show large
differences between the visual and Braille letter classification from our main study (Figure
6B). Before stimulus presentation, the average decoding accuracies fluctuate around chance
level (50%). Just after stimulus onset (0 ms) the mean auditory letter classification accuracies
increase steeply, reaching significance at 29 ± 19.3 ms (SD of the bootstrap distribution) for
the sighted, and 49 ± 24.9 ms for the blind. This is the time point when the MEG signals
begin to distinguish between the different letter conditions. Peak decodability, i.e. the time
point for which the individual auditory representations are most distinctive in terms of linear
separability, is reached at 103 ± 7.5 ms in the sighted group with 84.7 ± 6.3% decoding
accuracy. After that, the classification accuracy in the sighted group decreases gradually, with
interruptions in significance at 626-672 ms and 756-841 ms, until the letters are not decodable
any more after 862 ms. In contrast, the blind group’s decoding accuracy reaches a first peak
of 77.3 ± 10.3% at 105 ± 59.6 ms , followed by a short decline and a second lower peak of 75
± 6.1% at 256 ms. Overall, the blind group exhibits significantly lower decoding accuracies than the sighted group (two-sided Wilcoxon rank-sum test: p = 0.0085) without difference in
variation (p = 0.0877).
In contrast to the auditory letter classification, visual and Braille letter classifications show
large dissimilarities in the overall decoding time courses (Figure 6B). Whereas visual letter
decoding exhibits a more rapid onset, higher peak decodability, and a steep fall, the Braille
Figure 6: Time courses of the overall SVM classifications. A The average auditory letter decoding accuracies of 7 sighted (left) and 7 blind (right) subjects plotted from 200 ms before to 1000 ms after stimulus onset (0 ms). Blue and red bar on top denote time points significantly (p < 0.05) different from 50% chance level (dotted line), shading indicates standard deviation of the mean, gray bar at the bottom illustrates mean stimulus duration. B The average letter decoding time course of 13 sighted subjects reading visual letters (left) and 11 blind subjects reading Braille letters (right) from the main study.
3.2
Temporal generalization of letter decoding
3.2.1 Transient neural activity
The mean time-time decoding matrices of the sighted and blind participants (Figure 7) show
more distant time points, and thus neural activity during letter processing is mainly transient
and dynamic (King and Dehaene, 2014). The diagonals of the time-time decoding matrices
illustrate the same information as the time course of decoding accuracy described in the
previ-ous section, i.e. training and testing at the same time points (see 3.1). In this way of graphical
presentation it is especially apparent that after the first peak at approximately 100 ms, the
classification accuracy gradually decreases during and beyond stimulus presentation in the
sighted group, whereas it first gradually decreases, then peaks a second time at approximately
250 ms, and then declines again in the blind group.
Figure 7: Time-time MEG decoding matrices. Heat map of the average temporal general-ization of letter classification for 7 sighted (left) and 7 blind (right) subjects. The color code illustrates decoding accuracy of the SVM classifier for 200 ms before to 1000 ms after stimulus onset (0 ms) of training time (x-axis) and testing (generalization) time (y-axis), with warmer colors representing higher classification accuracy. The dark bars at the axes show average stimulus duration.
3.2.2 Persistent neural activity
In addition to transient neural activity, we also detected evidence for persistent neural activity,
which is illustrated in the time-time decoding matrices at time point combinations away from
the diagonal (Figure 7). Overall, the decoding accuracies away from the diagonal are at chance
level (50%). However, at point combinations between approximately 300 ms and 600 ms time
for the sighted, and approximately 350 ms and 700 ms for the blind, the diagonals broaden,
meaning higher temporal generalization at the end and after stimulus presentation. In addition
to a wider diagonal, a square of lighter blue (higher decoding accuracies) as compared to the
rest of time points away from the diagonal is observable in the blind group after approximately
350 ms until 1000 ms. Also only apparent in the blind group is a small increase of temporal
generalization at the time of the second peak decodability (at ∼250 ms).
Additionally, small regions of below-chance predictions are observable away from the
di-agonal shortly after the onset of stimulus presentation. In the sighted group, these dips of
decodability are relatively deep (down to 28.6% at 145 ms training time and 90 ms testing
time) and short, whereas in the blind group the decrease is less extreme and more sustained,
lasting until approximately 300 ms after stimulus onset.
3.3
Comparing letter representation within and between groups
We conducted Spearman rank correlations of the letter classifications across time within the
sighted and blind groups, respectively, as well as between the two groups. The resulting
matri-ces of the average correlations (Spearman’s ρ) of each pair of subjects and the corresponding
significance matrices are shown in Figure 8 for the within-group comparisons and in Figure
Consistency within the sighted group
Correlation within the sighted group is significantly elevated between approximately 50 ms
and 150 ms after stimulus onset (see square-like shaped significance cluster in Figure 8B).
Maximal similarity in letter representation between the sighted subjects is at 100 ms after
stimulus onset (ρ = 0.61 ± 0.15). Besides this cluster of significance shortly after
stimu-lus onset, there are several smaller cstimu-lusters that show significant correlation later on, mostly
around the diagonal. The correlation is generally higher in the time of stimulus presentation.
The last time point of significant correlation is at ∼570 ms with ρ = 0.18 (± 0.12).
Consistency within the blind group
The auditory letter representations between blind subjects start to correlate significantly at
ap-proximately 100 ms after stimulus onset, with maximal correlation of ρ = 0.41 (± 0.23) at 105
ms. A relatively large cluster of significant correlation is observable between approximately
100 ms and 190 ms on the x-axis and approximately 100 ms and 250 ms on the y-axis. This
asymmetric correlation cluster could indicate one-sided similarity of different time points in
the course of letter perception among the blind subjects. However, it could also be noise
be-cause the same subjects can be on both x and y axis during the pairwise correlations. It is
possible that more asymmetrical RDM correlations indicate more general variability in the
population. The letter representations in the blind correlate with each other at more time
points during and after stimulus presentation, also relatively far away from the diagonal, as
Figure 8: A Within-group correlations. The average correlations between patterns of decoding accuracies (RDMs) across time of each pair of subjects within the sighted (left; n = 7) and the blind (right; n = 7) group plotted as heat maps with warmer colors representing higher correlation coefficients ρ. The dark bars at the axes show average stimulus duration. (Note the different color coding keys for the sighted and blind correlations.) B Significance of within-group correlations. The outcomes of corrected Wilcoxon signed-rank tests at each time point of the within-sighted (left) and within-blind (right) correlations are illustrated with yellow color denoting significance (p < 0.01).
Similarity between the sighted and blind groups
The correlation of letter representations between the sighted and blind group reveals the time
with ρ = 0.49. In general, the sighted and blind groups show significantly elevated
correla-tion around the diagonal between approximately 50 ms and 450 ms. Furthermore, this cluster
of significance also includes time points more distant from the diagonal, namely at
approx-imately 100-200 ms for the sighted participants and 200-450 ms for the blind participants,
but not vice versa (see triangle-like shaped correlation and significance cluster in Figure 9).
In addition to this large cluster, significant correlation occurs at numerous other time points,
indicating large similarity between auditory letter perception in sighted and blind subjects at
many time points throughout and beyond stimulus presentation.
Figure 9: A Between-group correlation. The average correlations between patterns of de-coding accuracies (RDMs) across time of each pair of subjects between sighted (x-axis; n = 7) and blind (y-axis; n = 7) plotted as heat maps with warmer colors representing higher correlation coefficients ρ. The dark bars at the axes show average stimulus duration. B Signif-icance of between-group correlations. The outcomes of corrected Wilcoxon signed-rank tests at each time point of the between-group correlations are illustrated with yellow color denoting significance (p < 0.01) and green denoting no significance (n.s.).
3.4
Behavioral performance
The subjects performed well in the vigilance task, indicating that they were consciously
listen-ing to the spoken letters at all times. One sighted subject’s behavioral data were not recorded
correctly due to computer failure and had to be excluded from the analysis. Figure 10 shows
the average rates of hits, correct rejections, false alarms, and misses, as well as the sensitivity
index d’ for the sighted and blind participants, respectively. The two groups did not differ
significantly in any of the performance measures.
Figure 10: Task performance of the sighted and blind group. Average hit, correct rejection, false alarm, and miss rate in percent (left), and sensitivity index d’ (right) for the sighted (n = 6) and blind (n = 7) group, respectively. Error bars indicate standard deviation of the mean. Differences between the groups are not significant.
4
Discussion
Using multivariate pattern analysis (MVPA) and representational similarity analysis (RSA)
methods on MEG data, we decoded the identity of auditory spoken letters presented to blind
and sighted humans. We showed that letters were distinguishable significantly above chance
from stimulus onset until after stimulus offset. In the main study that compared letter
per-ception in the tactile and visual modalities between blind and sighted subjects, respectively,
we hypothesized that the observed differences between Braille and visual letter processing
were due to processing in different modalities, and not due to general differences between the
brains of the two populations. Hence, we reasoned that processing of identical stimuli, namely
auditory, should be similar between the two groups.
Consistent with our predictions, we found similar structures in the decoding time courses
between blind and sighted subjects. The overall shape of the time courses are very alike,
especially in comparison to the relatively large differences between the decoding time courses
of Braille versus visual letters (Teng et al., 2015). Both the blind and sighted decoding profiles
show a fast and high increase in decodability right after stimulus onset, followed by a less steep
falloff after approximately 100 ms until decodability is back at baseline after stimulus offset.
This suggests equivalent temporal structures of the neural representations between the blind
and sighted while listening to spoken letters. However, we found some differences as well.
The blind decoding time course exhibits a second, lower peak shortly after the initial peak
decodability that is not apparent in the sighted. This might suggest an additional processing
step in the course of stimulus recognition that is not present in the sighted group, or peak
decodability that is generally more plateau-like and sustained over time in comparison to the
clearer peak, followed by a rapid falloff in the sighted group. Nonetheless, this difference
is relatively small and larger group sizes are needed to elaborate on this phenomenon and
to determine its meaning. Furthermore, the blind group’s decoding accuracies are overall
slightly but significantly lower than the sighted group’s decoding accuracies. This might be a
for example, due to different possible comorbidities accompanying blindness. In addition, the
blind subjects cannot fixate whereas the sighted were asked to do so during the scan. This
experimental difference might thus have led to additionally noisier signals, leading to lower
classification accuracies.
In addition to the similar time courses of letter decoding, the time-time analyses of the
two groups revealed also similarity in the temporal generalization behavior of the letter
clas-sifier: It primarily generalizes well to close and poorly to distant time points with increased
generalization to distant time points at the end and after stimulus presentation. This indicates
that both the blind and the sighted show primarily transient neural activity during auditory
letter recognition with increased persistent activity at the end and after the stimulus. These
slower dynamics later on suggest a process that maintains the neural representations over
longer times, for instance, a working memory process.
One intriguing finding in the temporal generalization analysis are clusters of decoding
accuracies below chance. These occur at time points off the diagonal in the first 100 to
200 ms of stimulus presentation. The phenomenon of below-chance learning has been
re-ported in previous work (King and Dehaene, 2014; Carlson et al., 2011; Carlson et al., 2013;
Nikoli´c et al., 2009), however, its meaning is currently unclear. Possible interpretations are
oscillatory or reversing neural dynamics, in which activity patterns reappear with inverted
po-larity (King and Dehaene, 2014). Nevertheless, it remains poorly understood when and why
such inversions occur and it will be an interesting question for future research.
Besides decoding time courses and temporal generalizations, the RDM correlations
be-tween the blind and sighted groups demonstrate great similarities in the representational
pat-terns. Significant correlation at many time points during and beyond stimulus presentation
suggests that blind and sighted subjects represent auditory letters in a similar fashion. The
main cluster of high correlation is during letter presentation and triangle-shaped, with
pre-dominance above the diagonal. The asymmetric shape illustrates that early auditory letter
processing in the sighted is similar to early as well as later processing in the blind, whereas
in-dicate that the same or very similar brain processes can occur at different time points in blind
versus sighted participants. This asymmetry suggests that while activity elicited by the same
stimulus and task may exhibit subtly different time courses in blind and sighted participants,
it can involve the same or very similar neural representations. Taken together, all three means
of comparison, the time courses of decoding accuracies, the temporal generalizations, and the
correlations of representational patterns of letter processing, show similarities between the
sighted and blind.
Supplementary to our presented decoding analysis of temporal information structure,
fu-ture research may also provide an understanding of the spatial sources of the decoded signal.
This could be achieved by, for instance, source reconstruction from the MEG signal followed
by decoding at defined regions of interest to resolve in which brain areas the decodable
infor-mation originates (King and Dehaene, 2014), or combining MEG and fMRI in the
represen-tational similarity analyses and therewith integrate brain analyses in time and space (Cichy et
al., 2014).
In conclusion, our results demonstrate great homogeneity in the neural dynamics of
audi-tory letter processing in blind and sighted humans, suggesting that differences in Braille and
visual letter reading are not due to inherent population differences. Integrating the results from
investigating the brain mechanisms involved in processing multimodal stimuli, namely visual,
Braille, and auditory letters, aims to give answers to whether such stimuli are represented in a
independent way, and, if so, how stimulus information is transformed from
modality-specific to modality-independent. So far, preliminary results from the complete study (Teng et
al., 2015) suggest both disparate and common dynamic processes between Braille and visual
reading, as well as similar components of processing occurring at different times specific to
the modality. Since we showed that presenting auditory spoken letters to both groups elicited
strongly similar responses between groups, the differences are unlikely to be inherent group
differences. The present research thus contributed to promising new insights into the nature of
multimodal object recognition and the temporal structure of information processing in blind
Glossary
EEG Electroencephalography
FDR False Discovery Rate
fMRI Functional Magnetic Resonance Imaging
GLM General Linear Model
MEG Magnetoencephalography
MVPA Multivariate Pattern Analysis
RDM Representational Dissimilarity Matrix
RSA Representational Similarity Analysis
SOA Stimulus Onset Asynchrony
References
Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate: A Practical and
Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B
(Methodological) 57:289 – 300.
Carlson T, Tovar D, Alink A, Kriegeskorte N (2013) Representational dynamics of object
vision: The first 1000 ms. Journal of Vision 13:1–19.
Carlson TA, Hogendoorn H, Kanai R, Mesik J, Turret J (2011) High temporal resolution
decoding of object position and category. Journal of vision 11:1–17.
Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM
Trans-actions on Intelligent Systems and Technology 2 Software available athttp://www.csie.
ntu.edu.tw/~cjlin/libsvm.
Cichy RM, Pantazis D, Oliva A (2014) Resolving human object recognition in space and
time. Nature neuroscience 17:455–62.
Haxby JV, Gobbini MI, Furey ML, Ishai a, Schouten JL, Pietrini P (2001) Distributed and
overlapping representations of faces and objects in ventral temporal cortex. Science (New
York, N.Y.) 293:2425–2430.
Haxby JV (2012) Multivariate pattern analysis of fMRI: The early beginnings.
NeuroIm-age 62:852–855.
Haynes JD, Rees G (2006) Decoding mental states from brain activity in humans. Nature
reviews. Neuroscience 7:523–34.
Isik L, Meyers EM, Leibo JZ, Poggio T (2014) The dynamics of invariant object recognition
in the human visual system. Journal of neurophysiology 111:91–102.
King JR, Dehaene S (2014) Characterizing the dynamics of mental representations: The
Kriegeskorte N, Mur M, Bandettini P (2008) Representational similarity analysis -
connect-ing the branches of systems neuroscience. Frontiers in systems neuroscience 2:1–28.
Merabet LB, Pascual-Leone A (2010) Neural reorganization following sensory loss: the
opportunity of change. Nature Reviews Neuroscience 11:44–52.
M¨uller KR, Mika S, R¨atsch G, Tsuda K, Sch¨olkopf B (2001) An introduction to
kernel-based learning algorithms. IEEE transactions on neural networks / a publication of the
IEEE Neural Networks Council 12:181–201.
Nikoli´c D, H¨ausler S, Singer W, Maass W (2009) Distributed fading memory for stimulus
properties in the primary visual cortex. PLoS biology 7:1–19.
Norman Ka, Polyn SM, Detre GJ, Haxby JV (2006) Beyond mind-reading: multi-voxel
pattern analysis of fMRI data. Trends in Cognitive Sciences 10:424–430.
Pascual-Leone A, Amedi A, Fregni F, Merabet LB (2005) The plastic human brain cortex.
Annual review of neuroscience 28:377–401.
Sadato N, Pascual-Leone A, Grafman J, Iba˜nez V, Deiber MP, Dold G, Hallett M (1996)
Acti-vation of the primary visual cortex by Braille reading in blind subjects. Nature 380:526–528.
Safford AS, Hussey EA, Parasuraman R, Thompson JC (2010) Object-Based Attentional
Modulation of Biological Motion Processing: Spatiotemporal Dynamics Using Functional
Magnetic Resonance Imaging and Electroencephalography 30:9064–9073.
Tadel F, Baillet S, Mosher JC, Pantazis D, Leahy RM (2011) Brainstorm: A user-friendly
application for MEG/EEG analysis. Computational Intelligence and Neuroscience 2011.
Teng S, Cichy RM, Pantazis D, Sommer V, Oliva A (2015) The neural dynamics of letter
per-ception in blind and sighted readers. Poster session presented at the Society for Neuroscience
Tong F, Pratte MS (2012) Decoding patterns of human brain activity. Annual review of
psychology 63:483–509.
van Atteveldt N, Formisano E, Goebel R, Blomert L (2004) Integration of Letters and Speech