Neuronal Subpopulations Disrupt Coding of Orientation in V1 through Increased Correlated Trial-to-Trial Variability

(1)

Neuronal Subpopulations Disrupt Coding of Orientation in V1

through Increased Correlated Trial-to-Trial Variability

Marleen Voorn, 10658645

Bachelor of Sciences Bèta-Gamma, Neurobiology University of Amsterdam

Supervision: Guido Meijer and Carien Lansink

Swammerdam Institute for Life Sciences, Cognitive and Systems Neuroscience June 2017, Amsterdam

KEYWORDS – neuronal population coding, orientation-selective neurons, Bayesian Maximum-Likelihood decoder, correlated trial-to-trial variability, noise correlations

ABSTRACT – Exposure to an identical visual stimulus will never evoke an identical response of a neuron. This trial-to-trial variability, or noise, in activity of neuronal populations is correlated in the brain. Because noise correlations influence neuronal activity patterns, noise correlations influence neuronal population coding as well. However, little is known about the beneficial or detrimental effects of noise correlations on population coding. In this research, we investigated whether there are

neuronal populations in mouse V1 that disrupt the coding of orientation through increased correlated trial-to-trial variability. To measure the trial-by-trial fluctuations, or noise, in activity, we performed two-photon calcium imaging of GCaMP6m-expressing neuronal populations in mouse V1. First, we identified neuronal subpopulations that hamper population coding as neurons with the lowest contribution to a decoder. We observed increased noise correlations in the subpopulations, and removal of the noise correlations in these subpopulations improved the decoding performance. We propose that there are neuronal subpopulations with increased noise correlations which have detrimental effects on neuronal population coding.

Introduction

The rodent visual pathway starts at the retina, where retinal ganglion cells project through the optical nerve, as a bundle of axons. The projection fibers first cross sides at the chiasma, and from there the projections terminate in the lateral geniculate nucleus, or LGN. The LGN is a relay station where information about a visual stimulus from the retina is passed on to the primary visual cortex, or V1 (Goltstein, 2015). Across mammalian species, the tuning properties of neurons in V1 are preserved. Neurons in rodents V1, such as in mouse V1, are selective for orientation in the same manner as other species (Niell & Stryker, 2008), which means that the response of a neuron in mouse V1 depends on the orientation of the visual stimulus. Across neurons, there is a difference in this selectivity: some neurons show a very strong response to one orientation (so-called their preferred orientation), other neurons respond weakly to all orientations. This orientation preference in neurons is stable across days (Montijn et al., 2016). However, the spatial distribution of neurons’ preferred orientations in V1 differs across species. In animals such as cats and monkeys, each neuron is similarly tuned for orientations as its nearest neighbors. This columnar organization is thought to be absent in rodents’ visual cortex, implying that the preferred orientations of neurons in mouse V1 are ordered in a ‘salt and pepper’ way (Ohki et al., 2005).

Based on neuronal population activity patterns in response to a visual stimulus, the information about the stimulus is encoded by the brain. Thus, encoding describes the relationship between a visual stimulus and neuronal activity patterns. To mimic this population coding, a decoding algorithm can be used (Pitkow & Angelaki, 2017). Intuitively, the decoding algorithm used should be highly similar to the system where neurons in mouse V1 rely on. Unfortunately, this real algorithm is not known, and the best alternative is to use an optimal artificial algorithm. It is demonstrated that the Bayesian Maximum-Likelihood decoder outperforms all other decoding algorithms (Montijn et al., 2014). Furthermore, this approach gives a better estimation than neural networks, since the Bayes’ rule makes 8.0% errors where neural networks give 9.5%-11.1% errors (Ripley, 1996). With

reasonably small sample sizes from 28 neurons, the Bayesian ML decoder achieves >90% correctly decoded orientations (Montijn et al., 2014). The purpose of a decoding algorithm is to determine which stimulus is presented based on neuronal population activity patterns.

(2)

The difficulty in unravelling the underlying process of the population coding is the fact that neurons are noisy. Exposure to identical stimuli will never evoke identical neuronal population activity patterns. Subsequently this means that in each neuron’s activity trial-to-trial fluctuations are present, often called noise. This apparent noise causes the decoding output to be a probabilistic function (Averbeck et al., 2006). Because noise influences neuronal population activity patterns in response to a visual stimulus, it is an inevitable question to ask how this noise affects computations and therefore the decoding of the visual stimulus. Little is known about the eventually beneficial or detrimental effects of noise on population coding.

In the brain, it appears that the noise is correlated, which implicates that the variability in the response of one neuron around its average is correlated with the variability in the response of other neurons (Averbeck et al., 2006). Temporal measurements do not affect noise correlations between neurons, they are relatively stable across days (Montijn et al., 2016). It is shown that noise correlations usually have mean values between 0.01 and 0.2, thus positive noise correlations are observed (Kohn et al., 2016). This implies that a subset of neurons simultaneously respond stronger or weaker than their average response to the stimulus.

Noise is present and influences population coding, but the correlations manipulate this effect in a way that it is beneficial or detrimental to coding (Franke et al., 2016). Noise correlations can favor population coding when they shape responses evoked by different stimuli in a manner that the distinction between different stimuli improves. However, noise correlations do not always imply an increase in information about stimuli (Moreno-Bote et al., 2014), noise correlations can also impede population coding by causing more overlap between responses to different stimuli (Averbeck et al., 2006). Different stimuli become more difficult to distinguish from each other, and this impedes the coding of stimuli. The underlying mechanism behind these detrimental noise correlations on population coding has been investigated by many computational studies (Moreno-Bote et al., 2014; Kohn et al., 2016). Experimental evidence regarding these correlations are yet to be discovered. In this research, we aimed to demonstrate with experimental data the detrimental effects of noise correlations on population coding of orientation. To preserve the noise correlation structures between neuronal population responses, activity patterns of neuronal populations in mouse V1 were measured. We used in vivo two-photon excitation microscopy in awake mice, because this technique allowed us to measure the activity of many neurons simultaneously (Kerr & Denk, 2008).

Intuitively, not all neurons contain an equal amount of information about a visual stimulus. To obtain insight in each neuron’s contribution in the decoding of an orientation, we varied the neurons given to the decoder, and compared the decoding performance as correctly decoded orientations. We suggest that the neuronal subpopulations with the lowest contribution to the decoder, are the neurons that hamper the coding of orientation in V1 through increased correlated trial-to-trial variability, or noise correlations.

Methods

Animals and surgical procedures

All experimental procedures were approved by the animal ethics committee of the University of Amsterdam. This experiment was done with 12 adult male C57BL/6 (Harlan) mice. The animals were kept in a reversed day/night cycle to acquire data in their active phase.

Head-bar implantation

The surgical procedures are highly similar as described earlier in Montijn et al. (2016, Supplemental Experimental Procedures). During the measurements, the animals were head fixed due to the implantation of a chronic head-bar centered above the visual cortex. First, the analgesic

Buprenorphine (0.05 mg/kg bodyweight) was injected subcutaneously. After 30 minutes, the mice were anesthetized with isoflurane (3% in 100% O2). During surgery, the isoflurane was lowered to

1-2%. The skin on top of the head was removed with fine scissors, and the custom-built titanium head bar was attached with dental cement (C&B Superbond, Sun Medical, Japan) to the skull above the visual cortex, ~400 mm caudal and ~2.5 mm lateral from bregma. Last, cyanoacrylate glue (Loctite 401, Henkel, Germany) was applied to the skull to avoid infections.

(3)

To locate the visual cortex, intrinsic optical signal (IOS) imaging was performed. Mice passively viewed moving visual gratings under anesthesia (0.8% isoflurane) and the visual cortex was determined and marked.

Virus infection and craniotomy

One week after the head-bar implantation, the mice were injected with 200-300nl of the viral construct AAV1-Syn-GCaMP6m-WPRE (Penn Vector Core, PA, USA). The viral construct carried the protein GCaMP6m, which becomes activated by calcium influx. The viral construct was injected with a dental drill that made a small hole of ~0.1 mm in the skull. A mineral-oil backfilled glass capillary pipet operated by a Nanoject II Auto-Nanoliter injector (Drummond Scientific, PA, USA) was used for injection at a depth of 600-700 µm below the dura, 1 mm anterior of the target imaging site. After the virus injection, a circular craniotomy of 3 mm in diameter was created. Last, to avoid regrowth of the skull a double layered cover glass was attached to the skull using cyanoacrylate glue (Loctite 401, Henkel, Germany). The bottom layer of the glass fitted precisely in the craniotomy (Æ 3 mm, thickness ~300 µm), the top layer had a larger diameter and thickness (Æ 5mm, thickness ~150 µm). The animals recovered for one week before the experiment started.

Apparatus (Two-photon excitation microscopy)

Two-photon imaging recordings were performed in vivo using a Leica SP5 resonant laser-scanning microscope and SpectraPhysics Mai Tai High Performance Mode Locked Ti:Sapphire laser. The frame size of the imaging was a square region of 150x150 µm (512 x 512 pixels) with a sampling frequency of 12.7 Hz.

The laser’s wavelength was set on 910nm (IR) to excite the GCaMP6m protein molecules. When two photon molecules - both with a long wavelength and low energy - reach one fluophore simultaneously, the fluophore will lose some energy (this remaining energy is lower than the summed energy of two photons, but higher than the energy of one photon), and the emitted wavelength is shorter. Therefore, the fluorescence emission was filtered in a range between 500 and 550 nm and can be captured as fluorescent green light.

Stimulus presentations

During the two-photon imaging recordings, mice (n=12) were awake and head-fixed and were passively viewing the stimuli. The visual stimuli consisted of bi-directionally moving square-wave drifting gratings in 8 different orientations, ranging from 0° to 157.5° with steps of 22.5°. Per different orientation 10 or 12 repetitions were performed: a total of 80 or 96 trials per recording. For this analysis, only one session per animal was used.

One visual stimulus lasted for 3 seconds, after 1.5 seconds the direction of the drifting grating switched 180 degrees. The drifting gratings had a diameter of 60 retinal degrees and a spatial

frequency of 0.05 cycles/degree. The temporal frequency was set on 1 Hz. Between each stimulus presentation containing a new orientation, a blank inter-trial interval (isoluminant grey screen) was presented for 5 seconds. Thus, the difference in time between two stimuli was 8 seconds. To prevent edge effects at the boundaries of the circular window, all stimuli were displayed within a cosine-ramped window.

The visual stimuli were presented on a 15 inch TFT screen with refresh rate of 60 Hz, and was placed 16 centimeters from the mouse’s eye to achieve a diameter of 60 retinal degrees as stated above. This was controlled by MATLAB using the PsychToolbox extension. The microscope set up was connected to a field-programmable gate array (OpalKelly XEM3001) and interfaced with the stimulus computer for synchronization of timing of the visual stimulation with the frame acquisition. Two photon data

The observed difference in fluorescence because of the neuron’s activity is computed by Equation 1. ∆𝐹

𝐹# =

𝐹_&'()*+*&− 𝐹_-./.-.)0. 𝐹-./.-.10.

Equation 1

The activity during the stimulus presentation (Fstimulus) is determined by taking the absolute average fluorescence during the stimulus presentation (duration of 3 seconds). The baseline (Freference) of one trial is determined by taking the absolute average fluorescence during a time window of 30 seconds before the imaging started. This returned the difference in fluorescence relative to the fluorescence baseline per stimulus trial.

(4)

OSI computation

The presented stimulus orientation with the highest evoked averaged DF/F0 activity is determined as preferred orientation for that neuron. Computation of the orientation selectivity index (OSI) for a neuron is performed by Equation 2. To proportionate the degree of activity at the preferred orientation (Rpreferred), the averaged DF/F0 activity for all trials of the orthogonal orientation is used, noted as Rortho.

𝑅3-./.--.4− 𝑅5-'657518+

𝑅3-./.--.4+ 𝑅5-'657518+

Equation 2

Since Rorthogonal was always lower than Rpreferred, resulting in an OSI value between 0 and 1. Tuning curve analysis

Tuning curves were made by first averaging all the DF/F0 values where the same stimulus type is presented. In this study, we included 10 or 12 stimulus trials where the same orientation was presented. Since our stimuli consisted of orientations between 0° and 157.5° with steps of 22.5°, we had 8 different stimulus types, therefore we obtained a vector with 8 averaged DF/F0 responses per neuron. To visualize this average response vector, see Figure 3E for example tuning curves. Bayesian Maximum-Likelihood decoding

The decoding algorithm used to predict the presented orientation was the Bayesian Maximum-Likelihood (ML) decoder. This is done with the Bayes’ rule given by Equation 3.

𝑃 𝜃 𝐴353 =

𝑃 𝐴353 𝜃 𝑃(𝜃)

𝑃(𝐴353)

Equation 3

To simplify the Bayes rule, the rule can be read as:

posterior probability = (likelihood * prior probability) / normalization term

For computational purpose, we assumed the prior probability (P(q)) as equal for all directions, since there is little difference in prior probabilities of normal and uniform distributions (Jacobs et al., 2009, Figure 3C). In short, with Bayesian ML the probabilities of all orientations as presented stimulus orientation is estimated and thereafter it returns the orientation with the highest probability as the presented stimulus orientation.

First, the likelihood (P(Apop|q)) must be estimated for all different orientations. The likelihood represents the probability P to observe the population activity Apop, when orientation q is the presented stimulus. Thus, q can be a categorical value between 0° and 157.5° with steps of 22.5°. The mean activity and standard deviation can be derived from all trials (10 or 12 per neuron) where the presented stimulus had the given orientation (for example q = 0°). A Gaussian distribution is approximated for an example for neuron i with q = 0° from its mean and standard deviation. It is important to emphasize that this results in a normal distribution, not a tuning curve.

With the measured activity of that neuron Ai, the probability of q = 0° as presented orientation can be extracted from the Gaussian distribution. Assuming the neurons respond independent in time (probabilities will not influence following probabilities), the population posterior probability distribution for that orientation was calculated by taking the product of all posterior probability distribution for that orientation for all neurons (Equation 4). The posterior probability (P(q|Apop)) holds the probability P for the orientation q, given the measured activity of the population.

𝑃 q𝐴353 = 𝑃(𝜃|𝐴)( 1

(ST

Equation 4

The orientation with the highest posterior probability following from Equation 4 is read out as the decoded direction.

Determination of contribution to the decoder per neuron

The following approach is the same as in Montijn et al. (2014). The best and worst contributing neurons were determined by a bootstrapping procedure, followed by a jackknifing procedure. The procedures were applied to obtain insight in the contribution of a neuron’s response to the decoding

(5)

performance. Bootstrapping occurred with random samples of 15 neurons that were taken out the population and iterated 500 times per session, in each iteration a new random sample of 15 neurons was selected.

A jackknifing procedure was sequentially performed after every bootstrap resample. The aim of the jackknifing procedure was to determine the decoding performance with and without a neuron. To quantify the neuron’s contribution in decoding, for every neuron i Equation 5 was used. Here, S is the sample size (in this case 15) and D is the decoding performance of the whole sample.

𝐷(= 𝑆𝐷 − 𝑆 − 1 𝐷¬(

Equation 5

The decoding index for every neuron i (Di) was determined by leaving this neuron out of the subgroup and predict the presented orientations conform to the ML decoder. If the decoding performance is very low without neuron i, meaning neuron i has a great impact on the decoder, 𝐷¬(_{would be low.}

Therefore, Di would be very high. Following from this, neurons were ranked on their contribution. Neurons with the highest values were assumed as best contributing, neurons with the lowest values as worst contributing.

Noise correlations

Noise correlations give an indication of the similarity in trial-by-trial response variability between neurons (Montijn et al., 2014). To determine the noise correlations of a neuronal pair, first a response vector for each orientation q is made per neuron, as shown in Equation 6. This vector has length t, which is 10 or 12 since this is the amount of trials per orientation q we presented.

𝑅Y= [𝑅Y_[, … , 𝑅Y_^]

Equation 6

This results per neuron in 8 response vectors, for each orientation q one. Because our purpose is to compare single noise correlations, we took the mean over all presented orientations: 0-157° with steps of 22.5°). The single noise correlation for a neuronal pair i and j is then calculated with Equation 7. r_(,` = 𝑐𝑜𝑟𝑟(𝑅(,Y Tde.d YS# , 𝑅`,Y) 8 Equation 7

Results

We measured the responses of neurons in mouse V1, using two-photon microscopy in head-fixed mice expressing GCaMP6m (n=12; Figure 1A, left). Responses of neuronal populations were measured while presenting stimuli with 100% contrast black and white gratings, in eight different orientations between 0 degrees and 157.5 degrees with steps of 22.5 degrees (Figure 1A, right). With the use of two-photon microscopy we achieved to visualize and determine the activity of multiple neurons at the same time in one imaging plane (Figure 1B). The number of simultaneously measured neurons varied between 130 and 194 per mouse. The injected viral construction contained the protein GCaMP6m, the calcium indicator became fluorescent when calcium binds. The fluorescence provided a suitable measurement for the activity of each neuron, given by DF/F0 (Chen et al., 2013). Because

the neurons had different orientation preferences, the activity when presenting different orientations varied per neuron. The fluorescent traces of two example neurons are shown in Figure 1C.

(6)

FIGURE 1 – Overview of experimental set-up and obtained two-photon microscopy data of GCaMP6 expressing neurons in mouse V1

A) Head-fixed mice (n=12) were positioned in front of a screen while presenting black-and-white (100% contrast) gratings in eight different orientations, drifting in two different directions per trial. B) Example of a two-photon imaging plane in mouse V1. C) Two example DF/F0 traces, same neurons as marked in B. Presented orientations

are represented in the same color code as in A. Responses between neurons differ in orientations: neuron 1 responded strongly to 0 and 157.5 degrees (cyan and purple), while neuron 2 responded strongly to only 157.5 degrees (purple).

The different responses generated by different presented orientations can be explained by the orientation selectivity of neurons. The tuning of a neuron is quantified by the orientation selectivity index (OSI). We defined OSI as (Rpreferred – Rorthogonal) / (Rpreferred + Rorthognal), which implicates that an OSI of 0 is obtained by an equal response to all orientations, and perfect orientation selectivity would return an OSI of 1. The distribution of observed OSI values is shown in Figure 2A. Only about one third of the measured neuronal population in mouse V1 consisted of highly orientation selective neurons, which is in line with earlier findings where it appears that ~60% of a population are non-oriented neurons (Hübener, 2003). To investigate the neuronal population coding of orientations, only the responses of highly orientation selective neurons were relevant. Therefore, only neurons with an OSI value above threshold of 0.5 were included for further analysis (32.5%, n=616; Figure 2A). From now on, when referring to the neuronal population or all neurons, only neurons with OSI values above 0.5 are considered. To emphasize, the remaining neurons were relatively high orientation-selective and responded with a ratio of 3 (or higher) : 1 to their preferred orientation in comparison to the orthogonal orientation. However, this selectivity held no information about which orientation they were selectively responding to. The preferred orientations were uniformly distributed across the population; although a small peak was observed at 67.5 degrees (Figure 2B). Neurons in mouse V1 appear to be organized in a salt-and-pepper fashion (Ohki et al., 2005), which was also the spatial distribution we found (Figure 2C).

(7)

FIGURE 2 – Orientation selectivity index (OSI) of neuronal population with their preferred orientations and spatial distributions

A) About one third (32.5%) of the neuronal population showed an OSI above 0.5. Only these neurons (n=616, varying per mouse between 38 and 69) were restricted for further analysis.B) Preferred orientations of neurons with OSI>0.5 were uniformly distributed, showing a small peak at 67.5 degrees. C) Same color code as in Figure

1A (right). Imaging plane of one example session, illustrating the spatial distribution of the neurons with their

preferred orientation. Preferred orientations of neurons with OSI>0.5 are organized in a salt-and-pepper way.

Neuronal subpopulations with worst contributing neurons decrease the decoding performance When a stimulus with an orientation between 0 and 157.5 degrees is presented, the neuronal

population will respond in its unique manner. Therefore, the population code can reliably distinguish between different orientations. To infer the presented orientation from the population activity, we used the Bayesian Maximum-Likelihood decoder (for complete description, see Methods). Each neuron’s contribution in returning the correct presented orientation is quantified by a bootstrapping procedure, followed by a jackknifing procedure. In short, random groups of 15 neurons were selected and one-by-one a neuron is left out when the decoder is predicting presented stimuli. The percentage correct predicted orientations of the subgroup containing the neuron is compared to the percentage correct predicted orientations of the subgroup without the neuron: subsequently the contribution of that neuron is determined as difference in decoding performance (for complete description, see Methods). Thus, neurons with high contribution were important for correctly decoding the orientation.

Contribution and OSI did not show a correlation, indicating that a high contribution does not automatically imply a high OSI (Figure 3A). Interestingly, we also found neurons that showed a high OSI, but appeared to have a low contribution; these neurons are located at the bottom right in Figure 3A.

To get more insight in the importance of the contribution of a neuron in decoding orientations, the contributions of the neuronal populations per mouse were ordered in a descending way. The information – given as neuronal activity - provided to the decoder was cumulatively increased, starting with the activity of the first neuron (thus with the highest contribution). By adding the next neuron, the information about the presented stimulus is increased, and the decoding performance will logically increase (Figure 3C). We observed that the beginning of the curve representing the decoding

performance steeply increased, demonstrating that only a small fraction of neurons highly contributed to the decoding performance. This is substantiated by the distribution of the contribution, where the major group of neurons showed relatively low contribution (Figure 3B).

One might intuitively expect that after the quick increase of the decoding performance, the trend will flatten (in the best decoding performance around 95%) and not significantly in- or decrease anymore. The decoder already received the greatest amount of information from the best contributing neurons, and the information – and therefore the decoding performance – will not considerably change. However, it is evidently demonstrated in Figure 3C that all decoding performances decrease at the end: adding the neurons with the lowest contribution affect the decoder in a manner that the

(8)

overall decoding performance goes down. We observed this effect consistent in all mice (n=12). In other words, the decoder can decode more optimal without the information of these neurons. This demonstrates that the subpopulations of neurons with the lowest contribution to the decoder are the neurons that hamper population coding of orientation.

Worst contributing neurons are less orientation-selective and show more variation in activity around preferred orientations

Two neuronal subpopulations were distinguished for further analysis: one subpopulation consisted of the fifteen neurons with highest contribution to the decoding performance (n=180), and one

subpopulation consisted of the fifteen neurons with the lowest contribution to the decoding performance (n=180). We found that the average OSI of the two neuronal subpopulations differ significantly, respectively 0.7817 and 0.6519 (Figure 3D; Paired t-test: p<0.001).

To further identify tuning properties of the neuronal subpopulations, differences and similarities in tuning of the subpopulations were analyzed. When exploring the responses of the neurons, we found a range of response types. The responses included neurons that were highly selective for stimulus orientation, as well as neurons that were less selective for stimulus orientation. An important remark, “less selective neurons” in this research still were neurons with an OSI above 0.5. Examples of two neurons’ responses to different orientations are shown in Figure 3E. The preferred orientation of the example neuron on the left (45 degrees) can easily be deduced from the tuning curve. However, the example neuron on the right did not considerably change its firing in response to different orientations.

The shown tuning curves only included the presented orientations. To deduce the exact preferred orientations, a Von Mises fit was applied. The information given to the decoder quickly saturated (Figure 3C), and this effect was consistent in all mice, implying all the orientations should be covered by each neuronal subpopulation. We tested this assumption by taking the circular variance (S = 1 – R) of the exact preferred orientations from the fifteen best contributing neurons. The lower the mean vector R, the wider the data was distributed, thus a high value for circular variance represented broadly distributed data. The circular variance between the best (most contributing) and worst (least contributing) neurons did not hold a consistent difference, in 50% (6 out of 12) the circular variance of the fifteen best contributing neurons were higher than the fifteen worst contributing neurons (Figure 3F). We concluded that in both subpopulations a wide range of preferred orientations was present.

To determine the sharpness of tuning, the bandwidth of all fitted tuning curves was examined. The wider the bandwidth, the more activity the neuron showed around its peak value (thus its

preferred orientation), the less sharp the neuron was tuned. In 66,6% (8 out of 12) the mean

bandwidth of the worst contributing neurons was higher than in the best contributing neurons, in one example significant (Figure 3G), but we did not find an evident difference in sharpness of tuning between the neuronal subpopulations.

To reveal the fluctuations in responses at the neurons’ preferred orientations, these responses were analyzed with the coefficient of variation (COV), by taking the standard deviation divided by the mean. The neuronal subpopulations with the best contributing neurons showed less variation around the mean in response to their preferred orientations in comparison to the neuronal subpopulations with the

(9)

worst contributing neurons (Paired t-test, p<0.05; Figure 3H).

FIGURE 3 – Decoding performance decreases when cumulatively adding neurons with the lowest

contribution. Comparison of two neuronal subpopulations (fifteen highest and fifteen lowest contribution) revealed that the worst neurons are less orientation-selective and show more variation in activity around their preferred orientations

A) Neurons with high OSI show both high and low values for contribution, no correlation is observed between OSI and contribution, demonstrating that some neurons show low contribution but high OSI. B) Distribution of

contribution of neurons. Minor part of the neuronal population consists of highly contributing neurons. C) The decoding performance is plotted per mouse (n=12) when adding neurons with highest contribution first. Neurons with the lowest contribution causing the decoding performance to decrease. D) The mean OSI of the best neurons and the worst neurons in C differs significantly from each other (Paired t-test, p<0.05), mean OSI values are respectively 0.7817 and 0.6519. E) Left: Tuning curve of example neuron that shows high orientation selectivity.

Right: Tuning curve of example neuron that does not show orientation selectivity. F) The circular variance of the preferred orientations in the two subpopulations did not show consistent differences, no statement about spreading of the preferred orientations can be concluded. G) The bandwidth was in 66,6% higher in the lowest contribution neurons, however, great variation was observed. H) The coefficient of variation of the best and worst neurons are significantly different (Paired t-test, p<0.05), demonstrating that there is higher variation around the mean response to the preferred orientations in the worst neurons.

(10)

The neuronal subpopulations of best and worst contributing neurons are not spatially clustered

It has been shown that clustering of tuning similarity occurs in mouse V1, and this effect is still observed after restricting the analysis to sharply tuned neurons (Ringach et al., 2016). Therefore, a potential clustering of the best contributing and worst contributing neurons was examined. By exception, for the analysis of spatial clustering all the neurons were included to obtain a pure result about spatial clustering (n=1895). In Figure 4A an example imaging plane is shown and the neurons from the two groups consisting of the fifteen best and fifteen worst neurons are marked respectively as green and orange. The clustering is investigated by examining the identity of the fifteen nearest neighbors per neuron, thus the neurons with the fifteen lowest cortical distances from that neuron. The cortical distance between a pair of neurons is defined as the distance between the center of their cell bodies in the imaging plane (Ringach et al., 2016).

We found for the best neurons the mean number of neighbors from the same subpopulation was 1.4833, for the worst neurons we found 1.4333 same neighbors (Figure 4B, solid line). To determine if this was related to spatial clustering, we shuffled the labels of each neuron (best, worst or remaining neuron) with 500 repeats. In each repeat, the number of same neighbors out of the fifteen nearest neighbors is counted. The random assignment of labels resulted in distributions are shown in Figure 4B with significant borders set on 0.05 and 0.95 (Figure 4B, dashed lines). We found no clustering in the measured spatial distributions, the observed number of same neighbors did not significantly differ from random spatial distributions.

FIGURE 4 – Spatial clustering of the best and worst neurons is not observed

A) Imaging plane with the spatial distribution of the two neuronal subpopulations and the remaining neurons. An example imaging plane is shown, for analysis all neurons (OSI between 0 and 1) are included (n=1895, per mouse varying between 130 and 194). B) The number of neighbors from the same group (best: 1.4833, worst: 1.4333; solid lines) out of the fifteen nearest neighbors were compared with random spatial distributions (500 repeats). Dashed lines indicate the significant borders at 0.05 and 0.95. The number of same neighbors does not significantly differ from random distributions. For both subpopulations, no spatial clustering is found.

Removal of noise correlations in worst contributing neurons increase the decoding performance

To address the decrease in decoding performance caused by the group of neurons with the lowest contribution (as observed in Figure 3C), the effect of experimentally measured correlations on neuronal population coding is considered. The role of correlations is assumed to be relevant in the information encoded by a neuronal population, and have particularly a strong impact on the portion of information that can be extracted with simple and biologically realizable algorithms (Kohn et al., 2016). To quantify this effect of correlation, the performance of the Bayesian Maximum-Likelihood decoder on the real neuronal population activity was compared with its performance on the data randomly shuffled (Figure 5A). The shuffled neuronal population activity was obtained by randomizing trials across repetitions of the same stimulus type, which were stimuli with the same orientations. Because the neurons’ activities for the trials of the same stimulus type were preserved, the neurons’ mean

responses to each different orientation will be unaffected, only the neurons’ correlations with the other neurons across trials were destroyed. Hence, this procedure preserved stimulus tuning but destroyed noise correlation structures (Montijn et al., 2016). Taking all the measured neurons into account, no difference between real and shuffled data is found, indicating that noise correlations in the entire neuronal population did not influence the decoding performance (t-test, p>0.05; Figure 5A, red).

(11)

However, when only the best and worst neurons were included, we did find a great effect on the decoding performance. A negative value corresponds to a better decoding performance when data is shuffled which destroyed noise correlations, a positive value corresponds to a better decoding performance with the real data with intact noise correlations. The neuronal subpopulations with the best neurons were the fifteen neurons with the highest contribution to the decoder (n=180), we expected that removing the noise correlations from these neurons implied removing valuable information and therefore would cause a decrease in the decoding performance. We did find that effect, we showed that the noise correlations of the best contributing neurons improved the decoding performance (t-test, p<0.001; Figure 5A, green). Contrary to this, when we included only the worst neurons (n=180), we observed an increase of 40.4861% of the decoding performance with the data where noise correlations were removed compared to the decoding performance with the real data (t-test, p<0.001; Figure 5A, orange). This means that the coding of orientation improves when the noise correlations of the neurons with the lowest contribution were removed.

Noise correlations in worst contributing neurons are higher than in best contributing neurons Noise correlations, or trial-to-trial fluctuations, are shared by pairs of neurons. We used an index of the mean shared trial-by-trial variability over all stimulus orientations as the definition for noise correlations (see Methods, Noise correlations). To evaluate the noise correlations in more detail, we made

correlation matrices of pairwise Pearson correlations for all best fifteen neurons and all worst fifteen neurons per mouse (Figure 5B). The noise correlation of a neuron with itself logically had the value of 1 (Figure 5B, diagonal red line), this was not included when computing the mean noise correlation. Noise correlations usually have mean values between 0.01 and 0.2 (Kohn et al., 2016), which agrees with our findings. We found a significant difference in mean NCs between the best neurons and the worst neurons (Paired t-test, p<0.01; Figure 5C). Following from this, we concluded that the neuronal populations with neurons that were least contributing to the decoding of an orientation have significant higher NCs than the neuronal population with neurons that were most contributing to the decoding of an orientation.

FIGURE 5 – Effects of noise correlations on Bayesian Maximum-Likelihood decoding performance A) The decoding performance of the Bayesian Maximum-Likelihood decoder of measured datasets are compared with the decoding performance of shuffled datasets, where noise correlations structures were destroyed. This did not affect the decoding performance of the entire neuronal population. A decrease of 12.8993% was found when noise correlations of the best neurons were removed (t-test, p<0.001). An increase of 40.4861% was found when noise correlations of the worst neurons were removed (t-test, p<0.001), which indicates a detrimental effect of NCs on decoding. B) Left: Correlation matrices of NCs between the best neurons, measured with Pearson correlations. Right: Same as left, now including the neuronal subpopulations with the worst neurons. C) Same color code as in A. Neuronal subpopulations with lowest contribution show increased noise correlations (Paired t-test, p<0.01). Observed mean NCs of best and worst neuronal subpopulations were respectively 0.0286 and 0.0669.

(12)

Discussion

In this study, we measured the activity of neurons in mouse V1 by in vivo two-photon scanning when mice passively viewed oriented square wave moving grating stimuli. The neurons were sorted on their contribution to the prediction of the presented stimulus orientation, executed by the Bayesian

Maximum-Likelihood decoder. Limiting the information (i.e. measured activity) given to the decoder, by only providing the information of the fifteen best contributing neurons, the decoding performance quickly increased (Figure 3C). This is substantiated by Franke et al. (2016), where the coding error quickly decreased with the size of included neurons from one to twenty. This implies that the information of only a few neurons – the neurons with the highest contribution to the decoder - are needed to optimally decode the orientation of a visual stimulus. However, when the worst contributing neurons were added to the provided information given to the decoder, the decoding performance declined (Figure 3C). This means that coding of an orientation is more optimal without the information of these neurons. We conclude that the neurons with the lowest contribution to the decoder are the neuronal subpopulations that impede the decoding of orientation.

We expected to find the worst contributing neurons being the least orientation-selective (located on the bottom left in Figure 3B). Interestingly, the neurons that disturbed the population activity for decoding the correct stimulus were not found as the least orientation-selective neurons in the population. Although we observed a significant difference of measured OSI values between the best contributing and worst contributing neurons (Figure 3D), the neurons with the lowest contribution were not the neurons with the lowest OSI. Thus, not their lack of orientation-selectivity arranged them as worst contributing neurons. However, neurons can be highly selective for orientations but still show high variation in activity around their preferred orientations. For the worst contributing neurons, we observed a higher variation around their preferred orientations in comparison to the best contributing neurons (Figure 3H). We suggest a high coefficient of variation in activity around preferred

orientations can still cause orientation-selective neurons to disrupt the population coding of orientation. We introduced the concept of trial-to-trial variability, or noise, as a source for limiting information. Not noise itself is considered as beneficial or detrimental to coding: noise is present and correlation can manipulate it in a fashion that favors or impedes coding (Franke et al., 2016). We highly suspected that the decrease in decoding performance is attributed to the interaction of these neurons in a manner that decoding of the correct presented stimulus is perturbed. We observed that the worst contributing neurons have higher noise correlations in comparison to the best contributing neurons (Figure 4B), concluding that these neurons have higher correlated trial-to-trial variability.

To determine whether the observed increased noise correlations in the worst contributing neurons caused the impediment of coding of orientation, we artificially removed the noise correlations in this neuronal subpopulation. We destroyed the noise correlations by randomizing the measured activity in trials of the same stimulus type, which was the same presented orientation. With this shuffling in trials, the neuron’s mean response to a given orientation remained the same; only the correlations between the trial-to-trial variability, or noise correlations, are removed. We confirmed that the noise correlations in the neurons with the lowest contribution are detrimental, the removal of noise correlations in the neurons with the lowest contribution resulted in a significant increase of decoding performance (Figure 4A, orange). This implies that the observed increased noise correlations in the worst contributing neurons disrupt the population coding, because the removal of the noise

correlations improve the decoding performance.

To summarize, we showed that there are neuronal subpopulations that disrupt coding of orientation in V1 through increased correlated trial-to-trial variability. First, we identified the neuronal subpopulations that hamper coding of orientation as the neurons with the lowest contribution to decoding. Next, we concluded that these subpopulations have higher noise correlations, and that the removal of these increased noise correlations favors coding. We demonstrated that decoding of orientation improved without the noise correlations in the neurons with the lowest contribution. Therefore, we showed with experimental data from mouse V1 that noise correlations can have a detrimental effect on coding of orientation.

Encoded information increases when removing NCs: an explanatory model

The phenomenon that noise correlations are detrimental by limiting information is considered in computational approaches (Averbeck et al., 2006; Moreno-Bote et al., 2014; Kohn et al., 2016), and we suggest our study demonstrated with experimental data the existence of detrimental noise

(13)

correlations. Our proposed detrimental noise correlations can be explained by a model similarly to the model described by Averbeck et al. (2006). When the signal correlation between a neuronal pair is higher than 0, it means that a pair of neurons both responds stronger to stimulus 2 relative to stimulus 1 (Figure 6, left). For signal correlation between a neuronal pair smaller than 0, it is the exact

opposite: only one neuron of the pair responds stronger to stimulus 2 relative to stimulus 1 (Figure 6, right). By introducing NCs with the opposite sign than signal correlations (thus negative NCs on the left and positive NCs on the right), responses appear to be ellipse shaped (Figure 6, top). By artificially removing NCs - which we achieved by shuffling the trials of responses from the same stimulus type - NC structures disappear and the responses appear to be circle shaped (Figure 6, bottom). Evidently, removing the NCs causes less overlap between the two responses of the neuronal pair. This means that the amount of information present in shuffled responses is higher than the amount of information in the normal responses. Because these NCs reduced the information, they can be considered as detrimental. We observed that the decoding performance increased when the NCs of the worst contributing neurons were destroyed (Figure 5A), and according to this model, we suggest that the NCs found in these subpopulations are structured in a manner that they cause overlap between different stimuli.

FIGURE 6 – Effects of correlations on information encoding

This model shows that removing noise correlations increase the amount of encoded information. Unshuffled responses (top) and shuffled responses where noise correlations are removed (bottom) are shown. The dashed line indicates the optimal decision boundary to distinguish the two stimuli: every response of a neuronal pair below the dashed line is classified as stimulus 1, every response above the dashed line is classified as stimulus 2. When noise correlations are artificially removed, the ellipses transform in circles. This results in less overlap between the responses from the stimuli, and responses are more often classified as correct stimulus.

It remains unclear how exactly the interaction between the neurons with the result of detrimental NCs works. The worst contributing neurons were not found to be clustered (Figure 5B), therefore the spatial distribution cannot be assigned as source for detrimental NCs. A potential explanation is the formation of neuronal assemblies, where neurons - besides information from the stimulus – receive information from an unknown “common source” as well (Harris, 2005). The presence of unobserved variables reflects an important aspect of information processing. Correlations generated by these unobserved variables would not survive shuffling, and cannot be detected by stimulus reconstruction. Therefore, they can only lower the quality of reconstruction of a sensory stimulus (Harris, 2005). This is in line with our main finding that decoding performance including the worst contributing neurons is decreased, and that the NCs of these neurons have a detrimental effect on population coding. Information-limiting correlations: differential correlations

The brain does not simply encode information and leave it untouched. Instead, it transforms this information to make it more useful. However, with this transformation the noise correlation patterns

(14)

can transform in a manner that they are indistinguishable from the signal pattern across the population (Pitkow & Angelaki, 2017). These are thought to be the only “information-limiting correlations”, and are also called differential correlations (Moreno-Bote et al., 2014; Kohn et al., 2016). Differential

correlations result in population responses to identical stimuli to fluctuate in the same manner as the fluctuations in population responses that would be caused by presenting another stimulus (Kohn et al., 2016). Therefore, differential correlations will result in an impediment in coding, because different stimuli become harder to distinguish through the differential correlations.

Neuronal population coding is not only represented by pairwise correlations of neurons, multi-dimensional correlations might also be important for neuronal populations that represent a visual stimulus. However, it has been shown that these multi-dimensional correlations are harmless to the variability in response (or noise) of single neurons (Montijn et al., 2016). Further research is needed to investigate whether multi-dimensional correlations in neuronal populations will also impede population coding.

References

Averbeck, B. B., Latham, P. E., & Pouget, A. (2006). Neural correlations, population coding and computation. Nature Reviews Neuroscience, 7(5), 358-366.

Chen, T. W., Wardill, T. J., Sun, Y., Pulver, S. R., Renninger, S. L., Baohan, A., ... & Looger, L. L. (2013). Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature, 499(7458), 295-300. Denman, D. J. & Contreras, D. (2013). The structure of pairwise correlation in mouse primary visual cortex reveals functional organization in the absence of an orientation map. Cerebral Cortex, 24(10), 2707-2720.

Franke, F., Fiscella, M., Sevelev, M., Roska, B., Hierlemann, A., & da Silveira, R. A. (2016). Structures of neural correlation and how they favor coding. Neuron, 89(2), 409-422.

Goltstein, P. M., Coffey, E. B. J., Roelfsema, P. R., & Pennartz, C. M. A. (2013). In vivo two-photon Ca2+ imaging reveals selective reward effects on stimulus specific assemblies in mouse visual cortex. The Journal of Neuroscience, 33(28), 11540-55.

Goltstein, P. M. (2015). Effects of associative learning and cortical state on early visual processing in the brain (Unpublished doctoral thesis). University of Amsterdam, Amsterdam, The Netherlands. Harris, K. D. (2005). Neural signatures of cell assembly organization. Nature Reviews

Neuroscience, 6(5), 399-407.

Hübener, M. (2003). Mouse visual cortex. Current Opinion in Neurobiology, 13(4), 413-420.

Jacobs A. L., Fridman, G., Douglas, R. M., Alam, N. M., Latham, P. E., Prusky, G. T., & Nirenberg, S. (2009). Ruling out and ruling in neural codes. Proceedings of the National Academy of Sciences, 106(14), 5936-5941.

Kerr, J. N. D. & Denk, W. (2008). Imaging in vivo: watching the brain in action. Nature Reviews Neuroscience, 9(3), 195-205.

Kohn, A., Coen-Cagli, R., Kanitscheider, I. & Pouget, A. (2016). Correlations and neuronal population information. Annual Review of Neuroscience, 39, 237-256.

Montijn, J. S., Vinck, M., & Pennartz C. M. A. (2014). Population coding in mouse visual cortex: response reliability and dissociability of stimulus tuning and noise correlation. Frontiers in Computational Neuroscience, 8(58).

Montijn, J. S., Meijer, G. T., Lansink, C. S, & Pennartz, C. M. A. (2016). Population-level neural codes are robust to single neuron variability from a multidimensional coding perspective. Cell reports, 16(9), 2486-98.

(15)

Moreno-Bote, R., Beck, J., Kanitscheider, I., Pitkow, X., Latham, P., & Pouget, A. (2014). Information-limiting correlations. Nature Neuroscience, 17(10), 1410-1417.

Niell, C. M. & Stryker, M. P. (2008). Highly selective receptive fields in mouse visual cortex. The Journal of Neuroscience, 28(30), 7520-7536.

Ohki, K., Chung, S., Ch’ng, Y. H., Kara, P., & Reid, R. C. (2005). Functional imaging with cellular resolution reveals precise micro-architecture in visual cortex. Nature, 433(7026), 597-603.

Pitkow, X., & Angelaki, D. E. (2017). Inference in the Brain: Statistics Flowing in Redundant Population Codes. Neuron, 94(5), 943-953.

Piston, D., Fellers, T. J., & Davidson, M.W. (2016). Multiphoton microscopy – Fundamentals and applications in multiphoton excitation microscopy. Retrieved from

https://www.microscopyu.com/techniques/multi-photon/multiphoton-microscopy.

Ripley, B. D. (1996). Pattern recognition via neural networks. a volume of Oxford Graduate Lectures on Neural Networks, title to be decided. Oxford University Press.[See http://www. stats. ox. ac. uk/ripley/papers. html.].