The Effects and Values of Variables in Spiking Neuron Models for the Auditory Nerve

(1)

The Effects and Values of Variables in Spiking

Neuron Models for the Auditory Nerve

T.C.A.M. Zwaneveld

July 25, 2020

Master Thesis

(2)

Abstract

Neurons are the building blocks of the highly efficient information transmission in the body. In this thesis we will take a look at models of spiking neurons. As the neuron receives an input, its membrane potential will increase and reach a threshold. When it does, it emits a spike. Using these spikes, the next neuron then estimates the input signal in the former neuron. To create this estimation a number of parameters are used. We will optimize some of these variables both in a straightforward and in an evolutionary manner. We will then suggest values for the parameters in the model of the high, medium and low frequency auditory neurons in the cochlea of the inner ear. In order to do so we will use speech data from the TIMIT data set as well as a model of the cochlea created by Zilany et al. For one of the variables we find that the lower the characteristic frequency, the higher our findings for its value. we find that for the membrane time constant and resting threshold constant do not depend on the characteristic frequency.

Title: The Effects and Values of Variables in Spiking Neuron Models for the Auditory Nerve Author: T.C.A.M. Zwaneveld, tcam.zwaneveld@gmail.com, 6114571

Daily supervisor: Prof. dr. S.M. Boht´e Examinor: dr. R. Quax

Second assessor: Prof. dr. S.M. Boht´e Technical expert: dr. G.J. Stephens Deadline: July 25, 2020

Computational Science Lab University of Amsterdam

Science Park 904, 1098 XH Amsterdam https://uva.computationalscience.nl/ Centrum Wiskunde & Informatica

Science Park 123, 1098 XG Amsterdam https://www.cwi.nl/

(3)

1 Introduction

Every organism, big and small, consists of parts, tiny parts, called cells. Every cell has its own purpose. Some cells are part of the skin, others are part of the muscles in your legs or part of your bones: every function of your body needs its own cells. And each of these types of cells is different. During this thesis we will be looking at the cells that are part of the nervous system: nerve cells or neurons, more particularly spiking neurons. In chapter 2 we will introduce two spiking neuron models and compare them. We will take one of these models and in chapter 3 define features of spiking neuron models. We walk through these features using a toy signal, which we shall be using during the rest of the thesis as well. After this we will introduce a real signal in chapter 4 and compare the features of a high, a medium and a low frequency neuron to each other and to the toy signal. Then we take a closer look at some of the variables in the neuron model, we look at the individual effects and try to find an optimized combination in chapter 5. Chapter 6 introduces a number of changes to the model and ends in a new model for which we shall optimize some variables using an evolutionary algorithm for the three types of neurons in chapter 7. But first let us take a closer look at the nerve cells.

1.1 Neurons

Nerve cells are the building blocks of the nervous system. These neurons are found all over your body. There are different types of neurons: sensory neurons, mo-tor neurons and inter-neurons. When you touch something, you feel that you touch it. This is because the sensory neurons ‘sense’ that you touch something and send a signal via inter-neurons to your brain so that you know (or feel) that you just touched some-thing. This all happens very fast so the communication of the neurons must also be very fast. The same thing happens any-where in your body: touching, feeling, see-ing, hearsee-ing, every sense you have is a sig-nal sent from a sensory neuron, through other neurons, to your brain where you reg-ister the sensing. But you don’t only sense

(6)

have to react. Then the brain sends a signal through motor neurons to the muscles so that you can walk away, lift a cup, do something.

A signal enters a neuron through dendrites. It is possible for a neuron to have multiple signals entering. The signal goes through the cell body (or soma) via the axon and synapses to the dendrites of a possible next neuron and so on. This intermediate neuron is the inter-neuron, they work both ways. In this thesis we will be focusing on sensory neurons, in particular auditory sensory neurons in the inner ear.

(7)

2 Spiking neuron models

Figure 2.1: Spiking behavior of the membrane potential of a neuron ([37]).

A neuron is like a black box: the neuron receives an input signal and sends an output. This makes it interesting for us to find out what happens in between. Which is why we try to make neuron models. Fortunately a lot is already known about these models, so we don’t have to invent every-thing ourselves. The models we are interested in are of spiking neurons. Both motor- and sen-sory neurons are examples op spiking neurons. As such a neuron receives an input, its membrane potential will increase and may reach a threshold. When it does, it emits a spike. This behavior of the neuron’s membrane potential is shown in fig-ure 2.1. The next (post-synaptic) neuron then uses these spikes to estimate the input signal of the former (pre-synaptic) one, but here we focus on the first neuron.

Formally, if we let θ(t) denote the threshold, and u(t) the membrane potential of the neuron, then the neuron will fire a spike at time t(f ) if

u(t(f )) ≥ θ(t(f )) and δu(t) δt (t

(f )_{) > 0.}

The last expression means that the threshold must be crossed from below in order to fire a spike. In practice this is always the case because the membrane potential resets after crossing the threshold. Thus we could also omit the second expression and only state the first one as definition of the spike. All spikes together form the output of the neuron, which we call a spike train. Because of the characteristic spiking behavior of the neuron it is easy to see that the shape, the potential surrounding a spike, does not carry much extra information. What does carry the information is the moments at which the spikes occur: the spike train, which can be coded as a binary string.

Some models of spiking neurons are already known. Two of the models discussed in [10] we will also discuss (and implement) here. These are the Integrate-and-Fire Model and the Spike Response Model. And we will see that for some particular settings these two models are very similar.

(8)

2.1 Integrate-and-Fire Model

When looking up information about spiking neuron models, the first thing one encounters is probably the (leaky) Integrate-and-Fire Model. This model is based on a first order linear differential equation (or ordinary differential equation) for the membrane potential of the neuron: du dt = −u(t) + RI(t) τm .

In this model, u is the membrane potential, R denotes the linear resistor, I corresponds to the input current and τm represents the membrane time constant of the neuron. The

model is named after the fact that it is based on an integration of input and when the potential reaches a certain threshold, the neuron fires a spike. The Integrate-and-Fire Model is does not focus on capturing the actual shape of the membrane potential. It encapsulates the creation of spike trains, but since that is where the information is stored, a spiking neuron model does not really have to do much more than just that. So let us suppose that after firing a spike at time ˆt, the membrane potential returns to its resting value ur, then for any continuous input current I(t) we can solve the differential

equation and therefore find a direct solution for the membrane potential (up until the next spike). Starting from the differential equation we put the equation into the right form. du dt = −u(t) + RI(t) τm du dt + u(t) τm = RI(t).

Now let us first solve the homogeneous part of the equation. du dt + u(t) τm = 0 du dt u(t)+ 1 τm = 0 divide by u ˇ C = Z t ˆ t du dt u(t) + 1 τm dt C is some constantˇ = Z ₁ u(t)du + t − ˆt τm ln |u(t)| = ˇC − t − ˆt τm u(t) = ±eCˇe−t−ˆτmt = Ce−t−ˆτmt _{C is a different constant}

If the differential equation would equal 0, we would be done at this point. But this is not the case. Therefore let us complete the search for the direct formula of the membrane

(9)

potential. In order to do this we claim that the constant C is not a constant, but instead is a function. Now let us fill this in in the first equation:

1 τm RI(t) = du dt + u(t) τm = d dt C(t)e −t−ˆt τm + 1 τm C(t)e−t−ˆτmt = dC(t) dt e −t−ˆt τm − C(t) 1 τm e−t−ˆτmt ₊ 1 τm C(t)e−t−ˆτmt = dC(t) dt e −t−ˆt τm dC(t) dt = R τm I(t)e−t−ˆτmt

Now let us integrate to find C(t). Then put it in the formula for u(t). C(t) = Z t ˆ t R τm

I(x)e−x−ˆτmt_{dx + K} _{K is some constant}

= R τm Z 0 t−ˆt −I(t − s)et−s−ˆτmt_{ds + K} _{s = t − x} = R τm Z t−ˆt 0 I(t − s)et−s−ˆτmt_{ds + K} u(t) = C(t)e−t−ˆτmt = R τm Z t−ˆt 0

I(t − s)et−s−ˆτmtds + Ke− t−ˆt τm = R τm Z t−ˆt 0

I(t − s)et−s−hatt−t+ˆτm t_{ds + Ke}− t−ˆt τm = R τm Z t−ˆt 0 I(t − s)eτm−s_{ds + Ke}− t−ˆt τm_.

All that is left is filling in the initial value condition u(ˆt) = ur and find the value for K.

ur = u(ˆt) = R τm Z ˆt−ˆt 0 I(ˆt − s)eτm−s_{ds + Ke}− ˆ t−ˆt τm = R τm · 0 + K · 1 = K.

Thus the direct formula of the first order linear differential equation of the Integrate-and-Fire Model is

(10)

In the implementation of this model we use the differential version of this formula to-gether with the Euler Forward method:

u(t + 1) = u(t) + δu δt

= u(t) +−u(t) + RI(t) τm

.

This behavior of the membrane potential is shown in figure 2.2 on the left. In figure 2.2a the signal is not strong enough to create a spike and the membrane potential gradually returns to its resting value. Figure 2.2c shows a signal that does lead to spikes. The sudden return of the membrane potential to the resting value after each spike is shown here as well.

Now let us take a look at another well-known spiking neuron model called the Spike Response Model.

2.2 Spike Response Model

For the information a neuron carries it does not really matter what exactly happens to the membrane potential after a spike, however it makes for a more interesting and accurate model to take this part into account. One possible way of doing so is the Spike Response Model. This model is made up of kernels that describe the response of the membrane potential to the spike, which is where the name Spike Response Model comes from. Another word for kernel is a filter or a function. We will use these terms interchangeably. In the version of the Spike Response Model we will discuss here, the influence of input for a (postsynaptic) neuron coming as output from other (presynaptic) neurons will not be taken into account which makes for a more simplified model than the one that is discussed in [10]:

u(t) = η(t − ˆt) + Z ∞

0

κ(t − ˆt, s)I(t − s)ds.

Similar to the Integrate-and-Fire Model, u represents the membrane potential of the neuron, ˆt the last firing time and I the input current. The other symbols η and κ are both kernels. Just like in the Integrate-and-Fire Model, everything depends on the time passed since the last spike was fired. A major difference is that in the Spike Response Model what happens to the membrane potential after firing the spike is taken into ac-count. Later we will see that the Spike Response Model is in fact a generalization of the Integrate-and-Fire Model. But first let us discuss the kernels in the Spike Response Model.

After firing a spike the membrane potential goes through a trajectory which, if there is no input, is similar every time. The Spike Response Model captures this trajectory in the kernel η which therefore represents a standard form of the membrane response. It is likely that an input would trigger a reaction on the membrane potential. This reaction

(11)

(a) Integrate-and-Fire Model without spikes. (b) Spike Response Model without spikes.

(c) Integrate-and-Fire Model with spikes. (d) Spike Response Model with spikes. Figure 2.2: The reaction of the membrane potential to a spiking input current.

derails the membrane potential from its usual trajectory, η. The influence of the input on the membrane potential is captured by the κ kernel. Thus the κ kernel describes a linear response to the input current.

When we choose κ and η in a particular way, we get the solution of the ordinary differ-ential equation that defines the Integrate-and-Fire Model:

κ(t − ˆt, s) = R τm e−τms 1 [0,t−ˆt](s) η(t − ˆt) = ure− t−ˆt τm_.

In practice we let R = 1. In the formula for κ, the 1 makes sure that the time after spiking does not exceed the value of s and therefore creates the correct boundaries of the integral. This function is known as the indicator function and is defined as

1[a,b](x) =

(

1 x ∈ [a, b] 0 else.

In figure 2.2 both the Integrate-and-Fire Model and the Spike Response Model are shown. In the top figures (2.2a and 2.2b) the signal is not strong enough to create a spike and in both models the membrane potential responds in the same way. In the bottom figures (2.2c and 2.2d) spiking occurs and this is where the difference between

(12)

as in the top part, the Integrate-and-Fire Model just resets directly after omitting a spike. Another difference between the two models is the timing of the spikes. It looks as though the Integrate-and-Fire Model has a small delay. The model fires a little later than the corresponding Spike Response Model. For the purpose of this thesis, this is not a problem. As a neuron encodes a signal into spikes and sends these to the next neuron. This next neuron will decode these spikes to estimate the corresponding signal. This next neuron does not know when the signal started and therefore the estimation is stationary. The precise timing of the spikes do not matter, what is important are the inter-spike intervals: the time between two spikes.

(13)

3 Exploring the model

The model we have built so far is fairly simple. Of course there are known improvements ([39], [11]). Some of which we will be discussing in chapter 6. But in order to do that we state in chapter 5 what it means for one model to be better than another and use this to tune some variables of the current model with some examples. In this chapter we look into the methods needed to create such a distinction.

In real life a neuron is optimized through evolution with respect to a number of things. Every spike costs energy but holds bits of information. The cost per bit is different in different situations and cells ([32], [18]). But since the theory of metabolic cost is beyond the scope of this thesis, we will focus on coding efficiency instead. In order to calculate this we will need the information rate: the number of bits per spike or per unit time. Before we get into this let us introduce a bit of information theory. When talking about bits and information in a theoretical setting it is inevitable to also talk about entropy. According to [16], ‘entropy is a measure of uncertainty associated with a random variable. It measures the average information content one is missing when one does not know the value of the random variable.’ It is the quantification of surprise. To get a feeling of entropy, let us use the example of the coin: When flipping a fair coin the probability of getting heads is 50%. You are not surprised to get heads but would also not be surprised if the coin turned up tails. The entropy of this coin is equal to one bit. Now if the coin were not fair, say 80% heads and 20% tails, you would not at all be surprised for heads, and very surprised for the turn up tails. The entropy of this coin is approximately 0.72 bits. This unfair coin contains less information about the unknown value (heads or tails) than the fair one. Therefore the entropy can also be described as the amount of information one gains when learning the value of a random variable. The term entropy also has a mathematical definition:

For any probability distribution X with all values, a in some set A, let pa be the

prob-ability of X = a. Then the (Shannon) entropy is defined as

H(X) =X a∈A palog2( 1 pa ) = −X a∈A palog2pa.

In a theoretical setting this definition can be used to prove all kinds of theorems about information in general as well as in spiking neurons. Since this is not the focus of this thesis, we will not be discussing those. But we will be using a little bit of information theory in the quantification of our model. Let us start by looking at the speed of the information: the information rate.

(14)

3.1 Information rate

The information rate in a model can be measured in two ways: bits per (milli)second and bits per spike. It is easy to switch from one to the other when we know the spike rate: the number of spikes per (milli)second. What is most important is that the information rate is the amount of information per unit time. Whether the unit is seconds or spikes and therefore is time-based or event-based does not really matter. In [28] the authors describe a way to estimate the information rate, starting from the input and output of the spiking neuron model: the signal and the spike train. In order to get there, we need to perform a couple of steps: first we have to estimate the signal from the spike train, then we need to compute the (effective) noise in order to be able to estimate the signal to noise ratio, once we have that we can compute the information rate of the estimated signal and thus the information rate in our model. These steps are explained in the rest of this section.

3.1.1 Estimating the signal

In their paper ([39]) Zambrano & Boht´e describe how to construct an estimation of the input signal from the spike train and the threshold values at the times of spiking by using height kernels. The formula they use for this approximation is

ˆ S(t) =X ti<t ζ(t − ti) with ζ(t − ti) = ν(ISI) · θ(ti) · η(t − ti).

In this estimation η(t − ti) is computed as before: an exponential with time constant τm.

Note that the time constant used here is the membrane time constant of the neuron. And to determine the estimation the after-spike trajectory function from before is used. The correction factor ν(ISI) is approximated as a simple linear function. However, we found that the correction factor does not need to be linear, because the contribution of the inter-spike interval (ISI) turns out to be extremely small so we approximate it by a constant. The estimation is therefore as follows:

ˆ S(t) =X ti<t ζ(t − ti) =X ti<t ν(ISI) · θ(ti) · η(t − ti) =X ti<t ν · θ(ti) · ure(− t−ti τm )_.

For now let us postpone the scaling and put ν equal to 1. After creating this estimation, it is smoothed. We implement this as a convolution with smoothing filter φ, which is an

(15)

exponential function with constant _τ 1

smooth and time constant τsmooth:

¯ S(t) = (φ ? ˆS)(t) = Z ∞ −∞ 1 τsmooth e( −s τsmooth)_{S(t − s)ds.}ˆ

In the code, we use the numpy function convolve for this purpose. After smoothing we find that the estimation is now longer than necessary and it is shifted, it is also not the right height. Biologically speaking the shift could mean that the neuron needs a little time to create the estimation. But when comparing the estimation to the signal, we want to dismiss this delay, because it does not really matter when the signal started. This is called stationarity. We find the right amount of delay by doing a simple line search over the first part of the convoluted estimation up until the first spike. We then cut off the estimation up to this point and from this point plus the length of the signal, so that the signal and the estimation are again of the same length. At this moment we only have to re-scale the estimation, which is also done quite easily. Now we have found the estimation, we want to know how much noise it contains.

3.1.2 Computing the noise in the estimation

In order to compute the noise, we first have to transform the signal and its estimation into a more usable form. For this we use Fourier analysis. The part we use is the Fourier transform which for any continuous function f (of time) is defined as

fn= ˜f (ωn) = 1 T Z T 0 f (t) · eiωnt_dt.

Here, [0, T ] is the domain of the function, ωn denotes the n-th unit of frequency, such

that ωnT = 2πn holds. We call the complex number fnthe n-th Fourier coefficient of f .

If we would want to transform the signal back, we would need the inverse of the Fourier transform, which is defined as

f (t) =

n=∞

X

n=−∞

fn· e−iωnt.

Note that we can view the Fourier transform as a different representation of the function. A function and its Fourier transform are called a Fourier pair.

Now, let us put the Fourier analysis to use. Our goal is to determine the effective noise, in an estimate of the signal. By definition the estimation is equal to the signal plus noise. In this case there is no distinction between systematic and effective noise. To get this distinction we need a different approach, we need to infer noise to the input i.e. we assume that the conversion from the spikes to the estimation is noiseless ([14]). After applying the Fourier transformation on the signal and the estimation, it looks like

˜ ˆ

s(ω) = g(ω)˜s(ω) + ˜neff(ω).

(16)

τ0. Then we take the Fourier transform of each part for every frequency ω, creating a

set of sets (family) of Fourier coefficients:

{˜s(ωn), ˜s(ωˆ n)}m | m, n ∈ N, m ≤ M where M = T0 τ0 .

In order to make sure that M is a natural number, we pad the signal, which has length T , with zeros such that the new length of the signal, T0, and the length of the segments,

τ0, are both powers of 2. For every ω we can now create a scatter plot of the real parts

of ˜s(ω) versus the real parts of ˜s(ω) and find the best linear fit, b(˜ˆ s(ω)), through the dots. The slope of this line is g(ω) and is called the gain. Finally we can determine the effective noise per segment:

˜ neff(ω)m= ˜ ˆ s(ω)m g(ω) − ˜s(ω)m.

Note that ˜neff(ω)m is again a complex value, just like ˜s(ω)m and ˜s(ω)ˆ m.

3.1.3 Estimating the Signal to Noise Ratio

The signal to noise ratio is in fact the ratio between the signal power spectrum and the power spectrum of the effective noise:

SNR(ω) = S(ω) Neff(ω)

.

So what we need to do, is determine the power spectrum of the signal and the power spectrum of the effective noise. Then we can calculate the signal to noise ratio.

The power spectrum is a way of stating the amount of power in the signal for every frequency. So to find a power spectrum it makes sense to use a representation of the signal in terms of frequencies: this is why we use the Fourier transform. We can then take the variance of the component with frequency ω to find the power in this frequency. Another approach is by first determining the auto-correlation matrix of the signal and applying the Wiener-Khinchin Theorem, which states that the auto-correlation function and the power spectrum form a Fourier pair. In this case the auto-correlation function is the Fourier Transform of the power spectrum. So by using the inverse of the Fourier transform on the auto-correlation we find the power spectrum. We will start by fully explaining these methods.

Starting from the auto-correlation matrix

Before we use auto-correlation, let us first define it. The auto-correlation of a signal is the correlation between the signal and itself. Where the correlation of two signals s1

and s2 is defined as

C(s1, s2, t1, t2) =

Z ∞

−∞

(17)

Hence, the auto-correlation of a signal s(t) is the average of the product of the shifted signal with the same non-shifted signal:

AC(s, t1, t2) = C(s, s, t1, t2) = Z ∞ −∞ s(t1)s(t2)dt2 = Z ∞ −∞ s(t2− (t2− t1))s(t2)dt2 = AC(s, t2− t1) = Z ∞ −∞ s(t − t0)s(t)dt = AC(s, t0).

It is easy to see that the auto-correlation of the signal just depends on the differences in time. Also the signal itself is clear from the context so we can omit this from our function. Therefore we can say that AC(s, t0) = AC(t0) where t0 equals the difference in time. Now that we have shown that the signal is stationary, we are allowed use the Wiener-Khinchin theorem, which states

AC(t0) = Z ∞ −∞ 1 2πS(ω)e −iωt0 dω.

Thus the auto-correlation of a signal and its power spectrum form a Fourier pair. Using this we can now calculate the power spectrum by taking the inverse of the Fourier transform:

S(ω) = Z ∞

−∞

AC(t0)eiωt0dt0.

In practice however we find that this method does not work. This is because the time interval of real data is not of infinite length, this means that the integration is done over a finite interval. Since the proof of the Wiener-Kinchin theorem uses the infiniteness of the time interval, we cannot use this theorem when approximating the power spectrum of a signal.

Fortunately, there is another way to approximate the signal power spectrum, called a periodogram [1], in the limit of T → ∞ the periodogram is equal to the power spectrum. This method looks a lot like the one we use to find the power spectrum of the effective noise. We can use the scipy.signal function periodogram to find the results. In order to use the same frequencies as we used with the effective noise, we need to set the option nfft to the segment length τ0 and set return onesided to False. This way the sum of the

power of all frequencies equals 1, creating a power spectral density. But since this is symmetrical, it suffices to only use the non-negative part.

Using the Fourier transform

(18)

The whole procedure of cutting the noise into fragments, calculating its Fourier trans-form, its spectrum and at last averaging over the different segments is known as Bartlett’s method or the method of averaged periodograms ([6]).

In practice it does not really matter which method one uses (either the periodogram or the fast Fourier transform from numpy). We can turn one into the other using a couple of easy steps. Since it turns out that using the fft from numpy.fft is faster, we will be using that in the code for both the effective noise spectrum as well as the signal spectrum. Although the Fourier transform is symmetrical because the signal is real-valued, we will be using the full version and not only the non-negative part. If we were interested in taking only the non-negative part, rfft from the same package would do the trick. Once the Fourier transform of the signal is computed, we need to follow almost the same steps as we did for the effective noise: take the square of each absolute value and since we did not create different windows here, divide by the length of the zero-padded signal. Now that we have calculated both power spectra, we can determine the signal to noise ratio at any frequency by dividing the signal power of this frequency by its effective noise power.

3.1.4 Computing the information rate

In this last step we compute the information rate in bits per millisecond by putting the signal to noise ratio into the formula

Rinfo= 1 2 Z ∞ −∞ 1 2πlog21 + SNR(ω)dω.

Now multiply by 1000 to get the information rate in bits per second. Note that this version of the information rate only works for infinitely long signals, which we don’t have. In this thesis we are working with finite discretized signals with a sampling frequency of 1000 samples per second. Therefore the frequencies that can be described are those up to 500 Hz. Because of the finiteness and discreteness of our signals the information rate formula becomes: Rinfo = 1 T X ωn 1 2log21 + SNR(ωn).

(19)

Which is the sum of the information in each frequency (the total information in the signal) divided by the length of the signal. We are allowed to just take the sum, because we are working with Fourier transforms and therefore the frequencies and the information at them are independent from each other. Because we also had created a spike train, and with that a spike rate, we can use this to compute an information rate in bits per spike.

3.2 Coding efficiency

Now that we have computed the information rate, we would like to know what this number means, relatively speaking. So we would like to be able to compare it with some kind of maximum. One thing to compare it with is the total number of bits divided by the total length of the signal: the information capacity. Since we are using a specific kind of model, spiking neurons, we would like to also be able to compare the information rate with the capacity of this kind of model. For simple models we will discuss how to do this in section 3.2.2. The ratio between the information rate and the information capacity is the coding efficiency. In information theoretical terms the coding efficiency is the proportion of a neuron’s output entropy that provides information about the neuron’s input ([32]). An optimal coding efficiency would have a value of 1. Which means that every variation in the spike train corresponds to a unique change in the input signal ([28]).

3.2.1 Information capacity of the signal

One way of determining the information capacity is by calculating the information rate of the signal with almost no noise. It is not possible to do this without any noise, because then the signal to noise ratio is infinitely large and therefore the information rate becomes infinitely large and we would still not have a clue of what is a reasonable value for the information rate.

3.2.2 Information capacity of a spiking neuron model

We could also compare the calculated information rate with the highest information rate possible in a spiking model. How to do this is explained by MacKay and McCulloch in [22] and we will briefly discuss it here.

Assuming that every spike is of the same height, MacKay and McCulloch state that the information capacity boils down to the product of the mean signal frequency and the average information content per signal, which formally looks like

Cinfo=

2 Tm+ Tr

log₂Tm− Tr

∆T .

Here Tm is the maximum possible amount of time between two spikes and Tr is the

(20)

millisecond. It is clear that the information capacity is determined by the values of the time-constants of the model. So optimizing the information capacity is the same as optimizing these constants. In order to do this, let us choose ∆T and define

Tr= r∆T

Tm= m∆T.

Given the refractory period time constant, we can optimize the information capacity by solving

m + r

m − r = ln(m − r).

In practice, solving this equation is more difficult than it looks. There is no direct way to solve the equation and it is not an equation easily solvable by hand. We could choose to let a program like Wolfram Alpha ([38]) do the math, but this would require us to stop the program, delaying the entire process. We suggest a different approach:

Using the minimum and maximum inter-spike intervals

Once the spikes have been produced by the model, one can define Tr to be the smallest

inter-spike interval and Tm to be the largest one. This value of Tm might not be optimal,

but it is the maximum time between two spikes. All that is left to do is fill in the values and calculate the information capacity, Cinfo.

Using the spike rate

In their paper, MacKay and McCulloch used basic information theory as a starting point. When calculating the information capacity we can do the same thing, but instead of finding optimized values with respect the minimal space between two spikes, we can use the number of spikes per millisecond, the spike rate, as our basis. We find that on average the spike rate, Rspike, is the probability of there being a spike at any time. Since

the information capacity is approximated by the entropy per unit time ([28], [33]) and assuming that all spikes are of equal height, we find:

Cinfo= H/T = −Rspikelog2Rspike− (1 − Rspike) log2(1 − Rspike).

This is the entropy per millisecond and yet another way of estimating the information capacity of a spiking model.

For each of these methods we find the corresponding coding efficiency through = Rinfo

Cinfo

(21)

3.3 Some examples

Now that we know how to determine the information rate and coding efficiency, let us put the words into action and actually show some examples in order to really get what is going on. We will start with an ’easy’ one determined by a Gaussian distribution. We will discuss this in a theoretical way. Then we will move on to a more realistic signal, but still a toy signal, which we shall put through the Spike Response Model to determine the spike train. Then we will go through all above mentioned steps in order to find the information rate and eventually the coding efficiencies.

3.3.1 Example: White noise

Let us suppose we have a signal only consisting of Gaussian white noise. This means that every sample is drawn from a Gaussian distribution with mean 0 and variance σ2. Let us call this signal s. Because we know the distribution of the random signal, we can approximate it by the mean, since this is the expectation and also the best approximation of a random signal with respect to the sum of squared errors ([28]). Therefore let us estimate the signal by the zero-signal, and call this ˆs. Now the estimation error is equal to the signal itself which is not the same as the effective noise. The noise in the estimation is the complete error: the systematic and the effective noise combined. The effective noise measures the scatter of the Fourier components, which is how the systematic errors are removed. Since the power spectrum of the estimation error is equal to the power spectrum of the signal, we don’t know anything about the information rate, just that it is non-negative, which is obvious.

Rinfo≥ 1 2 Z ∞ −∞ 1 2πlog2 S(ω) Nest(ω) dω = 1 4π Z ∞ −∞ log₂ S(ω) S(ω) dω = 1 4π Z ∞ −∞ log₂1dω = 0.

3.3.2 Example: Toy signal

In this section we will calculate the coding efficiency of a toy signal. We will be using this signal throughout the coming chapters. Since we don’t know anything about a distribution for the signal, we need to put it through the basic Spike Response Model in order to create a spike train: {ti} = {322, 368, 403, 433, 460, . . . }. The constant values

we have used to create this spike train can be found in table 3.1. The threshold value at this point is always the same and thus at every spike θ(ti) = 0.2. Putting the formula

(22)

(a) (b) (c) Figure 3.5: Scatter plots of the estimation errors.

¯ S(t) = ν Z ₁ τsmooth · e −s τsmooth X ti<t−s θ(ti) · ure(− t−s−ti τm )_ds = 1.1325 Z 1 100· e −s 100 X ti<t−s 0.2 · 0.0001e(−t−s−ti40 )ds.

In figures 3.1, 3.2, 3.3 and 3.4 the steps leading to the final estimation are shown. Here the sum of squared errors is 9.92.

Now that we have the estimation, let us lengthen the domain of the signal from T to T0, by zero padding the signal until it has a length that is a power of 2, in this case

T0 = 4096 = 212. Now we split this new domain of the zero padded signal into M

segments. If we take τ0 to be a power of 2 as well, 27 = 128, then M becomes a natural

number:

M = T0 τ0

= 4096 128 = 32.

By plotting the Fourier transform of the signal against the Fourier transform of the estimation scatter we can find the gain in every frequency. Some of these plots are shown in figure 3.5.

Our next step in the process of determining the information rate is to estimate the signal to noise ratio. For this we need to find the power spectra of both the signal and the effective noise. This is done in the way described in section 3.1.3: For the signal we use the numpy function fft, square each value and divide by the signal length to find the power spectrum, for the noise we take the effective noise, square that and calculate the average over the segments. In figure 3.6 these spectra show that for higher frequencies the noise has significantly more power than the signal. We only take the frequencies into account where this is not the case. Namely, the frequencies where the signal power is at least one-tenth of the noise power, i.e. where the signal to noise ratio is at least 0.1. Therefore we cut everything above the frequency of 5.13 Hz.

Note that the signal to noise ratio still depends on the frequency. In order to remove this dependency we add 1, take the binary logarithm and integrate (sum) over all frequencies. Now we divide by the length of the signal and we have found a lower bound of the

(23)

Figure 3.1: The estimation of the toy signal created using 143 spikes.

Figure 3.2: Convoluted estimation.

Figure 3.3: Convoluted estimation, shifted and cut. variable value τm 40 θ0 0.2 ur 0.0001 τsmooth 100 delay 127 ν 1.325

Table 3.1: Variable settings in the model.

(24)

(a) Complete power spectra of the signal and the noise.

(b) Power spectra of the signal and the noise cut from 5.13 Hz.

(c) Complete signal to noise ratio. (d) Signal to noise ratio cut from 5.13 Hz. Figure 3.6

(25)

information rate: Rinfo≥ 1 T X ωn 1 2log21 + SNR(ωn) ≈ 28.1 bits/sec ≈ 0.688 bits/spike.

We can put this number in a bit more perspective when taking the information capacity into account and looking at the coding efficiency instead.

As we have mentioned before, we can calculate the information capacity in a number of different ways. Let us start by looking at the information capacity of the signal.

Information capacity of the signal

The information capacity of the signal is essentially the maximum information rate for the signal. It is calculated as the information rate when the estimation is almost the same as the signal we are estimating. As the estimation we use smoothed Gaussian random noise where σ = 0.0005 and where µ equals the signal. So we follow the above steps for calculating the information rate, where we use a cutoff frequency of 7.8125 Hz (this is the first time SNR < 0.1) and we find that the information capacity of the signal is

Cinfo= RSinfo≈ 43.2 bits/sec.

Leaving us with a coding efficiency of 0.650.

Information capacity of a spiking neuron model

In [22] MacKay and McCulloch state that when a minimum for the inter-spike interval is known, one can calculate an optimal maximum for this interval. This eventually leads to an optimal information capacity. We have stated that this is not the way to go, since it is quite a hassle to find this optimal maximum. But we can do this once for theoretical purposes and use the value we find here in the coming sections.

When a neuron fires a spike there is some time before the next neuron can happen. This time in between is at least 1 millisecond, but a neuron cannot sustain firing extremely fast for a long time. So let us assume that the minimum inter-spike interval is 2 milliseconds. The equation to solve becomes

m + 2

m − 2 = ln(m − 2).

Feeding this to Wolfram Alpha gives us the approximate solution of m ≈ 7.57239, which is about 8 milliseconds. With this information, the minimum inter-spike interval being 2 milliseconds and the maximum being 8 milliseconds, we find an information capacity of approximately 517.0 bits/sec and a coding efficiency of 0.054.

(26)

Information capacity of our model

As we discussed, there are two more ways to estimate the information capacity in our model. The first method uses the theory above, but instead of determining the optimal maximum inter-spike interval from the theoretical minimum inter-spike interval, we take the maximum and minimum inter-spike intervals occurring in the spike train we created using our model. The second method goes back to basic information theory, using the spike rate as an estimation for the probability of spiking at any given moment.

• Using the maximum and minimum inter-spike intervals

With the settings as mentioned at the start of this example, the maximum inter-spike interval, m, equals 45 and the minimum inter-inter-spike interval, r, equals 8. When working with milliseconds, ∆T is 1, since this is the spacing we have been using all along. Putting these values into the formula for the information capacity gives us: Cinfo= 2 Tm+ Tr log₂ Tm− Tr ∆T = 2 53log2(37) ≈ 0.1966 bits/msec = 196.6 bits/sec. And the coding efficiency, , becomes 0.143. • Using the spike rate

Starting with the spike train, we find that the spike rate is equal to approximately 40.9 spikes per second. Which means that on average there is a 0.0409 chance of there being a spike in a certain millisecond. Putting the information theory to use we find:

Cinfo= −Rspikelog2Rspike− (1 − Rspike) log2(1 − Rspike)

≈ 0.0409 log₂(0.0409) − 0.9591 log₂(0.9591) ≈ 0.2462 bits/msec

= 246.2 bits/sec.

With this information capacity we get a coding efficiency of 0.114.

Because these coding efficiencies are not optimal, there is room for improvement. In chapter 5 we will take a look at the effects of some variables used in this model. We will try to determine better values for these variables as well. As this model is very basic, we will change it in chapter 6. But first let us introduce a real signal.

(27)

4 A real signal

In the previous chapter we have seen how to create different types of coding efficiencies. In the examples we focused on a toy signal. In this chapter we will be focusing on a real signal as depicted in figure 4.1. This figure displays the amplitude of the change in air pressure of a man saying ’Don’t ask me to carry an oily rag like that.’ Just the word ’that’ is shown in figure 4.2. Here we see more clearly the high frequency of the sound. This makes sense as we hear sounds between 15 and 20000 Hz. We will discuss the workings of the human ear in the next section and finish this chapter with an example of our model with this signal. The signal displayed in figure 4.1 is part of a data set consisting of speech data collected by SRI international, Texas Instruments and MIT called the TIMIT data set. The set and how it was collected is described by Zue et al ([43]). In order to collect all data in the TIMIT data set a total of 630 people each from one of eight dialectical regions were asked to read 10 sentences out loud. Two sentences were the same for every person, these were used for calibration and were expected to incorporate significant dialectical differences. The sentence we use here is one of these calibration sentences. In this case it is spoken by a male who has moved around a lot.

4.1 The human ear

Figure 4.3: Diagram showing the structure of the human ear, detailing the parts of the outer, middle, and inner ear from [13].

The sound of a person speaking is some-thing we can hear. The ear consists of three main parts: the outer ear, the middle ear and the inner ear as shown in figure 4.3. A sound signal is picked up by the outer ear, gets sent through the middle ear into the inner ear where the inner hair cells in the cochlea translate the sound into electro-chemical impulses that are sent to the brain ([13]). At the beginning of the cochlea the higher frequencies are being picked up by the inner hair cells, while the lower frequen-cies are picked up by the inner hair cells at the far end of the cochlea. There are a to-tal of 31, 000 − 32, 000 inner hair cells in

the cochlea, each having their own center frequency and bandwidth of frequencies they are sensitive to ([30]). One might say that the ear, specifically the cochlea, works as an actual Fourier transform ([17]). The inner hair cells are the spiking (sensory) neurons

(28)

Figure 4.1: A man saying: ’Don’t ask me to carry an oily rag like that.’

(29)

Figure 4.4: Schematic diagram of the model for the auditory periphery, from [41].

4.2 Pre-processing

The sound travels quite a track from the outer ear to the inner hair cells. The computer model for this track is covered by a package called cochlea created by Zilany et al. ([42],[41]). Their model is discussed in the next subsections. In this section we will discuss everything we need to do in order to create a signal that can be put through our spike generating model. Before we can actually put the data to use, we need to do some pre-processing. In the end we want to build a spiking neuron model that estimates the envelope of the sound in an efficient way using spikes. We cannot put the files as we have them into the cochlea model immediately. One of the reasons is that Python does not recognize the .WAV sound files from the TIMIT data set as actual sound files. So first we use the Sndfile function from the scikits.audiolab package1 which makes the libsndfile package we need more usable [4]. The Sndfile function turns the .WAV file from the data set into a numpy array.

4.2.1 Re-sampling

The part that we modelled from the inner hair cell is the spike generator so everything up until that point we have to find somewhere else. Fortunately Zilany et al. created a model that does exactly this part. Theirs is a model of the middle and inner ear described in [40], [42] and [41]. As input we use the sound wave of a man saying: ’Don’t

(30)

ask me to carry an oily rag like that.’ In the model the sound wave passes through a middle ear filter and then moves on to the inner ear where the inner hair cells are located. After the inner hair cells it moves through a synapse model. From there we will use our model as the spike generator. The cochlea model is shown schematically in figure 4.4. The details are described in the papers. Zilany et al created a spike generator as well, but in order to be able to compare this signal with the toy signal from section 3.3.2 we will be using the model discussed in chapter 3.

Figure 4.5: The first millisecond of the sig-nal re-sampled.

When we try to use the model, an error oc-curs as it requires an input with a sample frequency between 100 and 500 kHz and our sound is of 16 kHz. So we have to re-sample the sound array before putting it through Zilany’s model. For this we use the resample function from scipy.signal to re-sample our sound to 160 kHz. Figure 4.5 illustrates the re-sampling of the first mil-lisecond of the signal. Note that nothing is being said here yet, which is clear from the amplitude change that is very small.

4.2.2 Cochlea model settings

variable value cf 15,000 1,500 150 fs 160,000 species human cohc 1. cihc 1.

vihc inner hair cell function output

anf type hsr

powerlaw approximate

ffGn False

Table 4.1: Variable settings for the cochlea model.

Now we can put the signal through the cochlea model. For this we will use the settings as men-tioned in table 4.1. Once we have put our sound through the inner hair cell and synapse simula-tions we will calculate the envelope of the output as an auditory neuron estimates the envelope of a signal instead of the signal itself [28]. At this point our sound array has a sample frequency of 160 kHz and the height of the sound is varying a lot. Because the model has a precision of 1 mil-lisecond, let us split the output of the synapse into batches of 400 samples and take the high-est value in every batch as a reference point and take a linear interpolation of these points such that the sample frequency equals 1000. Now we

can normalize this signal and it is ready to be used in our own spike generating model.

4.3 Example

As mentioned in the beginning of this chapter, we finish by putting the sound of an American man who has moved around a lot, a so-called army-brat, saying the sentence: ‘Don’t ask me to carry an oily rag like that.’, through the cochlea model and using our

(31)

spiking neuron model as the spike generator. We will do this for a high, a medium and a low frequency neuron with characteristic frequencies (cf) of 15000 Hz, 1500 Hz and 150 Hz respectively. Recall that each neuron responds to a bandwidth of frequencies. The characteristic frequency is the frequency to which the neuron is most sensitive.

4.3.1 High frequency neuron

The characteristic frequency of the neuron in this section is 15000 Hz. We take the sound and put it through the cochlea model as described in the previous sections. The output of the inner hair cell part and of the synapse are shown in figures 4.6 and 4.7 respectively. Although the output of the inner hair cell is really small, the output of the synapse is not. The envelope we take as input after normalization for our Spike Response Model is also shown here. Note that the output of the inner hair cell as well as the output of the synapse seem a lot like the amplitude of the sound as shown in figure 4.1, aside from the size of the values. This is because this neuron is most sensitive to frequencies around 15000 Hz and the original sound frequency is not far from that. Next we will use the variables from the toy signal (table 3.1) to create a spike train which consists of 19 spikes. Of course the values of delay and ν are different because these are created within the model and depend on the signal and the spike train. Eventually this results in the estimation shown in figure 4.8 and has a sum of squared errors of 35.2. A big difference between this and the toy signal from section 3.3.2 is that here the estimation is more smooth than the signal is, whereas in the toy signal, this was the other way around. Now we create the effective noise, for which we again use τ0= 128 and use this to create the

power spectrum shown in figure 4.9. This nicely shows that the power spectra of effective noise and the input are roughly in the same range. Even though this is the case, we still only use the lower frequencies (up until where the signal to noise ratio is lower than 0.1).

Information per sec 71.1

rate per spike 8.89

noisy 0.853

Coding optimal 0.138

efficiency inter-spike 2.03 spike rate 1.06 Table 4.2: Information rates and

coding efficiencies.

Figure 4.10 shows these lower frequencies of the power spectrum that are used to calculate the in-formation rate and finally the different coding effi-ciencies. These values are shown in table 4.2. Com-pared to our toy signal, these values are quite high. Both the coding efficiency created using the inter-spike intervals and the one created using the inter-spike rate are even above 1. In the next two sections we will do the same for a medium frequency auditory neuron and a low frequency auditory neuron.

4.3.2 Medium frequency neuron

As we did in the previous section, we start with the sound of an army brat saying the sentence: ’Don’t ask me to carry an oily rag like that.’ and put it through the cochlea model. This time we use the center frequency of 1500 Hz which represents a neuron around the middle of the cochlea. Figures 4.11 and 4.12 show the output of

(32)

High frequency neuron

Figure 4.6: Output inner hair cell. Figure 4.7: Output of the synapse and its envelope.

Figure 4.8: Final estimation: convoluted, shifted, cut and re-scaled, created using 19 spikes.

(33)

Medium frequency neuron

(34)

puts is a lot higher and the contours look a lot less like the input sound as it did with the high frequency neuron. Again we will use the variables from the toy signal (table 3.1) to create a spike train which consists of 24 spikes. With this we find the estimation shown in figure 4.13. The sum of squared errors is 31.0 compared to 32.5 for the high frequency neuron. Although the normalized envelope is overall less high than before, there are more spikes as more of the signal is centered around the middle.

rate per spike 6.56

noisy 0.773

Coding optimal 0.18

coding efficiencies. The power spectrum of both the signal and the

ef-fective noise is shown in figure 4.14. The frequen-cies we use for the the information rate and finally the different coding efficiencies are shown in figure 4.15. These values are shown in table 4.3. As we use a slightly smaller part of the power spectrum it is expected that these values are lower than the re-sults from the high frequency neuron although we did use the same cutoff rule.

4.3.3 Low frequency neuron

In this last section we discuss what happens if we do not take a high frequency neu-ron or a medium high frequency neuneu-ron but a low frequency neuneu-ron with a charac-teristic frequency of 150 Hz. This is a neuron that is stated at the apex (far end) of the cochlea. Again the outputs of the inner hair cell and synapse are shown in figures 4.16 and 4.17. As there are more small values in the output of the synapse the envelope becomes relatively more variant. Because of this and the smoothness of the estimation, the sum of squared errors is higher than in the previous sections: 46.7. The 23 spikes and the estimation created by our model are shown in figure 4.18.

rate per spike 8.34

noisy 0.884

Coding optimal 0.156

coding efficiencies.

The power spectrum of both the signal and the ef-fective noise is shown in figure 4.19. The part of the power spectra we use to calculate the informa-tion rate and finally the different coding efficiencies are shown in figure 4.20. These final results can be found in table 4.4. The information rate per second is higher due to the higher cutoff in the frequency domain. So are the coding efficiencies created using the noisy signal and the theoretically optimal one. As there are more spikes used, the information rate in bits per spike is less.

4.3.4 Conclusions

The normalized output of the synapse fluctuates more as the characteristic frequency decreases. In all cases the power spectra of the signal and effective noise have roughly the same range and therefore overlap. This is a little unexpected as the estimation is more smooth than the input in the model. This implies that the smoothing constant

(35)

Low frequency neuron

(36)

is probably not at its best value at 100 milliseconds. A smaller might be better when working with a real signal. Just like the membrane time constant should possibly be smaller as well as the threshold constant, as most of the signal is not captured by the estimation. The general effects of each of these three variables will be discussed in the next chapter. While doing so we will also try to find optimized values for these variables.

(37)

5 Effects of variables

In this chapter we will zoom in on the effects some constant variables in the model have on the sum of squared errors, the information rate and the information capacities and their coding efficiencies. As we did in the previous chapters, we will choose the cutoff frequency to be the first time the signal to noise ratio drops below 0.1. The variables we will focus on are the membrane time constant and the smoothing constant. In preparation for the next chapter we will also look into the basic threshold value. In order to be able to choose the settings with which we shall continue this thesis, we will define a performance measure at the end of this chapter.

5.1 The membrane time constant

The variable that is most used throughout the model is the membrane time constant, τm. Because of this it is not unlikely for this constant variable to have a big effect on

the results we have seen in the examples from sections 3.3.2 and 4.3. So what happens if we change this constant, make it variable? The first thing to look at is the range of the membrane time constant. Fortunately the people of Neuro-Electro ([35]) have gathered a lot of information about the constant variables in neurons that are found in articles. It has for instance a nice overview of the values found for the membrane time constant. The lowest value they found is 0.26 milliseconds ([24]) and the highest 158 milliseconds ([36]). This sets a boundary for us to look at: from 0 to 160 milliseconds. Note that at some point we use the constant as the denominator in a fraction, and since dividing by 0 is not a good idea, we start our exploration at τm = 1.

5.1.1 Sum of squared errors

The membrane time constant has an effect on the estimation itself. Therefore it affects the quality of the estimation as well. As one would expect, the lower the error the better the estimation. But since the estimation is sometimes higher than the signal and sometimes lower, we have to view the error as the absolute difference between the two. In computational science we usually even look at the sum of squared errors (SSE) instead of the sum of absolute errors. Larger errors then have a bigger impact and this could be used as some kind of penalty accordingly. As is shown in figure 5.1a, when the membrane constant increases this sum of squared errors decays and stays low: around the value of 8.5. This means that any value higher than 22 milliseconds for the membrane time constant is probably an acceptable choice with respect to the sum of squared errors. This error is smallest (7.02) at a membrane time constant of 87

(38)

(a) The effect of the membrane time constant on the sum of squared errors.

(b) The estimation created using the best membrane time constant with respect to the sum of squared errors: τm = 87

mil-liseconds. 64 spikes were needed to create this estimation.

Figure 5.1

were just about creating good estimations then we would be done. We would have found the right value for the membrane time constant for this model. But neurons are about creating good estimations in an efficient way. So let’s look at some other effects of this constant.

5.1.2 Information rate

As we mentioned before, the membrane time constant effects the estimation. This is because it determines the speed of the decay and therefore it influences the times of spiking. The estimation in turn affects the noise and therefore its Fourier transform. We use the Fourier transform of the noise when calculating the effective noise. Because the membrane time constant affects the signal to (effective) noise ratio (SNR) and eventually the information rate. But how is this last one affected by the membrane time constant exactly? This is shown in figure 5.2 for both the bits per second (5.2a) and the bits per spike (5.2b). We see that the information rate in bits per second generally goes up to about 28 bits per second and fluctuates there. Again, this fluctuation happens from a membrane time constant of about 22 milliseconds. The highest value of this information rate also happens to be at the same value for the membrane time constant and is equal to 31.0 bits per second. Although the information rate in bits per spike is also effected by the membrane time constant, the behavior is different from that of the SSE and that of the information rate in bits per second as the number of bits per spikes mostly keeps increasing. The reason for this is that the number of spikes goes down in what looks like an exponential fashion as the value of the membrane time constant goes up. This is shown in figure 5.2c. For the information rate in bits per spike, the effect on the estimation is shown in figure 5.2d. Here the best value of the membrane time constant is used with respect to the number of bits per spike. In order to create this figure only 34 spikes were needed.

(39)

(a) Information rate in bits per second for vari-able membrane time constant values.

(b) Information rate in bits per spike for vari-able membrane time constant values.

(c) The number of spikes needed to create the estimation for variable values of the mem-brane time constant.

(d) The estimation created using the best value for the membrane time constant with respect to the information rate in bits per spike: τm= 160.

Figure 5.2

5.1.3 Information capacity and coding efficiency

In order to get the coding efficiency, we need to divide the information rate by the information capacity. In section 3.2 we explain 4 ways of calculating the information capacity: by using a noisy signal, by using the theoretical optimum proposed by MacKay and McCulloch ([22]), by applying their theory on the actual inter-spike intervals en-countered, and by using the concept of entropy. The first two are the same for every value of τm since the noise and the theoretical values are not influenced by the

chang-ing of this value. The moments of spikchang-ing and therefore the inter-spike intervals are influenced by the membrane time constant, which means that the latter two are affected by τm. This leads to the effects on the information capacity as shown in figure 5.3a.

Because the inter-spike interval and spike rate information capacities generally go down, the rate-capacity ratios (the coding efficiencies) generally go up. Thus when using the minimum and maximum inter-spike intervals or the entropy based capacity, it follows in general that the higher the membrane time constant, the higher the coding efficiency. For these, the highest coding efficiencies are at 149 and 157 milliseconds for the mem-brane time constant with values of 0.421 and 0.356 for the inter-spike interval and spike rate efficiencies respectively. The estimations created using these values are shown in

(40)

(a) How the membrane time constant affects the different information capacities (and how it does not).

(b) The effect of the membrane time constant on the coding efficiencies.

Figure 5.3

(a) The estimation created using the best membrane time constant with respect to the inter-spike coding efficiency: τm= 149

milliseconds. 37 spikes were needed to cre-ate this estimation.

(b) The estimation created using the best membrane time constant with respect to the spike rate coding efficiency: τm= 157

milliseconds. 35 spikes were needed to cre-ate this estimation.

Figure 5.4

membrane time constant and therefore the coding efficiency follows the same pattern as the information rate as shown in figure 5.3b. Note that the capacity created using the noisy signal is very low. The reason for this is that when creating the noisy signal estimation, it went through the same smoothing techniques as our spiky estimation. This technique as described in section 3.1.1 uses convolution to smooth the estimation and removes the unwanted (higher) frequencies before calculating the information rate as well. This results in a low information capacity and therefore a high coding efficiency of 0.717 from the noisy signal at its best. The theoretically optimal coding efficiency has a top value of 0.060.

(41)

5.2 The smoothing constant

Another constant which effects our estimation is the smoothing constant. In this section we will find out what happens if we change its value while leaving all other values as in the example of section 3.3.2. Because of time and computational reasons we will be taking steps of 5 milliseconds each: 1, 5, 10, . . . , 245, 250 milliseconds. Because of the same reason as before we will not be using 0 as a value here, instead we will use 1. Let us start by taking a look at the effect of the smoothing constant on the sum of squared errors.

5.2.1 Sum of squared errors

The smoothing constant comes into play when the spike train has already been made. In this whole section the figures of the estimation will be created using 143 spikes. Nonetheless the estimation changes when modifying this constant. For this reason the sum of squared errors is affected by the smoothing constant as shown in figure 5.5a.

(a) The effect of the smoothing constant on the sum of squared errors.

(b) The estimation created using the best smoothing constant with respect to the sum of squared errors: τsmooth= 150. Figure 5.5

As is clear from this figure, the choice of a value of 100 is not optimal with respect to the sum of squared errors. The value of 150 milliseconds is better suited for this aspect, with an SSE of 9.35. The estimation corresponding with this value is shown in figure 5.5b.

5.2.2 Information rate

Because the smoothing constant affects the estimation, it also impacts the effective noise and therefore the information rate. How this constant affects the information rate is shown in figure 5.6. Here we find that the number of bits per second and the

(42)

(a) The effect of the smoothing constant on the number of bits per second.

(b) The effect of the smoothing contsant on the number of bits per spike.

Figure 5.6: Effect on the information rate.

pattern. The reason is that the spike train is not affected by the smoothing constant and therefore the difference between the two is just a factor. We find that the highest information rate happens at a smoothing constant of 145 milliseconds. The information rate here is equal to 28.4 bits per second and 0.696 bits per spike. The effect on the estimation is shown in figure 5.7.

5.2.3 Information capacity and coding efficiency

Figure 5.8: The (non-)effect of the smooth-ing constant on the different in-formation capacities.

As is clear from figure 5.8 the only informa-tion capacity that is affected is the one cre-ated using the noisy signal. For the most part, this capacity tends to decrease as the smoothing constant increases. This leads to an overall increase in the coding effi-ciency which is shown in figure 5.9a. In this figure there are two things that need some extra attention. The first being that it seems as though the other coding efficien-cies are constant values. But this is not the case, as the information capacities of those are constant and the information rate is not. Therefore the coding efficiency follows the same pattern as the information rate. This is shown in figure 5.9b. The second thing that needs our attention is the fact that the noisy signal coding efficiency exceeds the upper boundary of 1. Moreover, the value of the noisy signal coding efficiency is 1.13 where the smoothing constant is 250. The explanation for this is quite simple: When creating the information capacity of the noisy signal, the estimation of the noisy sig-nal is smoothed in the same way the spiky sigsig-nal is smoothed. And for this the same smoothing constant is used. A coding efficiency larger than 1 implies that the final spiky estimation is a better estimation of the signal than the final noisy estimation, which is the case here.

(43)

(a) The effect of the smoothing constant on the different information capacities.

(b) The coding efficiencies without the noisy signal coding efficiency.

Figure 5.9

Figure 5.10: The signal, noisy estimation and spike-based estimation for a smoothing constant of 250 milliseconds.

The sum of squared errors of the noisy esti-mation is 12.8, whereas the sum of squared errors of our spiky estimation equals 9.92 at this point. Both estimations of the sig-nal are shown in figure 5.10. The estima-tion using our model exceeds the noisy es-timation from a smoothing constant of 190 milliseconds.

5.3 The threshold constant

In the next chapter we will be making some changes to the model we are currently

us-ing. One of these changes includes a tighter look at the threshold. In the current model, the threshold is static, it is a variable that does not change. In the next chapter we will among other things be looking at dynamic thresholds. But before we do that it is good to know the influence of the threshold as a constant which is what we will be discussing in this section. As before we will keep all values as in table 3.1 and only change the variable at hand: θ0. The threshold value of a neuron is one of the things that determines when

the neuron will be emitting a spike. When this value is too low, the neuron spikes a lot, therefore it is not able to estimate the higher parts of the signal and might overestimate lower parts of the signal. Plus, when a neuron spikes a lot it is not being very efficient, this contradicts the notion of it being a neuron. Now if the threshold value is too high, the neuron does not spike often enough, maybe even not at all, and the estimation, if there is one, will not be useful ([39]). These effects are shown in figure 5.11. Now let us explore the effect of the threshold constant on the sum of squared errors, the information rates and the coding efficiencies as we did in the previous sections.

The Effects and Values of Variables in Spiking Neuron Models for the Auditory Nerve