• No results found

New methods for the analysis of trial-to-trial variability in neuroimaging studies - Thesis

N/A
N/A
Protected

Academic year: 2021

Share "New methods for the analysis of trial-to-trial variability in neuroimaging studies - Thesis"

Copied!
196
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl)

New methods for the analysis of trial-to-trial variability in neuroimaging studies

Weeda, W.D.

Publication date

2012

Document Version

Final published version

Link to publication

Citation for published version (APA):

Weeda, W. D. (2012). New methods for the analysis of trial-to-trial variability in neuroimaging

studies.

General rights

It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).

Disclaimer/Complaints regulations

If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.

(2)
(3)

New Methods for the Analysis of

Trial-to-Trial Variability in

(4)

Cover design by Wouter D. Weeda ISBN: 978-94-6191-212-1

This thesis was funded by a Netherlands’ Organization for Scientific Research (NWO) VIDI grant awarded to Hilde Huizenga and a NWO VENI grant awarded to Raoul Grasman.

(5)

Trial-to-Trial Variability in

Neuroimaging Studies

ACADEMISCH PROEFSCHRIFT

ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam

op gezag van de Rector Magnificus prof. dr. D.C. van den Boom

ten overstaan van een door het college voor promoties ingestelde commissie, in het openbaar te verdedigen in de Agnietenkapel

op woensdag 28 maart 2012, te 12:00 uur

door

Wouter Dani¨el Weeda geboren te Haarlem

(6)

Promotor: Prof. dr. M.W. van der Molen

Copromotores: Dr. H.M. Huizenga Dr. L.J. Waldorp Dr. R.P.P.P. Grasman

Overige leden: Dr. C.F. Beckmann Prof. dr. C.V. Dolan Prof. dr. M.C.M. De Gunst Prof. dr. K.R. Ridderinkhof Prof. dr. S.A.R.B. Rombouts

(7)

Contents

1 Introduction 11

1.1 A general model of trial-to-trial variability . . . 12

1.1.1 Reaction time (RT) . . . 14

1.1.2 Electroencephalography (EEG) . . . 16

1.1.3 Functional Magnetic Resonance Imaging (fMRI) . . . 17

1.2 Why trial-to-trial variability is important . . . 19

1.3 Outline . . . 20

1.4 Articles resulting from this thesis . . . 21

2 Empirical Support for a Drift Diffusion Model Account of the Worst Perfor-mance Rule 23 2.1 Introduction . . . 24

2.1.1 The drift diffusion model . . . 25

2.2 Method . . . 27

2.2.1 Participants . . . 27

2.2.2 Intelligence test . . . 27

2.2.3 Response time task . . . 27

(8)

2.3 Results . . . 29

2.3.1 Preprocessing . . . 29

2.3.2 Worst Performance Rule . . . 29

2.3.3 Drift diffusion model parameters . . . 30

2.4 Discussion . . . 35

2.5 Supplementary Materials. . . 37

2.5.1 Characteristics of excluded participants . . . 37

2.5.2 Behavioral measures . . . 37

2.5.3 Drift diffusion model fits . . . 39

3 Simultaneous Estimation of Waveform, Amplitude, and Latency of Single-Trial EEG/MEG Data 41 3.1 Introduction . . . 42

3.2 Methods . . . 45

3.2.1 Model . . . 45

3.2.2 Parameter estimation . . . 47

3.2.3 Estimating single-trial amplitude and latency . . . 49

3.2.4 Multiple signals . . . 50 3.2.5 Model selection . . . 50 3.3 Simulations . . . 52 3.3.1 Results . . . 52 3.4 Empirical application . . . 59 3.4.1 Methods . . . 62 3.4.2 Results . . . 63 3.5 Discussion . . . 67

(9)

fMRI Data 69 4.1 Introduction . . . 70 4.2 Method . . . 71 4.2.1 Input . . . 72 4.2.2 Spatial model . . . 73 4.2.3 Estimation . . . 74 4.2.4 Model selection . . . 74 4.2.5 Variance estimation . . . 75 4.2.6 Hypothesis testing . . . 76 4.2.7 Starting values . . . 78 4.2.8 Simulations . . . 78 4.3 Results . . . 80 4.3.1 Parameter recovery . . . 80 4.3.2 Variance estimation . . . 81 4.3.3 Power analysis . . . 83

4.3.4 Real data example . . . 85

4.4 Discussion . . . 87

4.4.1 Acknowledgements . . . 89

4.5 Appendix A. Noise simulations . . . 90

4.5.1 Procedure . . . 90

4.6 Appendix B. Real data Tables . . . 91

4.7 Appendix C. Algorithm overview . . . 92

5 Functional Connectivity Analysis of fMRI Data Using Parameterized Regions-of-Interest 93 5.1 Introduction . . . 94

(10)

5.2 Method . . . 95

5.2.1 Connectivity analysis . . . 96

5.2.2 Single-trial amplitude estimation . . . 96

5.2.3 Connectivity estimation . . . 97

5.2.4 Testing differences in connectivity . . . 97

5.2.5 Simulations . . . 98

5.2.6 ARF connectivity estimation and standard methods . . . 100

5.2.7 Power . . . 100

5.3 Results . . . 101

5.3.1 Power . . . 104

5.3.2 False positive rate . . . 104

5.4 Empirical application . . . 105

5.4.1 Results . . . 105

5.5 Discussion . . . 107

5.6 Acknowledgements . . . 109

5.7 Appendix A. Extension to full volume fMRI analysis. . . 110

5.7.1 Spatial model . . . 110

5.8 Appendix B. Raw time-series versus trial-by-trial amplitude . . . 112

5.9 Appendix C. Consistency of correlation estimates. . . 114

5.10 Supplementary materials . . . 115

6 arf3DS4: An Integrated Framework for Localization and Connectivity Ana-lysis of fMRI Data 119 6.1 Introduction . . . 120

6.2 Activated Region Fitting . . . 122

(11)

6.2.2 Spatial model . . . 125 6.2.3 Parameter estimation . . . 127 6.2.4 Model selection . . . 127 6.2.5 Hypothesis testing . . . 128 6.2.6 Procedure . . . 128 6.2.7 Connectivity analysis . . . 129

6.3 The arf3DS4 package . . . 130

6.3.1 Setting up an experiment . . . 132

6.3.2 Creating and customizing a model . . . 136

6.3.3 Fitting a model . . . 139

6.3.4 Finding an optimal model . . . 142

6.3.5 Connectivity analysis . . . 142

6.4 The ARF example data . . . 145

6.4.1 The experiment structure . . . 146

6.4.2 Getting a feel for the data . . . 147

6.4.3 Fitting the model . . . 150

6.4.4 Selecting an optimal model . . . 153

6.4.5 Connectivity analysis . . . 155

6.5 Empirical data . . . 157

6.6 Conclusion . . . 157

6.7 Acknowledgements . . . 158

6.8 Appendix A. NIfTI files and the fmri.data class . . . 159

6.8.1 Visualizing fMRI data . . . 160

6.9 Appendix B. S4-classes . . . 160

(12)

7.1 Variability in neuroimaging data . . . 164 7.1.1 Simultaneous Estimation of Waveform Amplitude and Latency 164 7.1.2 Activated Region Fitting . . . 165 7.2 Why the analysis of trial-to-trial variability is important. . . 167

References 169

Summary in Dutch / Samenvatting in het Nederlands 185

(13)

Introduction

T

he working of the human brain has intrigued scientists for centuries: how are hu-mans able to perform complex cognitive tasks like facial recognition, problem solv-ing, or even simple tasks like pressing a button (which of course can also be seen as very complex act). Cognitive psychologists are usually interested in disentan-gling these tasks into more fundamental psychological processes. From the early days of F.C. Donders, who pioneered the systematic exploration of human cognition (Donders, 1969), reaction times have been (and still are) the most widely used out-come measure in cognitive psychology. In recent years more direct measures have become available in the form of electro/magneto-encephalography (EEG/MEG) and func-tional magnetic resonance imaging (fMRI), allowing researchers to study the underlying mechanisms of cognition more directly.

A general observation with almost all studies of cognitive functioning, independent of the outcome measure, is that within-subject (or intraindividual) performance in these studies is variable (Fiske and Rice, 1955; Arieli et al., 1996; Aguirre et al., 1998). In other words, the reaction of a single subject varies from trial to trial. Traditionally this variability is seen as measurement error and therefore ignored or discarded from analyses (Jensen, 1992; Jung et al., 2001; Duann et al., 2002). There is, however, a large (and growing) amount of evidence suggesting that this variability conveys important

(14)

information (MacDonald et al., 2009).

For example, reaction time (RT) studies have shown that increased variability in RT corresponds to lower intelligence (Baumeister, 1998; Jensen, 1992; Hultsch et al., 2002), lower cognitive performance (Bielak et al., 2010), older age (Deary, 2005; L ¨ov-d´en et al., 2007), and clinical conditions like Attention Deficit Hyperactivity Disorder (ADHD) and autism (Geurts et al., 2008). These differences are also evidenced when analyzing neuroimaging studies using for example Event Related Potentials (ERPs), stimulus-locked deflections in EEG signals. For example, high intelligence partici-pants show less trial-to-trial variability in both amplitude (Fjell and Walhovd, 2007) and latency (Saville et al., 2011) of ERPs. Increased variation is observed in clinical groups diagnosed with Alzheimer’s disease (Hogan et al., 2006) or ADHD (Lazzaro et al., 1997), and in older subjects (Fein and Turetsky, 1989; Fjell et al., 2009). In ad-dition, and somewhat surprisingly, higher variation in amplitude of fMRI BOLD re-sponses, the brain’s metabolic response to a stimulus, is correlated with younger age (Garrett et al., 2011).

These studies suggest that variability in responding conveys important information on how the brain performs cognitive tasks (MacDonald et al., 2006; Moy et al., 2011), more so than the average response alone. Analysis methods should therefore take into account this variability, which can be a daunting task since noise should be sepa-rated from true variability in the signal of interest. This thesis therefore concerns new methods for the analysis of trial-to-trial variability, specifically for the analysis of re-action time (RT), electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) data.

1.1 A general model of trial-to-trial variability

Variability in reaction time and neuroimaging data can have two underlying sources. First, variability may come from measurement error. Second, variability may reflect trial-to-trial variability of the actual signal. The latter being the variability of interest. The major hurdle in analysis of variability is thus discerning noise variance and vari-ance in the signal of interest at each single trial. To quantify this distinction between

(15)

signal and noise we pose a general model. In this model the measured data on each trial are seen as a function of an underlying neurological or psychological process and measurement error. More formally this can be stated as:

yi = f (d, si) + εi (1.1)

In Eq. 1.1 yi is the measured response to a stimulus on trial i = 1, . . . , N . The

un-derlying process is described by a non-linear function f (d, si)with deterministic

pa-rameters d and stochastic papa-rameters si1. The deterministic parameters define

char-acteristics of the response that are equal across trials. The stochastic parameters can have different values at each trial and therefore define characteristics that are variable across trials. Finally, εidefines measurement error. In the case of reaction time data yi

is a single value, namely the reaction time at trial i. The function f can be defined as the process underlying this reaction time, that is, the process of evaluating the stimu-lus, making the decision and executing the response. In the case of EEG/MEG data yi

is a time-series of the brain’s response on trial i (usually a short time-window around stimulus presentation and response), with f characterizing the shape of this response over time. For fMRI data yiis a (2 or 3 dimensional) image of the brain’s response on

trial i, with f modeling the spatial pattern of this image.

The key in characterizing trial-to-trial variability is the estimation of the signal f (d, si)

at each trial. This requires estimation of deterministic parameters d and trial specific parameters si. Since parameters d are equal over trials, estimation of these

param-eters benefits from multiple trials. Paramparam-eters si are variable over trials and cannot

benefit from multiple trials as they have to be estimated at each trial separately. Given the high levels of noise in single-trial data (Faisal et al., 2008; Parrish et al., 2000; Kuriki et al., 1994), this makes estimating the stochastic parameters very difficult. The next paragraphs will give a detailed description of how single-trial parameters are estimated in RT, EEG, and fMRI data.

1The terms deterministic and stochastic are used to indicate parameters that are fixed over trials and

parameters that vary over trials respectively. The terms are not used in their statistical interpretation to indicate fixed and random variables.

(16)

1.1.1 Reaction time (RT)

Reaction time is one of the most widely used outcome measures in cognitive psy-chology. It is used, for example, in Simon tasks (Simon, 1969), Erikson-flanker tasks (Eriksen and Schultz, 1979), Go/No-go tasks (Mesulam, 1985), and Choice Reaction Time (CRT) tasks (Hick, 1952; Hyman, 1953). Usually, variability in reaction times is attributed to measurement error (Jensen, 1992). Therefore, in most studies, reaction time is summarized (i.e., by calculating the median or mean) to get a better measure of the ‘real’ reaction time, thereby ignoring trial-to-trial variability.

The easiest way to characterize variability in reaction times is by simply computing the standard deviation of the entire response time distribution (instead of calculating the mean over this distribution). This, albeit a successful measure in reaction time studies (van Ravenzwaaij et al., 2011), ignores the assumption that reaction times might be a reflection of the timing of underlying (cognitive) processes (Sternberg, 1969). That is, variability of reaction times does not necessarily say anything about variability of underlying processes. Delineating the processes underlying reaction times, and characterizing their variability may therefore be beneficial.

In order to extract the underlying processes a model of this process has to be assumed. There are multiple models available, but one that has proven to be very versatile is the ”drift diffusion model” (DDM, Ratcliff, 1978). This model for Two-Choice decision tasks translates the reaction time distributions for correct and error responses into psychological processes like quality of information accumulation, response caution, and non-decision related processes. In addition, the DDM can account for important phenomena often found in reaction time studies (Ratcliff et al., 2008; Wagenmakers et al., 2005; Wagenmakers and Brown, 2007; Schmiedek et al., 2009; van Ravenzwaaij et al., 2011).

The DDM assumes that evidence for one of two choices accumulates over time at each trial via a noisy random walk process (gray lines in Fig. 1.1) until one of two response boundaries is reached (horizontal lines 0 and a in Fig. 1.1). The rate of this accumulation (drift-rate) at each trial is defined as coming from a normal distribution with mean ν (bold arrow in Fig. 1.1) and standard deviation η. The width of the

(17)

Figure 1.1: Schematic representation of the drift diffusion model. The gray lines indicate noisy informa-tion accumulainforma-tion from starting point z to boundary a or 0; the solid arrow indicates the mean rate of information accumulation v. On the top and bottom are the hypothesized response time distributions. Total RT equals Terplus the decision time.

boundaries (boundary separation) is defined by parameter a. The non-decision related processes at each trial (non-decision time), the reaction time that cannot be explained by the diffusion process, is defined as coming from a uniform distribution with mean Terand width St.

The DDM estimates parameters from the reaction time distributions of correct and incorrect responses. This makes direct estimation of the characteristics of the under-lying process at each trial impossible. However, variability of the underunder-lying process is captured by the η and Stparameters. To relate the diffusion model to our general

model, let yibe the reaction time of trial i. The underlying process f can be

charac-terized as the diffusion process with stochastic parameters s coming from a normal distribution of drift rate (with ν and η) and a uniform distribution of non-decision time (with Terand St). The deterministic parameter d is defined as boundary

separa-tion a.

The added value of using an explicit model to analyze reaction times, is that different cognitive processes can be discerned from reaction time data, instead of only descrip-tive measures like the mean or standard deviation. Furthermore, variability of these cognitive processes can be estimated, allowing researchers to test specific hypotheses

(18)

about them (for example, that aging subjects have a higher variability in quality of information accumulation). There is evidence that these cognitive processes, have a neurological basis. Specifically, quality of information accumulation, is shown to be related to trial-to-trial variability in EEG (Ratcliff et al., 2009). Neuroimaging meth-ods like EEG thus allow researchers to further unravel these underlying cognitive processes.

1.1.2 Electroencephalography (EEG)

Electroencephalography (EEG) captures electromagnetic changes in the brain by mea-suring this signal via electrodes on the scalp. The signal is sampled very fast, allowing it to record changes within millisecond precision. EEG data consist of a time-series measured at each electrode on the scalp. The amount of noise in EEG data is often much larger than the signal (Jung et al., 2000), making it difficult to estimate charac-teristics of this signal at each trial.

EEG analysis is usually performed on data locked to the presentation of a stimulus or an overt response to this stimulus. Given the high amount of noise in single-trial EEG data, the analysis requires multiple measurements of the same condition to clearly identify these evoked or event-related potentials (ERPs) (Glaser and Ruchkin, 1976). The single trials are then averaged to filter out random noise. Thereafter, the resulting ERP is examined for profound positive or negative deflections (amplitude), occurring at specific latencies across different electrodes (McCarthy and Wood, 1985). These deflections (often referred to as complexes) are subsequently linked to different cognitive processes.

In estimating single-trial EEG data there are to steps two be taken. The first is to estimate the shape of the ERP, obtained by averaging stimulus- or response-locked EEG data. The second is estimating the single-trial characteristics of the ERP (or a specific complex of interest), more specifically, calculating its amplitude and latency at each trial. This second step, estimating single-trial estimates of amplitude and latency, requires a correct estimation of the shape of the ERP. Subsequently, estimates of amplitude and latency at each single-trial can be obtained.

(19)

In terms of our general model the single-trial EEG data of a single electrode yi are

modeled by a neural process f determined by deterministic parameters d, that model the shape of the ERP, and stochastic parameters sithat determine the amplitude and

the latency of this ERP.

The major advantage of using single-trial parameters in EEG analysis is that these give a more detailed description of the underlying neurological process. An addi-tional advantage is that it can improve estimation of the shape of the ERP (Woody, 1967; Handy, 2004; Saville et al., 2011). Furthermore, the variability of single-trial pa-rameters can be used to highlight individual differences in brain responses between people.

1.1.3 Functional Magnetic Resonance Imaging (fMRI)

Functional Magnetic Resonance Imaging (fMRI) visualizes brain activity by contrast-ing the ratio of oxygenated versus deoxygenated blood, termed the blood oxygena-tion level dependent (BOLD) contrast (Ogawa et al., 1990; Logothetis et al., 2001). An fMRI scanner measures this activity by dividing the brain into small cubes (voxels), approximately 3 × 3 × 3 mm, and measuring changes in BOLD contrast within these voxels at multiple points in time (Boynton et al., 1996; Glover, 1999).

The main goal in (task-related) fMRI analysis is localizing regions in the brain that are active during a specific cognitive process (Friston, 2011). In recent years studies have not only focused on localizing active brain regions, but also on describing the inter-actions between these regions. These so-called connectivity analyses focus on the co-variation of activity of regions over time (functional connectivity) or on establishing directionality (i.e., causal relations) between active regions (effective connectivity). The key in localizing regions of brain activity is to estimate the spatial extent (i.e., shape) of these regions. Since fMRI analysis is performed on each voxel separately estimating the extent of a region is usually done by counting active voxels and depict-ing regions of these voxels graphically. Improvements in these analyses can be made by posing a spatial function to model the spatial extent of activated brain regions (see for example, Hartvig, 2002; Weeda et al., 2009).

(20)

In terms of our general model the spatial map reflecting the brain’s response to a stim-ulus yiis modeled by a spatial function f determined by deterministic parameters d,

that model the spatial extent of each active brain region, and stochastic parameters si

that determine the height of activity2of these regions at each single trial.

An additional problem in localizing active brain regions is that, in standard analysis, all voxels are tested using a massive univariate method (Friston et al., 1995). That is, each voxel is tested separately for activity, and with often over 30000 voxels included in the analysis, the chances of finding false positive results (marking a voxel as ac-tive while it is actually not) increase dramatically (Forman et al., 1995). To control the number of false positives (i.e., the familywise error rate), this type of analysis re-quires a correction for multiple comparisons. A major problem with these correction methods, however, is that they are often conservative (Nichols and Hayasaka, 2003), leading to decreased power to detect active brain regions.

Conservativeness of localization can be problematic since connectivity analyses (i.e., analysis of trial-to-trial variation) and especially effective connectivity analyses re-quire that all regions in a network associated with a particular task are known. Fail-ing to include regions can bias connectivity analyses (Eichler, 2005). In addition to this, connectivity analyses require an estimate of single-trial amplitude for each re-gion in the network. This can be problematic due to the low signal-to-noise ratio of single-trial data.

The major advantages of using a spatial function to model the spatial extent of ac-tive brain regions are increased power to detect activation and improved estimates of trial-to-trial variability (Weeda et al., 2011). The main use of trial-to-trial variabil-ity in fMRI analysis is to define functional or effective networks of localized brain regions. These can give more insight into the underlying mechanisms involved in a particular task as it highlights the interaction between regions as opposed to just localizing them (Biswal et al., 1995). Effective connectivity goes even a step further as it can highlight directionality within these networks, showing which regions exert in-fluence over others (Friston et al., 2003; Waldorp et al., 2011). Furthermore, estimates

2Due to the nature of fMRI data, the latency of the BOLD response may differ between trials. In contrast

to EEG this latency is not of interest in most neuroimaging studies. Therefore the latency of the BOLD response is usually taken into account within the analysis, or corrected for a-priori

(21)

of trial-to-trial variability can also be used to highlight differences between groups.

1.2 Why trial-to-trial variability is important

There are mainly two reasons why it is important to take into account trial-to-trial variability. First of all, trial-to-trial variability may better reflect the working of the human brain. Evidence from multiple behavioral and neuroimaging studies has shown that intraindividual variability conveys important information on cognitive functioning. Reaction time studies have shown that taking into account this variabil-ity leads to better estimates of, for example, intelligence. Variabilvariabil-ity in this case thus seem to capture the underlying mechanism of cognitive processing better than the average alone.

This observation is further strengthened by EEG studies, where differences in vari-ability are also observed between (clinical) groups. In this sense varivari-ability is an in-trinsic part of cognitive functioning and may be a reflection of how the brain actually processes information. This may become apparent in connectivity studies, where not only the locations of regions involved in a cognitive process are of interest, but also the interactions between these regions. Cognitive processes are then defined not only based on their location but also on there role within a functional network. Correct localization of active brain regions and estimation of their trial-to-trial amplitudes is therefore essential.

This highlights the second point of why trial-to-trial variability is important: From a methodological standpoint, taking into account variability is essential for correct analyses. This has mainly two reasons: First, not taking into account trial-to-trial variability, thus only using the average, ignores the accuracy information of that av-erage (contained in the trial-to-trial variability). For example, two runs of a similar condition can have the same average, but completely different trial-to-trial variability. Second, ignoring trial-to-trial variability can lead to biased estimates. For example, latency variation can have an impact on the estimated shape of ERPs and taking into account these differences can correct for this phenomenon.

(22)

To summarize, the correct estimation of trial-to-trial variability may not only better reflect the mechanisms underlying cognitive functioning, it is also essential for a cor-rect analysis of reaction time and neuroimaging data. For these reasons, this thesis focuses on the development of new methods to estimate single-trial characteristics of neuroimaging data.

1.3 Outline

The outline of this thesis is as follows. Chapter 2 discusses a reaction time experi-ment where evidence of a commonly found effect in reaction time studies, namely that the slowest reaction times are a better predictor of cognitive functioning than the fastest reaction times, is not found when looking solely at reaction times, but is found when analyzing the underlying decision process. This chapter highlights the impor-tance of taking into account trial-to-trial variability when analyzing reaction times and serves as proof-of-concept for the further chapters. As learned from the reaction time study in Chapter 2, the importance of assessing the properties of a neurological signal at a single-trial level is clear: trial-to-trial variability may convey more infor-mation than the average alone. Chapter 3 therefore introduces a new method for the analysis of single-trial EEG data. This method not only determines the waveform of an event-related potential (ERP) but also determines its amplitude and latency at each single-trial. To further discern cognitive functioning, fMRI allows analysis of location of, and interactions between, areas in the brain associated with a cognitive function. Chapter 4 therefore introduces a new method for the localization of active brain regions using fMRI that uses a spatial model for these regions. This method has increased power to detect activation and has clear advantages for further analy-ses used to estimate trial-to-trial variability. Chapter 5 extends the initial method of

Chapter 4to be used with functional connectivity analysis, showing increased power and better estimates of trial-to-trial amplitude than standard approaches. Chapter 6 describes the accompanying software package of this method as implemented in R, and gives an overview of the method, its functions and its extensions. The thesis ends with a summary and a concise discussion of the advantages and disadvantages of the methods.

(23)

1.4 Articles resulting from this thesis

Chapter 3

Weeda, W.D., Grasman, R.P.P.P., Waldorp, L.J., van de Laar, M.C., van der Molen, M.W., and Huizenga, H.M. (in revision). Simultaneous Estimation of Waveform Amplitude and Latency of Single-Trial EEG/MEG data. PLoS One.

Chapter 4

Weeda, W.D., Waldorp, L.J., Christoffels, I., and Huizenga, H.M. (2009). Activated Region Fitting: A Robust High-Power Method for the Analysis of fMRI Data. Human Brain Mapping, 30(2), 2595-2605.

Chapter 5

Weeda, W.D., Waldorp, L.J., Grasman, R.P.P.P., van Gaal, S., and Huizenga, H.M. (2011). Functional Connectivity Analysis of fMRI Data using

Parameterized Regions-of-Interest. NeuroImage, 54(1), 410-416.

Chapter 6

Weeda, W.D., de Vos, F., Waldorp, L.J., Grasman, R.P.P.P., and Huizenga, H.M. (2011). arf3DS4: An Integrated Framework for Localization and Connectivity Analysis of fMRI Data. Journal of Statistical Software, 44(14), 1-33.

Other articles resulting from this Ph.D. project:

Van Duijvenvoorde, A.C.K., Figner, B., Jansen, B.R.J., Weeda, W.D., & Huizenga, H.M. (in preparation). Heuristic or Rational? The Neural Mechanisms Underlying Decision Strategies.

Harsay, H.A. , Cohen, M., Spaan, M., Weeda, W.D., Nieuwenhuis, S. & Ridderinkhof, K. R. (submitted). Pupil Dilation Predicts Shifts Between Default-Mode and Task-Focused Brain Networks During Error Awareness. Zeguers, M.H.T., Snellings, P., Tijms, J., Weeda, W.D., Tamboer, P., Bexkens A., &

Huizenga, H.M. (2011). Specifying Theories of Developmental Dyslexia: A Diffusion Model Analysis of Word Recognition. Developmental Science, 14(6), 1340-1354.

(24)
(25)

Empirical Support for a Drift

Diffusion Model Account of the

Worst Performance Rule

Abstract The worst performance rule (WPR) states that people’s slowest responses are the best predictors of their intelligence. Recently, Ratcliff et al. (2008) showed that at least in theory the WPR can be produced by a drift diffusion model (DDM) in which individual differences in IQ map on to individual differences in response caution or the rate with which information is extracted from the stimulus. The goal of this study was to test empirically the drift diffusion model account of the WPR and to see whether the WPR is affected by task manipulations (item difficulty and participant motivation). In a two-choice RT experiment with 46 high-school stu-dents, drift diffusion model analyses showed that high IQ participants tended to have a higher drift-rate (i.e., higher quality of information processing), and had, unexpectedly, a higher non-decision time (e.g., more time spent on execution of the motor response). We show that the WPR is present in the data, but that its presence is masked by the differences in non-decision time. Furthermore, we show that the WPR is not affected by task manipulations. Our results support the DDM account of the WPR, and show that a model-driven analysis of individual differences in IQ can be considerably more informative than a traditional analysis.

(26)

2.1 Introduction

A

cross a wide range of cognitive tasks, people with a high IQ tend to respond faster than people with a low IQ (Jensen, 2006). This regularity holds even for the most elementary tasks, such as when people have to decide which of two lines, pre-sented side by side, is the longest. A general observation in these kinds of studies is that the statistical association with IQ is more pronounced for the slowest responses than it is for the fastest responses (Larson and Alderton, 1990). In other words, peo-ple’s worst performance generally provides the best indication of their intelligence. The underlying mechanism of this “worst performance rule” (WPR) is not yet en-tirely understood, but the general consensus is that the WPR cannot be attributed to measurement errors or distributional confounds (Coyle, 2003). Instead, many re-searchers believe that the relation between IQ and instability in response time (RT) is caused by differences in neural structure or function (Jensen, 1992, 2006; Caryl, 1996; Deary and Caryl, 1997; Walhovd and Fjell, 2007; Fjell and Walhovd, 2007).

Recently, Ratcliff et al. (2008) argued that the WPR is consistent with the ”drift dif-fusion model” (DDM, Ratcliff, 1978; Wagenmakers et al., 2008). This model assumes that people make decisions by accumulating noisy samples of information from the stimulus until reaching a predefined threshold of evidence. In the DDM, the ob-served behavior (i.e., response times and accuracy) is decomposed into latent psy-chological processes such as quality of information processing, response caution, a priori bias, and non-decision time (e.g., peripheral processes such as the execution of motor commands). Ratcliff et al. (2008) showed that, at least in theory, the DDM automat-ically generates the WPR when individual differences in IQ are reflected through individual differences in quality of information processing or in response caution1.

In this chapter, we provide an empirical test of the validity of the DDM account of the WPR. We administered a simple two-choice RT task to high-school students of

1In order to produce the WPR, the drift diffusion model requires that quality of information processing

is higher, and response caution is lower, in participants with a higher IQ. In addition the DDM model also requires that the non-decision time component varies randomly from participant to participant (Ratcliff et al., 2008).

(27)

different aptitude and then applied the DDM to the data. We expected to observe the WPR and we expected that IQ would be positively associated with quality of information processing and/or response caution.

In addition we investigated whether the WPR is a general phenomenon or whether it is affected by task difficulty or the presence or absence of motivational incentives. If the WPR is a general phenomenon, it should be observed irrespective of manip-ulations of task difficulty and motivation. In addition, the association between IQ and quality of information processing and/or response caution should then not be affected by these manipulations.

The outline of this paper is as follows. First we discuss the DDM and its most impor-tant parameters. Then we briefly recapitulate the relation between the DDM and the WPR (Ratcliff et al., 2008). Next we describe the method of our study, the data, and the results of the model analyses. We end with a brief discussion.

2.1.1 The drift diffusion model

The drift diffusion model for two-choice decisions (e.g., Ratcliff, 1978; Ratcliff and Rouder, 2000; Ratcliff, 2006; Ratcliff et al., 2006, 2007; Voss et al., 2004; Wagenmakers et al., 2008) assumes that noisy stimulus information is accumulated over time until one of two response criteria is reached, after which a response is initiated. Figure 1.1 provides a schematic representation. In Figure 1.1, the gray lines each indicate the information accumulation for a single trial, starting at z, with four accumulations ter-minating at the correct boundary a and one terter-minating at the incorrect boundary

0. The mean rate with which the information accumulates is indicated by the black arrow marked v and is termed drift-rate. This parameter is interpreted as the qual-ity of information processing: high values of v lead to responses that are both fast and accurate. The distance between the lower boundary 0 and the upper boundary

a indicates boundary separation and is interpreted as response caution: high values of a lead to responses that are slow and accurate (i.e., high values of a reduce the impact of chance fluctuations on the outcome of the decision process). The starting point z reflects a priori bias – for instance, when a sequence of 100 trials contains 80 stimuli of category A and only 20 stimuli of category B, the participant is likely to

(28)

expect another category A stimulus and move the starting point accordingly. When the starting point is not equidistant from the response boundaries, responses will be fast and accurate for one stimulus category, but slow and inaccurate for the other. The final parameter shown in Figure 1.1 is termed Terand captures the component of

response time that is unrelated to the decision process (e.g., the time it takes to encode the stimulus and to move a hand to press a button). The response time of a single trial consists of the sum of the non-decision time Ter and the decision time. In addition

to the parameters shown in Figure 1.1, the DDM contains additional parameters that reflect the trial-to-trial variability in v, z, and Ter. All DDM parameters are estimated

from the observed error rates and the observed response time distributions, both for correct and incorrect responses (Ratcliff, 2002).

The drift diffusion model account of the WPR

The DDM makes several qualitative predictions that hold regardless of the specific value of its parameters (Ratcliff, 2002). One of the most fundamental predictions con-cerns the change in the shape of the RT distributions as a function of drift-rate and boundary separation (e.g., Ratcliff et al., 2008; Wagenmakers et al., 2005; Wagenmak-ers and Brown, 2007). In particular, the model predicts that when responses slow down because of a lower drift-rate or a larger boundary separation, the RT distri-butions will show an increasing right-skew. That is, a change in either drift-rate or boundary separation produces relatively small effects for the fast responses, and rel-atively large effects for the slow responses. It is this inherent property of the DDM that allows it to produce the WPR.

Specifically, Ratcliff et al. (2008) showed that the DDM produces the WPR provided that individual differences in IQ map on to individual differences in drift-rate or boundary separation (and provided that there exists some random individual dif-ferences in non-decision time).

The following study was designed to test whether the DDM’s theoretical account of the WPR holds up against empirical data.

(29)

2.2 Method

2.2.1 Participants

Fifty-two high school students (28 female, age 15-17) participated for monetary re-ward.

2.2.2 Intelligence test

We measured IQ using Raven’s Advanced Progressive Matrices (Raven et al., 1998). The test consisted of 36 items and was time limited to 20 minutes. The time limited version of the test was shown to be a good predictor of the non time limited version (Hamel and Schmittmann, 2006). The final score of a participant was the number of correct answers.

2.2.3 Response time task

The stimuli of the response time task were taken from the π-paradigm (Vickers et al., 1972; Jensen, 1998). The stimuli of the π-paradigm consist of three lines placed in a π-like configuration (two vertical legs and one horizontal), with one of the vertical legs longer than the other (see Figure 2.1).

For this study the two vertical lines differed in length, the difference being 20 pixels (“easy items”), 10 pixels (“medium items”), or 5 pixels (“difficult items”). To mini-mize expectancy effects, the π was presented either in its normal orientation or upside down. This yielded 12 stimuli in total (left/right leg longer (2), difference in length (3), and inverted/normal position (2)). The order of the stimuli was randomized. To see whether the IQ – reaction time relation was affected by motivational factors the experiment included two conditions with different instructions. To motivate partici-pants they were told that a monetary reward could be obtained if their performance was within the top three of all their classmates’ performances. After that, they were

(30)

Figure 2.1: Examples of the stimuli used in the experiment. Stimulus difficulty was varied on three levels. Note that the longer leg could be left or right, and that the π could be presented in its normal orientation or upside-down.

told that only the trials in the real condition would be used to rate their performance, trials in the practice condition would be discarded.

2.2.4 Procedure

Participants were tested in groups of maximum five high-school students in a quiet classroom. Three instructors were present in the room at all times. First the Raven’s Advanced Progressive Matrices was explained and administered for 20 minutes with instructions to be as accurate and fast as possible. Next the instruction for the re-sponse time task was given and the task was administered for 25 minutes. Partici-pants started with 100 “practice” trials and 100 “real” trials. These blocks were re-peated until the end of the session. Between blocks a message indicated whether a practice block or real block was up next. A single trial consisted of a cross shown in the center of the screen for 500 ms, the stimulus for 150 ms, and a mask for 500 ms. Participants were instructed to press the “z” key on the left of the keyboard when the left vertical line of the stimulus was longer than the right one, and the “m” key on the right of the keyboard when the right vertical line was longer than the left one. Participants were instructed to be as fast and accurate as possible.

(31)

2.3 Results

2.3.1 Preprocessing

Six participants were excluded from the analysis. One participant was excluded due to not completing the intelligence test. Four participants were excluded due to ex-tremely low scores (lower than 5 correct answers) on the intelligence test (values fell outside 1.5 times the interquartile range (IQR) of IQ scores). Finally, one participant was excluded due to an overall percentage correct of around 24%. This left 46 par-ticipants for further analysis. Characteristics of excluded parpar-ticipants can be found in the Supplementary Materials (2.5.1, Table 2.2). Data were collapsed over left/right and normal/inverted positions. The raw data were checked for outliers: RTs lower than 250ms and higher than 2500ms were removed (Ratcliff and Tuerlinckx, 2002). On average 7.7% of trials was removed, leaving the mean number of trials completed 455(sd=87) per participant. The mean score on the Raven’s was 16.30 (sd=3.94). Be-havioral results can be found in the Supplementary Materials (2.5.2, Table 2.3).

2.3.2 Worst Performance Rule

The WPR was first reported by Larson and Alderton (1990). They divided the ob-served response times for individual participants in different “RT bands”, ranging from slow to fast, computed the mean RT for each band, and correlated these mean RTs with several measures of intelligence. Figure 2.2 plots a subset of the original results reported in Larson and Alderton (1990). It is clear from Figure 2.2 that the correlation with IQ is always negative and is most pronounced at the highest RTs. In this article we used a procedure similar to that of Larson and Alderton (1990): We divided the RT distribution in quantiles, from fastest to slowest (with the lower quantiles measuring the fast responses and the higher quantiles measuring the slow responses). The WPR predicts that correlations become more pronounced with in-creasing quantile, and that all quantiles correlate negatively with IQ.

(32)

Figure 2.2: RT band correlations with intelligence. Figure based on data reported in Larson and Alderton (1990, Table 4, p. 317).

items (black lines) in the practice and real conditions. As the WPR predicts, the cor-relation between RT quantiles and IQ appears to decrease with increasing quantile. However, the first quantiles correlate positively with IQ, and the negative correlation for the slowest quantiles is not convincing. This effect is even more pronounced in the real condition than in the practice condition. In the real condition all quantiles are positively associated with IQ. In sum, this study may at a first glance appear to find only limited support for presence of the WPR.

2.3.3 Drift diffusion model parameters

The DDM was fit to the data of each participant separately (Voss and Voss, 2007, 2008). In the model, the drift-rate parameter (v) was allowed to vary between be-tween item difficulty and motivation conditions. The boundary separation param-eter (a) and non-decision paramparam-eter (Ter) were constrained to be equal across item

(33)

parame-Figure 2.3: Correlation of IQ with RT quantiles for easy (solid line), medium (dashed line), and difficult (dotted line) items. Black lines indicate WPR based on uncorrected RT, gray lines indicate WPR based on Tercorrected RT.

ters that reflect trial-to-trial variabilities (η, variability of drift-rate; St, variability of

non-decision time; Sz, variability of starting-point) were constrained across trial

dif-ficulty and motivation. The parameter reflecting starting point was constrained to be equidistant from both response boundaries (i.e., z = a/2). This pattern of con-straints was motivated both by a desire for parsimony and theoretical considerations of plausibility. Drift-rate may depend both on the difficulty of the stimulus as well as the motivation of the participant. Boundary separation is predefined and should not be informed by task-difficulty when items are presented in a random ordering, but boundary separation might differ between motivation conditions (i.e., participants might be more or less cautious when motivated to perform optimally). Non-decision time should in the same way not be influenced by item difficulty but could be influ-enced by motivation (i.e., when motivated, participants might be more attentive and therefore make faster motor responses).

Table 2.1 shows mean and standard deviation of the parameter estimates. Figure 2.4 shows the fit of the model to the data of the .5 quantile (i.e., median). As can be seen from Figure 2.4 the model fits the data well. Deviance from the diagonal, an

(34)

indica-tion of model misfit, is mainly observed in incorrect condiindica-tions where there is a low number of responses. Fits for the other quantiles can be found in the Supplementary Materials (2.5.3, Figure 2.5 and 2.6).

Table 2.1: Mean and standard deviation of drift diffusion model parameters over participants. Bottom row shows correlations of drift diffusion model parameters with IQ. Note that values for z are omitted as these were fixed at z = a/2.

Diffusion model parameters

Motivation Difficulty v(s.d.) a(s.d.) Ter(s.d.) η(s.d.) Sz(s.d.) St(s.d.) Practice Easy 3.477 (1.110) Medium 2.966 (0.987) Difficult 2.714 (1.045) Overall 0.932 (0.288) 0.302 (0.045) Real Easy 3.459 (1.360) Medium 3.060 (1.108) Difficult 2.676 (1.052) Overall 0.890 (0.232) 0.304 (0.046) Overall 0.201 (0.242) 0.243 (0.155) 0.123 (0.073) Correlation (p-value) 0.023 (0.881) .004 (0.980) −.114 (0.452) with IQ Drift-rate

A repeated-measures ANOVA with item difficulty and motivation as within factor and IQ as continuous between subjects factor indicated that there was a significant main effect of item difficulty (F (2, 88) = 32.96; p < .001): drift-rate decreased with in-creasing item difficulty. The effect of IQ tended to be significant (F (1, 44) = 3.99; p = .052): drift-rate was higher for participants with a higher IQ score. The main effect of motivation was not significant (F (1, 44) = .01; p = .916). There were no signifi-cant interaction effects between IQ and either difficulty (F (2, 88) = .39; p = .677) or motivation (F (1, 44) = .953; p = .334).

(35)

Figure 2.4: Fit for the drift diffusion model per condition. X-axis indicates observed RT quantile, y-axis indicates estimated RT quantile. Points (black) indicate correct responses, crosses (gray) indicate incorrect responses.

Boundary separation

A repeated-measures ANOVA with motivation as within factor and IQ as continuous between subjects factor indicated that there was a trend towards an effect of motiva-tion (F (1, 44) = 3.07; p = .087): boundary separamotiva-tion was higher in the practice than in the real condition. There was no main effect of IQ (F (1, 44) = .06; p = .816). There was no interaction effect between motivation and IQ (F (1, 44) = 1.69; p = .200).

Non-decision time

A repeated-measures ANOVA with motivation as within factor and IQ as continu-ous between subjects factor indicated that there was no main effect of motivation

(36)

(F (1, 44) = .12; p = .736). However, there was a main effect of IQ (F (1, 44) = 5.56; p = .023). Unexpectedly, high IQ participants were characterized by a longer non-decision time (r = .335; p = .023). Motivation and IQ did not interact (F (1, 44) = .23; p = .634). For the other DDM parameters (constrained across item difficulty and motivation), correlations with IQ were calculated. Table 2.1 (bottom row) shows these parameters did not correlate significantly with IQ.

Worst Performance Rule

As mentioned earlier, we found a positive association of IQ and non-decision time, indicating that high IQ participants tended to react slower than low IQ participants. The effect of the non-decision parameter is to shift the entire RT distribution by an equal amount. This effect of non-decision time has two consequences: First, it will counteract the effects of drift-rate and boundary separation on reaction time, and, second, it will cause the fastest responses to be positively correlated with IQ (meaning that the very fastest responses of the high IQ participants will be slower than those of the low IQ participants) – an effect that could explain why we did not find the WPR in its pure form. This latter consequence is also evident from an inspection of Figure 2.3.

In order to test the hypothesis that the non-decision time component is effectively masking the WPR, we generated predictions of the DDM after assigning all partic-ipants the same value of non-decision time within each motivation condition (i.e., practice Ter=.302, real Ter=.304). All other parameters were left unchanged at their

values from the original analysis. We then calculated quantile-correlation plots for the data predicted by the DDM with an equal non-decision time component. The result, shown by the gray lines in Figure 2.3, demonstrates evidence for the WPR: high IQ participants are faster than the low IQ participants in all quantiles, and the magnitude of the correlation increases with increasing quantile. In other words, the DDM reveals that the data are consistent with the WPR as long as the non-decision time does not contaminate the results.

(37)

2.4 Discussion

The Worst Performance Rule (WPR) states that the association between IQ and reac-tion time is most pronounced in an analysis that focuses on worst performance, that is on the slowest responses. More specifically, the WPR states that the correlation be-tween IQ and RT is negative, and that this correlation decreases with increasing RT. Ratcliff et al. (2008) provided a theoretical DDM account of this WPR, predicting that it should originate in the fact that high IQ is associated with either increased qual-ity of information processing or in decreased response caution. The purpose of the present study was to test whether the DDM account of the WPR holds up against em-pirical data. That is, we expected to observe the WPR and we expected that IQ would be positively associated with either increased quality of information processing or decreased response caution.

In a simple perceptual discrimination task, we found initially unconvincing evidence for the WPR: although correlations between IQ and RT decreased with increasing RT, the correlations were positive instead of negative. The DDM analysis indicated that IQ tended to be associated with quality of information processing: the higher IQ, the higher the quality of information processing. IQ was not associated with response caution. Unexpectedly, IQ was also associated with non-decision time, that is, participants with a high IQ were characterized by slower non-decision related processes. This effect on non-decision time counteracted the effects of drift-rate on the WPR, with the overall result that direct evidence for the WPR was unconvincing. The data were in fact consistent with the WPR once the effect of non-decision time was corrected for. The present study therefore provides support for the DDM account of the WPR (Ratcliff et al., 2008): the WPR originates in the fact that IQ is associated with increased quality of information processing.

A second purpose of this study was to investigate whether the WPR is a general phe-nomenon or whether it is affected by manipulations of task difficulty or motivation. The results from the Ter-corrected analysis show that the WPR is not affected by

diffi-culty: the slopes of the WPR are the same at each difficulty level. Also the WPR is not affected by differences in motivation: the first quantiles correlate negatively with IQ

(38)

and increase in strength with increasing quantile. Moreover, there were no significant interaction effects between IQ and difficulty, and IQ and motivation on drift-rate or boundary separation, however, the absence of these effects might be due to the low power of this study given the small number of participants.

It is as yet unclear why decreases in IQ were associated with a shorter non-decision time, although this result is consistent with Schmiedek et al. (2007), who found that drift-rate and non-decision time correlated positively. The unexpected effect on Ter

might explain why not all studies find the WPR (e.g., Salthouse, 1998). This possi-bility highlights one of the major advantages of using a diffusion model to analyze two-choice RT data: the model allows the researcher to draw substantive conclusions with respect to underlying psychological processes. This is especially useful when certain effects are masked in the behavioral measures, as was the case in the present data.

To summarize, we have shown that the DDM provides an account of IQ-related dif-ferences in response times that is much more detailed than the one provided by stan-dard analysis. First, a higher quality of information processing tended to be asso-ciated with a higher IQ. This is consistent with results from most diffusion model studies regarding reaction times and IQ (Ratcliff et al., 2010; van Ravenzwaaij et al., 2011). Second, the worst performance rule is shown to be independent from task manipulations of item difficulty or participant motivation. Finally, the DDM is able to not just generate the worst performance rule when it is evidently in the data, but the model can also uncover the worst performance rule when it is masked by other effects.

(39)

2.5 Supplementary Materials.

2.5.1 Characteristics of excluded participants

Table 2.2: Characteristics of excluded participants.

ID IQ Age Sex MRT SDRT Pc n 11 0 NA female 450 108 .92 599 13 5 15 male 514 152 .98 393 18 NA NA female 487 229 .70 369 35 0 NA male 381 77 .91 589 37 2 16 male 475 139 .94 585 48 17 15 female 332 46 .24 488

2.5.2 Behavioral measures

Table 2.3 shows the behavioral data, that is, mean reaction time (MRT), standard deviation reaction time (SDRT) and percentage correct (Pc). A repeated-measures ANOVA with item difficulty and motivation condition as within-factors and IQ as continuous between subjects factor was performed. IQ score had no effect on MRT or SDRT, but did have a significant effect on accuracy: participants with higher IQ scores were more accurate (F (1, 44) = 4.199; p = .046).

Item difficulty had an effect on MRT (F (2, 88) = 21.71; p < .001): participants re-sponded faster to the easy items than the difficult items (p < .001) and medium items (p < .001). Item difficulty had an effect on SDRT (F (2, 88) = 6.16; p = .003): participants’ responses were less variable in the easy condition than in the difficult condition (p = .001). Item difficulty had an effect on Pc (F (2, 88) = 31.49; p < .001): participants were more accurate in the easy condition than in the medium (p < .001) and difficult conditions (p < .001).

(40)

(F (1, 44) = 4.01; p = .052): participants’ responses were slower and more variable in the practice condition than in the real condition.

There was a significant interaction between item difficulty and IQ on MRT (F (2, 88) = 3.36; p = .039): correlation of MRT and IQ was positive in the easy (r = .044; p = .772) and medium (r = .028; p = .854) conditions and negative in the difficult (r = −.021; p = .892) condition, although all were non-significant. In addition there was a significant interaction between motivation and IQ on MRT (F (1, 44) = 6.13; p = .017): correlation of MRT and IQ was positive in the real condition (r = .119; p = .433) and negative in the practice condition (r = −.097; p = .522) , although both were non-significant. Finally, there was a significant interaction between item difficulty and motivation on MRT (F (2, 88) = 4.14; p = .019) and SDRT (F (2, 88) = 5.69; p = .005). In the real condition MRT increased linearly with item difficulty, while in the practice condition responses to medium and difficult conditions showed no differences. In the practice condition SDRT increased between easy and medium items, whereas in the real condition this increase was observed between medium and difficult items.

Table 2.3: Mean and standard deviation (sd) of the behavioral measures (MRT, SDRT, and percentage correct (Pc)) for practice/real and difficulty conditions. MRT and SDRT are in milliseconds (ms), n = 46

Motivation Difficulty MRT (sd) SDRT (sd) Pc (sd)

Practice trials Easy 440.60 83.54 110.96 73.05 0.91 0.08 Medium 456.16 83.22 138.79 69.61 0.88 0.09 Diffcult 453.43 92.21 137.94 84.43 0.86 0.10 Real Trials Easy 427.01 78.35 111.56 76.42 0.90 0.09 Medium 434.39 80.53 103.99 59.63 0.87 0.10 Diffcult 445.47 83.64 122.44 74.19 0.85 0.11

(41)

2.5.3 Drift diffusion model fits

Figure 2.5: Fit for the drift diffusion model per item difficulty for the practice trials. X-axis indicates observed RT quantile, y-axis indicates estimated RT quantile. Points (black) indicate correct responses, crosses (gray) indicate incorrect responses.

(42)

Figure 2.6: Fit for the drift diffusion model per item difficulty for the real trials. X-axis indicates observed RT quantile, y-axis indicates estimated RT quantile. Points (black) indicate correct responses, crosses (gray) indicate incorrect responses.

(43)

Simultaneous Estimation of

Waveform, Amplitude, and

Latency of Single-Trial EEG/MEG

Data

Abstract The amplitude and latency of single-trial EEG/MEG signals may provide valuable information concerning human brain functioning. In this chapter we propose a new method to reliably estimate single-trial amplitude and latency of EEG/MEG signals. The advantages of the method are fourfold. First, no a-priori specified template function is required. Second, the method allows for multiple signals that may vary independently in amplitude and/or latency. Third, the method is less sensitive to noise as it models data with a parsimonious set of basis functions. Finally, the method is very fast since it is based on an iterative linear least squares algorithm. A simulation study shows that the method yields reliable estimates under different levels of latency variation and signal to noise ratios. Furthermore, it shows that the existence of multiple signals can be correctly determined. An application to empirical data from a choice reaction time study indicates that the method describes these data accurately.

(44)

3.1 Introduction

S

ingle-trial amplitude and latency of EEG/MEG signals may contain valuable in-formation concerning human brain functioning. Amplitude and latency of signals may change during the course of an experiment, for example due to learning or habit-uation. In addition, particular groups of subjects may be characterized by increased amplitude or latency variation. For example, increased latency variation may be as-sociated with ADHD (Geurts et al., 2008), ageing (Fein and Turetsky, 1989; Fjell et al., 2009), and low intelligence scores (De Pascalis et al., 2008).

Studying inter-trial differences requires that estimates of single-trial amplitude and latency are accurate and reliable. This may be a daunting task given the complexity of EEG/MEG data: single-trial EEG/MEG data have a low signal-to-noise ratio (SNR, Fein and Turetsky, 1989), and are usually composed of signals from multiple brain processes (Gevins, 1984).

Several methods have been proposed to derive single-trial amplitudes and latencies. These methods differ in several ways. First, some methods require an a-priori tem-plate function, whereas other methods do not. That is, some methods require that the shape of the signal of interest is defined before analysis (for example, Woody, 1967; Mayhew et al., 2006). Second, some methods only allow for either amplitude or latency variation (for example, Pham et al., 1987), whereas other incorporate both types of variation (for example, Jaskowski and Verleger, 1999). Third, some methods assume that the data consist of one underlying signal (for example, Pham et al., 1987; Jaskowski and Verleger, 1999) whereas others allow multiple signals each with their own amplitude and latency variation (for example, Mayhew et al., 2006). The latter is certainly an advantage since it might very well be the case that some early signals do not show marked inter-trial variability whereas some later signals do show variabil-ity. Fourth, some methods are susceptible to noise (cf. Jaskowski and Verleger, 2000), whereas in others this susceptibility is reduced by incorporating basis functions. The purpose of the present paper is to combine the strengths of all these methods into one framework, Simultaneous Waveform, Amplitude and Latency Estimation (SWALE). First however, we review existing methods in more detail.

(45)

A common, and simple, approach to obtain single-trial estimates is peak picking. Peak picking entails smoothing of single-trial data with a low-pass filter and search-ing for the signal maximum within a specified time window to determine amplitude and latency in each trial (Picton et al., 2000). Advantages are that no template has to be defined, and that both amplitude and latency can be estimated. However, it is not possible to test whether multiple signals are present. Furthermore the method is very susceptible to noise (Jaskowski and Verleger, 2000).

A different approach is to explicitly model the signal in each single-trial. Pham et al. (1987) assume that an EEG/MEG trial can be modeled by a waveform with trial spe-cific latency. Parameters of the waveform and trial spespe-cific latencies are estimated in the frequency domain. This method was extended by (Jaskowski and Verleger, 1999) to also allow estimation of trial specific amplitudes. Major advantages are that no template is required, since the waveform is estimated, and that the method incorpo-rates both trial varying amplitudes and latencies. Disadvantages are that the method does not allow for multiple signals and that the method does not perform optimally in low SNR conditions (Jaskowski and Verleger, 2000).

A different modeling approach is based on a technique that is also used in the analysis of fMRI data. In fMRI analysis the haemodynamic response (i.e. the response of the brain to a stimulus) is often modeled by a waveform plus its first order derivative (see Figure 3.1) to allow for differences in latency (Friston et al., 1998) .

Mayhew et al. (2006) used this method to estimate single-trial EEG amplitude and latency in a multiple linear regression framework. By regressing the data to a priori specified template functions and their derivatives, estimates of single-trial amplitude and latency are obtained. The method has the advantage that it can be used to esti-mate single-trial amplitude and latency of multiple signals. However, it requires that a template is specified for each signal. Mayhew et al. (2006) use the average data as a template, however this template might be biased in the presence of large latency variation (Handy, 2004; Woody, 1967). Another disadvantage is that the template re-quires as many parameters as there are timepoints, and therefore is very susceptible to noise.

(46)

Figure 3.1: Modeling latency using waveform and its first-order derivative.

combine the aforementioned methods into the SWALE framework. More specifically, we extend the approach of Mayhew et al. (2006) such that no a-priori template is required and such that noise sensitivity is diminished. Instead of a template we es-timate waveforms from the data (cf. Jaskowski and Verleger, 1999) and model these waveforms with a parsimonious set of basis functions that reduces noise sensitivity. The framework can be extended to model data with multiple waveforms and it can explicitly test for the necessary number of waveforms. Furthermore, estimation of parameters is embedded in an iteratively least squares framework (de Munck et al., 2004) and is therefore very fast.

In the following we first explain the method in more detail: we formulate the model, outline parameter estimation, and indicate how the optimal number of basis func-tions and the optimal number of waveforms can be obtained by means of statistical model selection. Second, we report a simulation study on the characteristics of the method. Third, we illustrate the method with an analysis of empirical data obtained in a choice reaction time (CRT) experiment. Finally, we discuss advantages and

(47)

limi-tations and provide some extensions.

3.2 Methods

The rationale of the SWALE framework is to model single EEG/MEG trials by the sum of (i) an overall waveform plus (ii) its respective derivative scaled by a parameter that depends on trial specific latency. Both parameters are scaled by a trial specific amplitude parameter (Figure 3.2). This model is easily extended to allow each trial to be described by multiple waveforms (each representing an underlying signal). We will first treat the single waveform case and then extend it to multiple waveforms.

3.2.1 Model

The EEG/MEG data are in the (M × T ) matrix Y consisting of m = 1, ..., M trials of length T . Each single-trial ymcan now be modeled as a waveform plus its derivative:

ym= am[(Qf) + lm(Df)] + εm (3.1)

In Eq. 3.1 Q is a (T × P ) matrix containing the P basis functions, D is a (T × P ) matrix containing the first-order derivatives of the basis functions, and f is a (P × 1) vector containing the P coefficients of the waveform. amis the trial specific amplitude

pa-rameter and lmis the trial specific latency parameter. εmis the noise term distributed

as N (0, σε).

In order to model all trials at once we rewrite Eq. 3.1. We first move f outside the brackets and replace amlmwith bm. The model then becomes:

ym= [amQ + amlmD]f + εm ⇔ ym= [amQ + bmD]f + εm (3.2)

(48)

Figure 3.2: Effect of amplitude and latency parameters on the single-trial model. Solid black lines indicate the single-trial model. Dashed lines indicate the average model. Light grey lines indicate the data.

(49)

M trials can be rewritten as:

vec(Y) = [Q ⊗ a + D ⊗ b]f + ε (3.3)

vec(Y) contains the stacked data Y. a is an (M × 1) vector containing the single-trial amplitude parameters, b is an (M × 1) vector containing the single-single-trial latency parameters. ⊗ denotes the Kronecker Product.

In this model the type and number of (orthogonal) basis functions must be set a priori (matrix Q). Note that the sufficient number of basis functions can be determined via model selection (see 3.2.5). In general any set of flexible basis functions can be used to model the waveform. In the current implementation we use a set of orthogonal polynomial basis functions since they are flexible enough to describe the waveforms. Also, polynomial basis functions are easy to compute and their derivatives can be obtained analytically. By default the number of basis functions is set to 20.

3.2.2 Parameter estimation

The SWALE model thus estimates both the waveform and trial specific amplitude and latency parameters from the data. Parameter estimation is split in two parts that are applied iteratively until convergence: estimation of the waveform (f) and estimation of single-trial amplitude (a) and latency (b) parameters. Each part has a linear solution and thus can be solved easily. For the estimation of the waveform the least squares estimator is:

ˆf = ((Q ⊗ a + D ⊗ b)0(

Q ⊗ a + D ⊗ b))−1

· (Q ⊗ a + D ⊗ b)0· vec(Y) (3.4)

For estimation of single-trial amplitude and latency the least squares estimator is given by:

vec(ˆa ˆb) = ([I ⊗ (Qˆf) I ⊗ (Dˆf)]0[I ⊗ (Qˆf) I ⊗ (Dˆf)])−1

(50)

where I is the (M × M ) identity matrix.

The estimation procedure thus consists of two parts that can be solved linearly. Ap-plying these two parts iteratively leads to convergence of the overall solution (cf. de Munck et al., 2004). The detailed procedure is as follows (see Figure 3.3): First, the grand average of the data is used as starting waveform for the iteration procedure. For this waveform the amplitude and latency parameters are estimated using Eq. 3.5. This set of amplitude and latency parameters is thereafter used to re-estimate the waveform using Eq. 3.4. This waveform will be slightly different from the starting waveform (but closer to the actual waveform). Subsequently, the amplitude and la-tency parameters for this updated waveform are estimated using Eq. 3.5. These steps (calculating amplitude/latency parameters and the waveform) are repeated until the decrease in Residual Sums-of-Squares is negligible. Note that the starting waveform does not need to have any relation with the estimated waveform.

(51)

3.2.3 Estimating single-trial amplitude and latency

After convergence of the estimation procedure single-trial estimates for a specific peak of interest can be obtained. In order to do so one must first identify the peak of interest in the average model (by specifying a range). For example, in Figure 3.4 (left panel) the positive deflection at 300ms is selected as the peak of interest. To obtain

Figure 3.4: Single-trial detection. First a peak of interest must be identified in the averaged model by specifying a range (left panel, dark gray bar). Second, this range is used to estimate maxima/minima at each single-trial (right panel, dark gray point).

amplitude and latency estimates for this peak, at each modeled single-trial (Eq. 3.3) the maximum/minimum deflection within the range is identified (Figure 3.4, right panel, dark gray point). The time-point of this maximum/minimum is the latency of the single-trial, the value at this latency is taken as the amplitude of the single-trial. This procedure can also be followed for models with multiple waveforms (see 3.2.4). The estimation is then performed for each waveform separately.

Referenties

GERELATEERDE DOCUMENTEN

„In seiner allgemeinsten Form stellt Evaluation die Beurteilung oder Bewertung eines Sachverhalts oder Objektes auf Basis gezielt gesammelter Informationen dar.“ (ebd.)

Secondly, municipalities receive a block grant for integration that includes the funds for the financing of reintegration services of social assistance recipients, youth care,

d) Uit c) en opgave 8.4 volgt de inclusie [G, G] ⊂ N. Deze laatste groep heeft precies drie elementen waarvan de orde een deler van 3 is, namelijk 0, 4 en 8. Het gezochte aantal is

Er mag verwezen worden naar resultaten die in het dictaat bewezen zijn, maar niet naar opgaven, tenzij anders vermeld.. Op de achterkant staan formules en resultaten waar je ook

[r]

Er mag verwezen worden naar resultaten die in het dictaat bewezen zijn, maar niet naar opgaven, tenzij anders vermeld.. Op de achterkant staan formules en resultaten waar je ook

Als je de antwoorden niet op de logische volgorde opschrijft, vermeld dan duidelijk waar welk antwoord staat..

Vijfenzestig procent (942/1450) zei dat zij iemandd verdacht van TB zouden doorsturen naar het ziekenhuis voor onderzoek en 25%% (359/1450) zei dat ze de patiënt zouden behandelen