EmilKraaikamp Hor-I-Spell,amentaltypewriterbasedonabeta-burstbrainswitch

(1)

Human-Machine Communication Department of Artificial Intelligence University of Groningen, The Netherlands

Master of Science Thesis

Hor-I-Spell, a mental typewriter based on a beta-burst brain

switch

by

Emil Kraaikamp

S1338994

Internal supervisor: prof. dr. L. Schomaker External supervisor: prof. dr. R. de Jong

Groningen, March 2012

(2)

(3)

(4)

(5)

Abstract

A brain-computer interface (BCI) provides a non-muscular communication channel for people with and without disabilities. State-of-the-art EEG-based BCI spelling applications traditionally make use of a two-class motor-imagery tasks, based on mu-band activity (8-12 Hz), to move a cursor and select letters presented on a computer screen. However, a continuing study at the University of Groningen has shown that beta-band activity (16-26 Hz) could be used as a reliable binary selection signal, and in this study a non-invasive EEG-based BCI spelling application was developed based on such a single binary selection signal.

In the first stage of the study several spelling applications were developed and tested in an extensive offline user study to find the best performing application based on a single binary selection signal. The second stage consisted of applying this Hor-I-Spell spelling application in an EEG setting to develop a semi-automatic calibration method used to operate Hor-I-Spell in the final stage of the experiment.

Although two subjects were able to gain periodic control over the spelling application, this was not enough to operate the Hor-I-Spell spelling application reliably and let the subjects type sentences. Further research needs to be done to establish whether or not the performance of subjects can be improved after extensive training.

1

(8)

(9)

Acknowledgements

I would like to thank Ritske de Jong for his enthusiasm and invaluable advice throughout the entire project. Mark Span for both his interesting sense of humor, but also for his practical assistance during the EEG experiments. My roomy, Paolo Toffanin, for the many talks in the coffee room and beyond, as well as for his great expertise on EEG experiments. Lambert Schomaker for his advice and feedback especially during the early and final stages of the Master project.

And last but not least, my brother, sister and parents for their continued support throughout my many years of studies.

3

(10)

(11)

Chapter 1

Introduction

Brain-computer interfaces (BCI) focus on providing a direct way of communication between the human brain and a computer. Instead of using motor movements to operate an application, thought alone is used to gain control over a specific task. One application of BCIs is to provide a way of communication for those with locked-in syndrom, as they have lost control over all motor movements [3].

Other applications include tasks like operating a prosthetic limp [36], operating a motorised wheelchair [13], but also the computer gaming industry has shown interest in and developed BCI systems (in [35]).

Most BCI systems use electroencephalography (EEG) to record electrical activity along the scalp. This EEG data is then used to extract one ore more control signals to allow the operation of a (software) device. Other neuroimaging measurements have also been used in BCI systems, such as functional Magnetic Resonance Imaging (fMRI) [38] and Near Infrared (NIR) [10], but EEG has an advantage in that it can measure cortical potentials in real-time. This is important to achieve reliable control in a BCI system as it will reduce time-lag between the intention of the user and the action of the system. Other methods such as Electrocorticography (ECoG) are invasive and use electrodes placed directly on an exposed surface of the brain. They generally provide a higher spatial resolution and a higher signal-to-noise ratio than EEG, but for most applications they are not an option as surgery is required to place the electrodes.

The target application in a BCI system is often controlled using imagined motor movements which are continuously classified by the BCI system in one of two categories. The Hex-O-Spell spelling application for example makes use of two motor imagery states to either move a cursor on a computer screen, or select the letters pointed to by that cursor [4]. Hex-O-Spell can achieve typing speeds of up to 7 characters per minute, which compared to traditional input devices does not sound impressive, but it is in fact as fast as any BCI spelling application

5

(12)

can get. The two control states are detected using machine-learning techniques on EEG data filtered in the mu-frequency range (8-12Hz). They correspond to (de)synchronisation of the somatomotor cortex caused by imagined motor movements, and are slow in that the imagined movements take some time to be detectable by the system.

In a continuing research project at the University of Groningen a fast, binary control signal was found which could potentially be used as a selection signal in a BCI application. The control signal consists of a burst of activity in the beta frequency range (16-26Hz) over the somatomotor cortex. The TriSpeller application [17] was developed in an attempt to incorporate this brain switch in a spelling application based on Hex-O-Spell. Motor imagery was used to steer a cursor in one of two directions using a classifier based on mu-band activity. The beta-rebound brain switch could then be used to make a selection by switching the motor imagery movements. In pilot experiments it was shown that the beta rebound could reliably be detected in 99% of the trials, however due to time constraints they were unable to create a working spelling application. It was found however that the beta rebound was especially strong when switching motor imagery tasks compared to just stopping a motor imagery task. This beta rebound signal was also detected in earlier studies [28], where it was found to re- late to stopping (imagined) movements, and could potentially be used to realise a brain switch. However, it had not been related to switching (imaginary) motor movements.

In the current study an EEG-based BCI spelling application, Hor-I-Spell, will be developed based on the beta rebound signal to determine if it can indeed be used as a reliable control signal. It will be used as a selecting signal to select letters in a mental typewriting task.

1.1 Research question

In this study an application will be developed that makes use of the beta rebound signal to control an EEG-based BCI spelling application. In order the create a reliable beta rebound signal, the application will use alternating motor imagery tasks. Instead of using several control signals as in [39] and [17], a single control signal will be used to determine if the beta rebound signal can be used as a reliable brain switch. The study attempts to answer the following two research questions.

Can an EEG-based BCI spelling application be developed using a beta-burst selection signal which allows users to type letters, words, and sentences? And can the BCI world record typing speed of 7 characters per minute be improved with this new system?

To answer these questions, a new spelling application based on a single control

(13)

1.1. RESEARCH QUESTION 7

signal will have to be developed. This spelling application will be tested in a non- BCI setting using an extensive user study before being integrated into an actual BCI system.

(14)

(15)

Chapter 2

Background

In this chapter background information on (sources of) brain activity, brain- computer interfaces (BCI), and machine-learning techniques are provided. This information will be used to build the Hor-I-Spell BCI system described in subsequent chapters.

2.1 Brain Activity

This section describes the sources of brain activity, how it can be registered, and how active moving thoughts can be used to modulate brain activity to create a control signal for BCI applications.

2.1.1 Neurophysiology

The human brain consists of an incredibly complex network of +- 100 billion neurons communicating with each other using both chemical and electrical signals.

The neurons consist of a cell body, dendrites receiving information from other neurons (or sensors), and a single (branching) axon to send out information to other neurons (or effectors). Dendrites and axons are chemically connected via synapses, where neurotransmitters are used to either raise or lower the membrane potential of the cell. If the cell membrane is raised above a threshold of -50mV, an action potential will be fired along the axon causing either excitation or inhi- bition of other neurons. On average a neuron has 7000 synaptic connections to other neurons.

The human brain can be divided into different regions where groups of neurons work together to process information. In BCI research the areas in the brain associated with the planning and execution of motor movements are often used, where large groups of neurons fire in rhythmic synchrony. These so called idle

9

(16)

rhythms are attenuated when motor engagement takes place. These areas show even-related (de)synchronisation (ERD/ERS) with maxima in the mu (around 10Hz) and beta (around 20 Hz) frequency bands.

Somatomotor cortex

Figure 2.1: Penfields Homunculus. Showing the relatively large areas in the motor cortex associated with hand movements compared to other body parts.

The somatomotor cortex, or primary motor cortex, is a region in the human brain located in the posterior portion of the frontal lobe. This area of the brain is associated with planning and executing motor movements. Even without actually performing motor movements, but only imagining them, areas in the motor cortex become activated. Both mu and beta-rhythm (de)synchronisations of the areas are associated with either real or imagined motor movements.

A mapping has been created of the primary motor cortex by activating different areas in the cortex [33]. This resulted in a graphical representation of the areas associated with motor movement, showing that especially hand movements make up a relatively large part of the motor cortex, and imagined hand movements are therefore often used to control BCI applications (see figure2.1). Other imagined movements that have been found to cause (de)synchronisation of mu and beta-rhythms in the associated parts of the primary motor cortex include

(17)

2.1. BRAIN ACTIVITY 11

foot and tongue movements [26], but mostly left versus right hand movements or hand versus foot movements are used as a control signal in BCI applications.

2.1.2 Electroencephalography

Eletroencephalography (EEG) is a non-invasive technique to measure brain activity by recording voltage fluctuations resulting from ionic current flows within the neurons of the brain. EEG does not have the ability to record activity from single neurons, but relies on large groups of neurons to fire in synchrony. The technique relies on the use of electrodes placed on the outside of the skull. Dif- ferent types of electrodes exists to measure EEG activity, from traditional ’wet’

electrodes using conductive gel that require a long set-up time and can only be used for about two hours in a row without significant drop in the signal-to-noise ratio [34], to the latest range of dry electrodes that are easier to apply, but are not yet in general use and have a worse signal-to-noise ratio [14].

Because EEG measures voltage potentials directly caused by neuronal activity of groups of neurons, the technique has a high temporal resolution, making it suitable for BCI applications. But as the electrodes are placed outside the skull which smears out and reduces the quality of the electrical signal, the spatial resolution is relatively low compared to other techniques, also EEG has little sensitivity for sub-cortical activity. The low spatial resolution makes it difficult to detect activity originating in smaller brain regions. Detecting large hand- movements is generally possible using EEG recordings, but it is extremely difficult to reliably detect activity originating in smaller areas such as activity associated with individual finger movements.

Electrocorticgraphy (EcoG) is a similar technique to EEG, but here electrodes are placed under the skull. This increases the spatial resolution, making it possible to for example also detect activity in smaller regions caused by movements of individual fingers (in [18]). However, unlike EEG EcoG is an invasive method which requires surgery to place the electrodes.

2.1.3 Motor Imagery

To provide control signals for Brain-Computer interfaces, motor imagery is a task that is often used. Instead of actually performing motor movements, motor imagery consists of only imagining motor movements. It has shown to provide similar activation of somatomotor cortical regions as actual motor movements for both mu rhythm (de)synchronisation as well as beta-rhytm (de)synchronisation [26] [1] [24]. Beta-rhythm (de)synchronization can be expected after stopping or switching (imagined) motor movements [17] [27].

The large somatomotor-cortical regions associated with the hand and arm movements make the imagined movements particularly suited for controlling BCI

(18)

systems. Often a combination of left hand and arm movement is used against a right hand and arm movement to train a classifier system on these two classes, although some subjects show better performance using hand-foot motor-imagery instead [28] [39].

2.2 Brain-Computer Interface Research

Brain-computer interfaces (BCI) are systems designed to provide a direct connection path between the human brain and a computer system, without using any muscle movements. One application of BCIs is to provide a way of communication for those with locked-in syndrome, as they have lost control over all motor movements [3]. Other applications include tasks like operating a prosthetic limp [36], operating a motorised wheelchair [13], but also the computer gaming industry has shown interest in BCI systems (in [35]).

BCIs are asymmetrical closed loop systems. The amount of information that can be transferred from brain to computer using a BCI system is much lower than that can be achieved using traditional input methods for motor movements, whilst the amount of input the user can process is still high (mostly using visual stimuli, bot other sensory information can also be used). For example, the Hex-O- Spell application [39] can achieve maximum typing speeds of about 7 characters per minute, whilst the user is receiving continuous feedback on the computer screen showing both the typed letters, the interface to select the letters, and direct feedback describing the current states of the BCI system. When operating a keyboard most people can easily achieve typing speeds of over 200 characters per minute.

Figure 2.2: Overview of the closed loop of BCI system consisting of a calibration phase to train a classifier, and a feedback phase to classify the EEG data (from [6]).

The low information transfer rate in BCI systems is a direct result from both inaccuracies in measurements of brain-activities and the current state of machine- learning algorithms that attempt to transform EEG data into control signals. To

(19)

2.2. BRAIN-COMPUTER INTERFACE RESEARCH 13

be able to produce a reliable control signal, BCI systems have to be trained on the subject (e.g. let the machine do the learning [4]) or the subjects have to be trained in order to reliably produce predefined EEG components needed to operate the machine [3]. In recent years the former approach has been successful, reducing calibration times from several months to half an hour. However, it has been shown that a combination of the two methods can increase the information transfer rate for subjects that have difficulty achieving control over BCI systems [37]. See figure2.2 for an overview of a BCI system.

2.2.1 BCI2000

Typical BCI systems are often built for specific hardware and for a specific task only. This makes it difficult to perform systematic studies. The BCI2000 framework [31] is a general-purpose brain-computer interface system, built to respond to this problem. It is designed to simplify the development of a brain-computer interface, and allow for interchangeability of both hardware and software modules.

The BCI2000 platform can be used on multiple operating systems and used with different programming languages. BCI2000 consists of different modules communicating with each other via a TCP/IP connection protocol. There are three main modules, a source module, a signal processing module, and finally an application module. The source module receives input from different brands of EEG recorders, and sends the output in a standardised way to the signal processing module. Also included in BCI2000 is a signal generation source module that can be used for debugging purposes by simulating EEG data.

The signal-processing module receives the data from the source module, and transforms this into a control signal. The user is allowed to select any type of programming language to program the source module in, as long as it follows standard BCI2000 communication protocols. Among others an example signal processing module written in Matlab (The MathWorks Inc., 1984) is supplied which allows the experimenter to perform any signal-processing computations that are possible using Matlab.

The output of the signal-processing module, the control signal, is send to an application module. This application module can be anything from a controller for a prosthetic arm, to a game of pong or a program to provide locked-in patients a way of typing letters and forming sentences using only imagined movements.

The modules are controlled by an operator module that the experimenter can command. The operator module handles the communication between the modules, and provides feedback to the experimenter in the form of (filtered) EEG data and the control signals. The operator module can also receive additional markers indicating critical sections in the EEG data. Both the raw EEG data and markers are saved to the hard drive for offline analysis. Several tools are supplied

(20)

to analyse these stored recordings. The data can be transformed into several popular file formats to analyse the data using third-party software. Because the modules communicate over a TCP/IP connection, they do not have to run on the same computer. This makes it possible to, for example, separate a CPU-intensive task as signal processing from the application module displaying feedback to the user.

BCI2000 is an open source framework. Improvements in the form of new or changed modules for new hard- or software are being made regularly. There is a community of researcher providing support to other BCI-researchers.

2.2.2 BCI Spelling Applications

There are many different kind of BCI applications available, ranging from simple computer games and other software applications to hardware applications like controlling a robotic arm or motorised wheelchair. This Master project focuses on creating a BCI spelling applications based on motor imagery and EEG signals (Hor-I-Spell), two such systems will be discussed which have greatly influenced the development of the Hor-I-Spell application.

Hex-O-Spell

In this section the Hex-O-Spell spelling application [39] [4] [8] will be discussed.

At the time of writing it is the fastest BCI spelling application available, able to achieve 7 characters per minute for some subjects.

Figure 2.3: The Hex-O-Spell control loop, indicating the transformation of a discrete user intention into a continous variable which is fed back to the user. (From [4]).

Hex-O-Spell tries to detect two control commands to allow a user to steer an arrow towards a letter or group of letters and make a selection. The subjects perform motor imagery to control the application. The application uses EEG data to continuously classify user-specific spatio-spectral changes in the mu-frequency band. See figure 2.3 for a schematic representation of the Hex-O-Spell control loop.

Before being able to operate the Hex-O-Spell application, the users are required to perform a calibration task where they perform alternating hand-hand or hand-foot motor imagery for 20 minutes. After this calibration phase a classifier based on Common Spatial Patterns (see 2.3.1) and Fisher’s Linear Discriminant

(21)

Analysis (see2.3.2) is trained to detect the two most discriminant motor imagery classes. Most of the time the left versus right hand imagery conditions gave the best classification results, but for some subjects hand versus foot imagery provided better classification accuracies. Training the classifier takes less than one minute, combined with the short calibration time this allows naive users to quickly operate the spelling application. During the online spelling task, every 40ms the last 0.5 to 1 second of EEG data is classified by the trained classifier.

The output is smoothed over time and used as a continuous control signal for the spelling application.

Figure 2.4: Hex, a mobile text entry system controlled by tilt. From [40].

The Hex-O-Spell application was adapted from a text entry system for mobile devices called Hex [40]. The interface was designed in such a way that the user could select letters without actually touching the mobile device. By tilting the device a ball rolls over a virtual landscape with six hexagons containing groups of characters that can be selected. If a group of characters is selected, then each of those letters is placed in an empty hexagon the user can select. Hex feeds the output of a language model which predicts the most likely future letters, to a dynamic system that alters the landscape to make both likely things easier to select and a priori unlikely text require additional evidence in the form of greater control effort. See figure 2.4.

Letters can be selected by a selection array that either moves clockwise to point to a (group of) letters, or extend to select a specific (group of) letter(s) (see figure2.5. Smoothed outputs of the classifier are continuously used to either move or extend the cursor, making the program effectively timing-based. The time at which the transition from the rotating state changes to the selection state, determines which selection is made. However, as certain evidence is required to actually make a selection (the time needed to extend the arrow fullest), it is possible for the application to revert to a moving state without selecting a cursor.

This makes the system fairly robust to incorrect classifications. The movement speed of the array and the time required to extend the arrow depends on the performance of the user and can be set by the experimenter [21].

(22)

Figure 2.5: The basic setup of Hex-O-Spell. The participant wears the EEG cap shown at the left. On screen, the layout of letters can be seen, along with the currently entered phrase, the state of the selection arrow, and the feedback bar which shows the filtered output of the classifier directly. (From [40]).

Throughout typing, the user is provided with feedback by graphically displaying the smoothed output of the classifier in a large vertical progress bar. The bottom part of the progress bar represents one class which will move the selection arrow clockwise, and the top part of the progress bar represents the second class which extends the arrow to select letters. Whenever the output of the classifier is undecided, the arrow will maintain its momentum until the cursor has retracted fully and the cursor stops.

In order to reduce the effort required to type a sentence and make full use of the limited available bandwidth in a BCI task, a language model is used to make it easier to select more likely letters. The language model is based on two partial predictive-match (PPM) models [2] trained on a large newspaper corpus and several novels [39]. The first PPM model, P P M 1 determines the probability of a letter given the previous two letters, whilst the second PPM model P P M 2 determines the next likely letter given all previously typed letters in the word.

The two models are combined using a relative weighting that depends on the relative position of the letter in a word. The relative weights for P P M 2 decrease linearly from 1 for the first letter to 0.5 for the sixth and all subsequent letters.

The interface of Hex-O-Spell is adaptively changed according to the outcome of the language model. After a letter is typed, the arrow starts at the hexagon containing the most likely next letter. The letters within that hexagon are also ordered by likelihood.

In a demonstration at the worlds largest IT fair, two subjects achieved typing speeds of between 4 and 7 characters per minute. These typing speeds are very competitive for non-invasive BCI, and are at a level that are viable for use for those without other means of communication.

(23)

TriSpeller

The TriSpeller application was developed at the University of Groningen, and was the first spelling application to implement a beta-burst control signal in a spelling application [17]. The interface is based on the Hex-O-Spell application, but instead of a variable starting position for the selection cursor, the cursor starts at the top of the screen. The cursor can move either clockwise or anti- clockwise, instead of in only one direction. Instead of 5, there are 6 letters in each hexagon freeing one hexagon that is placed at the bottom of the screen and contains the most likely word according to the language model.

TriSpeller uses motor-imagery to modulate sensorimotor rhythms in the mu- band frequency to distinguish two motor-imagery classes to steer a cursor either clockwise or anti-clockwise. A third control signal was added based on a beta- burst arising after changing imagined motor movements. This third control signal was used to select letters.

The application uses a simple language model based on a dictionary of a certain language. Given a set of already typed letters L, the next most likely letter is determined by collecting all the words that fit the typed letters. The letters following L are counted and ordered by frequency. The interface is then re-ordered in such a way that the box containing the most likely letter is placed at the top. However, this language model does not take into account how often the words in the dictionary occur in real-life. Rarely used words attribute just as much to the predictions made by the language model as frequently used words.

Good control of the cursor was achieved using a combination of Common Spatial Patterns (see2.3.1) and Fisher’s Linear Discriminant Analysis (see2.3.2) on filtered EEG data when subjects performed left and right hand motor imagery.

However, in pilot experiments using one subject a strong beta-rebound signal was detected when switching imagined motor movement (+- 99% classification accuracy for movement direction) in an alternating motor-imagery task, indicating the beta-burst activity could be used as a binary selection signal. This beta-rebound occurred about 1 second after stopping the motor-imagery task for one hand, when motor-imagery started with the opposite hand. Unfortunately, due to time constraints they failed to create a working spelling application.

2.2.3 BCI Illiteracy

It is known for some time that not everyone is capable of achieving control in an EEG-based BCI application. About 15-30% of the population is so called BCI illiterate, a problem generic to all motor imagery based BCIs. Three different classes of BCI users can be distinguished: subjects for whom (I) a classifier could be trained and feedback could be performed with good accuracy; (II) a classifier

(24)

could be trained but the transition from classification to feedback proved difficult;

and (III) a classifier could not be trained.

Scientific consensus has not been reached for the BCI illiteracy problem, but some evidence suggests that with extensive training of the subjects, the performance of those in category II and III could be improved [37].

2.3 Machine Learning

The following sections describe machine-learning techniques used for the Hor- I-Spell application in a BCI setting (see 4). The purpose of machine-learning techniques in EEG-based BCIs is two-fold. On the one hand the techniques are used to adapt the BCI system to each individual user, extracting as much information from EEG channels as possible thereby increasing the performance of the system. On the other hand, machine-learning techniques also save time by allowing subjects to operate a BCI after only a short calibration session [4], instead of after months of extensive user training [3].

2.3.1 Common Spatial Patterns

Instead of using just two electrodes to record brain activity over the areas in the primary motor cortex associated with left and right hand and arm movements (e.g. electrodes C3 and C4 [41]), multi-channel EEG data will be recorded over a larger area over the motor cortex. This EEG data is spatially filtered using Common Spatial Patterns (CSP) to find the most discriminative patterns for two classes of motor imagery.

CSP is a supervised learning technique that is often used in BCI systems to spatially filter EEG data in order to find motor imagery induced ERS and ERD effects [4] [6] [19]. CSP increases the spatial resolution of EEG by taking advantage of the inherent correlation between neighbouring channels. EEG channels are decomposed into spatial patterns that are extracted from the data of two pop- ulations of EEG in a manner that maximises the differences in temporal variance [15] [29].

Given two distributions in a high-dimensional space (e.g. EEG channels), the CSP algorithm finds spatial filters to maximise variance in one class and at the same time minimise variance for the other class. To calculate CSP patterns, the EEG data is filtered for the frequency domain of interest, for example, if it is expected that mu (8-12Hz) or beta (16-26Hz) rhythms are affected by the motor imagery task, the data is filtered in this frequency ranges. Both the mu and beta frequency bands are often used in case of motor-imagery BCI systems. High variance within a signal reflect a strong rhythmic activity, while low variance represents a weak rhythmic activity.

(25)

2.3. MACHINE LEARNING 19

Figure 2.6: The common spatial pattern (CSP) algorithm finds spatial structures representing the optimal discrimination between two classes with respect to variance. At the left the weights of the spatial filters for each electrode are shown. In the middle the patterns are shown. The spatial filters represent the patterns, but their intricate weighting is essential to obtain signals that are optimally discriminative with respect to the variance. The 30-dimensional input consists of EEG data filtered in the beta-frequency range (16-26Hz). The top row shows the filters and patterns found on 100 trials during which subjects imagined a right hand movement. The bottom row shows the filters and patterns for an imagined left hand movement. At the right the average power over all trials is shown for both classes.

The CSP algorithm outputs a decomposition matrix W . The rows of W represent the stationary spatial filters, the columns of W⁻¹ can be seen as the common spatial patterns which describe the EEG source location. These patterns are used to verify the neurophysiological plausibility of the solution, while the intricate weighting of the filter takes into account potential noise sources and optimise the discrimination of both classes (see figure 2.6).

In a BCI application these spatial filters are applied to reduce the dimen- sionality of multi-channel EEG data. After the CSP patterns are applied to the EEG data, often only the two most distinctive time-series are used, one representing each class of motor imagery. The EEG data can either be filtered (e.g. in the beta-band) before applying the CSP spatial patterns, or afterwards, without changing the functionality. However, filtering only the relevant channels after applying the CSP matrix is recommended, as that will be faster than filtering all the channels before applying the CSP matrix.

After the EEG data has been spatially filtered using the CSP matrix, and filtered in a specific frequency band, features are then extracted by calculating the log transformed variance of the last second of EEG samples for the two most distinctive channels. This variance represent the power in the used frequency

(26)

band. These features are used to first train a linear classifier in a calibration session, where subjects are asked to perform specific motor imagery tasks, and finally to classify EEG data during online testing.

2.3.2 Fisher’s Linear Discriminant Analysis

Fisher’s Linear Discriminant Analysis (LDA) is a linear classifier that is often used in BCI research [23]. Linear classifiers use a hyperplane to separate data representing two classes according to the following formula.

y = w · x + b

Where y can be used as a class label with y < 0 defining one class, and y > 0 the other class, and x is an appropriate feature vector in n dimensional space.

The normal vector of the hyperplane w and the threshold b need to be estimated from the training data. In a BCI session a projection of the unseen data x is calculated onto the direction of the normal w · x, to determine what class should be given to x according to the linear model.

In this study, 30 channel EEG data is filtered in a frequency range and linearly transformed using CSP to obtain two channels with optimal discriminatory power between two conditions. The variance of the last second of data in these two channels will be used as the features x for the linear model.

Figure 2.7: A linear classifier is defined by a hyperplane’s normal vector w and an offset b. The decision boundary is represented by the thick line where w · x = 0. The margin of the linear classifier is the minimal distance of any training point to the hyperplane, and is represented by the dotted line.

LDA assumes a normal distribution of the data with equal covariance matrix for both classes. These assumptions have been fulfilled in several BCI experiments [5] [7]. The hyperplane is obtained by seeking a projection that maximises the distance between the means of the two classes, and minimises the variance within each class (see figure2.7). This is achieved by maximising the Rayleigh coefficient of between and within class variance with respect to w [12].

(27)

2.3. MACHINE LEARNING 21

LDA has low computational requirements making it useful for online BCI systems. It is a stable and robust classifier, not likely to overfit on the data.

LDA in combination with Common Spatial Patterns have been used in a great number of BCI applications [19] [11] [6].

2.3.3 Leave-one-out Cross Validation

Leave-one-out cross validation [9] is a method to verify the generalisability of a classifier. The technique works by repeatedly taking one sample out of a training set, training the classifier on the remaining samples, and finally testing the classifier on the left-out sample. By averaging over all test cases, an accurate unbiased performance measurement of the classifier is obtained.

The advantage over other validation methods, is that leave-one-out cross validation always uses the maximum amount of training samples available and thus provides the most accurate validation. But it comes at a high computational cost, as for each individual testing sample a new model has to be calculated.

As the CSP algorithm is susceptible to overfitting [30], leave-one-out cross validation will be used for both the construction of the CSP matrix and the training of the LDA classifier. This means that for each trial in the training set, a CSP/LDA classifier will be computed using all other trials and tested on the trial left out.

(28)

(29)

Chapter 3

Hor-I-Spell

3.1 Introduction

In this chapter the design and testing of several graphical user interfaces for a BCI spelling application will be discussed. The graphical user interfaces (GUIs) are all variants on a single spelling application called Hor-I-Spell. The first part of the name describes the way the letters are represented, in a horizontal manner, while the I stands for the single selection signal that will be used to select (groups of) letters.

The spelling interfaces will be tested in an offline setting where subjects have control over the application using the keyboard, instead of in a real BCI setting where they have to control the application using motor imagery. This makes testing the application much easier, but it also means that the results in terms of typed characters per minute will be different from the actual BCI application.

However, it should give a good idea about what the limits of the spelling applications are, which versions are most promising to use in a BCI setting, and if and how the program should be improved.

3.2 GUI Design

The design of the BCI spelling application is largely based on the Hex-o-Spell application [4] discussed in the previous chapter. Instead of using multiple control signals to both point a cursor to and select a letter, Hor-I-Spell uses only a single control signal that stops the cursor. The movement of the cursor is automatic, but the speed at which the cursor moves depends on the performance of the user.

Because of these differences, the Hor-I-Spell interface shows several important changes compared to Hex-O-Spell, all of which will be discussed in this section.

23

(30)

Figure 3.1: Hex-o-Spell Application. Letters can be selected using a rotating and growing cursor.

3.2.1 Layout

Just like in Hex-O-Spell, all the characters in Hor-I-Spell are organised in groups.

In order to select a character, first the group containing it must be selected from the main selection menu, only then it becomes possible to select it from the second menu. Punctuation marks can be selected in the same fashion, and to correct typing errors a backspace character < is included that will remove the previously typed letter. The space bar is represented by an underscore character.

Figure 3.2: Hor-I-Spell Application. Top: groups of characters can be selected in the main menu. The cursor is moving from left to right. Middle: single characters can be selected in the second menu, the cursor is moving from right to left. Bottom: A letter has been typed, the cursor is moving from left to right again and the main menu is shown.

(31)

3.3. LANGUAGE MODEL 25

Horizontal Representation

A pilot experiment involving the detection of the beta-burst signal in a switching motor imagery task [17], indicated that using a circular representation could provide problems. In the pilot experiment the subjects were instructed to imagine pushing a moving cursor around a circle. At the beginning of a trial, the cursor always started at the top of the circle, and then moved either left or right until it reached the bottom of the circle. In the following trial the cursor would move in the opposite direction. When the cursor was moving clockwise, they had to imagine pushing it with their left hand, and when the cursor was moving counterclockwise, they had to imagine pushing it with their right hand. At the end of each trial a beta-burst signal related to the switching between both movements - pushing left versus pushing right - was detected which predicted the direction of the trial. But on occasions a beta-burst signal was also detected half-way each trial when the cursor reached 3 or 9 o’clock. It was hypothesised that this was related to the visible change of horizontal direction of the cursor at those locations. Instead of the cursor continuously moving from left to right in a trial, the cursor would move from left to right, and then back to the left again.

Instead of using a circular representation like Hex-O-Spell (Figure 3.1), a horizontal representation was implemented to circumvent this cursor direction changing problem (Figure3.2). The cursor is switching directions after a selection is made. It always moves from left to right when selecting groups of characters, and from right to left when selecting single letters. When the cursor is moving from left to right, the user will try to imagine pushing the cursor from left to right with their left hand, and when a selection needs to be made, the opposite movement should be imagined to stop the cursor. If the selection is successful, the cursor will switch directions allowing the user to keep performing the same motor imagery task he used to stop the cursor.

The selection squares will light up whenever the cursor is moving above them, indicating to the user that the square can indeed be selected. If the cursor moves to the next square, the previous square will return to its original colour. If the cursor reaches the end of the screen before the user performed a selection, the cursor will start moving from its starting position again to give the user another opportunity to make a selection.

3.3 Language Model

The Hor-I-Spell language model is used to effectively reduce the amount of actions the user has to perform, by predicting the most likely letters given up to four previous letters. It uses a modified partial predictive-match (PPM) model [2], that is trained on a list of the 5000 most frequent Dutch words listed in the Algemeen Nederlands Woordenboek (ANW) corpus [22]. This corpus in total contains more

(32)

than 100 million Dutch words coming from Internet texts, newspaper articles and written literature.

The model stores the counts of all letter combinations from one to five characters, including spaces, in a hash table. The number of occurrences of the words in the ANW word list are also taken into account, in such a way that the letter combinations in the most common dutch word DE (5.680.697 occurrences) is weighed more than ten twelve as heavy as the letter combinations in the word ALS (465.786 occurrences).

All letter combinations are counted once from the beginning of the word up to the fifth character, and once throughout the entire word also from one to five characters, giving in effect two different language models (P_beginning(character) and P_all(character)). Both these models subsequently predict the most likely letter given 0 to 4 previous letters, but the first model Pbeginning is only applied at the beginning of the word, while the second model P_endis always being applied.

The models are combined with a weighting factor such that Ptot(XN) = 4 ∗ Pbeginning(XN) + Pall(XN) Where

P (X_N) = 1

256P (X) + 1

64P (X_N|X_{N −1}) + 1

16P (X_N|X_{N −1}, X_{N −2}) +1

4P (X|X_{N −1}, X_{N −2}, X_{N −3}) +P (XN|X_{N −1}, XN −2, XN −3, XN −4)

All the predictions of the sub models in the formula above were normalised such that the sum of chances of all characters equals 1 when the letter combination was valid, and 0 otherwise.

This language model is implemented in a Matlab (The MathWorks Inc., 1984) script that is listening at a UDP port for parts of words, and returns an ordered list of most likely to least likely characters from a to z including the space character. The language model uses about 200 megabytes of RAM memory, and on average takes 1.3 milliseconds to parse a request and calculate all letter proba- bilities.

The language model was tested and fine tuned on a combination of texts coming from random articles on the Dutch Wikipedia site [25] and from a Dutch Blog containing less formal language. In total this text contained 9038 words with an average word length of 4.8 characters. Because the language model tries to predict 27 characters (all letters including the space character), the text was formatted in such a way that it only contained lowercase words and spaces, without any other punctuation marks or characters. Each word was followed by a single space. In total this provided 53.108 characters that the language model tried to predict. Without any letters given, the language model is only

(33)

3.3. LANGUAGE MODEL 27

Figure 3.3: Performance of the language model. The numbers on the horizontal axis represent the amount of guesses the language model could take to predict the next letter given a least one previous letters. Without any letters typed, the model has difficulty predicting the first letter of the word (a). When more letters are given the model performance increases (b-e). Image (f ) contains the average performance of the language model excluding the first character.

able to correctly predict the next letter in 14% of the time. The top 4 of the language model, the four most likely letters, only contains the right prediction 35% of the time. This low performance of the model is to be expected because no information of the word to type is yet available, and the model only uses the a priori probability of all letters. When the language model is estimating the second letter up until the end of the word (including the trailing space) the model is correct in 56% of the time with the first suggestion. In that case the top 4 suggestions of the language model contains the correct letter in 85% of the time (see Figure 3.3)

The language model not only predicts the most-likely next character, but it will order all characters from most likely to most unlikely. Where Hex-O-Spell makes it easier to select the square containing the most likely letter, Hor-I-Spell

(34)

adds an extra square with characters containing the top 4 most likely letters.

This square is the first one that can be selected by the user, but is only visible after at least one letter of a word is typed. Unlike the contents of other squares, the contents of this ’most likely’ square is dynamic and will change whenever a new letter is typed in a word. To make sure the user can still process the contents, it was found that no more than four characters should be placed in this square.

Figure 3.4: After typing ’amster’, the only word possible according to the dictionary is

’amsterdam’, which is automatically completed.

The final part of the language model consists of a Dutch dictonary containing 372.487 words, and is used to automatically fill in (parts of) words when no other letter combinations are possible. To make sure the user notices the word completion algorithm, the sentence briefly flashes green whenever it has added letters to the current word (see Figure3.4). If for whatever reason the user does not approve of the auto completion, the < character can be selected to undo the suggestion and the word completion will be turned off for that particular word.

3.4 GUI Test

In order to create the best suitable interface for a BCI spelling application, four different variations of the application discussed in the previous sections, were analysed in an offline setting and tested in a user study. In the following sections the experimental setup and results will be discussed.

3.4.1 Method

Participants

One subject took part in a small pilot experiment. Because several technical problems were found in that experiment, it was decided that the data from this participant was excluded from the analysis. Twelve new Dutch subjects (6 males, 6 females, age 24.3 +- 5.4 years) took part in the actual study. All subjects were paid ¿20. None of the subjects had prior experience in the presented task or in operating similar other spelling applications. All subjects gave their written informed consent, and were allowed to stop the experiment at any time if they so

(35)

3.4. GUI TEST 29

Table 3.1: Each of the twelve participants were assigned to one of six conditions indicating the order in which the different versions were presented during the experiment.

Participant Order

1, 7 1234

2, 8 4132

3, 9 2134

4, 10 4231

5, 11 3124

6, 12 4321

desired, but all of them completed the entire experiment. Ethical approval was obtained from the ethics committee of the University of Groningen.

Materials

The stimuli were presented on a laptop computer running Microsoft Windows 7.

The screen had a diameter of 15.6” and was operating at a resolution of 1366 x 768 and a refresh rate of 50Hz. The participants were seated in a comfortable chair, with their heads placed approximately 70 cm in front of the monitor.

The Graphical User Interface(GUI) of the spelling application was written and compiled in Borland Delphi 7, and was communicating with the Matlab letter server providing a list of the most likely letters given typed letters of the current word.

The sentences the participants were asked to type were in Dutch, and had a length between 29 and 63 character, averaging 48 characters. The average word length in these sentences was 4.8 characters. See Appendix Bfor a complete list of the sentences.

A questionnaire containing six statements and one open question was used to receive feedback on the tested spelling applications. A six point Likert scale was used to describe the range from total disagreement to a total agreement on each of the six statements. The open question was used to find potential problems and other ways to improve the spelling application. The questionnaire can be found in AppendixA.

Design

Four versions of the spelling application were tested in a within subject design that can be seen in table 3.1.

(36)

Versions The Hex-o-Spell spelling applications groups the letters alphabetically, and makes it easier to select both the group and the letters within that group according to which letter is most likely to be typed. But there are other ways of both grouping the letters, and making using of the information the language model provides. The Hor-I-Spell interface was adapted to four different versions, each differing only in the way the letters were ordered and and how the information containing the next most likely letter(s) were used.

Figure 3.5: At the top the sentence the user has to type is shown, along with the sentence the user already typed. Then from top to bottom version 1 to 4 of the spelling application.

The cursor is hovering above the next square the user has to select.

1. This version does not order the letters alphabetically, but instead uses an ordering by the shape of the letter. In [32] it was found that the shape of the letters is used when humans read Western script, and in the Hor-I- Spell application it was hypothesised that using this information to group letters would make it easier to find and select the letter the users wants to type. This versions also presents the four most likely characters in a separate box at the beginning of the main menu. The order of the letters in the other boxes always remains the same, ordered with the most frequently used letters in Dutch in the front.

2. This version differs from the first only by the way the letters are represented.

Instead of using the shape of the letters, they are ordered alphabetically.

(37)

3.4. GUI TEST 31

3. Similar to the second version, but this time the most likely characters are not being represented in an additional box. The information the language model provides is not being used.

4. The final version is similar to the Hex-O-Spell application, where the letters within each selection box are ordered by likelihood according to the language model, and the selection box containing the most likely letter is placed at the front. The order in which the selection boxes can be selected by the cursor remains the same, they are alphabetically ordered at all times.

The box containing the letters ’abcde’ will be placed before box ’fghij’, and so on. If for example the letter ’t’ is the most likely next letter, than the box containing this letter, box ’pqrst’, is placed at the first position. The letters in this box will be ordered by likelihood according to the language model, so the contents of this box will for example read ’trspq’. The letters in the box before that, ’klmno’, will also be ordered, and placed at the far right of the screen. Because the cursor will leave the screen at the right end and come back at the left end, this ensures that the order of all the boxes remains the same.

Each of these versions was tested by a simulation program typing a set of Dutch sentences listed in appendix B. The simulation was perfect in the sense that it did not make any typing errors, and selected each letter at the earliest possible moment. This resulted in a measurement of an average distance the cursor had to travel to select a letter, which can be used as a reference to test the performance of a human user. The results including a small summary of the four different versions can be found in table 3.2. Figure 3.5 shows the different versions after one letter has been typed.

Table 3.2: Summary of the tested spelling applications.

Version Grouping Selection help Minimal cursor

traveling distance per character 1 Shape Most likely letters grouped in

first selection box

1.54

2 Alphabet Most likely letters grouped in first selection box

1.90

3 Alphabet None 4.40

4 Alphabet Most likely group of letters at the front. Relative position between groups remains intact. Letters within the groups also ordered.

1.84

(38)

Updating the Cursor Speed Because not always the same amount of selection squares are visible on the screen, the speed of the cursor is expressed in how much time it takes to cover one selection square, and not by the amount of pixels it travels in a specific amount of time. This method ensures that the user always has the same amount of time to process the information in each square.

Because the potential maximum typing speed of each version is one of the interesting factors of the study, the cursor speed automatically adapts to each user and version of the application by applying a tracking algorithm. Every time the user makes a correct selection, the speed of the cursor is increased by a small amount, but whenever an error was made the cursor speed is lowered by a larger amount. When these step-sizes are set correctly, after a some time the user will be operating the application close to the maximum potential.

Figure 3.6: A typical plot of how the speed of the cursor changes over time. The cursor speed is described in terms of ms per selection square. The entire X-axis corresponds to +- 20 minutes in real time. Half-way the experiment the speed is close to becoming stable.

In a pilot study it was found that using a variable step size algorithm operating between a fixed minimum and maximum cursor speed worked best. This method ensured the speed changed gradually while still being able to reach the maximum speed after only several minutes of operating. Let step be a measure of the speed of the cursor, where step >= 0 and step = 0 means the cursor is running at the minimum speed which is also the starting position of each version of the spelling application. Then the current speed of the cursor T in milliseconds per square can be expressed by the following formula.

T_max= 620ms T_min = 210ms X = 2 ∗ 1

1 + e^−step − 0.5

(39)

3.4. GUI TEST 33

Table 3.3: The results of the questionnaire on 1 - 6 scale (higher is better). V1 is the version with the letters ordered by shape and a separate top four selection box containing the most likely letters. V2 has the same top four box, but uses an alphabetical representation of the letters. V3 also uses an alphabetical representation but does not use the language model. And V4 is the horizontal equivalent of the Hex-O-Spell application.

Question V1 V2 V3 V4

Easy to get started 5.3 5.3 5.5 5.0

Easy to use 5.0 4.9 5.4 4.6

Correcting typing errors 4.4 3.8 3.8 3.8 Automatic word prediction 5.0 4.9 4.9 4.4 Enjoyed operating the program 4.6 4.7 3.4 4.6

General impression 4.9 4.9 3.9 4.5

Average 4.8 4.7 4.5 4.5

T = T_max− X ∗ (T_max− T_min)

When the users fails to make a selection the first time round in either the main or second selection menu, ∆step = −0.06 and the cursor speed is recalculated.

If the user correctly selects a letter in the second selection menu, ∆step = +0.01 and the speed is also being recalculated. See figure 3.6 to see a typical example of how the speed of the cursor changes over time during an experiment.

Procedure

Before the testing of the applications started, the participants were given in- structions on how to operate the spelling applications. The participants were instructed to type Dutch sentences presented at the top of the screen, by press- ing the Ctrl key on the keyboard whenever the moving cursor was above the square that contained the letter to be selected. Before a specific version was tested, they were allowed several minutes of practise. When it was clear they understood the task, each application was tested for 20 minutes. Between each typed sentence, a small break was allowed and the users had to press the Enter key to continue with the next sentence. After a specific version was tested, the subjects had 5 minutes to fill in a small questionnaire providing feedback on the application they just tested, after which the experiment continued with the next version. When the final questionnaire was filled in, the subject was debriefed and asked to give general feedback or provide ideas on how to improve the application.

(40)

Table 3.4: This table shows the maximum typing speed, the time the cursor takes to travel one square, the distance in selection boxes travelled per typed character, and potential used of each of the different versions. Perf shows the performance of the users in terms of cursor distance travelled compared to that of a perfect user as shown in table 3.2. A value of 100% indicates the user has not made any mistakes, but also that the user cannot type any faster given this application. The final column shows the amount of typing corrections the user had to perform per typed character.

Version Char/min ms/square Travelled Perf Error

1 33.5 299 ms 2.60 59% 3.6%

2 33.0 287 ms 3.00 63% 4.6%

3 23.3 333 ms 4.76 92% 2.1%

4 32.8 327 ms 2.52 73% 3.2%

3.4.2 Results

Table 3.3 shows the results of the six questions presented on the questionnaire, along with the average speed the participants were able to obtain in characters per minute excluding the first half of the typed sentences. The participants least liked to work with version 3 of the program (average score 3.4), which is the version that does not use the language model. Version 3 was the most easy version to work with according to the questionnaire. All other versions are rated almost equal with scores of 4.7, 4.6, and 4.6 for version 2, 1, and 4 respectively.

The maximum number of characters per minute averaged across all subjects, was highest for version 1 with 33.5 characters per minute, closely followed by version 4 and 2. Version 3 is the slowest application with 23.3 characters per minute. The maximum attained typing speed of a single subject was 39.5 characters per minute, and was achieved by version 1 where the letters were ordered by shape. The sentence this subject typed was 66 characters long, and no words were automatically completed in this sentence.

When the average cursor distance travelled before a letter was selected is compared to that from a simulated user who does not make mistakes, a measurement for the amount of potential improvement for each version can be obtained. This measurement shows that both version 1 and 2 have the most room for improvement as they were operated at 59 and 63 % of their maximum potential, version 4 worked at 63% of the maximum performance, and version 3 was operated close to the maximum performance at 92%.

The average distance travelled per typed letter in terms of selection squares (with 0 being the first selection square) is lowest for version 4 with 2.52 selection squares being crossed to type one letter. Version 1 and 2 follow with 2.60 and 3.00 respectively, and the version where the most squares have to be travelled

(41)

3.4. GUI TEST 35

before selecting a letter is version 3 with 4.76 squares. The fewest mistakes were made with version 3 with 2.1% of the typed letters being a correctional backspace character. Version 1, 3, and 4 of the spelling application show an error rate of respectively 3.6%, 4.6% and 3.2%. The application that allowed for the highest cursor speed is version 2 with 287 ms per square, closely followed by version 1 with 299 ms per square. In version 3 and 4 the cursor speed was 333 ms and 327 ms per square respectively.

In table 3.4 a summary of these results can be found. The autofill function

Figure 3.7: The autofill mostly adds a single character to the word being typed, and saves the user from typing 5.9% of all the keystrokes

that automatically completes (parts of) words when no other options are available according to the dictionary, on overage saved 5.9% of the keystrokes that needed to be typed. In 65% of the cases only one letter was automatically added to the word (see figure3.7). Although 4 participants did not particularly like the autofill option (a score of three or lower on the questionnaire, with one participant giving it the lowest possible score) most users thought it was a positive feature with an average score of 4.7 out of 6 over all different versions.

The feedback given by the participant was generally positive, although several remarks were made on how to improve the application. See appendix C for a complete list of written feedback provided by the participants.

3.4.3 Discussion

The results of the questionnaire show that the users do not have a real prefer- ence over which version of the application they liked to work with the most. But version 3 (alphabet without top selection) scores the lowest on both this measurement as well as on the general impression question. Even though the users make few typing mistakes when using this spelling application, it has by far the lowest

(42)

possible typing speed, and is therefore discarded as a potential typing application in an actual BCI setting. These results can be explained by the fact that no language model is being used to aid the selection of letters the user has to type.

This means the cursor has to travel a longer distance to select the correct letter, and this takes extra time. Because the interface always remains the same, the users makes few mistakes and operate the program very close to the maximum potential. In other words, there is hardly any room for improvement when the users are given more time to work with the application for a longer amount of time.

With the horizontal Hex-O-Spell representation, version 4, the users were able to type almost at the fastest rate of all applications at 28.2 characters per minute. Even though the questionnaire showed the users enjoyed working with this application, the general impression scores were slightly lower than for version 1 and 2. Version 4 shows that there is still some room for improvement as the users operate this spelling application at 73 % of the best possible selection performance. The speed of the cursor is slightly lower than for version 1 and 2.

Overall speaking version 1 (letters ordered by shape with top selection) and 2 (alphabetical ordering with top selection) show the highest scores on the questionnaire. The average typing speed is slightly favouring version 1 with 33.5 characters per minute over 33.0. Several users indicated they liked the selection box at the beginning containing the most likely letters, but the relatively low selection performance of 59% and 63% for version 1 and 2 indicates that the selection box could be used more often. This indicates there is a plenty of room for improvement, especially when the cursor slows down, as it likely will in an actual BCI setting. This will give the users more time to determine the target box.

With several hours of training with version 1, it was found that speeds of more than 55 characters per minute are possible. Because version 1 shows slightly more potential room for improvement compared to version 2, and also scores slightly higher in both the questionnaire results as well as on the amount of characters per minute that can be typed, this version will be used as a basis for the BCI experiments described in the following chapter.

(43)

Chapter 4

BCI Experiments

The Hor-I-Spell application discussed in the previous chapter will now be tested in an online EEG-based Brain-Computer Interface setting. The experiments are performed in two phases which will be discussed in the following sections.

The first phase consists of developing and fine-tuning a calibration method.

The subjects performed several short calibration session for the online spelling task, and are instructed to imagine pushing and stopping a cursor moving through the screen using two kinds of motor imagery to induce changes in the beta (and mu) frequency rhythms. In the meantime EEG signals are being recorded along with markers indicating the target locations for the cursor, creating labelled examples of brain activity during the different mental tasks. The Hor-I-Spell application is already used in this session, but the program is operating automatically.

The subjects yet have no control over the application, they are instructed to imagine stopping the cursor in front of a marked target location containing the letter to type.

The information from the first phase of experiments is manually analysed in order to find a (semi) automatic calibration procedure. This procedure will then be used in the second phase of the BCI experiments where the subjects need to be in control of the cursor to type sentences. The goal is to let novice users operate a short (30 minutes) motor imagery calibration session, after which they are able to freely type sentences in Hor-I-Spell using the best fitting calibration model.

4.1 Calibration Experiments

Instead of using a separate simple calibration tasks such as using randomised tar- gets indicating which type of motor imagery the subjects had to perform (such as used in [4]), it was hypothesised that using the actual application in the cal-

37

(44)

ibration experiment would provide generalisable calibration data. This way the participants can both get used to the spelling application by seeing it in action, but the calibration task is also more close to the actual experiment and should therefore provide better training examples.

4.1.1 Calibration Hor-I-Spell

Some changes had to be made to the Hor-I-Spell application to allow it to be used in a calibration experiment. The application was altered in such a way that markers were send to the BCI2000 system which encode the target location of the cursor, the direction of the cursor, and timing information indicating a trial start and cursor reaching the target. These markers are saved along with unfiltered EEG data, providing a list of labelled examples that can be used to train a classifier using machine learning techniques.

Hor-I-Spell was also changed to indicate the target location with a small square, allowing the subjects to prepare for performing motor imagery to try and stop the cursor in time instead of focusing on finding the next letter to type in those squares. For the duration of the calibration experiment, the sentences are also automatically typed, and a small pause of 2 seconds is added after the cursor stopped and before the next trial began. During this pause the screen remained the same, the cursor stopped over the target location, and EEG data is still being recorded.

Figure 4.1: Changes to the Hor-I-Spell interface for the calibration phase. The target location containing the next letter to type is marked with a square rectangle. A small arc next to the cursor represent the hand of the user performing motor imaging by pushing the cursor over the target location. Right before the target location, the users are instructed to stop the cursor with the right hand. Because the left and right trial alternate, the braking movement of the previous trial can be used as a pushing movement for the cursor in the next trial.

Other changes to the Hor-I-Spell interface are more subtle like adding a small