Early Recognition of Low Literacy from Tablet Data

(1)

Masters Thesis

Early Recognition of Low Literacy from

Tablet Data

Author:

George-Viorel Vis,niuc

Supervisor:

Dr. Maarten van Someren

Faculty of Science Masters Artificial Intelligence

(2)

Declaration of Authorship

I, George-Viorel Vis,niuc, declare that this thesis titled, ’Early Recognition of Low

Literacy from Tablet Data’ and the work presented in it are my own. I confirm that:

This work was done wholly or mainly while in candidature for a research degree

at this University.

Where any part of this thesis has previously been submitted for a degree or any

other qualification at this University or any other institution, this has been clearly stated.

Where I have consulted the published work of others, this is always clearly

at-tributed.

Where I have quoted from the work of others, the source is always given. With

the exception of such quotations, this thesis is entirely my own work.

I have acknowledged all main sources of help.

Where the thesis is based on work done by myself jointly with others, I have made

clear exactly what was done by others and what I have contributed myself.

Signed:

Date:

(3)

Abstract

Faculty of Science Masters Artificial Intelligence

Early Recognition of Low Literacy from Tablet Data by George-Viorel Vis,niuc

The purpose of this research project is to identify the types of errors that students make when completing spelling exercises and categorize them based on the required cognitive ability. In the first phase of the project we classify the errors and understand what is their root cause. In practical terms this research allows specialists to diagnose spelling problems earlier in time, backtrace their source and correct them.

(4)

Acknowledgements

Special thank you to the dedicated team of Leeruniek for supplying the necessary data for this analysis and providing specialist pedagogical insight.

. . .

(5)

Declaration of Authorship i

Abstract ii

Acknowledgements iii

Contents iv

List of Figures vi

List of Tables vii

1 Introduction 1

1.1 Background . . . 1

1.2 Hypothesis . . . 2

1.3 Terminology . . . 3

1.3.1 Spelling and Dyslexia . . . 3

1.3.2 Learning objectives. . . 4

1.3.3 Error definition . . . 5

1.3.3.1 Visual processing . . . 7

1.3.3.2 Phonological awareness deficiency . . . 7

1.3.3.3 Vocabulary and Category formation (rules) . . . 8

1.3.3.4 Error type formation . . . 9

1.4 Relation between learning objectives and error types . . . 10

1.5 Practical stages . . . 10

1.6 Research questions . . . 11

2 Related work and current status 14 2.1 History . . . 14

2.2 Intervention and modern approach . . . 15

2.3 Contribution . . . 15

3 Dataset and Methodology 17 3.1 Corpus structure . . . 17

3.2 Data cleaning . . . 18

3.3 Methodology and Technology . . . 18

(6)

Contents v

3.4 Error classification . . . 19

3.4.1 Approach . . . 19

3.4.2 Accuracy calculation . . . 20

4 Experiments and results 22 4.1 Prior research and corpus analysis . . . 22

4.1.1 Word complexity . . . 22

4.2 What are the correlations/structure of learning objectives? . . . 27

4.2.1 Approach . . . 27

4.2.2 Setup . . . 28

4.2.3 Results . . . 28

4.2.4 Conclusion and comments . . . 29

4.3 How many misspelled words can overall be “explained” by the error types? 30 4.3.1 Visual processing . . . 31

4.3.2 Phonological awareness deficiency . . . 33

4.3.3 Rules and Vocabulary . . . 36

4.3.4 Discussion. . . 39

4.3.5 Conclusion . . . 40

4.4 How “stable” are error types over time? . . . 40

4.4.1 Approach . . . 41

4.4.1.1 Time series . . . 41

4.4.1.2 Assumptions and preparation. . . 41

4.4.1.3 Measuring stability . . . 42

4.4.2 Results . . . 42

4.5 How predictable are errors of certain types from previous errors and how early? . . . 44

4.5.1 Approach . . . 44

4.5.1.1 Linear methods . . . 44

4.5.1.2 Results . . . 45

4.5.1.3 Non Linear methods . . . 48

4.5.2 Target set analysis . . . 49

5 Conclusion and future work 54

A Learning objective list 57

(7)

1.1 Project flowchart . . . 12

3.1 Number of errors in a single word . . . 21

4.1 Accuracy and word length . . . 24

4.2 Word length for error types . . . 25

4.3 Word length distribution . . . 25

4.4 Edit distance and number of characters . . . 26

4.5 Learning objective correlation matrix. . . 29

4.6 Learning objective correlation factor loading. . . 30

4.7 Distribution of the visual processing errors. . . 31

4.8 Visual processing errors based on location . . . 32

4.9 Visual processing errors location and proportion . . . 33

4.10 Distribution of the phonological errors . . . 34

4.11 Phonological awareness errors based on location. . . 35

4.12 Phonological awareness errors location and proportion . . . 35

4.13 Phonological awareness errors number of syllables and proportion . . . 36

4.14 Distribution of the vocabulary and rule errors . . . 37

4.15 Vocabulary and rule errors based on location . . . 37

4.16 Visual processing location distribution . . . 38

4.17 Phonological location distribution. . . 38

4.18 Rule and vocabulary location distribution . . . 38

4.19 Error class distribution. . . 38

4.20 Stability for all pupils . . . 43

4.21 Prediction scores for the visual processing error type . . . 47

4.22 Prediction scores for the phonological error type . . . 47

4.23 Standard error comparison. . . 48

4.24 Explained variance for the phonological error type using K-NN . . . 50

4.25 Prediction scores for the rule and vocabulary error type . . . 50

4.26 Rule and vocabulary scores . . . 52

4.27 Visual processing scores . . . 52

4.28 Phonological scores . . . 53

(8)

List of Tables

1.1 Error example. . . 5

1.2 Error pattern examples . . . 9

4.1 Error class correlation in general . . . 39

4.2 Error class correlation for target groups . . . 39

4.3 Correlation between initial and final score . . . 44

4.4 Regression results for target set . . . 46

4.5 Regression results based on error rate . . . 51

A.1 Learning objectives ta . . . 58

(9)

Introduction

1.1 Background

The use of tablet computers at primary schools facilitates the collection of data from pupil activities with a higher level of detail that was not possible before.

This opens a wide range of possibilities meant to support teaching such as adaptive selection of teaching materials and tests. However, the data open up many more possi-bilities, like collecting and analyzing real time information about pupils’ activities such as spelling. In this project we focus on early detection of spelling disabilities. This results intend to make it possible to start interventions (ranging from extra exercises via remedial teaching to training by experts) at an early stage in time, preventing many future cognitive and emotional problems.

Literacy is the ability to read and write. The inability to do so is called illiteracy. Visual literacy also includes the ability to understand visual forms of communication such as body language, pictures, maps, and video. Evolving definitions of literacy often include all the symbol systems relevant to a particular community. Literacy encompasses a com-plex set of abilities to understand and use the dominant symbol systems of a culture for personal and community development. In a technological society, the concept of literacy is expanding to include the media and electronic text, in addition to alphabetic and number systems. These abilities vary in different social and cultural contexts accord-ing to need, demand and education. Deficiencies in literacy usually falls into several categories based on their severity. Low literacy or functional illiteracy happens when reading and writing skills are inadequate to manage daily living and employment tasks that require reading skills beyond a basic level.

(10)

Chapter 1. Introduction 2

Dyslexia on the other hand is a brain-based type of learning disability that specifically impairs a person’s ability to read or write properly. These individuals typically read and write at levels significantly lower than expected despite having normal intelligence. Although the disorder varies from person to person, common characteristics among people with dyslexia are difficulty with phonological processing (the manipulation of sounds), spelling, and/or rapid visual-verbal responding.[1]

Dyslexia can also manifest several symptoms which have been validated medically or in some cases dyslexia patients appear to have symptoms that are unique for which they have developed remarkable compensation mechanisms. We suspect the underlying deficiencies reside in one’s cognitive abilities which have not been fully developed due to environmental circumstances or other variables such as health issues [2].

Dyslexia is primarily associated with problems with reading but it can also affect writing, spelling and even speaking. Dysgraphia is associated writing difficulties. Kids with dysgraphia may struggle with handwriting, organizing their thoughts on paper or with both of these activities. Since answers to the exercises that will be analyzed are are not handwritten we will not investigate further into dysgraphia.

The symptoms of dyslexia are also visible in the spelling tasks but the analysis of spelling errors derived is complicated by the fact that many factors influence activities and per-formance (e.g. illness, native language of parents, reading habits), variation in curricula between schools (and teachers). Thus the construction of a suitable data model is there-fore quite challenging given the fact that the cause of the errors is not always in to the neurological disorders that we assume.

The goal is an attempt to formulate a strategy to improve literacy overall, detect special cases such as dyslectic pupils and minimize the response time from detecting the issue to its treatment.

1.2 Hypothesis

Our assumptions are based on the idea that the literacy rate is directly related to the results from the exercises of a pupil. We take into account quantitative measures of the performance such as exercise accuracy and then try to define ways to recognize the possible causes of an error.

The main hypothesis is that the quality of the results and the nature of the errors are influenced by underlying cognitive processes reflected in the error patterns, their frequency and their behavior over time.

(11)

One key aspect of low literacy and the special cases of dyslexia is that it has not been successfully fitted within a framework that allows to replicate results. This research seeks to verify current knowledge and establish whether underlying patterns exist between neuro-cognitive features of the brain and its direct manifestation through the form of writing and reading. Because of technical reasons and the nature of the data given at our disposal we will only be able to analyze written results derived from spelling exercises. During the experiments we answer several research question and address the difficulties of low literacy through analysis of the data.

1.3 Terminology

1.3.1 Spelling and Dyslexia

In this section we explain how spelling and low literacy and it’s severe case (dyslexia) are related and which spelling tasks are affected by the symptoms of low literacy. Spelling is difficult for many people, but there is much less research on spelling than there is on reading to tell us just how many people spell poorly or believe they spell poorly. Less is known about spelling competence in the general population than is known about reading achievement because there is no national test for spelling and many states do not test pupils’ spelling skills.

Almost all people with dyslexia, however, struggle with spelling and face serious obstacles in learning to cope with this aspect of their learning disability. The definition of dyslexia notes that individuals with dyslexia have ”conspicuous problems” with spelling and writing, in spite of being capable in other areas and having a normal amount of classroom instruction. Many individuals with dyslexia learn to read fairly well, but difficulties with spelling (and handwriting) tend to persist throughout life, requiring instruction, accommodations, task modifications, and understanding from those who teach or work with the individual [3].

One common belief is that spelling problems stem from a poor visual memory for the sequences of letters in words. Recent research shows that a general kind of visual mem-ory sometimes plays a relatively minor role in learning to spell but the visual memmem-ory problems of poor spellers are specific to memory for letters and words[4]. Spelling prob-lems, like reading probprob-lems, originate from language learning weaknesses. Therefore, spelling reversals of easily confused letters such as b and d, or sequences of letters, such as ”wnet” for ”went” are manifestations of underlying language learning weaknesses rather than of a visually based problem[4].

(12)

If dyslexia is suspected, and the pupil is at the kindergarten or first-grade level, simple tests of phoneme awareness and letter naming can predict later spelling problems, just as they predict later reading problems. If a pupil is struggling to remember spelling words, a standardized test of spelling achievement with current national norms should be given to quantify just how serious the problem is. In addition, a spelling diagnostic test should be given to identify which sounds, syllable patterns, or meaningful parts the pupil does not understand or remember.

Spelling instruction that explores word structure, word origin, and word meaning is the most effective, even though pupils with dyslexia may still struggle with word recall [5]. Emphasizing memorization by asking pupils to close their eyes and imagine the words, or asking them to write words multiple times until they ”stick” are only useful after pupils are helped to understand why a word is spelled the way it is. pupils who have learned the connections between speech sounds and written symbols, who perceive the recurring letter patterns in Dutch syllables, and who know about meaningful word parts are better at remembering whole words [6].

1.3.2 Learning objectives

School curricula are organized around learning objectives. For a learning objective pupils are taught the material and then practice by doing exercises. Exercises come in different forms such as single or multiple answers, filling in missing characters or writing a specific word. The answers are stored along with metadata such the time taken or number of tries, exercise Elo rating and length of its description. The Elo rating system is a method for calculating the relative skill levels of players in competitor-versus-competitor games such as chess. Adapted to spelling exercises it is used to estimate relative skills level of players [7].

A learning objective describes what pupils should know or be able to do at the end of the course that they couldn’t do before.

Learning objective are built around a certain set of rules while also increasing the pupil’s vocabulary size. This allows us to track the effects of certain errors in future learning objectives. Such information will help determine whether pupils have not understood previous learning objectives or if they are struggling with the current one giving the possibility to pinpoint recurrent issues before they become a permanent gap in the pupil’s education.

(13)

1.3.3 Error definition

Each learning objective can be characterized by the usage of a certain set of words that will be used to teach a concept. For example a learning objective can be that long vowels are written differently at the end of a syllable. Some words that exhibit this are then used as exercises. This feature gives rise to both specific types of errors that can only occur in this set of to general type of errors such as typos or single letter operations such as replacing or omitting a letter.

A spelling error is an error in the conventionally accepted form of spelling a word. This is revealed after we compare the answer given by the pupil and the expected correct answer.

For precaution we tag the answer as erroneous only after cleaning the data and assuring that typos caused by hardware configuration or intended incorrect supplying of an answer are not taken into consideration.

It is important to have a sufficient error categories that reflect the structure of the errors and hopefully their cause(s).

These errors index the maturity and specificity of the developing orthographic lexicon. A preponderance of orthographic errors would be consistent with difficulties in memorizing information relevant for particular items. In contrast, phonological or grammatical errors reflect difficulty in employing mechanisms that apply over large classes of items. Table 1.1 shows examples of the type of possible spelling errors that can occur in the results of the exercises.

Correct form Erroneous form Example

nm mn onmiddellijk - omniddellijk

v f vouwfiets - f ouwfiets

p b pizza - bizza

ou au stout - staut

Table 1.1: Error example

In the first example and third we can see an error type where letters that are similar visually are transposed. The second and fourth example the error occurs because of the phonetic similarity of the letters. One note is that the third example p and b can case both visual and phonetic errors.

(14)

Dutch has a moderately transparent orthography, in between German (transparent) and English (opaque), and Dutch has complex syllable structure [8]. The Dutch spelling is in-fluenced by four different linguistic domains, namely phonology, morphology, etymology, and Dutch syllable structure [9].

The idea that the presence of specific errors can be the result of certain cognitive pro-cesses has been hypothesized in the past [10] given the past state of technology and also the reduced number of pupils no experiments were conducted using non invasive techniques.

The spelling classifier conducted in this study very broadly distinguishes between 3 error types, namely:

• Klankzuiver (i.e. orthographically transparent) and Bijna-klankzuiver (ortho-graphically semi-transparent; in other words, letter clusters that have a fixed pro-nunciation which is not identical to the propro-nunciation of its individual sounds (e.g., bank), or simply letters that correspond to phonemes that are difficult to recognize or distinguish from similar sounds in the language (e.g., f/v).

• Regelwoorden (orthographically transparent words, which are pronounced in a way that can be derived from specific Dutch spelling rules). An example of such word would be boom - bomen where the rule for making the plural has to be recalled. • Weetwoorden (orthographically transparent words that are completely irregular). Such words are for example alcohol or karamel which are exception and require memorizing their spelling as is.

These different strategies relate to the different strategies that need to be applied in order to correctly spell words from these categories (please read Huizenga, 1997 for a more detailed discussion). Words from category 1 can be spelled correctly by apply-ing phonological strategies, in other words by applyapply-ing rules that relate to the sounds and sound structures of Dutch, and their direct relation to Dutch orthography. Visual strategies, on the other hand, can be applied to retrieve the spellings of words belonging to category 3. Finally, rule-based strategies may be applied to deduce the spelling of words in category 2.

Using these errors types we are trying to identify which neuro-cognitive process can be the leading cause of an error. There are different methods of acquiring, storing and recalling information [11].

(15)

1.3.3.1 Visual processing

A visual processing disorder can cause difficulty in seeing the difference between two similar letters, shapes, or objects, or noticing the similarities and differences between certain colors, shapes and patterns. Visual processing can explain why a pupil may have trouble with learning.

Although exceptions can exists when the pupil has actual medical problems regarding sensory perception the use of this method can still be useful in pointing out anomalies. Common patterns concerning this deficiency:

• Seeing lines of text merge together - attributed to the length of the word in which case we will check whether the pupil has difficulties when it comes to words with many characters

• Letters missing at the beginning or end of words. • Letters missing in the middle of words.

• Parts of letters missing in a horizontal manner such as all letters missing their top, middle or bottom.

• Transpositions and reversals of letters

• Letters are present but not in their proper order (fv,mn,bd) • Letters appear as their mirror image. (pq,dt,pb,pd)

1.3.3.2 Phonological awareness deficiency

Phonological errors were defined as spelling that affected the pronunciation of the word, altering its phonological identity. Phonological errors are thought to reflect difficulty in representations and processes that are not specific to words and ostensibly independent of lexical knowledge.

Auditory processing problems prevents pupils from hearing all the individual sounds in a word. So they don’t read by sounding out.

Instead, they use alternative strategies: context clues (pictures and a predictable or familiar story), the shapes of words, and guessing based on the first letter or two. But their memories can hold only a limited number of words. So these strategies will fail them by third or fourth grade. Without the right type of help, they can not progress any further.

(16)

Auditory dyslectics have difficulty processing sounds of letters or groups of letters. Mul-tiple sounds may be fused as a singular sound. For example the word ’back’ will be heard as a single sound rather than something made up of the sounds ’b’ - ’aa’ -’ck’. Alternatively, sounds may be reversed, or jumbled, with the constituent parts not heard correctly such as in ’Kershmal’ instead of ’commercial’ or the classic ’pasghetti’ instead of ’spaghetti’.

The ear of a child with auditory dyslexia captures sound just fine, but their brain processes the input differently or less accurately. It’s still a good idea to have the ears and eyes of struggling reader tested by professionals as part of assessing any severe reading problem.

The auditory problem is related (it’s not clear exactly how though) to the inability to use phonemes. This may explain why a phonics program alone is inadequate for helping a dyslexic: dyslexic children must be able to discern the sounds of language accurately before they can accurately break words up into syllable chunks.

There is much more research evidence that dyslexia is an auditory processing problem rather than a visual one, but like all complex things, there are usually multiple causes. Vision, ability to focus attention and other capacities almost certainly play a role.

1.3.3.3 Vocabulary and Category formation (rules)

This type of symptom is heavily influence by the memory process. Excellent long-term memory for experiences, locations, and faces. Poor memory for sequences, facts and information that has not been experienced. Thinks primarily with images and feeling, not sounds or words (little internal dialogue).

Individuals with dyslexia often have difficulty in understanding the sound structures of words. In particular, they struggle in a skill known as segmentation and blending: breaking up words into smaller segments (e.g., c and –at) and putting them together. One explanation is that individuals with dyslexia have poor working memory so they struggle to hold all the sound segments in their head while they are doing the spoonerism task. The process of keeping two words active in our mental post-it-note, combined with trying to exchange the first letter proves much too difficult for most dyslexics.

So much of language learning relies on working memory. When we learn new words, we have to remember each sound segment, put it together, learn the meaning, and finally remember what it looks like for future use! Someone with poor working memory, like

(17)

the person with dyslexia, struggles because they simply don’t have a big enough mental post-it-note (working memory) to cope with all these steps.

1.3.3.4 Error type formation

Spelling is defined as the process or activity of writing or naming the letters of a word in the correct order. The task is to recall a word from memory and type the corresponding characters.

An error is defined as a sequence of operations between letters that describe the difference between the correct answer and the incorrect one. In order to classify an error an error pattern must be constructed.

The character operations we are using are (-) displacement and (+) addition. The (=) operator indicates no modification. Therefore when adding a new error type to the set of known ones we create the regular expression pattern intended to look for these operations.

Such an examples can be seen in table. 1.2.

Pattern Correct Wrong +b-p pizza bizza =d-t$ wordt word

Table 1.2: Error pattern examples

The first entry shows the transposition of the letters p and b. The second entry show another rule which may apply, where we take into account the position of letters (t$) indicates that operation will happen at the end of the word. Using the following method we are able to create, modify and expand the error types as we wish, making it applicable for any language.

The classifier that we have built is based upon the concepts taught in the learning objectives A.1. For each of the categories we extracted the set of rules and the types of error most likely to occur in the learning objectives. The classifier started off with a general set of rules that can be found in any of the exercises of the learning objectives such as single letter transpositions (p - d), letters that have been doubled (o - oo) or double letters that have reduced to a single letter (ss - s). Then for each category discussed in1.3.3specific rules were formulated that could were deemed relevant by our team specialists in this field. For example words containing -aai- can be misspelled as -aaij-, -aaj-, -aay-, -ai-, -aj-, -ay-, -aa-, -a-.

(18)

1.4 Relation between learning objectives and error types

Learning objectives are designed to focus and teach in most cases a single concept. This concept can also be built upon another knowledge of previous learning objectives. Be-cause of this property the exercises that compose the structure of the learning objectives are densely packed with words that are designed to build learning objective.

The error types are also designed in accordance with the learning objectives in the sense that they also cover error patterns that can be seen in the exercises of a learning objective. The relation becomes obvious considering that a learning objective contains a set of exercises that increases the probability of a an error pattern to be found in the set of words used. Therefore we can say that most of the error types are inspired from the structure of the learning objectives.

The difference between the learning objectives and the error types lies in the fact that in the exercises of a learning objective multiple error, of different types, can be found. This raises the question of our choice for using error types instead of just using the learn-ing objective results as an indicator of performance. Although the learnlearn-ing objectives can summarize pretty well the activity and performance of a pupil, we argue they do not show the full story and more then often the errors overlap in the sense that within an exercise, we find errors that can be traced to multiple learning objectives.

Using the error types is a more fundamental and detailed approach that allows us to trace and show exactly what type of errors a pupil is doing instead of trying to understand which learning objectives or parts of them he has not understood correctly.

The practicality of this concept is that knowing exactly what type of errors are being made will allow specific treatment for the problem, greatly reducing both the time it takes to identify the problem as well as treating it.

1.5 Practical stages

In figure1.1 we present the flow of the steps that will be taken to reach our diagnostic goal.

We can see the individual steps and decision that need to be taken when certain thresh-olds are exceeded or a rule applies. The decision flow has been designed considering an actual future implementation by company that has provided the data. Initially we acquire the data, perform sanitation if needed and alter the structure of the model by adding or removing new types of error of remapping them to other cognitive processes.

(19)

The first part handles and prepares the incoming data. The acquisition of the data is not in our scope but represent the starting point for new iterations of the project that is meant to result in improvement the quality of the results.

If the features present themselves stable enough to use for predictive purposes or sta-tistical analysis in general we try will try to see if any of the pupils present future risks based on their current performance. If this is not possible as a consequence of low pre-diction scores we either for new data to come in or just decide based on the severity of the situation to apply treatment or not. The steps should be part of a feedback loop that allows the tuning of parameters based on the results.

This decision model implies human involvement to approve certain decisions as we would like to avoid situations where penalties for mistreatment are high. The decision model in our case needs to provide feedback in the form of results that demonstrate improvement or decrease in accuracy for a pupil.

Thus given a set of decisions or recommendations that the end user will put into appli-cation the results will be reflected in the pupil performance. The feedback will allow the system to tune the parameters in order to provide the best decision.

1.6 Research questions

From a practical point of view we want to replicate the diagnostic procedure that is being used to deal with spelling problems. Replicating such procedure in a semi-automatic way that would still require human intervention would represent a great improvement to the educational system.

1) The first step is to identify what are the possible causes for spelling errors that will be answered through the following research questions :

• How many errors belong to the three types described in1.3.3

We assume that there is an underlying structure between the error types and a learning objective. By structure we mean the statistical relationship between the results of 2 or more learning objectives.

• What is the correlation/structure of learning objectives?

(20)

(21)

The second step is based on the fact that the first answers provided by the research questions allows us to understand the root cause of spelling errors and we take a step forward and try to verify how well we can detect and predict possible candidates with recurrent deficiencies.

• How predictable are errors of certain types from previous errors? a. How early can they be predicted?

In order to predict future performance we will need to measure variables that present stability. By stability we imply the presence of recurrent patterns over time with as little variance as possible. Determining whether the error types are stable over time enables us to validate the error types and their relationship to the cognitive processes. Based on this assumption we expect to produce satisfactory prediction results that can be used for early assessment of pupil performance.

(22)

Chapter 2

Related work and current status

2.1 History

Identified by Oswald Berkhan in 1881 the term ’dyslexia’ was later coined in 1887 by Rudolf Berlin, an ophthalmologist practicing in Stuttgart, Germany. He used the term to refer to a case of a young boy who had a severe impairment in learning to read and write in spite of showing typical intellectual and physical abilities in all other respects.[12][13][14]

The first documented case of dyslexia treatment was described in Temple and Marshall (1983) where the subject manifested an estimated 7 year discrepancy in reading skills, also described as phonological dyslexia. This type of disability hinders one’s ability to read new or long words. Following this concept Coltheart, Masterson, Byng, Prior, and Riddoch (1983) investigated reading performance of regular and irregular words which revealed regularization disorders when confronted with new or nonwords.[1][15][16][17]

Regarding the cognitive approach Snowling, Stackhouse and Rack (1986) used Frith’s (1985) model of literacy acquisition to determine the relation between reading difficulties and cognitive features. Results have shown that the majority of subjects were halted in a literacy state that did now allow them to distinguish letters thus making them prone to visual errors.[18] Phonological awareness is concluded to be influenced as early as the period for alphabetic skill acquisition. During this period the brain maps the printed form of words to the sound of the spoken word.[11]

Other forms of dyslexia were discovered such as the visual perception of words de-scribed in (Boder, 1973) but in most cases was not replicated successfully due to the cognitive process that implicates visual processing. Recent studies have also considered

(23)

behavioural and genetic components as contributing factors in developing phonological or surface dyslexia.[19]

2.2 Intervention and modern approach

Standard techniques developed to address pupils with difficulties are usually put into practice after the pupils have expressed severe or recurrent difficulties in understanding certain concepts or rules. The period between detection and action is arguably too long in most cases since it only delays taking effective measures that would seek to take care of the present deficiencies.

The treatment in such cases is tailored based on how problematic the symptoms are. Less invasive steps consist in just putting an accent on specific problems. For problems that are not superficial standardized tests are required to establish what type of treatment that the pupils must undergo. In severe cases the pupil can be removed from class where they are submitted to a more intense treatment. One of the problems in most of these procedure is determined by the invasive approach that may have unwanted effects given the pupil’s discontinuous presence in the class room.

Present modern day approach of dealing with these issues in mainly developed in digital form as aids that facilitate reading and spelling.

In [11] the focus is on the differences and relations between reading and writing perfor-mances. It is revealed that the information is acquired via several routes such as visual or auditive.

2.3 Contribution

Unlike standard techniques that involve direct interaction with the pupil, the statistical nature of this analysis allows us to extract and interpret information resulted from the exercises and possibly adjust the flow of information given to the pupils while not being invasive since the pupil must not be necessary exposed to treatment in some cases. This type of setup allows lecturers as well as experts in the field of psychology to have both an overview of the performance of the pupil and a detailed view of the development over time of the components that reflect the pupil’s strengths and difficulties.

Another novelty that this approach represents the analysis method that categorizes the errors based on their underlying cognitive features. The assumption is that these

(24)

Chapter 2. Related work and current status 16

categories besides giving a more informative overview about deficiencies and overall progress of a pupil it also describes fundamental processes responsible for spelling in our case.

In order to do this we develop the instruments needed to perform the analysis. Non-invasive procedures is one of the novelty that consists in the fact that pupils are not aware that they are being measured directly thus eliminating several factors that induce stress and cause mistakes.

Another contribution consists in conceptualizing the methodology for error classification based on their possible relation to the underlying cognitive processes. With the help of experts in the field of child psychology and native Dutch speakers we have been able to compile a comprehensive list of error patterns found in real life that describes difficulties of certain types when recurrent and stable in time.

As we will see in later chapters we use several modern techniques to interpret the results. Overall the project will model a process that facilitates decision making in a multitude of scenarios.

(25)

Dataset and Methodology

3.1 Corpus structure

This section will offer insight with respect to the type of data available in the corpus, the structure and properties that make the analysis possible.

Current data is composed of 700 pupils (out of which 587 usable). The rest of 113 pupils have either insufficient data for a meaningful analysis or have results that do not belong the the Spelling learning objectives. Insufficient data can also be the result of absences in the classroom for several reasons.

The pupils belong to 41 groups (classes) from 26 schools with an average number of 130 unique exercises per pupil. The relevant exercises are of course the ones belonging to the spelling learning objectives.

The corpus contains 1488340 entries that cover 16650 individual spelling exercises be-longing to 32 learning objectives (520 individual exercises and 46510 corpus entries per learning objective on average). This means that in a single month a pupil does on average 2.66 learning objectives or 1383 exercises. We can see this data in tableA.1. An exercise contains a number of important fields.

• unique identifier

• learning objective identifier • correct answer

The total number of incorrect answers in the corpus is 409681 which represent approxi-mately 27.5% of the corpus entries.

(26)

Chapter 3. Dataset and Methodology 18

All fields are annotated with extra data regarding the answer given for an exercise, the time it took and whether the answer was correct.

3.2 Data cleaning

This step was performed for practical reasons with regards to the structure of the data. The purpose was to reduce the number of JOINS between tables when searching for data thus search speed for queries. Another practical aspect accomplished by this step is determining which set of pupils can be used for the experiments.

The data cleaning step consists of 2 steps:

• Creating structure; arrange the data such that it facilitates aggregation, search and makes data manipulation easier to handle in general.

• Cleaning; where the strings that will be subjected to analysis will be cleared of errors caused by software or hardware errors.

After sanitation, part of the corpus contains sparse data meaning that non usable results were removed from pupils which excludes them from the analysis step. The analysis requires homogeneous data in order to be valid and be used for training and classification purposes.

In practical terms we created tables that contained information previously retrieved through joining other tables. The cleaning process eliminated special characters or converted accented vowels to their latin equivalent (i.e. `a - a) and clearing out spaces introduced by mistake (i.e. tre in - trein). pupils who have been absent at school and did not complete all the learning objectives had to be removed.

3.3 Methodology and Technology

Having defined which cognitive features can be inferred from the data we seek to de-fine what are the properties that can dede-fine each feature. These properties represent measurements of certain elements in the data that are influenced by how developed a cognitive feature is. By matching the description of deficiencies found in specialist literature with the pupil data we hope to conclude the presence of such deficiency. In order to perform data analysis and to run the experiments we made use of the following tools:

(27)

• For storage we will be using the format of the dataset handed to us, a relational database using MySQL as the database administration system.

• To process the data we have considered Python to be an excellent choice given the flexibility in handling strings and dictionaries. The machine learning algorithms are implemented using the Scikit package.

• For visualizing the data we will make use of the Google chart to create plots that have a clear aspect.

3.4 Error classification

After analysis of the structure of the exercises and the type of words that define them of establish a set of possible errors that might occur with higher frequency in a learning objective. The error assignment relates directly to factors such as word complexity. Word complexity describes the composition of a word and the characteristic difficulties that makes it harder or easier to grasp. Such characteristics can be the length of the word, number of vowel group or consonants.

3.4.1 Approach

For this task we have created a classifier that takes into account word features and tries to match the error types found in word to a list of known, structured error patterns. The list of error patterns represent a sequence of operation that need to be done to a letter of group of letters to obtain an error. This is achieved with the use of regular expressions that also allow us to check the position of the errors in the word.

The classifier takes as input a pair of strings that are initially compared to see if there are any differences before proceeding further.

The first criterion for being classified is that edit distance between the correct and submitted answer should be shorter than the length of the correct answer itself. The contrary indicates that the word has changed so much that is either a totally different word or the displacements of characters is not possible to track anymore. We do not classify this type of error due to several factors; the percentage of errors that have this property is low, 0.0032%, out of the total number of errors; second the mapping of such errors is not practical given that these types of error can have random structures of letters.

(28)

Chapter 3. Dataset and Methodology 20

If differences are found we analyze the output in the form of operations needed for the correct string to have the same from as its given answer. The differences, defined as a set of operations are checked against the collection described in1.3.3.

The classification of the corpus represents the step in our research where we collect all the answers given wrong and try to categorize them based on techniques that look for the amount of operations at a character level (displacement, addition) and translates them into categories that reflect certain deficiencies such as visual processing or the failure of understanding a rule that makes up a concept.

This experiment will help consolidate the validity of the error types created. In order to conclude this we will analyze results coming from that are facing difficulties in one of the category and check whether errors that belong to the same category are found together.

Out of the total number of errors in the corpus (409681) 33.85% (138682) of them have been classified. The relatively low classification coverage is due to the fact that a large portion of the error types that have been observed cannot be fitted into a category since they occur rarely and are particular to individuals. In other instances some of the errors were intentionally made such as ’1234’,’asdf ’, ’qwerty’ or smiley faces ( :), :D ). The classification processes consisted in retrieving all the answers in the corpus that were different than the correct answer and checking whether the error pattern that was found matched any of the error patterns (types) defined by us. In this process we take into account the fact that a word can contain multiple errors. The distribution of the number of errors per word can be seen in figure3.1.

3.4.2 Accuracy calculation

Because of the diversity of the words used in the exercises of the learning objectives, the probability of making an error if different for each learning objective. Therefore the weight of an error varies based on the frequency of the words that present the opportunity for an error to occur. This opportunity is given by the presence of certain characters that can be misspelled.

To further develop on this idea we take the following example. We are trying to analyze the stability of the errors that belong to the visual processing category such as the exchange of p with the letter b. The occurrences of the words that have the letter p is not dependent on the concept that is being taught in the learning objective nor by the number of exercises in it.

(29)

Figure 3.1: Number of errors in a single word

To address this issue we use learning objectives as delimiters that express the accuracy over time for an error class. We will use these delimiters to measure the accuracy over time. Using this procedure we would like to view after every learning objective the behavior in committing errors.

In practical terms we build a dictionary of counts for each entry of our error classes as we loop through the words that compose the exercises of a learning objective.

Thus given an learning objective l we would like to know what is the probability of an error occurring given an error pattern e. This is given by the sum of errors derived from pattern e over the occurrences of the words that contain the pattern in the learning objective.

Let E(x) be the number of elements of x1,. . . ,xn that contain pattern e and are not correct. Let L(w) be the number of words that belong to learning objective l and contain the error pattern e.

p(e|l) = C(x)

(30)

Chapter 4

Experiments and results

The goal of this chapter is to answers the research questions asked in the in Chapter1. Following the flowchart presented in figure1.1we will handle the analysis, interpretations and the decisions processes given the results of our experiments.

4.1 Prior research and corpus analysis

In this section we analyze the results of preliminary experiments to understand what are the contributing factors to error making, and how they correlate and influence each other. These factors, which are inferred from the results of the exercises, are relevant to understanding the underlying cause of low literacy and can help to better explain the the nature of the spelling errors and how they influence pupil performance over time. This section represents a prior research because it studies the effects of word complexity in general with no regards to the results of the classification process.

The purpose is to show what we assume is a normal relation between accuracy and word complexity, where the probability of making an error increases with word length and its complexity.

4.1.1 Word complexity

This section covers the results for word length analysis where we measure accuracy depending on word length (number of characters) and edit distance.

In this section we investigate the effect of the length of the word for all error types. Given a list of words, word length generally influences one’s ability to recall the items

(31)

that are longer in length. Short-term preservation is impacted by the time length needed to practice the words.

We seek to verify that long words has a negative effect on performance in all types, especially in the Rule and Vocabulary where it becomes more difficult to detect whether a rule was applied or memorize a word that has a high complexity (compound words, low consonant to vowel ratio, etc).

The word length effect represents a key feature that most current formal models of the immediate serial recall task attempt to explain. Consequently, the word length effect is seen very much as a crucial aspect of theorizing about short-term memory recall[20].

Word length effects are not a defining characteristic of short-term recall, but reflect differential processing that is independent of retention interval[21].

In order to prove this we group all the words in the corpus based on their character count and verify how many errors exist in each group.

Complex (long) words have more features to assemble at output than short words and as such there is a greater probability of an assembly error. Thus, from this perspective the word length effect is not attributable to decay, instead it is attributable to a form of interference.

To answer this research question we perform 2 experiments in which we measure different type of analyses regarding the word complexity.

• Change in accuracy given the number of characters • Change in given the number if syllables

The accuracy is calculated in the following way.

Given all the words in the corpus W of length l we compute the ratio between number of correct answers C over the total number of answers. Later on we will apply the same principle but grouped by the either the length of the word or the number of syllables.

A(l) = C

W (4.1)

The following experiment measures the accuracy of the pupil performance for all learning objectives. The setup comprises in collecting all the correct answers of an exercises and measuring the number of characters.

(32)

Chapter 4. Experiments and results 24

After collecting all answers we place the words into bins based on the different values of word lengths found in the corpus. For each bin we compute the number of correct and wrong answers after which we compute the accuracy.

In figure 4.1 we can observe a trend that matches the out expectations. This figure shows the accuracy for all exercises based on the length of the word. The probability of making an error increases with word length and the accuracy in the case of short words is significantly higher that the longer words (apart from very long words). The steady drop in accuracy given the increase in number of characters indicates that pupils are more prone to make an error when the word is longer.

Figure 4.1: Accuracy and word length

Given our different error types the probability of making an error of a certain type is not always dependent on the length of the word and its composition as seen in figure

4.2. This figure shows how much percentage of the total error is represented by each error type based on the length of the errors. The majority of the errors happen in the range of 4 to 6 characters with equal distribution for each error types. In figure 4.3we can also see the word length distribution for all the words in the corpus.

Another way to show this effect is to analyze how many modifications does a word suffer when its length increases. The premise here is that taking into account the result of the correlation analysis between accuracy and word length we want to consolidate the fact that having more characters not only increases the probability of making an error but also the probability of making multiple errors within the same exercise or word. In

(33)

Figure 4.2: Word length for error types

(34)

order to perform this analysis we chose a suitable metric that describes the modifications that occurs to a word. To support our previous results the edit distance measure should present the same behaviour where we see an increase in the edit distance with longer words.

The Damerau–Levenshtein distance is a distance (string metric) between two strings, i.e., finite sequence of symbols, given by counting the minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character, or a transposition of two adjacent characters. Because the classifier is built around such operations between the characters we assume it to be a meaningful metric that further consolidates the idea that the word length is an influencing factor. In our case we measure the edit distance between the correct answer and the submitted answer. To support our previous claim the edit distance needs to shows the same pattern with the increase in length, as in increasing the probability of making an error given longer words.

Figure 4.4: Edit distance and word length

As expected we can observe in figure 4.4 the pattern fits the finding of the previous experiment where we can see a steady increase in the edit distance indicating multiple modifications to the word, thus the occurrence of multiple errors withing the same word. The graph contains results from incorrect exercises only.

The conclusion is that the length of the word influences not only the probability of making an error but also the number of errors that can be made within a single word.

(35)

4.2 What are the correlations/structure of learning

objec-tives?

In this research we strive to find what are the roots of spelling errors. We ask the question whether errors originate from sources such as bad comprehension of learning objective or characteristics of the words that make use of cognitive processes, or a combination of both. The same approach would not be possible using the error types as they overlap in definition and they are not equally distributed over time.

4.2.1 Approach

The goal of this research question is to determine whether the errors in a learning objective originate from the bad performance of previous learning objective of the mis-understanding of the current one. In other words we would like to know how much has the performance from previous learning objectives influenced one’s current overall performance.

The structure of the exercises belonging to a single learning objective is often highly diverse. By structure we are referring to properties of a word such as the number of syllables, letters, vowel groups and consonant clusters that compose it..

As stated in the previous research question the structure of the word is a factor that leads to committing errors. The fact that there is no correlation between two learning objectives can be explained by analysis of the structure of the words and exercises in the learning objective. The concept of the learning objective might be delivered correctly but the choice of exercises should be tailored as well based on detected deficiencies of a pupil.

To perform this task we must understand how much each learning objective is account-able for the variance in the data, or in other words perform an exploratory analysis to research the correlation between any pair of learning objectives and select the relevant ones that contribute in a significant way in the accuracy of the spelling process.

The end result of the experiment will be used to determine whether in a scenario where the pupil presents a noticeable under average performance, that there is an underlying correlation between the learning objectives that we can use to asses pupil current and pupil’s future performance.

This represents both an alternative and an extra diagnosis tool in case the classification results do not suffice in determining the source of a deficiency.

(36)

4.2.2 Setup

The approach in this experiment is to create a training set where the variables are represented by the learning objectives and the data points by the pupils and their final accuracy for each of learning objectives.

The training set used consists in pupils that have completed the same 15 learning objec-tives. The number of learning objectives we have chosen is due to the fact that we have sparse data in the sense that pupils are have either missing results due to not attending or they were invalidated by the sanitation process.

Although this is not crucial factor for the selection we decided to not include pupils who had more than 80% of the data missing from the learning objective exercises.

Therefore in the end we have data collected from 530 pupils that have fully completed the same 15 learning objectives.

The value for each variable (learning objective) represents the final score of the pupil for that learning objective.

4.2.3 Results

We would like to know which learning objectives are responsible for most of the variance in the data that, which will eventually allows us in case of interventions to focus on learning objectives that require more consolidation in order to increase future pupil accuracy.

We observe in figure4.5the correlation matrix where the values represent the statistical relationship between any 2 learning objective. This relationship indicates to what de-gree the performance in one of the learning objective influences performance in another learning objective.

From the results we deduce that strongest correlated learning objectives were the ones that sought to teach more basic concepts such as vowel and consonant pronunciation. For example: klinkerverenkeling, words ending in -d of -t with words containing (-)ei(-) or (- )ij(-) (trein lijst).

The high factor loads we can observe means that the information taught in these learning objectives is used often throughout the other learning objectives.

(37)

Figure 4.5: Learning objective correlation matrix

One initial assumption was that the learning objectives are build on top of each other and taught in a sequential way which should have reflect in the correlation table between the learning objectives. In 4.6the results did not show an overall strong correlation.

Factor loading is the percent of variance in that variable (the final pupil performance) explained by the factor (the learning objective). This may indicates that some of the learning objectives believed to be more fundamental than the others did not influence strongly the performance in learning objectives that were taught after.

This is due to the structure of the learning objectives that is not necessarily uniform and requires knowledge from multiple previous ones in order to correctly solve the exercises.

4.2.4 Conclusion and comments

In conclusion, knowing the relation between learning objectives plays an important role in determining the root of a problem in the sense that it is possible to trace back the learning objectives that are responsible for a pupil’s negative performance.

In a scenario where a pupil is having difficulties with his current learning objective we can gain insight about the nature of his problem by looking at the performance of the correlated learning objectives.

(38)

Figure 4.6: Learning objective correlation factor loading

4.3 How many misspelled words can overall be “explained”

by the error types?

This section presents the output of the classification processes in terms of coverage, or how many errors could be detected using the error type described in3.4.1for each type. This give us an overview on the distribution of the pupils based on what type of errors they committed.

In order to have a better representation of the nature of errors in spelling we need to have a devise a meaningful method of describing an error rather that just stating its existence. This being said we are interested in understanding the composition of the letter or groups of letters that was spelled erroneously.

(39)

The position of the errors is interesting to study to see if certain errors are more probable to happen at a specific position in the word and if that is dependent on the length of the word.

Using the approach discussed in3.4.1we will now described and analyze the results from the implementation of the error classifier.

4.3.1 Visual processing

The classification process revealed that 21% of the errors belong to the visual processing type.

In4.7we can observe the distribution of the number of errors in this type. As expected most pupils commit on average a low number of errors (or none in some cases) followed by the exceptions where our area of focus is located. These exceptions will be part of further case studies where we will attempt to determine whether this pattern is a stable one.

Figure 4.7: Distribution of the visual processing errors

Next step is to study the location of the errors in order to understand for each type where the majority of the problems lie. In the visual processing type4.8 we can observe that the primary source of errors stands in the beginning of the word with 61.8% followed by the middle and end of the word. The middle of the word is the portion between the first and last characters of the word (applicable for words longer than 2 characters).

(40)

We performed a statistical test to estimate probability of rejecting the null hypothesis of a the results when that null hypothesis is true. The statistical test is the Chi-square Goodness of Fit test. The resulting P-Value of the test is <0.00001. This makes the result significant at p <0.01. In other words the chances of making a random mistake at those locations are unlikely.

In other words it is very unlikely that the location of the error is random given an error of this type is made.

To support this we computed the probability of making an error based on where the error pattern for the Visual processing type is located.

We define loc as the location of the error pattern in the word (start, middle or end). Let C(e, loc) be the count of the errors produced at location loc of an error type E. Let T(e) be the total number of errors produced at any location in the word. Using formula3.1we take into account the probability of that error type to occur.

p(e|loc|E) = C(e, loc) ∗ p(e|E)

T (e) ∗ p(e|E) (4.2)

Using4.2we find out that the probability of the of an error to occur at beginning of the word is 3.8%, 0.83% in the middle and 0.84% at the end.

(41)

Figure4.9shows the relation described above where we can see a number of phenomena. All errors present a bigger proportion when the word has between 4 and 7 characters regardless of the location of the error. The proportion of the errors that are found in every location of the word is increasing with the number of characters with little difference.

The proportion expresses the percentage of errors for a location out of the total number of errors for the type (visual processing in this case).

Figure 4.9: Visual processing errors location and proportion

This indicates that there is no strong relation between the length of the word and the location where the error occurred, for the visual processing type.

4.3.2 Phonological awareness deficiency

The classification process revealed that 53% of the errors belong to the phonological awareness type, thus represent a big proportion of the total number of classified errors. As done previously we have classified the errors based on their location. It is useful to take the location into account to understand whether the probability of making an error depends on the world length. The distribution of the errors in 4.10 is similar to the visual processing one where we observe a normal behavior where we have the majority of pupils making less errors and then a small problematic minority.

(42)

Figure 4.10: Distribution of the phonological errors

The nature of this experiment makes it so that we have a slight overlap in the error types where we use the same type of error to describe 2 different deficiencies. For example the letter p error in the visual processing type which can be visually mistaken with other letters such as b,d,q also constitutes an error in phonological awareness type where pupils confuse it for another sound like b. Therefore a small fraction of the overall distribution is shared by both types.

We next study to location of the errors within the word. In figure4.11we discover that the majority of the errors are found in the middle of the word which is consistent with the fact that most phonemes are found in the middle of the word and are harder to recognized that the ones located in the beginning [22].

Unlike the visual processing type where we did not observe any relevant difference in the relation between of location and error proportion, in figure4.12we can see the that the shorter the word the more errors we have in the beginning. The opposite effect is found for the middle and end locations that scale with the word length. In the case of the middle location it remains stable even with words longer than 7 characters whereas the proportion of the errors located at the end decreases.

We reiterate the fact that most phonemes are located in the middle of the word which provides and explanation for this pattern.

(43)

Figure 4.11: Phonological awareness errors based on location

(44)

To better explain the relation between phonological errors and word length we proceed to investigate the effect of the number of syllables whether it influences the proportion of the errors based on location. Given the fact that the number of syllables is proportional to the number of character we expect to see the same pattern.

Figure 4.13: Phonological awareness errors number of syllables and proportion

In figure 4.13 we confirm the expected pattern where the initial or end sound of the word is more likely to be mistaken for single syllable words. For multiple syllable words the location of the errors is mostly found in the middle.

4.3.3 Rules and Vocabulary

The classification process revealed that 26% of the errors belong to this error type as seen if figure4.14.

The location of the errors in this case is found to be noticeably at the end and beginning of the word as seen in figure 4.15.

This is consistent with the fact that most words where this type is applied studies are subject to grammar rules that apply modifications at these locations. Take for example the formation of the plural which is modified at the end of the word or the addition of prefixes for certain rules.

(45)

Figure 4.14: Distribution of the vocabulary and rule errors

(46)

Figure 4.16: Vi-sual processing

loca-tion distribuloca-tion

Figure 4.17:

Phonological location distribution

Figure 4.18: Rule and vocabulary

loca-tion distribuloca-tion

To better view the individual differences in figures 4.16,4.17 and 4.18we would like to know the that given an the occasion to make an error in a word that contains one of the pattern that our classifier searches for, what is the probability of the error to be committed in a specific location.

The results are consistent with the previous observations regarding each type where we see that the probability of an error to occur in a specific location is influenced by the type being searched for.

Figure 4.19: Error class distribution

Figure4.19presents the overall distribution of the classified errors based on the location where the error was found in the word. Given the high amount of errors in the phono-logical type the percentage of errors belonging to this class remains high for all word lengths reinforcing our observations in the previous experiment where we have seen a drop in accuracy with the increase in length.

(47)

The visual processing type seems to affect words up to 5 characters with a similarly with a drop in proportion for longer words (after 7 characters). The figure shows an inverse relation between the proportion of the errors in the visual processing type and rule and vocabulary class where the proportion of errors for the former is higher for shorter words and bigger in the case of longer words for the later type. This corresponds to the observation we made previously regarding rule making that increases the length of the word.

4.3.4 Discussion

The error type overlap in the strongly affects the accuracy of this study since during classification 2 or more error classes will be assigned to the same pupil without discrim-inating what cognitive process is actually contributing to the spelling error.

Visual Phonological R&V

Visual - 0.77 0.41

Phonological 0.77 - 0.54

R&V 0.41 0.54

-Table 4.1: Error class correlation in general

Visual Phonological R&V

Visual - 0.48 0.41

Phonological 0.65 - 0.85

R&V 0.69 0.85

-Table 4.2: Error class correlation for target groups

In table4.1we can observe the effect of the noise for the correlation coefficient between the Visual and Phonological error class that present overlap. In table4.2the correlation coefficient is calculated on another set of pupils that represent the target group. The target group is represented by the pupils who committed a high number of errors of a specific type.

We make the assumption that pupils who perform poorly overall have a higher proba-bility of committing errors regardless of the underlying cognitive feature we investigate. Therefore we select pupils that are above average and consistently have difficulties with a certain error type which we will call the target group. The reason for doing this is that pupils who perform poorly for all error types are are less difficult to diagnose unlike the target group.

(48)

4.3.5 Conclusion

This section answered the question regarding the coverage of the error classes in terms of classification. We were able to classify 33% of the total errors found. For all error types the distribution of the number of errors per pupil follows an expected pattern with the majority of pupils having few errors.

Out of the total percentage of errors, 53% belong to are of the phonological error type, which can be related to properties of the corpus (i.e. the density of words that present an error pattern of this type, is higher). In this process we also look into how word length affects each error type and conclude that there is no direct relation which increases the chances of making an error of a certain type.

4.4 How “stable” are error types over time?

Until now we have analyzed the validity of the type of errors and their relation to the pupil performance, measured in the number of errors. In this section we try to determine whether the pattern seen above represents a reliable feature that complies with 2 key conditions of reliability:

• Stability, where an error type is recurrent over time

• Predictability, where a stable error type can be analyzed early in time and deter-mine its future development

The difference between the two conditions is that stability implies the performance of the student does not alter over time, whereas predictability can determine future performance that may or may not be stable.

We define stability of an error type in the context of pupil performance as the recurrent appearance over time of the same error pattern throughout different learning objectives. We refer to an error type as stable if it persists (reappears) over time given different tasks. In other wards to what degree the the performance stays the same and the pupil does not remediate or recognize his errors.