• No results found

Acoustic and Perceptual Vowel Analyses of Dysarthric Speech in Parkinson’s Disease and Spinocerebellar Ataxia

N/A
N/A
Protected

Academic year: 2021

Share "Acoustic and Perceptual Vowel Analyses of Dysarthric Speech in Parkinson’s Disease and Spinocerebellar Ataxia"

Copied!
49
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Acoustic and Perceptual Vowel Analyses of Dysarthric Speech in

Parkinson’s Disease and Spinocerebellar Ataxia

Kristel Nijburg, S2753618

Master Thesis in Neurolinguistics

Faculty of Arts, University of Groningen Supervisor: Dr. R. Jonkers

(2)

Abstract

In the history of dysarthria research, the focus was mainly on the perceptual aspects of dysarthric speech, due to the highly influential articles by Darley, Aronson and Brown (1969a & 1969b). However, there has been a shift towards using acoustic measures such as vowel analysis because of the ability to detect more subtle changes in speech and because the more objective way of analyzing speech (e.g. Kent, Weismer, Kent, Vorperian, & Duffy, 1999). Several vowel measures have been developed that can be used to distinguish between dysarthric and non-dysarthric speech, and some have also been shown to correlate with intelligibility of speech. Because of these promising results of vowel measures, we wanted to investigate whether it was also possible to differentiate between two types of dysarthria using vowel measures, and to see how they influenced intelligibility of speech. Results indicated that between groups of people with Parkinson’s Disease and Spinocerebellar Ataxia, no differences existed for any of the vowel measures that were used. These disease groups also did not differ in vowel measures from a healthy control group. Vowel intelligibility in the group with Parkinson’s Disease was lower than in the healthy control group, but none of the vowel measures correlated with intelligibility scores. Nevertheless, there seemed to be a deviation in vowel production of at least the group of people suffering from Parkinson’s Disease since the intelligibility was low, but we were not able to capture those differences with the vowel measures in the current study.

Keywords: Parkinson’s Disease, Hypokinetic Dysarthria, Spinocerebellar Ataxia, Ataxic Dysarthria, Vowel Articulation, Acoustic Analysis, Intelligibility

(3)

Table of Contents Abstract 2 1. Introduction 5 1.1. Dysarthria 5 1.2. Dysarthria Classification 8 1.2.1. Flaccid Dysarthria 8 1.2.2. Spastic Dysarthria 8 1.2.3. Ataxic Dysarthria 9 1.2.4. Hypokinetic Dysarthria 10 1.2.5. Hyperkinetic Dysarthria 10 1.2.6. Mixed Dysarthria 11

1.3. Mayo System Replications 11

1.4. Vowel Measurements 12

1.5. Intelligibility Testing 16

1.6. The Current Study 17

2. Methods 19

2.1. Participants 19

2.2. Data Collection 20

2.3. Vowel Annotations 20

2.4. Phonetic Analysis 21

2.4.1. Vowel Space Area 21

2.4.2. Vowel Articulation Index 22

2.4.3. F2i/F2u Ratio 22

2.4.4. F1/F2 Contrasts 22

2.4.5. F1/F2 Variability 22

2.5. Perceptual Task 23

2.6. Analysis Perceptual Task 24

3. Results 24

3.1. Acoustic Measurements 24

3.2. Perceptual Experiment 28

3.2.1. Objective vs. Subjective Intelligibility 28

3.2.2. Diagnosis vs. Accuracy 28

(4)

4. Discussion 30

4.1. Acoustic Measures 31

4.2. Perceptual Experiment 33

4.3. Correlations Acoustic Measures and Perceptual Experiment 34

4.4. Limitations and Future Recommendations 34

5. Conclusions 35

References 36

(5)

1. Introduction

Communication is something that most people usually see as a normal part of their daily life. However, even the slightest deviation in the (control of) speech can hugely impact this, and normal communication suddenly becomes an issue. This is the case for people with dysarthria of speech, which is a general term for disturbances in the speech muscle control, occurring after damage in the central or peripheral nervous system (Darley, Aronson & Brown, 1969a). Darley et al. (1969a & 1969b) first described dysarthria subtypes, due to differences in origin and outcome of these disturbances. Diagnosing dysarthria involves analyzing different aspects of the speech, among others intelligibility. Intelligibility is important for the self-esteem of people with dysarthria, hence, well-grounded scientific research is needed in order to understand what influences intelligibility in different dysarthria types, ensuring successful and specific speech therapy. Perceptual analyses are often done regarding dysarthric speech, although there has been a growth in studies analyzing acoustic measures such as formant frequencies in vowels (Kent & Kim, 2003). This is because acoustic analysis is less subjective than perceptual analysis, and specifically vowel analysis can reflect underlying difficulties in speech muscle control (Bunton et al., 2007; Kent, Weismer, Kent, Vorperian & Duffy, 1999; Liu, Tsao & Kuhl, 2005). Therefore, the aim of this thesis was to analyze vowels of people with dysarthria using formant frequency measures, and to study how these measures correlate with intelligibility. Two groups of people suffering from dysarthria were included, namely a group with Spinocerebellar Ataxia (SCA), and a group with Parkinson’s Disease (PD). They were compared to each other as well as to a healthy control group. In this introduction, an overview of current findings on dysarthria in general is first given, followed by descriptions of specific pictures and neural bases of different dysarthria types. Then, the traditional way of analyzing and diagnosing dysarthric speech is reviewed, before discussing how analyzing vowels acoustically can be a valuable addition to the current perceptual analyzing methods.

1.1. Dysarthria

In the scientific field, dysarthria is a relatively new phenomenon. Until the first half of the twentieth century, the symptoms of dysarthria were merely vaguely described in case studies, but not much was known about the causes and clinical picture of dysarthria (Duffy, 2006). From the 1960s onward, dysarthria gained more attention, partly due to the highly influential articles by Darley et al. (1969a & 1969b). These articles are to date still considered seminal contributions to the field, and many researchers adhere to their classification of

(6)

dysarthria types. As mentioned before, dysarthria is an overarching term for different manifestations of the disease. Dysarthria is most of the time secondary to a (degenerative) neurological disease such as PD or Multiple Sclerosis (MS), or can be caused by traumatic brain injuries or otherwise (“Dysarthria Symptoms & Causes”, Mayo Clinic, n.d.). These brain disturbances can affect different parts of the brain related to motor functions, for instance the cerebellum, the basal ganglia, or the lower and upper motor neurons. Characteristics of dysarthric speech mostly express themselves in reduced movements, slowness of speech, and disturbed timing and coordination of movements (Darley et al., 1969a; Yusunova, Weismer, Westbury & Lindstrom, 2008). The control of articulators – jaw, tongue and lips – is disturbed in dysarthria, which causes speech muscle movements of dysarthric speakers to behave differently from non-speech impaired speakers. How these movements differ depends on the type of dysarthria, and will be discussed further below.

Not only studies about the biological basis of dysarthria and the characteristics of the speech have been carried out, but there are also studies regarding the well-being of people with dysarthria. For example, Miller, Noble, Jones and Burn (2006) interviewed a group of people with PD. Findings suggested that the dysarthric speakers mostly had difficulty with conversations and interaction with others, and were highly aware of their own voice and intelligibility. This all influenced their socialization, leading to feelings of humiliation and of being ignored. Similar patient perspectives were described by Hartelius, Elmberg, Holm, Lövberg and Nikolaidis (2007), who administered a questionnaire among different groups of people with dysarthria due to progressive neurological disorders. These groups of people most often mentioned difficulties in participating in conversations, and that they experienced negative self-image. Confirming findings were obtained by Dickson, Barbour, Brady, Clark and Paton (2008), who interviewed people with dysarthria due to stroke lesions. Participants complained about stigmatization and negative changes in self-identity. Interestingly, though, is that dysarthria severity did not correlate with the outcomes in any of these three studies. However, Dickson et al. (2008) mentioned that their participants were focused on improving their intelligibility, independently of how intelligible they actually were. Thus, even though preventive and effective speech therapy is needed in the case of dysarthria, it should not overshadow daily life difficulties, as the authors of these articles all stress (Dickson et al., 2008; Hartelius et al., 2007; Miller et al., 2006).

(7)

Table 1

Neurologic Correlates and Features of Dysarthria Types

Type Lesion Locus Presumed Distinguishing Neurophysiologic Impairment

Common Associated Neurologic Signs

Prominent/Distinctive Auditory-Perceptual Features

Flaccid Lower motor neuron One or more of the following: Cranial nerves V, VII, X, XI, XII Cervical and thoracic spinal nerves

Weakness Atrophy, fasciculations, weakness, hypotonia, reduced reflexes

Continuous breathy-hoarse voice quality, diplophonia, reduced loudness, short phrases, stridor/audible inspiration, hypernasality, nasal

emissions, imprecise consonants Spastic Upper motor neuron (usually bilateral) Spasticity Hyperactive reflexes, pathologic

reflexes, pseudobulbar effect, dysphagia, limb spasticity and

weakness

strained-harsh voice, monopitch and monoloudness, slow rate, slow, regular speech

AMRs

Ataxic Cerebellar control circuit Incoordination Ataxia, dysmetria, hypotona, Intention tremor

Irregular articulatory breakdowns, irregular speech AMRs, excess and equal stress, excess

loudness variation, distorted vowels Hypokinetic Basal ganglia control circuit Rigidity, Reduced range of motion Rigidity, bradykinesia, akinesia,

resting tremor, postural abnormalities

Reduced loudness, monopitch and monoloudness, reduced stress, accelerated rate,

short rushes of speech, rapid, blurred speech AMRs, repeated sounds, inappropriate silences Hyperkinetic Basal ganglia control circuit Quick-to-slow, regular or irregular

involuntary movements

One or more of: Chorea, dystonia, athetosis dyskinesia, myoclonus, action

myoclonus, tics, tremor

Highly variable, affecting one or more of the components of speech production, but often with

significant rate and prosodic abnormalities. Speech features consistent with nature of involuntary movements (e.g., voice tremor,

spasmodic dysphonia, jaw/face dystonias, palatal-pharyngeal-laryngeal myoclonus). Mixed Two or more of above Combinations of above Combinations of above Combinations of above

Note. Adapted from ‘History, current practice, and future trends and goals’, by J. Duffy, in G. Weismer (Ed.), Motor speech disorders: essays for Ray Kent (p. 9-11), 2006, San Diego, CA:Plural Publishing.

(8)

1.2. Dysarthria Classification

As mentioned earlier, Darley et al. (1969a & 1969b) were the first to classify dysarthria into subtypes, and patients are still assessed according to this classification method. The authors investigated speech in people suffering from one of seven neurological disorders, including parkinsonism, cerebellar lesions and amyotrophic lateral sclerosis (ALS). These groups of people were asked to read a text that is often used in speech analyses called the “Grandfather” passage, after which the three authors judged their speech using different speech dimensions. These dimensions consisted of descriptions of articulation, prosody, pitch, intelligibility, loudness, voice quality, and respiration. The subtypes that they identified were flaccid, spastic, ataxic, hypokinetic, and hyperkinetic dysarthria, as well as a mixed type dysarthria (Darley et al., 1969a). In Table 1, an overview can be found of the different kinds of dysarthria. In the following sections, more thorough descriptions will be given regarding these types, with focus on the types investigated in this study, namely hypokinetic and ataxic dysarthria.

1.2.1 Flaccid Dysarthria

According to Darley et al. (1969a), flaccid dysarthria is present in people who suffer from bulbar palsy due to lesions in the lower motor neuron region. Underlying causes could be Lyme disease, a tumor in the nerve roots, or bulbar ALS (Lomen-Hoerth, 2011), and signs are often muscle weakness or atrophy, affecting tongue and jaw movements (Tomik & Guiloff, 2010). According to Darley et al. (1969a), the lack of control over facial muscles most often results in hypernasality, imprecise consonant pronunciation, breathy voicing and monopitch. Furthermore, dimensions mostly correlated with intelligibility were imprecise consonants and distorted vowels.

1.2.2. Spastic Dysarthria

Darley et al. (1969a) assigned spastic dysarthria to people with pseudobulbar palsy, which results from bilateral upper motor neuron lesions. This can for instance lead to paralysis, spasticity, and hyperreflexia, in turn resulting in slow and limited tongue movements. Pseudobulbar palsy can occur due to diseases such as cerebral palsy, brain tumors, multiple sclerosis or multiple strokes (Darley et al., 1969a; Lomen-Hoerth, 2011). As is the case with flaccid dysarthria, the most deviant speech characteristic that Darley et al. (1969a) found in speech of people with spastic dysarthria was imprecise consonants, which also mostly correlated with intelligibility. Furthermore, signs of monopitch as well as reduced stress and distorted vowels were identified, which correlated with intelligibility as well.

(9)

When upper motor neurons are affected only on one side, a different type of dysarthria occurs, namely unilateral upper motor neuron dysarthria. Spasticity can also occur in this type of dysarthria, though the origin is different, and speech features are often only mild to moderate (Duffy, 2006).

1.2.3. Ataxic Dysarthria

As can be derived from the name, this type of dysarthria occurs with ataxia, which is caused by lesions in the cerebellum or its pathways (Spencer & Slowcomb, 2007). The cerebellum or ‘little brain’ is an immensely important part of the central nervous system, and holds more than 50% of the neurons in the entire brain (Miall, 2014). While the cerebellum is responsible for many different functions, it is mostly known for its motor control. Lesions in the cerebellum due to tumors, multiple sclerosis or strokes affect this motor programming and planning, resulting in a disorder called ataxia. Impairments due to ataxia mostly lie in timing, forge, range, and direction of movements, causing clumsiness and overshooting of movements (Miall, 2014). First symptoms of ataxia are often seen in affected gait and body equilibrium, and later on, symptoms could be slow and uncontrolled voluntary movements due to unstable speed and strength, and tremors can occur during movements of limbs (Darley et al., 1969a; Miall, 2014).

Studies regarding ataxic dysarthria due to cerebellar lesions ascribe diverse characteristics to it, though overlapping features are often found for prosody and articulation (Spencer & Slowcomb, 2007). Darley et al. (1969a) for instance name monopitch, monoloudness, and excess or equal stress as typical features of ataxic dysarthria, of which the latter is called ‘scanning speech’. Kent et al. (2000) excessively investigated perceptual and acoustic characteristics of ataxic dysarthria using different tasks. For example, in a diadochokinesis task, participants were slow and irregular, and in conversation, the number of uttered words differed among the participants. Slow speech was a common feature among the different participants and tasks, but other characteristics varied among tasks and participants. Spencer and Slowcomb (2007) remarked in their article that the amount of motor programming needed to succeed in a task could also cause differences between tasks. Differences in speech characteristics between people with ataxic dysarthria could appear due to variability in subgroups within this group. For instance, a form of ataxia is spinocerebellar ataxia (SCA), which is an autosomal dominant hereditary ataxia (Schalling & Hartelius, 2013; Miall, 2014). Characteristics of dysarthria in SCA resemble those of ataxic dysarthria, but spastic characteristics could occur as well. All in all, a wide range of difficulties occurs in speech of

(10)

people with SCA, due to heterogeneity within this group of people (Spencer & Slowcomb, 2007; Schalling, Hammarberg & Hartelius, 2007). There are around 35 types of SCA, and for instance, while SCA1, 2 and 3 show more deviations in voice quality, people with SCA6 are more impaired in articulatory timing (Schalling & Hartelius, 2013). Thus, it is clear that while people suffering from lesions in the cerebellum all receive the label ‘ataxic’ to their type of dysarthria, perceptual characteristics are often divergent.

1.2.4. Hypokinetic Dysarthria

Hypokinetic dysarthria occurs in cases of parkinsonism, which results from lesions in the basal ganglia control circuit (Duffy, 2006). These lesions could occur after head traumas, toxin poisoning, metabolic diseases or neurodegeneration. An example of the latter is Parkinson’s Disease (PD), which is a progressive neurodegenerative disease that affects dopamine production. Besides the well-known resting tremor, movements and gait are often rigid and slow, and initiating movements for people with PD can be quite a struggle (Darley et al., 1969a). Furthermore, the range of movements are small and become even smaller as the disease progresses, which also applies to speech that gradually deteriorates in around 75% of people with PD (Ho, Iansek, Marigliani, Bradshaw & Gates, 1998).

The defining dimensions that Darley et al. (1969a) found in hypokinetic dysarthria consisted of monopitch, monoloudness and reduced stress. However, their speakers with hypokinetic dysarthria did not display slowness of speech, being the only type of dysarthria to lack slow speech, according to the authors. Monoloudness, reduced stress and imprecise consonants were found to influence intelligibility (Darley et al., 1969a). In studies regarding PD, the term often used to describe lack of clear articulation is ‘articulatory undershoot’, caused by restricted movements of the articulators such as the jaw or tongue (Robertson & Hammerstad, 1996). However, according to Ho et al. (1998), dysarthria in PD patients first affects voice, followed by fluency of speech, and lastly by articulation.

1.2.5. Hyperkinetic Dysarthria

The counterpart of hypokinetic dysarthria is hyperkinetic dysarthria, originating in the basal ganglia as well. Darley et al. (1969a) described cases of dystonia and chorea as diseases leading to hyperkinetic dysarthria, though it could also be caused by disorders such as myoclonus, tremor and athetosis (Kent, Duffy, Slama, Kent & Clift, 2001). Common signs of hyperkinetic dysarthria are imprecise consonants, prolonged intervals, distorted vowels, and harsh voice (Darley et al., 1969a).

(11)

1.2.6. Mixed Dysarthria

The final category that Darley et al. (1969a) distinguished is a combination of dysarthrias. According to Darley et al. (1969a), this type is a combination of flaccid and spastic dysarthria, resulting from lesions in both the lower and upper motor neurons. This is for example the case in people who suffer from ALS (Tommik & Guiloff, 2010). Other combinations of dysarthria types could occur as well (Duffy, 2006).

1.3. Mayo System Replications

Darley et al. (1969a) used a combination of perceptual analysis and neurological assessment to differentiate types of dysarthria, often referred to as the Mayo System. After the introduction of this system, numerous studies have since used auditory-perceptual assessments as a means of diagnosing and differentiating dysarthria subtypes. Critical replications have also been done by researchers, for instance by Zyski and Weisiger (1987), who tested whether auditory-perceptual analysis alone would be sufficient to identify the types of dysarthria. Speech clinicians were asked to either indicate whether a ‘deviant dimension’ was present or not in 28 dysarthric speakers, or to write down the three most notable deviations. The first method led to a classification accuracy of only 19%, whereas two groups that applied the second method scored accuracy levels of 55% and 56%. Low accuracy of identifying correct dysarthria types was also found by Van der Graaff et al. (2009), who asked a group of neurologists, speech therapists, and neurology residents to rate dysarthric speech. Overall, the three groups were correct in identifying dysarthria types in around 40% of the cases. Van der Graaff et al. (2009) furthermore analyzed interrater agreement, and this was found to be only fair to moderate, though Darley et al. (1969a) wrote about adequate interrater agreement in their article. Bunton, Kent, Duffy, Rosenbek and Kent (2007) found somewhat higher levels of interrater agreement, although they remarked that auditory-perceptual ratings should not be used as the only way to diagnose dysarthria, and recommended more research in this area.

These equivocal results concerning the accuracy and interrater agreement of the Mayo System calls for a different method as an addition to perceptual judgements. Perceptual judgements are sensitive to subjectivity, and not every aspect of speech can be detected, contrary to acoustic measurements (Bunton et al., 2007; Kent et al., 1999). In their review article, Kent et al. (1999) describe how acoustic studies are at the same time informative and challenging, though studies combining acoustic and perceptual information were rare at that time. However, they predicted that it would be more common in the future, and indeed, more

(12)

and more dysarthria studies are using acoustic measures (Kent & Kim, 2003). A type of acoustic studies is formant frequency analysis, which is less subjective than perceptual judgments, and changes in vowel production can reflect underlying difficulties in controlling speech muscles (Liu et al., 2005). This is what will be analyzed in this study as well, and measures used in these types of analyses will be described thoroughly in the following section.

1.4. Vowel Measurements

When sounds are produced with our mouths, the movements of the jaw, lips and tongue influence the shapes of the cavities within the mouth and throat, creating resonance frequency bands (Rietveld & Van Heuven, 2009; Skodda, Visser & Schlegel, 2011). These sound frequencies are called formants and range from 0 to 5000 Hz. The first two formants (F1 and F2) are particularly involved in the production of vowels, and their frequencies change according to the movements of the tongue and jaw (Skodda et al., 2011). F1 increases when the jaw lowers, and decreases when the mouth is closing. The second formant is influenced by the tongue, and increases when the tongue moves forward, and consequently decreases when the tongue moves backward. Thus, a vowel that is pronounced in the back of the mouth when the jaw is lowered, as for example /a:/ produces a high second formant and a low first formant. Vowels can easily be visualized in a vowel chart, where the x-axis displays the second formant and the y-axis displays the first formant (see Figure 1). As can be seen, at the extremities of the chart are the vowels /i/, /a:/, and /u/, and together they form a so called ‘vowel triangle’.

(13)

In dysarthria, control over speech muscles is impaired, and less control over the tongue and jaw could cause vowels to be distorted. The three corner vowels are often used for vowel analyses, for the reasons that they exist in almost every language and due to their extreme positions that influence perception and measures of formant frequency (Liu et al., 2005). Less control over the tongue causes these corner vowels to be less distinct, and in the case of hypokinetic dysarthria, limited movement range of speech muscles could cause the already mentioned articulatory undershoot, when vowels cannot reach the intended formant frequency (Robertson & Hammerstad, 1996). Thus, for instance, when for the vowel /a:/ the F1 is lower and the F2 is higher than normal, causing the vowel to move closer to the central point of the vowel triangle (Kent & Kim, 2003). There are several ways to compare the formant frequencies of the corner vowels of one group to another. For example, Liu et al. (2005) calculated the Vowel Space Area (VSA) of young adults with cerebral palsy and dysarthria, using the following formula:

VSA = 0.5 · | F1i · (F2a - F2u) + F1a · (F2u - F2i) + F1u · (F2i - F2a) |

In this equation, F1n refers to the value of the first formant of a vowel, and F2n to the second formant of a vowel. The outcome of this equation captures the space between the corner vowels in Hz, where a lower value indicates a smaller VSA due to vowel centralization. In normal articulation and hyperarticulation, a higher VSA could be expected (Roy, Nissen, Dromey & Sapir, 2009). Liu et al. (2005) found that their group with dysarthria had a significantly lower value for the triangular VSA than the healthy control participants with which they were compared. People with spastic dysarthria due to cerebral palsy participating in the study by Kim, Hagasewa-Johnson and Perlman (2011) also exhibited reduced VSA. This furthermore significantly influenced their intelligibility. In other types of dysarthria, as studied by Lansford and Liss (2014a & 2014b), reduced VSA was also found. Their group of participants consisted of people with ataxic dysarthria, hypokinetic dysarthria due to PD, hyperkinetic dysarthria, and mixed dysarthria. Alike the study by Kim et al. (2011), this influenced the intelligibility of dysarthric speakers, though only for the female speakers (Lansford & Liss, 2014a & 2014b). Rusz et al. (2013) also found reduced VSAs for people with PD as opposed to healthy speakers, though the participants only consisted of male speakers.

Some researchers found no deviations in the VSA of dysarthric speakers at all, such as Sapir, Ramig, Spielman, Story and Fox (2007). In their article, they introduced a form of voice

(14)

treatment, and compared speech of people with PD to that of healthy control participants before and after treatment. The authors analyzed vowel formants, and used VSA as one of their measurements. They, however, did not find a smaller VSA for the group of people with PD before or after voice treatment. More studies lacked significant results regarding VSA, among these the study by Strinzel, Verkhodanova, Jalvingh, Jonkers and Coler (2017). The authors ran acoustic analyses among females and males with PD, and found no differences in VSA when comparing them to a gender-matched healthy control group. Lastly, Sapir, Ramig, Spielman & Fox (2010) also lacked significant results regarding the VSA of people with and without PD. All in all, VSA seems to be non-beneficial in speech analysis of – at least – people with PD.

Because of inconsistency in results and sometimes gender sensitivity, new metrics were proposed. One variable is called vowel articulation index (VAI), and is said to minimize the effects of inter-speaker variability (Roy et al., 2009). Like VSA, it also uses the triangular vowel space, and uses the following equation, extracted from the article by Roy et al. (2009):

VAI = F1a+F2i (F1i +F1u)+(F2a+F2u)

In the case of vowel centralization, the numerator increases and the denominator decreases, causing a smaller VAI value (Roy et al., 2009). Sapir, Ramig, Spielman and Fox (2011) presented the developed variable VAI, contrasting it to VSA and the logarithmic version of VSA (lnVSA). According to Sapir et al. (2011), logarithmic scaling should inter-speaker variability, and the authors expected the VAI to be the best predictor of differences between people with PD and people without speech impairments. This expectation was correct, with no significant differences between groups regarding VSA, greater differences with lnVSA, and highest significant differences when using VAI. More studies confirmed the improved effect of VAI, for instance the one executed by Skodda et al. (2011). VSA and VAI outcomes of people with PD were compared to see which variable would most appropriately reflect vowel articulation changes. The authors found correlations between VSA and VAI, and reductions in VSA and VAI were found for the male PD group compared to a healthy control group. However, although the VAI was significantly reduced in the female PD group as well, VSA was not. The authors concluded that VAI was a better tool for detecting subtle changes in vowel articulation than VSA, and would thus be more useful (Skodda et al., 2011). The fact that gender did not influence VAI confirmed the outcomes of Roy et al. (2009).

(15)

Other studies using VAI are for instance one by Rusz et al. (2013), who used different tasks to extract speech of male PD speakers. Besides VSA, they found that VAI was reduced, compared to a healthy control group. The earlier mentioned study by Strinzel et al. 2017) revealed significant differences between a group with PD and a healthy control group when looking at VAI, contrary to VSA. Gender influences were found in the study by Skodda, Schlegel, Klockgether and Schmitz-Hübsch (2013), who studied vowel formants in speakers with SCA. In female speech, VAI was significantly reduced compared to healthy speech, but not in male speech. The authors ascribed these gender differences to the sexual dimorphism of the speech muscles and cavities in the mouth (Skodda et al., 2013). Gender also influenced the outcomes of other formant frequency measures that they used. Kim, Kent & Weismer (2011) also mentioned that formant frequencies were affected by gender.

Besides vowel measurements that take the entire triangular vowel space into account, other measurements have also shown to be of aid when studying differences between dysarthric and normal speech. For instance, Sapir et al. (2010), who looked at the speech of individuals with PD and individuals with nonimpaired speech, used the ratio between the F2 of the vowels /i/ and /u/ to differentiate between these groups. According to the authors, this ratio is often large in English, and the anterior-posterior movements of the tongue, as well as lip rounding influence this ratio intensely. The authors predicted that when articulatory undershoot is present, this ratio diminishes (Sapir et al., 2010), and indeed, their results displayed significant differences between the groups by measuring this ratio. In another study, Sapir and colleagues (2007) found smaller ratios in PD as opposed to a healthy control group. Moreover, voice treatment targeting vowel articulation significantly improved this ratio. The Czech male speakers with PD in the study by Rusz et al. (2013) also exhibited a smaller ratio compared to the healthy control participants, which was confirmed by Strinzel et al. for German speakers (2017).

Lastly, stability of vowel frequencies is sometimes used for acoustical analysis, using the standard deviations of the mean frequencies. This could be calculated per formant, by either combining the three corner vowels, or analyzing them separately. Strinzel et al. (2017) did the former, and found that F2 variability differed significantly between the female PD speakers and the healthy control group. The male groups did not reveal any differences in vowel variability. Other researchers also found decreased formant steadiness in people with hypokinetic dysarthria compared to nonimpaired speakers (Beverly et al., 2008). Skodda et al. (2013) calculated the coefficient of variance to analyze variability, by dividing the standard deviation

(16)

with the means of all formant values, times a hundred. They did this for all formants separately and found higher F1i variabilities for males and females with SCA than healthy controls. Moreover, in the male group this correlated with intelligibility. Kim et al. (2011) found a main effect of regression for both F1 and F2 variability in spastic dysarthria, which also influenced intelligibility.

1.5. Intelligibility Testing

As seen in the article by Darley et al. (1969a), different aspects of the speech can influence intelligibility. These aspects could be taken into account in speech intervention, which is often focused on increasing the intelligibility of dysarthric speech. Intelligibility can be measured using objective or subjective measurements, as outlined by Hustad (2006).

Objective intelligibility is often measured by transcribing words, or by choosing between different word options after hearing a word. The percentage of correct answers or transcriptions can then be easily calculated. There are also standardized tests to assess intelligibility, such as the Sentence Intelligibility Test (Yorkston, Beukelman & Tice, 1996), the Phoneme Intelligibility Test (Yorkson, Beukelman & Tice, 1999), and the Assessment of Intelligibility of Dysarthric Speech (Yorkson & Beukelman, 1981).

Subjective intelligibility means assigning a value to a heard word, often either on an interval scale or using percentage estimations (Hustad, 2006). The latter is often used in clinical situations. After finding inconsistent results in previous studies, Hustad (2006) described in her article how she compared objective intelligibility through transcriptions with subjective intelligibility using percentage estimates. Listeners were instructed to transcribe sentences, and after each group of ten sentences, they were asked to estimate how many of the previously heard sentences they found intelligible. The author found that the percentages of estimated intelligibility were lower than the amount of words correctly transcribed. This indicates that subjective judgements of intelligibility are not reliable and measuring it objectively is preferable.

For the current study, it was necessary that the intelligibility of the corner vowels was assessed, instead of the entire words. One way of doing this is by using minimal pairs. For instance, Levy et al. (2016) investigated the intelligibility of children with spastic dysarthria. Because their goal was to explicitly look at vowel intelligibility, they instructed the children to repeat words that were minimal pairs. These words contained contrastive vowels, such as ‘soap-soup’ [/o-u/]. Adults orthographically transcribed these minimal pairs, and rated the

(17)

intelligibility on a scale of 1-9. The authors found a discrepancy between the transcription and ease of understanding, and concluded that the first way is preferable, since other variables could influence intelligibility in subjective rating. This way of measuring objective intelligibility pairs would also be preferred in the current study, although the materials were not specifically designed for this approach to be executed. Van Wijngaarden (2001) studied the phoneme and sentence intelligibility of native and non-native speakers of Dutch. For the phoneme intelligibility, he used CVC-structured words, and for the vowels he gave the raters 15 word options to choose from. Only the middle vowel differed between the word options, making them minimal pairs, and this way only the intelligibility of the vowel was measured. Therefore, the method by Wijngaarden (2001) was adapted for the current study, and will be described in further detail later in this thesis. Furthermore, subjective intelligibility scales were used to compare them with the objective judgements, to see if it differs as well as in previous studies (Hustad, 2006; Levy et al., 2016).

1.6. The Current Study

Taken together, studies have thus far investigated acoustic measures in different dysarthrias, as well as the influence of vowel measures on intelligibility, though none have (successfully) combined these two aspects. Therefore, the aim of the current study was to differentiate between the dysarthria types occurring in PD (hypokinetic dysarthria) and SCA (ataxic dysarthria) as classified by Darley et al. (1969a), using vowel formants of the three corner vowels, and to see how the corresponding vowel measures correlate with intelligibility. Vowel measures were used to analyze speech of the speakers with PD, SCA, and without neurological impairments. Next, intelligibility of these three groups were judged by naïve listeners, after which correlations between the vowel measures and intelligibility of the different dysarthrias were done. The recordings of dysarthric speakers came from the same database as used by Verkhodanova et al. (2019). After promising results in investigating prosody of different dysarthria types, Verkhodanova et al. (2019) recommended further research in the direction of other acoustic cues such as formant measurements in light of different groups of dysarthria and its perception. The current study could be seen as a follow-up on their study, with the main goal to analyze how speech of people with different types of dysarthria vary in vowel formant deviations as captured by acoustic analyses and intelligibility tests.

First, five different vowel measures were used to investigate differences in formant frequency between the three groups. These measures were (1) Vowel Space Area (VSA), (2)

(18)

Vowel Articulation Index (VAI), (3) F2i/F2u ratio, (4) F1/F2 contrasts, and (5) F1/F2 variability. There were doubts in the applicability of VSA as a measure to differentiate between dysarthric and non-dysarthric speech, coming from conflicting results in previous studies regarding VSA (Kim et al, 2011; Lansford & Liss, 2014a, 2014b; Rusz et al., 2013; Sapir et al., 2007, Sapir et al., 2011; Strinzel et al., 2017). However, general patterns can be found with VSA, and visualizing vowel triangles can be done easily through VSA (Skodda et al., 2013), which is why it was used in this study. VAI was used as an alternative to capture differences in vowel space area between dysarthria types, since it is generally less gender sensitive, and has proven to reveal differences no matter the severity of dysarthria (Roy et al., 2009; Skodda et al., 2011). Measuring F2i/F2u ratios was done as well, since previous studies have demonstrated smaller ratios for dysarthric speakers compared to non-dysarthric speakers, and that the presence of centralized vowels is the underlying cause for small ratios (Rusz et al., 2013; Sapir et al., 2007; Sapir et al, 2010). Contrasts between the F1 of /a:/ and /i, u/, and between the F2 of /i/ and /a:, u/ were used to see whether differences in frequencies between high and low vowels and between back and frontal vowels become smaller in dysarthric speakers. According to Kim et al. (2011), contrasts were present in most of their participants with dysarthria, though it did correlate with intelligibility. The fifth measure was F1 and F2 variability using standard deviations, which measures the stability of retaining a formant frequency of a vowel. For both hypokinetic and ataxic dysarthria, studies showed reductions in variability (Beverly et al., 2008; Skodda et al., 2013; Strinzel et al., 2017), and thus it would be interesting to investigate this in the current study as well.

In the second part of this study, a task was administered to naïve listeners to assess the level of intelligibility of each dysarthric and non-dysarthric speaker. Single syllable words containing corner vowels, pronounced by these speakers, were used in this experiment. Adapted from the method by Van Wijngaarden (2001), listeners could choose between six word options that only differed in the vowel after hearing a word to study only vowel intelligibility. Levels of intelligibility were calculated with the amount of accurate responses, and could then be used to study the presence of differences in intelligibility between the three groups of speakers. Subjective intelligibility were also measured to see whether it corresponded with the accuracy scores, or differed such as in previous studies (Hustad, 2006; Levy et al.; 2016).

Finally, outcomes of the vowel measures were linked to the outcomes of the intelligibility experiment, to see whether any of the vowel measures correlated with intelligibility. Lansford and Liss (2014a) found that all vowel measures used in their study were

(19)

linked to the intelligibility of male speakers, though other researchers found specific measures to correlate with intelligibility (Kim et al., 2011, Skodda et al., 2013), or found no correlations at all (Strinzel et al., 2017). Hence, the current study will hopefully give more clarity with regard to these conflicting results.

2. Method 2.1. Participants

Participants for this study were taking part in a larger research project on dysarthria by Verkhodanova et al. (2019). The participants were divided into three groups, of which two were dysarthric speakers and one was a healthy control group. Speakers in the first dysarthria group consisted of PD speakers, and speakers in the other group were suffering from SCA or a related disease (see Table 2 for specifications). The assumed corresponding dysarthria types were respectively hypokinetic and ataxic dysarthria, where the diagnosis was based on the articles by Darley et al. (1969a & 1969b). All groups were age- and gender matched, and each group consisted of eight native Dutch-speaking individuals, adding to a total of 24 participants.

Table 2 Summary of Demographics HC PD SCA Mean Age 55;10 55;1 51;11 Gender – male:female 4:4 4:4 3:5 Mean Duration of Disease N.A. 13;5 6;1

Diagnosis N.A. Idiopathic Parkinson’s Disease

Spinocerebellar ataxia, adult form of Alexander

disease, idiopathic late onset cerebellar ataxia, or

multiple system atrophy with cerebellar ataxia Assumed Dysarthria

Type

N.A. Hypokinetic dysarthria Ataxic dysarthria

(20)

A summary of the participant characteristics of each group can be found in Table 2. For complete descriptions of the individual characteristics, see Table A1 in Appendix A. Exclusion criteria for dysarthric speakers were the following, as described in the article of Verkhodanova et al. (2019): scores lower than 26 on the Dutch Mini-Mental State Examination (MMSE; Kok & Verheij, 2002); having impaired vision and/or hearing; brain damage due to stroke resulting in other language and/or (motor) speech impairments besides dysarthria (aphasia or apraxia of speech). The healthy controls were also excluded in the case of scores lower than 26 on the MMSE, and when they were experiencing language and/or (motor) speech impairments.

2.2. Data Collection

For the collection of speech, different tasks were administered among the participants for the purposes of a large research project by Verkhodanova et al. (2019). The tasks were audio recorded and were administered in either a quiet room in the University Medical Centre Groningen or at home with the participants. Before starting with the recordings, participants read and signed an informed consent. After this, they filled in a questionnaire which included questions concerning exclusion criteria, as well as questions about medication use and questions regarding their daily use of language. They were told that all speech would be recorded during the remaining time with the examiner. The recording device that was used was the TASCAM DR100, combined with an external Senheiser e865 microphone, which was placed in front of the participant at a distance of 40 cm (Verkhodanova et al., 2019).

During the speech collection, the MMSE was administered first, had to have a score of at least 26 in order to continue with the speech tasks of the data collection. This cutoff score was chosen to have a certainty that participants did not show any signs of dementia. After this, participants were interviewed, were asked to do a prolonged phonation of the vowel /a:/, to describe pictures and a video, to read a text, and to do tasks that assessed intonation. Administering the entire test battery took between 25 and 35 minutes. Participants did not receive any compensation for taking part in the study.

2.3. Vowel Annotations

Because of the widespread purposes of the data collection, each task focused on a different aspect of the speech. For this study, free speech collected through interviews, semi free speech collected through picture and video description tasks, and read speech were used.

(21)

By using these samples, more natural sounding speech was used, which is the preferable way of analyzing vowels (Skodda et al., 2013). The recordings were carefully analyzed via the computer program Praat (Boersma & Weening, 2020), which is an advanced program specifically designed for analyzing speech. The corner vowels that were fit for this study were annotated, following the strict rules that were composed by Strinzel et al. (2017):

- Vowels were at least 40 milliseconds long and the most stable part of the vowel was selected.

- Vowels could not be preceded by nasals, voiced consonants or other vowels, due to coarticulation effects. Exceptions were made for voiced consonants produced at the same place as the intended vowel, which would therefore not influence the vowel formants.

- Vowels were only included when they were part of intelligible and clear phonated words, excluding for instance whispered speech.

- Vowels could not be preceded by nasals, glides or other vowels. When a vowel was followed by nasals, glides or other vowels, the end boundary of the vowel would be moved closer to the beginning of the vowel to avoid coarticulation.

2.4. Phonetic Analysis

After carefully annotating all eligible vowels, values of the first two formants of these vowels were measured using Praat scripts. With the retrieved values, vowel measures could be calculated, to analyze differences between the four groups. These vowel measures were chosen based on previous research showing that they were sensitive enough to detect differences between groups or to influence intelligibility. Calculations were done using statistical scripts written in R.

2.5.1 Vowel Space Area

First, the mean vowel space area (VSA) was calculated, which covers the area between the three corner vowels. The corresponding equation is the following, where F1i stands for first formant of /i:/, F2a for second formant of /a:/ and so on (Liu et al., 2005):

VSA = 0.5 · | F1i · (F2a - F2u) + F1a · (F2u - F2i) + F1u · (F2i - F2a) |

The higher the outcome of the formula, the more widespread are the corner vowels. Consequently, when the outcome is a smaller value, vowel space area is reduced and the corner vowels are more centralized. VSA was calculated for each participant with the mean formants

(22)

of each vowel. Afterwards, the three groups were compared using non-parametric Kruskal-Wallis rank sum tests and Mann-Whitney U tests as post hoc tests.

2.5.2. Vowel Articulation Index

The vowel articulation index (VAI) is a ratio that was introduced by Roy et al. (2009) and is expressed as the following:

VAI = F1a+F2i (F1i +F1u)+(F2a+F2u)

This measure is also based on the vowel space area, however it is believed to be more stable among different groups and in addition it is able to detect more subtilities than the VSA equation (Roy et al., 2009; Skodda et al., 2011). VAI also captures vowel centralization, where a smaller ratio indicates a greater centralization and vice versa. Like VSA, mean VAI was first measured for each individual, and subsequently, the three groups were compared using the Kruskal-Wallis rank sum tests with Mann-Whitney U tests for post hoc testing.

2.5.3. F2i/F2u Ratio

The ratio between the second formants of /i:/ and /u:/ were also calculated to measure anterior and posterior movements of the tongue. This measure is expressed as F2i / F2u, and the lower the ratio is, the more reduction there is in the tongue movement (Sapir et al., 2010). For each individual, ratios of all /i/ and /u/ combinations were calculated, and distributions were created for each participant. These distributions were then taken together and through Kruskal-Wallis rank sum tests and Mann-Whitney U post hoc analyses, groups were compared.

2.5.4. F1/F2 Contrast

Fourth, the contrasts of the first two formants between the three corner vowels were analyzed. Here, F1 and F2 frequencies are independent variables, with the vowels as dependent variables. As /a:/ and /u:/ are back vowels, and /i:/ a frontal vowel, F2 contrasts were expected here. F1 contrasts were expected between the low vowel /u:/ and the high vowels /a:/ and /i:/. For each speaker, nonparametric Kruskal-Wallis rank sum tests were executed with post-hoc tests to view differences in F1 and F2 between all three vowels. Differences or lack of differences were noted for each speaker, after which differences between the three groups could be calculated.

2.5.5. F1/F2 Variability

Lastly, the variability of F1 and F2 was analyzed for every vowel, using the mean standard deviation of each speaker. The outcome indicates the stability of the pronounced vowel, as explained by Kim et al. (2011). Higher variability means that the speaker is less capable in maintaining a stable vowel. For this measure, mean variability for each individual

(23)

was extracted through R, and groups were compared using nonparametric Kruskal-Wallis rank sum tests and Mann-Whitney U tests for post hoc analyses.

2.5. Perceptual Task

To assess intelligibility of the corner vowels, a perceptual task among naïve listeners was administered in the second part of the study. Words were selected from the words with annotated corner vowels, structured as either CVC or CVCC/CCVC. Two words per vowel were selected for each speaker, adding to a total of six target words per speaker. Unfortunately, not every speaker produced enough suitable words, so for some speakers, only four or five target words were used in the task. Six fillers per speakers were selected as well, of which four were words with a non-corner vowel, and two were nonwords with a non-corner vowel. This way, the risk of participants finding a pattern in the stimuli would be avoided. In total, 251 words were included in the perceptual task, which can be found in Appendix B.

Following the method described in Van Wijngaarden (2001), listeners would hear a word, and after each word they could choose between six words that best matched the heard word according to them. These word options could either be true words or nonwords, and contained either a rounded (/o:, ɔ, u, ø, y, ʏ/) or an unrounded vowel (/i, ɪ, e:, ɛ, a:, α/), depending on whether the pronounced vowel was rounded or unrounded. This way, the word options were minimally paired to the uttered word, as only the vowel was deviant from the intended word. For example, ‘toen’ (then) would have the word options – besides the intended word – ‘toon’, ‘ton’, ‘teun’, ‘tuun’, and ‘tun’. The script of the experiment was written in JavaScript, and executed in the online experiment tool JATOS (Lange, Kühn & Filevich, 2015).

Eighteen listeners, native speakers of Dutch, participated in the experiment, of which 12 were female, and the mean age was 24;9 (SD = 1;10). There were two more people who enrolled for the experiment, but both came across technical issues thus could unfortunately not participate. The participants all followed or finished higher education, and none were professionals in the linguistic field, though two people had experience working with neurodegenerative diseases. They all grew up and/or were living in the northern provinces of the Netherlands, which was convenient since the dysarthric speakers lived in the same area and possible dialects of the speakers would be of less influence. Detailed demographics of these participants can be found in Appendix C. The Dutch listeners first digitally signed an informed consent and filled in a questionnaire including questions about for instance usage of language, education level, and current profession. After finishing the questionnaire, they were

(24)

automatically directed to the online experiment in JATOS (Lange et al., 2015). They were instructed to use headphones during the entire experiment, to account for better hearing of the items. First, instructions were shown, and a trial round with three words – a target word, a real word filler, and a nonword filler – was executed. After this, each item played twice in an automatic generated randomized order, and listeners could select one of the shown words as described above. Every time a word was selected, it was followed by a screen with a slide bar. On a scale of 1 to 5, listeners could indicate how intelligible they thought the word was. The entire experiment took between 20 and 25 minutes. The listeners did not receive any compensation for their participation.

2.6. Analysis Perceptual Task

Listeners could choose between six answer options for each item, but only one word option was correct. Thus, the outcomes of the experiment was either scored as correct (1) or incorrect (0), with which statistical analysis could be conducted. The subjective intelligibility scores remained the same, namely 1 to 5. For comparisons between objective and subjective intelligibility, the measures were both converted into percentages. The percentages were also used for correlations between the vowel measures and vowel intelligibility. Binary logistic regressions were performed with the accuracy scores as dependent variable and with diagnosis and type of vowel as categorical predictors.

3. Results 3.1. Acoustic measurements

Values for all measures were calculated, and their means and standard deviations can be found in Table 3. For all measures, nonparametric Kruskal-Wallis rank sum tests were used because groups were small and thus not normally distributed. The vowel space areas were first visualized via vowel plots for the genders together, and for both genders separately (see Figures 2 to 4). In the vowel plot for all speakers, the PD group seemed to display the smallest vowel space area, followed by the SCA and the HC group, which were close together. This pattern was confirmed by the values for VSA, but after performing Kruskal-Wallis rank sum tests, differences in VSA between groups were not significant (

2(2) = 0.29, p = .87). The logarithmic version of the VSA was also used to study differences between groups, following Sapir et al. (2011) who found larger differences between dysarthric and non-dysarthric speakers compared to VSA by doing this. However, no differences were found for log VSA in the current study

(25)

either (

2(2)= 0.29, p = .87). Both the HC and PD group had a mean VAI value of 0.81, and overall the groups did not differ (

2(2)= 0.62, p = .73). No group differences were found for the F2-ratios between /i/ and /u/ either (

2(2)= 0.31, p = .86). For the F1 and F2 contrasts, significant individual differences were found for all but a few speakers between the formant frequencies of the intended vowels (see Appendix D). This means that all groups were able to distinctively produce vowels. No further comparisons were done between the groups because all groups (almost) reached the maximum score of 1. F1 and F2 stability was analyzed by calculating the mean standard deviation of F1 and F2 of each speaker separately, and comparing the three groups using non-parametric Kruskal-Wallis rank sum tests. These deviations did not differ significantly (resp.

2(2) = 0.22, p = .90;

2(2) = 0.82, p = 0.67).

Table 3

Scores for VSA, VAI, F2-ratio, F1/F2 Contrasts and F1/F2 Variability (Mean and SD)

Group VSA VAI

F2-ratio F1 contrasts F2 contrasts F1 variability F1 variability HC M 126729.70 0.81 1.67 1 0.94 169.47 433.91 SD 83654.53 0.10 0.36 0 0.24 32.55 107.23 PD M 108040.06 0.81 1.72 1 1 168.00 464.78 SD 76591.34 0.07 0.33 0 0 38.79 72.71 SCA M 124195.13 0.78 1.62 1 1 175.55 424.70 SD 67609.53 0.06 0.24 0 0 44.00 81.17

Note. VSA = Vowel Space Area, VAI = Vowel Articulation Index, HC = Healthy Controls, PD = Parkinson’s Disease, SCA = Spinocerebellar Ataxia, M = Mean, SD = Standard Deviation. F2-ratio is between /i/ and /u/, F1 contrasts are between /a:/ and /i, u/, F2 contrasts are between /i/ and /a:, u/, F1 and F2 variability is the averaged standard deviation of each speaker’s formant frequencies.

(26)

Figure 2. Vowel Space Areas all Speakers.

(27)

Figure 4. Vowel Space Areas Males.

This lack of vowel measure deviances between groups was not expected, thus we wanted to go deeper into the statistics to analyze possible causes of such outcomes. First of all, the standard deviations of the measures were large for each group, causing great variance in all three groups. However, there were no clear outliers that could be removed from the dataset.

Differences in formant frequencies and formant frequencies variability were also analyzed for each vowel separately, to see how the individual vowels influence the vowel measures. For formant frequency variability, the mean standard deviation of all vowels of each speaker was calculated. These data can be found in Table E2 of Appendix E, and no group differences were found for any of the vowels after performing Kruskal-Wallis rank sum tests (F1a:

2(2) = 2.84, p = .24, F2a:

2(2) = 2.21, p = .33, F1i:

2(2) = 3.82, p = .15, F2i:

2(2) = 4.75, p = .09, F1u:

2(2) = 5.99, p = .05, F2u:

2(2) = 0.82, p = .67). Thus, all groups seemed to be equal in retaining each vowel frequency.

However, differences between groups were found when analyzing the distributions of the formant frequencies of each vowel group separately (see Table E1). This was also done with Kruskal-Wallis rank sum tests because most datasets were not normally distributed. For the F1 and F2 of the vowel /a:/, large group effects were found (resp.

2(2) = 38.55, p = < .05;

2(2) = 71.77, p < .05), and Mann-Whitney U tests revealed that the SCA group displayed the largest

(28)

F1 for vowel /a:/, followed by HC and PD (p < .05). For the F2 of /a:/, SCA and PD did not differ significantly (p = .11), but for both the SCA group and the PD group the F1 was significantly higher than that of the HC group (p < .05). The F1 of /i/ also differed significantly between the three groups (

2(2) = 24.68, p < .05), with the SCA group exhibiting the largest F1 (p < .05), followed by the PD and HC group, the latter not differing significantly (p = .55). A significant group difference was also found for the F2 of /i/ (

2(2) = 17.33, p < .05), with post hoc tests showing a significantly larger F2 for the PD group than the HC group (p < .05), but a lower F2 for PD compared to SCA (p < .05). The SCA group did not differ from the HC group (p = .21). For the F1 of /u/, a significant group effect was found (

2(2) = 11.80, p < .05), with significant differences between the PD and HC group (p < .05) and the PD and SCA group (p <.05), but no significant difference between the SCA and HC group (p = .15). The Kruskal-Wallis rank sum test revealed no group differences for the F2 of the vowel /u/

2(2) = 0.93, p = .063).

3.2 Perceptual Experiment

3.2.1. Objective vs. Subjective Intelligibility

In Table 4, mean accuracy scores and mean subjective intelligibility scores (ranging from 1 to 5) can be found per group of speakers. To explore whether the objective and subjective intelligibility correlated, percentages for both measures were compared. Scores for these measures per speaker can be found in Appendix F. The results for both measures were not normally distributed, thus a nonparametric Wilcoxon Signed Ranks Test was performed. With a mean of 84.38 (SD = 14.46) compared to a mean of 71.74 (SD = 11.82), the percentage of accurate answers was significantly higher than the subjective intelligibility (Z = -3.69, p < .05). Thus, listeners thought the speakers were less intelligible than they actually were.

3.2.2. Diagnosis vs. Accuracy

A chi-square test was performed to analyze accuracy, and it revealed significant differences between the groups (

2(2) = 38.24, p < .05). The HC group received the highest percentage correct answers, followed by the SCA group and the PD group (see Table 4). Subsequently, a binary logistic regression analysis was executed, to see if the intelligibility measures for different groups actually predicted correct or incorrect answers. In this regression, group was the categorical predictor (with three categories), and accuracy was the categorical

(29)

dependent variable (with two categories). The ‘enter method’ was used, with the ‘diagnosis’ HC as indicator for the predictor variable (R2 = .00). The groups HC and PD were both significant predictors of accuracy (p < .05). Results indicated that in the perceptual experiment, listeners were more likely to choose a wrong word option when the corresponding speaker had the diagnosis PD (B = -0.96, SE = 0.16). The B estimate was also negative for SCA (B = -0.64, SE = 0.16), but it was no significant predictor (p = .38).

Table 4

Objective and Subjective Intelligibility (Cross Tabulation) Group Amount of Incorrect Answers Amount of Correct Answers Total Percentage correct (SD) Subjective intelligibility (SD) HC 69 687 756 0.91 (0.29) 4.00 (1.05) PD 135 513 648 0.79 (0.41) 3.46 (1.23) SCA 115 605 720 0.84 (0.37) 3.33 (1.20) Total 319 1805 2124

Note. SD = Standard Deviation, HC = Healthy Controls, PD = Parkinson’s Disease, SCA = Spinocerebellar Ataxia.

3.3 Correlations Vowel Measures and Accuracy

The percentage correct answers per speaker was used to analyze correlations between these percentages and the vowel measurement values per speaker. This was done via non-parametric Spearman’s correlations, because the percentage accuracy was not normally distributed (p < .05). Correlations were performed between VAI, VSA, ratio, F1-, and F2-variablity and the percentage accurate answers, but no significant results were found between the measures and accuracy (see Table 5). Only VSA and VAI correlated nearly significantly with the percentage of true answers per speaker. Correlations were also done between the measures and the intelligibility scale, but no significant or close to significant results were found here.

(30)

Table 5

Correlations Acoustic and Perceptual Outcomes

F1 var F2 var VSA VAI F2 ratio Accuracy Intelligibility F1 var Cor. 1 Sig. - N 24 F2 var Cor. 0.31 1 Sig. 0.15 - N 24 24 VSA Cor. 0.11 .69** 1 Sig. 0.61 0 - N 24 24 24 VAI Cor. .56** .87** .57** 1 Sig. 0.005 0 0.004 - N 24 24 24 24 F2 ratio Cor. .49* .76** 0.36 .93** 1 Sig. 0.015 0 0.08 0 - N 24 24 24 24 24 Accuracy Cor. 0.31 0.33 0.34 0.39 0.33 1 Sig. 0.14 0.12 0.10 0.059 0.12 - N 24 24 24 24 24 24 Intelligibility Cor. -0.33 0.37 0.36 0.29 0.19 0.80** 1 Sig. 0.88 0.075 0.085 0.16 0.37 0 - N 24 24 24 24 24 24 24

Note. F1 var = F1 variability, F2 var = F2 variability, Cor. = Correlation Coëfficient, Sig. = Significance, VSA = Vowel Space Area, VAI = Vowel Articulation Index.

**Correlation is significant at the .01 level (2-tailed) *Correlation is significant at the .05 level (2-tailed)

4. Discussion

In this study, vowel formant measures were used as an aid to differentiate between a group of people with PD and SCA, and to compare them to a group of HC participants. Subsequently, a perceptual experiment was executed among naïve listeners to investigate the intelligibility of the vowels pronounced by dysarthric and non-dysarthric speakers, and to see how these outcomes relate to the vowel measures. No differences between the values for the vowel measures of the three groups were found. There was a difference between the intelligibility of the groups, with the PD as the most unintelligible group, followed by the SCA

(31)

and HC group. However, no correlations were found between these results and the vowel measures of each speaker. In the following sections, we will look deeper into the results and interpret them in light of previous research.

4.1. Acoustic Measures

The investigated vowel measures based on the first two formant frequency bands were VSA, VAI, the F2-ratio between /i/ and /u/, F1 and F2 contrasts between vowels, and lastly F1 and F2 variability. Vowel plots indicated that the HC group exhibited the largest vowel triangle, followed by the SCA and the PD group. Although this pattern was confirmed by the values of VSA, differences between groups were not significant. So these data seemed to confirm that general patterns can be found when using VSA (Skodda et al., 2013), though differences could not be found. In the past, researchers claimed that small but present differences in vowels could not be detected through VSA, and argued that VAI should be better in distinguishing between groups (Roy et al., 2009; Skodda et al., 2011; Strinzel et al., 2017). Yet in the current study, differences in VAI between groups were even less pronounced than in VSA and also lacked significance. Several aspects may have influenced these outcomes. First of all, gender could have been of influence, since the shape of the vowel triangles were quite differently shaped for each gender. Previous studies have also shown that VSA is susceptible to gender (Roy et al., 2009; Skodda et al., 2011), and some authors claim this is the case for other vowel measures as well, such as VAI (Kim et al., 2011; Skodda et al., 2013). For the vowel charts of the male participants, the HC group clearly displayed a larger VSA than the other two groups, contrary to the plots of the females. That is why we believe that gender could indeed be (part of) the underlying cause of absence of differences between groups. Unfortunately, groups were too small to investigate this further. Secondly, there was a lot of variability within the three groups, ranging from extremely small to large values for VSA and VAI. Since there were various types of SCA within the same group, this might at least caused this group to differ. Other authors have also claimed that differences in speech derive from these group differences in SCA (Spencer & Slowcomb, 2007; Schalling et al., 2007; Schalling & Hartelius, 2013). However, by this we cannot explain the variations in the HC and PD group, for which such reports were not found.

We compared formant frequencies of the vowels separately, since the corner vowels are expected to be more centralized in cases of dysarthria (Roy et al., 2009). Differences were found between formants of the vowels /a:/ and /i/ uttered by the SCA and PD group compared to the

(32)

HC group, though these changes in formant frequencies did not always direct towards the central point of the vowel space area. Significant differences were also not detected between the groups for the F2-ratios. Analyses regarding the F2 frequencies of /i/ and /u/ revealed that the F2 of /u/ was equal for each group, and for /i/ it was higher for the SCA and PD group compared to the HC group, while it was expected to be lower in the case of vowel centralization. These were both causing the ratios not to be deviant for the dysarthria groups, though smaller ratios could be expected when there is articulatory undershoot in dysarthria (Rusz et al., 2013; Sapir et al., 2007; Sapir et al, 2010).

After calculating F2 contrasts between frontal vowel /i/ and back vowels /a:/ and /u/, it could be concluded that on average, all speakers were able to distinguish between the front and back vowels. The same applies to the contrasts between /a:/ and /i, u/, high and low vowels, which were distinct for all three groups. Previous studies had been able to capture centralization with F2-ratios and F1 and F2 contrasts (Kim et al., 2011; Sapir et al., 2010), but unfortunately we were not able to do that.

Finally, F1 and F2 variability using standard deviations of formant frequencies was used as a measure to study vowel stability, since in both hypokinetic and ataxic dysarthric speech larger variability has been detected in the past due to inability to retain a vowel frequency (Beverly et al., 2008; Skodda et al., 2013; Strinzel et al., 2017). However, for the current study, no differences between groups were found here, neither for the variability of all vowels combined nor for each vowel separately. This means that overall, speakers were equally able to maintain a stable vowel frequency.

In sum, we were not able to differentiate between groups with any of the vowel measures. This could be due to the above described causes such as gender and variability within the groups, but also to more general reasons. For instance, speakers with PD and SCA were not officially diagnosed with dysarthria, but it was assumed based upon the article by Darley et al. (1969a). While annotating the vowels, it was harder to identify corner vowels for some PD and SCA speakers than for others, which might indicate that they were in a further stadium of dysarthria. Furthermore, Ho et al. (1998) stated that articulation is the last aspect of speech to be affected in people with hypokinetic dysarthria due to PD. Thus, if dysarthria was present, deviations could still not be detected in vowel articulation. Perhaps there were deviations in vowel articulations, but they were still too subtle to be detected by the vowel measures. Not all previous studies agreed unanimously on the usefulness of each vowel measure, because there was a lot of variation in results of different studies. Moreover, the participant groups of the

Referenties

GERELATEERDE DOCUMENTEN

The work presented here explores visual hallucinations in Alzheimer’s disease (AD), dementia with Lewy bodies (DLB), Parkinson’s disease (PD) and visual impairment, and

The baseline characteristics of the three groups are shown in table 2.. Disease duration measured as years since onset of complaints. TMT-A scores are presented as time needed

Unsteady criteria are introduced for both 2D rotor-environment simulations with varying free- stream Mach number and angle of attack and for high-frequency, small amplitude

Objectives: The main aim and objectives of the study were to determine the degree of malnutrition and body composition in individuals with ID living in a

Could a network organization like the Road of Peace be seen as an agent of European consciousness and contribute to a sense of common European

In previous application of the method described here, it turned out that to a good ap- proximation the crystal basis may be restricted to the state 兩⌽典, which is a product of

However, in the abovementioned real situations, the endpoint (black circle) was moved to the direction opposite to the bending direction (blue and red circles). This

Het heem-ijzer draagt door de hoge biobeschikbaarheid* echter (meer dan) 40% bij aan de totale ijzeropname [14]. Vooral rode vleessoorten zijn heem-ijzer-rijk; het bevat meer dan