• No results found

The role of age and gender in acoustic measurements for people with schizophrenia and major depression

N/A
N/A
Protected

Academic year: 2021

Share "The role of age and gender in acoustic measurements for people with schizophrenia and major depression"

Copied!
71
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Lisanne de Boer

S3273717

Faculty of Arts

Neurolinguistics

Supervisors: S. Popov and S.G. Brederoo

July 1, 2020

The role of age and gender in

acoustic measurements for

people with schizophrenia and

major depression

(2)

Abstract

In previous literature it has been investigated if acoustic measurements can be used for distinguishing people with major depression or schizophrenia from healthy people. However, it had never been investigated what the role of age and gender is when acoustic measurements are used for distinguishing people with mental illnesses. The main goal of this study is to examine the role of age and gender by distinguishing healthy participants from people with major depression or schizophrenia. Additionally, to examine if it is possible to make a differential diagnosis with acoustic analysis between mental illnesses like depression and schizophrenia.

In total 279 native Dutch speakers (18 – 59 years) were included in this study, the speech of 141 people with schizophrenia, 18 people with major depression and 120 healthy controls has been examined. The participants were interviewed with a structured interview, which was recorded. The following acoustic measures were analyzed: pitch, loudness, Jitter, Shimmer, Harmonic to Noise Ratio, speech rate and pause time. For the comparison between the groups it is been examined if the differences were caused by age- or gender-related differences.

The results showed for almost all of the acoustic measures gender-related differences between the controls. However, age-related differences have only a small role in men.

Furthermore, it showed that men with schizophrenia differ from controls on different acoustic measures than women with schizophrenia. The variability of loudness, the speech rate and pause time were the only acoustic variables that showed differences that were not influenced by age or gender. Unfortunately, the result show not many differences for people with major depression in comparison with controls or people with schizophrenia.

These finding suggest that it hard to say that the differences that were found for acoustic measurements between healthy and unhealthy people are caused by the disease. This study gives an overview of role of gender and age in speech, that brings us one step? further by using acoustic measurements for diagnosing mental illnesses. However, there are many parameters out there that could influence the speech. Further research is needed to be sure that the differences in speech are caused by the illness and not by something else.

(3)

Contents

Introduction ... 4

Background ... 6

Voice measurement ... 6

Acoustic voice measures in schizophrenia ... 8

Acoustic voice measures in depression ... 9

Differential diagnosis ... 11

Influences of therapy ... 11

Demographic parameters ... 12

The current study ... 16

Research questions ... 17 Expectations ... 17 Method ... 19 Participants ... 19 Materials ... 20 Procedure ... 21 Statistical Analyses ... 21 Results ... 23

Difference between healthy men and women ... 23

Differences between healthy people and people with schizophrenia ... 26

Differences between healthy people and people with Major depression ... 33

Differences between schizophrenia and people with major depression ... 36

Discussion ... 42

The differences between healthy controls ... 42

(4)

The differences between people with major depression and healthy controls ... 46

The differences between people with major depression and people with schizophrenia ... 46

Limitations and recommendations for future research ... 47

Conclusion ... 49

References ... 50

Appendix A – interview worksheet ... 56

V1 - interview ... 57 DEEL A ... 57 DEEL B ... 58 V1 – vragenlijsten ... 60 V2 – interview ... 64 DEEL B ... 64 DEEL C ... 65 V2 – vragenlijst ... 66 V3 - interview ... 67 DEEL A ... 67 DEEL C ... 69 V3 – vragenlijst ... 70

(5)

Introduction

In 2018 in the Netherlands 57.600 people with schizophrenia were registered in the medical system, 37.000 men and 20.600 women (RIVM, 2020a). Furthermore, there were 554.600 people with depression, 192.300 men and 362.300 women (RIVM, 2020b). Those mental illnesses have a big impact on people's well-being. Early diagnosing of mental illnesses is important to give people the help that they need as soon as possible.

There is a trend going on about using voice measures as biomarkers for diagnosing mental illnesses. With the fast-moving technology, it may be possible to diagnose a disease at home. However, it could be questioned if voice measures are reliable enough or that we better stick to the current diagnosing techniques. Nowadays the DSM-5 is the gold standard of psychiatry and there are standard diagnostic instruments for mental illnesses but there are many studies that make critical comments about the DSM-5 (Koukopoulos, Sani & Ghaemi, 2013). According to Koukopoulos et al. (2013) the DSM-5 is proposing diagnostic criteria for depression with mixed features that will lead to more misdiagnosis. For now, mental illnesses like schizophrenia or major depression are often diagnosed using questionnaires and the expert judgment of a psychologist or psychiatrist. For example, Major Depression is diagnosed based on a clinical interview and mental status examination, a method with relatively low reliability (Afshan et al., 2018). Specifically, screening instruments are hampered by poor specificity and sensitivity and no reliable biomarkers exist (Afshan et al., 2018). Moreover, it is estimated that comorbid depression occurs in 50% of the people with schizophrenia (Buckley, Miller, Lehrer & Castle, 2009). Perhaps speech characteristics could be a biomarker that could distinguish depression from schizophrenia and improve the

reliability by diagnosing mental illnesses.

In several studies voice characteristics are mentioned as measures to distinguish healthy and unhealthy people (Compton et al., 2018; Kraepelin 1921). For now, it is unknown if major depression and schizophrenia can be distinguished from healthy participants based on voice characteristics. It would be helpful because if voice characteristics are be able to

distinguish metal illnesses, voice characteristics could help to make diagnosis more reliable. Moreover, it would be interesting to no if

However, voice characteristics are also used to distinguish within the group of healthy individuals. For example, age and gender can be derived by voice measurements (Rojas, Kefalianos & Vogel, 2020; Singh, 2019). This leads to the question whether it is possible to distinguish healthy from unhealthy people if individual differences such as age and gender are

(6)

taken out of the equation. Moreover, it could be questioned for example, if women with schizophrenia differ from a control group on other voice characteristics than a man with schizophrenia. Before acoustic measurements can be used for diagnosing mental illnesses it has to clear how age and gender influence the voice.

As mentioned above there are two main problems: it is unknown if patient groups can be distinguished from each other and from the controls with the use of voice characteristics. Second the influence of age and gender on the voice characteristics in unknown. The aim of the current study is to examine the role of age and gender by distinguishing healthy controls from people with major depression or schizophrenia based on acoustic analyses. These findings can be used for developing a screenings instrument based on voice measurements that can be used for early diagnosing mental illnesses.

(7)

Background

Voice measurement

For this study, the people with major depression, schizophrenia and controls will be compared with regard to seven voice characteristics. Vocal pitch and loudness are indicative of prosody. Jitter, Shimmer and Harmonic to noise ratio (HNR) will be used to describe the voice quality. Articulation rate and pause time will be examined to measure the speech velocity.

Vocal pitch

Pitch, in speech, is the relative highness or lowness of a tone as observed by the ear, which depends on the number of vibrations per second produced by the vocal cords. Pitch is the main acoustic correlate of tone and intonation (The Editors of Encyclopaedia , 1998). Fundamental frequency F0 is used to describe the actual physical phenomenon, whereas pitch describes how our ears and brains interpret the signal, in terms of periodicity. F0 is expressed in Hertz (Hz). In the current literature both terms F0 and pitch are used interchangeably. The mean, variety and range of F0 will be examined in the current study.

Loudness

The loudness of voice is expressed in decibel (dB). For the current study the mean, variety and range of loudness will be examined. Pitch and loudness reflect the basic aspects of the voice source. It has been noticed that speakers who raise their loudness of phonation also raise their mean voice fundamental frequency (Gramming, Sundberg, Ternström, Leanderson & Perkins, 1988).

Jitter

Jitter is defined as the parameter of frequency variation from cycle to cycle. It represents the average absolute difference between two consecutive periods, divided by the average period. So, it measures the irregularity of the vocal fold vibrations. Jitter is one of the features that represent the voice quality, the higher the Jitter value, the lower the voice

quality. Jitter (local, absolute): Represents the average absolute difference between two consecutive periods and is known as jitta.

(8)

Jitter (local): Represents the average absolute difference between two consecutive periods, divided by the average period. It is known as jitt and has 1.04% as the threshold limit for detecting pathologies (Teixeira, Oliveira, & Lopes, 2013). For the current study Jitter (local) will be used.

Shimmer

The methods used to determine the Shimmer are identical to Jitter, the main difference being that the Jitter considers periods and Shimmer takes the maximum peak amplitude of the signal into account. Shimmer represents the average absolute difference between the

amplitudes of two consecutive periods, divided by the average amplitude. Also, for Shimmer applies, the higher the Shimmer value the lower the voice quality. Jitter and Shimmer

represent roughness, breathiness and instability in the voice. Shimmer (local, dB) represents the average absolute difference of the base 10 logarithm of the difference between two

consecutive periods and it is call ShdB. The limit to detect pathologies is 0.350 dB (Boersma, 1993: Teixeira et al., 2013).

Harmonic to Noise Ratio (HNR)

The HNR is a combination of Shimmer, Jitter and aspiration noise (Severin, Bozkurt & Dutoit, 2005). It quantifies the relative amount of additive noise in the voice signal (Awan & Frenkel, 1994). Additive noise arises from turbulent airflow generated at the glottis during phonation (Hillenbrand, 1987). Inadequate closure of the vocal folds allows excessive airflow through the glottis, giving rise to turbulence. The resulting friction noise is reflected in a higher noise level in the spectrum (Krom, 1993). Noise in the signal may also result from aperiodic vocal fold vibration. The ratio thus reflects the dominance of harmonic (periodic) over noise (aperiodic) levels in the voice, and is quantified in terms of dB the following formula it is found the value of HNR (Boersma, 1993)

(9)

Articulation rate

Articulation rate is a prosodic feature, defined as a measure of rate of speaking in which all pauses are excluded from the calculation (Goldman-Eisler, 1968). For this study the voiced segmentations per second, the mean and standard deviation of the length of voiced segments per second will be examined.

Pause time

There are different kinds of pauses, such as filled pauses and unvoiced pauses. Filled pauses are repetitions of syllables and words, reformulations or false starts. However, for this study only the unvoiced pause times will be examined.

Acoustic voice measures in schizophrenia

There are a several studies that reported on vocal pitch and schizophrenia. The current literature has emerged that offers contradictory findings for the comparison of vocal pitch between participants with schizophrenia and healthy participants. Cohen, Alpert, Nienow, Dinzeo and Docherty (2008) found a lower pitch variability for the schizophrenia group. This was confirmed by de the meta analyses of Parola, Simonsen, Bliksted and Fusaroli (2019). Nevertheless, other studies reported a higher pitch perturbation or average pitch for people with schizophrenia (Graux, Courtine, Bruneau, Camus and El-Hage 2015; Meaux, Mitchell & Cohen 2018). Pitch perturbation is the absolute value of change in the fundamental frequency in successive frames. However, there are many previous studies that reported no effect on average vocal pitch (Compton et al., 2018; Parola et al., 2019; Rapcan et al., 2010).

All of the studies used a gender-matched control group and expect from the study of Meaux et al. (2018) the control group was aged matched as well. Previous studies did not focused on age or gender-related differences between the schizophrenia group and the control group. Only Cohen et al. (2008) mentioned that acoustic analyses show gender-related

differences while clinical ratings of vocal inflection do not.

However, the studies used the different methods to measure vocal pitch. Graux et al. (2015) used just one vowel, while Cohen et al. (2008) used spontaneous speech. Compton et al. (2018) showed the method influenced the results. On one hand, they found a difference for vocal pitch in reading aloud, using an emotionally stimulating text passage. On the other

(10)

hand, for the spontaneous speech task and the reading aloud task with a more neutral passage no effect was found for vocal pitch (Compton et al., 2018).

Just a few studies reported something about Jitter and Shimmer as valuable acoustic measure for people with schizophrenia (Kliper, Vaizman, Weinshall and Portuguese 2010; Kliper, Portuguese, & Weinshall, 2015). Kliper et al. (2015) reported that people with

schizophrenia had a significant higher mean Jitter and a higher mean Shimmer. In contrast, in a previous study by Kliper and colleagues (2010), Jitter was reported only different for people with depression in comparison with a reverence group.

Both studies used gender matched controls and an interview task for the acoustic measurements. The studies did not reported about gender or age-related differences. There are no previous studies done that examined the loudness or the HNR of voices from participants with schizophrenia.

Only a few studies reported something about speech velocity. Cohen et al. (2008) examined the speech rate of patients with schizophrenia and a clinically-rated flat affect, patients without flat affect and a reverence group. A flat affect is A severe reduction in emotional expressiveness. Only the speech rate of the patients with clinically-rated flat affect was lower than that of reverence group. Moreover, Cohen, Kim and Najolia (2013) examined the pause time of people with schizophrenia. People with schizophrenia were correlated with increased pause time. Additionally, they reported that the schizophrenia group had a reduced prosody.

For people with schizophrenia, the current literature does not show many strong significant differences for vocal pitch, Jitter or Shimmer in contrasts to people with

depression. According the study of Kliper et al. (2010) people with schizophrenia have high ratings of negative symptoms correlated more with longer gaps and shorter utterances and with lower emphasis measures in comparison with people with depression and healthy controls. Perhaps, vocal pitch, Jitter or Shimmer is a good variable to distinguish people with schizophrenia from depression. However, psychiatric comorbidities are common among patients with schizophrenia. It is estimated that comorbid depression occurs in 50% of the people with schizophrenia (Buckley, Miller, Lehrer & Castle, 2009). For comorbid depression there are no studies done on voice characteristics.

Acoustic voice measures in depression

Pitch features have been investigated extensively for people with depression. As early as 1921, Kraepelin stated that depressed patients' voices tend to have lower pitch. Moreover,

(11)

according to previous studies people with depression show a lower Pitch (F0) than healthy controls (Horwitz et al., 2013; Kiss & Vicsi, 2017; Kliper et al., 2010; Mundt, Snyder, Cannizzaro, Chappie & Geralts, 2007; Nilsonne, 1988; Quatieri & Malyska, 2012; Stassen, 1993; Ellgring & Scherer, 1996; Wang et al., 2019). Nevertheless, a number of studies report no significant correlation between pitch variables and depression (Alpert et al., 2001;

Cannizzaro et al., 2004; Mundt et al., 2012; Quatieri and Malyska, 2012; Teasdale et al., 1980; Yang et al., 2012). According to Cummins et al. (2015) it is possible that these contradictory results are due to the heterogeneous nature of depression symptoms. A few studies examined gender related differences for pitch in people with depression. Mundt et al. (2007) reported only a lower pitch in the case of men, however not in the speech of women. Vicsi, Sztahó and Tamás (2013) found lower pitch average values for both depressed women and men.

There is evidence that sadness and depression are associated with a decrease in loudness (Scherer, 1987). Moreover, a study of Wang et al. (2019) reported also a significant lower loudness level for people with depression. In this study, 42 people with major

depression were compared with 57 age- and gender-matched healthy controls. They examined the speech loudness in 12 different speech tasks, including an interview task. For all the tasks, people with depression had a lower loudness than healthy controls.

Jitter and Shimmer voice features have been analyzed for depression, finding higher Jitter in depression is caused by the irregularity of the vocal fold vibrations (Scherer, 1987). Nevertheless, Shimmer is lower for people with depression (Nunes, Coimbra & Teixeira, 2010). However, the study of Nunes et al. (2010) was a small corpus study and therefore not the strongest evidence. In contrast, other studies shows that both Jitter and Shimmer were higher for people with depression (Vicsi et al., 2013; Vicsi, Sztahó & Kiss, 2012).

Previous studies showed that features such as Jitter, Shimmer, and pitch variability tend to increase with depression severity and psychomotor retardation (i.e., slowing of thought, physical movement, and reaction times) which affects motor control precision and laryngeal muscle tension (Horwitz et al., 2013; Kiss & Vicsi, 2017; Quatieri & Malyska, 2012). For HNR no significant differences reported for people with depression in previous literature.

For speech velocity Vicsi et al. (2013) found that speech rate and length of pauses in the speech of depressed people show significant changes compared to a healthy reference group. The speech rate was lower for both men and women with depression than for the healthy controls. The length of the pauses showed a large difference between the reference

(12)

group and people with depression. During the reading of a tale people with depression made more pauses, which can be caused by psycho-motoric disorders. The longer pauses resulted in longer recordings.

Moreover, Liu, Kang, Feng and Zhang (2017) investigated the pause time of people with depression. 92 depressed patients and 92 controls matched for age, gender and education were examined on recording time, phonation time and speech pause time. One of the tasks was an interview that had 18 questions on conversational topics. For the interview task all of the variables showed a significant difference for both men and women with depression. The recording time, phonation time and speech pause time were longer for people with depression.

Differential diagnosis

Alpert, Rosen, Welkowitz, Sobin and Borod (1989) performed a discriminant function analysis to 20 people with schizophrenia, 17 people with right-brain damage, 20 people with Parkinson’s Disease and 10 people with unipolar depression to see if acoustic voice measures could be used to differentiate these groups. Alpert et al. (1989) used variables such as the average and range of voice level, pitch for all of the syllables spoken, the average durations of utterances and the average duration of pauses. The classification results for the acoustic measures showed an accuracy of 69%. Some of the people with schizophrenia had a so-called flat affect, which make them more similar to people with depression and hard to distinguish with the current methods. However, the results of the discriminant function support the impression that these methods may help to discriminate flat affect from depression (Alpert et al., 1989).

The study of Kliper et al. (2015) compared people with schizophrenia, depression and healthy controls with each other. From their study it became clear that on the base of mean Shimmer people with schizophrenia, depression and healthy controls differ from each other. Healthy people had the lowest level of Shimmer followed by people with schizophrenia and people with depression had the highest Shimmer.

Influences of therapy

In a study of Tolkmitt, Helfrich, Standke and Scherer (1982) voice changes as a result of clinical treatment for 17 people with depression and 15 people with schizophrenia were investigated. Both groups showed a decrease of pitch after therapy (p = 0.06 for the people with schizophrenia). Assuming that a high pitch may be indicative of higher tension in the

(13)

vocal cords which in turn may be due to higher muscle tone. One can argue that after therapy patients were more relaxed as an effect of treatment (Tolkmitt et al., 1982).

Demographic parameters

Research with healthy participants has shown that many demographic parameters can influence speech. Especially gender and age affect prosody, quality and velocity of people’s voice. Those parameters need to be considered to find a difference in acoustic voice

measurements between healthy and people suffering from a psychiatric disorder.

Gender

That men and women have a difference in their vocal pitch is obvious. The vocal pitch is lower for man than for women, which can be explained by physical differences between men and women. As can be seen in Figure 1, among humans, men's vocal folds, that are approximately 60% longer than women’s, contribute to an average rate of vocal fold vibration during phonation (fundamental frequency) that is about 5 standard

deviations below women's (Fant, 1960; Titze, 2000; Puts, Apicella & Cárdenas 2012). The primary growth is located in the anterior two-thirds of the larynx, between the vocal processes and the anterior commissure. This is the membranous length, where vocal fold vibration takes place. Membranous length grows at a rate of 0.4 mm per year for the female and 0.7 mm per year for the male, up to approximately the age of 20. It is interesting that the growth appears to be nearly linear in both genders through the developing years. The relationship between

the pitch and the Membranous length can be calculated with pitch = 1700/ Membranous length, where Membranous length is in millimeters (Titze, 1989).

Figure 2 illustrates the relationship between the mean speaking pitch and the membranous vocal fold length. The mean pitch in Hz is plotted on the vertical axis, and membranous length in mm is plotted on the horizontal axis. Data for females are represented with a solid line and open circles, whereas data for males are represented by the dashed line

Figure 1 Male-female comparisons of dimensions of the larynx. (a) Sagittal view of thyroid cartilage and (b) horizontal section showing difference in membranous length (after Kahane, 1978).

(14)

and closed circles. Age is indicated as a parameter along the curves ranging from about 1 year to about 20 years. Also shown in the Figure 2 is a hyperbola that shows an inverse relationship between fundamental frequency and Membranous length, as would be predicted by a

vibrating string model with fixed tension and fixed mass per unit length (Titze, 1989).

Rojas, Kefalianos and Vogel (2020) describe in a systematic review and meta-analysis the acoustic

changes of voice data from healthy adults over 50 years of age. Their results show that acoustically, men have higher scores on measures of absolute Jitter. Furthermore, in

previous studies, women have displayed minimally less Shimmer and more Jitter (Deem, Manning, Knack, & Matesich, 1989; Sorensen & Horii, 1983) but smaller Jitter values than men (Jafari, Till, Truesdell, & Law-Till, 1993; Ludlow et al., 1987).

The differences between men and women for Jitter and Shimmer could be related to habitual voice loudness levels. In the study of Brockmann, Storck, Carding and Drinnan (2008) 29 men and 28 women (aged 20–40 years) were asked to phonate an /a/ on three different loudness levels. Brockmann et al. found a correlation between loudness and Jitter and Shimmer. Jitter and Shimmer increased with decreasing voice loudness, especially in phonations below 75 dB and 80 dB. In soft and medium phonation, men were generally louder and showed less Shimmer. However, men had higher Jitter measures when phonating softly. Gender differences in Jitter and Shimmer at medium loudness may be mainly linked to different habitual voice loudness levels. In the current literature no research has been done for the voice loudness over spontaneous speech only in setup as by the research of Brockmann et al. (2008).

Several studies found effects of gender on the articulation rate. Previous studies reported that men spoke faster than women( Byrd, 1994; Jacewicz, Fox, O'Neill & Salmons, 2009; Lutz & Mallard, 1986; Quené, 2008; Verhoeven, De Pauw and Kloots 2004; Yuan, Liberman & Cieri, 2006). The studies used both readings tasks and conversational speech samples for the acoustic analyses. However, the study of Van Borsel and De Maesschalck (2008) found no difference between men and women. They used a text reading task to measure speech rate.

Figure 2 Mean speaking fundamental frequency F0 as a function of membranous length Lm . Solid line without data points is a hyperbola that serves as a first-order "string" model (after Kahane, 1978).

(15)

Dabbs and Ruback (1984) examined the pause time of 50 men and 50 women. The participants had a conversation with people from the same gender. In general women

vocalized more than men and paused less during the conversation. Dabbs and Ruback (1984) noticed that women enjoyed the task more and men had more awkward pauses in their conversations. Furthermore, to my knowledge there exist no studies investigating gender-related differences in pauses in healthy men and women.

So in general, previous studies showed that men had a lower pitch, probably a lower voice quality and a higher articulation rate. The evidence for gender-related differences of loudness, HNR and pause time seems to be scare.

Age

As mentioned above, during the normal development of the vocal folds their growth influences the vocal pitch. From the early adulthood until the age of about 55 years, voice pitch remains relatively constant, and then changes begin start as the tissue structure within the vocal tract begins to undergo deterioration (Linville, 1996). Hormonal changes affect the vocal pitch; the pitch of men’s voices has been observed to increase with age, while in women it decreases with age (Singh, 2019). However, previous literature is inconsistent about the relationship between age and vocal pitch. Harnsberger, Brown, Rothman and Hollien (2008) showed and increase for the pitch in men while other showed a decrease (Harrington,

Palethorpe & Watson, 2007; Nishio & Niimi, 2008). Besides that, Ramig and Ringel (1983) reported no differences in the vocal pitch of men. Previous literature is more clear about relationship between age and pitch of women and all reported a decrease (Ferrand, 2002; Harrington et al., 2007; Nishio & Niimi, 2008; Singh, 2019). However there is one study that showed a that participants aged 80–89 years produce higher fundamental frequency compared to participants aged 60–69 years (Rojas et al., 2020) According to the study of Harrington et al. (2007) the pitch change in association with aging was much larger for women than for men. For men there was a trend found for the pitch changes in aging. There was just a slight increase observed in participants aged 70 years or older. For women, however, the results showed that females in their 30s and 40s showed lower frequencies than those in their 20s.

It is complicated to compare the previous the studies because they used all different age groups. On one hand, the study of Harnsberger et al. (2008) compared the vocal pitch of 16 older men (74–88 years) and 14 younger men (21–29 years). On the other hand, Nishio and Niimi (2008) compared the vocal pitch of 374 healthy speakers (187 males and 187

(16)

females) separated in 3 different age groups young (19–34), middle-aged (35–59) and elderly (over 60). Besides that the study of Harrington, Palethorpe and Watson (2007) was a

longitudinal study of four people.

About the speech loudness Huber and Spruill (2008) reported that older adults tended to use more abdominal movement in loud speech than younger adults, especially when talking in a noisy environment. It seems to be that age-related differences were larger when

participants produced a longer utterance as compared to a shorter one. The study compared twenty-five older adults (15 women mean age 71 and 10 men mean age 73) was compared to that of a group 30 young adults (mean age of 22). However, the authors did not report on habitual speech loudness of the participants. Furthermore, previous studies did reported about the relation between age and loudness.

Ramig and Ringel (1983) reported about the relation between voice quality and age of men who were divided over three age groups (25-35, 45-55 and 65-75). They reported no difference for Jitter, but Shimmer was higher for the older group. However, when they added the variable physical condition, Jitter and Shimmer were significant higher for the older group. Rojas et al. (2020) suggest that participants aged 80–89 years produce higher Jitter percent, Shimmer percent, and Shimmer in decibels compared to participants aged 60–69 years and a significant increase in relative average perturbation, Jitter percentage, and Shimmer in decibels compared to participants aged 70– 79 years.

In a study of Ferrand (2002) a lower HNR was found for elderly women. Nevertheless, there were no differences found for Jitter. For this study the groups were separated in 3 age groups young (21-34), middle aged (40-63) and elderly (70-90) with 14 women in each of three groups. They reported that elderly women had a lower HNR. According to Ferrand (2002) the significant lowering of HNR, evident for elderly speakers, may be attributable in part to medications taken by the majority of these elderly subjects.

The experiment of Jacewicz et al. (2009) examined the speech rate of healthy people by comparing a group of Northern speakers (from Wisconsin) with a group of Southern speakers (from North Carolina). In free speech, only Northern young adults spoke faster than older adults. The study of Quené (2008) found a similar effect. They interviewed high school teachers and reported that speech rate was faster for younger speakers than for older speakers. Additionally, Verhoeven et al. (2004) reported the same results. The average speech tempo was faster for younger speakers ages (21–40) than for older speakers ages (45–59). There were no studies that reported about pause in relationship with age.

(17)

Other parameters

There are other demographic parameter which seem to influence these acoustic measurements like level of education, nativity and geographical origin, culture, ethnicity and race, and smoking (Ayoub, Larrouy-Maestri & Morsomme, 2019; van Bezooijen, 1995; Damborenea et al., 1999; Gonzalez & Carpi, 2004; Hillenbrand, Getty , Clark & Wheeler, 1995; Hudson & Holbrook, 1982; Jacewicz & Fox, 2015; Jacewicz, Fox & Salmons, 2007; Pinto, Crespo, & Mourão, 2014; Singh, 2019; Sorensen & Horii, 1982; Tafiadis, Toki, Miller & Ziavra, 2017). For the current study, these parameters were not included because there was not enough variability with regard to these factors in the participant sample.

However, the evidence for the parameters Geographical origin, culture and ethnicity and race make it clear that the results from the current study cannot be generalized to other countries. For example, Dutch women seem to have lower pitch than Japanese women

because men in Japanese countries find a higher pitch more attractive while Dutch men prefer a lower pitch (Van Bezooijen, 1995). Moreover, it seems that skull form and size also

influence the voice parameter (Hudson and Holbrook, 1982).

The current study

The aim of the current study is to examine the role of age and gender in acoustic analysis of people with major depression or schizophrenia. Apart from that, to examine if acoustic analysis can be used for differential diagnosis between mental illnesses. For this study, the parameters age and gender will be central because according to the literature these variables influence prosody, voice quality and speech velocity. The current literature shows that there is a difference between healthy controls and people with mental illnesses. However, there is still disagreement in the literature on how variables to measure voice are influenced by age and gender. For prosody, voice quality and speech velocity, there are many studies with healthy controls for demographic parameters, but not in the patient groups. In most of the previous studies the healthy controls were age and gender matched with whom?, but the groups which groups? were not separated in the analyses. In this study men and women will be divided in different age groups and the healthy controls will be compared with people with

schizophrenia and major depression. The goal of this study is to examine if voice variables are sufficiently reliable to be used for the diagnosing mental illnesses. If the role of age and gender is larger than the effect caused by the mental illness, then voice measurements are not reliable for diagnosing mental illnesses. If acoustic analyses are going to be used as a

(18)

differences. There are already studies that tested this by using a smartphone for the speech analysis (Fukazawa et al., 2020). They used acoustic analysis to estimate the degree l depression.

Research questions

The main question of this thesis is: how do age and gender affect the voice of participants with major depression or schizophrenia?

To answer the research question, the following sub-questions have been formulated:

1. Is there a difference in the prosody, voice quality and speech velocity between healthy young and middle-aged men and women?

2. Do people with schizophrenia have a different prosody, voice quality and speech velocity than aged and gender matched healthy controls?

3. Do people with major depression have a different prosody, voice quality and speech velocity than aged and gender matched healthy controls?

4. Do people with major depression have a different prosody, voice quality and speech velocity than aged and gender matched people with schizophrenia?

Expectations

According to previous literature gender and age-related differences are expected for most of the variables that measure prosody, voice quality and speech velocity. It is likely that the vocal pitch is lower with less variability for men than for women. However, age-related differences are not expected for pitch because those changes in tissue structure would be expected only after the age of 55 years (Linville, 1996). No predictions can be made for gender or age-related differences in loudness, as the literature is limited with regard to this subject. It is expected that men have a higher level of Jitter and Shimmer than women (Rojas et al., 2020). For age-related differences in Jitter and Shimmer no hypothesis can be made based on previous literature. For HNR no earlier studies were done between genders, but because of the relation with Jitter and Shimmer it is expected that women would have a higher level of HNR. Additionally, it would not be expected to find many age-related differences for prosody or speech quality because that would be expected at an older age level than the participants from this study (Ferrand, 2002). For speech rate it is expected that men have a faster speech than women and it would be expected that speech rate decreases with age. For

(19)

the gender and age affect for pause time no hypothesis can be made based on previous literature.

The current literature is inconsistent about speech differences between people with schizophrenia and healthy controls. There are studies that did not find a difference for vocal pitch (Rapcan et al. 2010), but there are also studies that report that people with schizophrenia have a higher pitch than healthy controls(?) (Graux, Courtine, Bruneau, Camus & El-Hage, 2015: Meaux, Mitchell & Cohen, 2018) and a study that reports a lower pitch for people with schizophrenia (Cohen, Alpert, Nienow, Dinzeo & Docherty,2008). Previous literature reports a higher Jitter and Shimmer for people with schizophrenia than controls (Kliper, Vaizman, Weinshall and Portuguese 2010; Kliper, Portuguese, & Weinshall, 2015). It would be

expected based on previous literature that people with schizophrenia have a higher pause time (Cohen, Kim & Najolia, 2013). For speech time no differences were found between people with schizophrenia and controls. Loudness and HNR were not examined in previous studies for people with schizophrenia. Moreover, most of the studies had gender and aged matched controls but gender and age-related differences were not examined.

From previous literature it is expected to find a lower pitch and a higher Jitter and Shimmer for people with depression in comparison with controls (Horwitz et al., 2013; Kiss & Vicsi, 2017; Kliper et al., 2010; Mundt, Snyder, Cannizzaro, Chappie & Geralts, 2007; Nilsonne, 1988; Quatieri & Malyska, 2012; Stassen, 1993; Ellgring & Scherer, 1996). However, the current literature still argues if these findings are gender-related (Mundt et al., 2007; Vicsi et al., 2013). Furthermore, previous literature reported a decrease in loudness levels for .. compared to .. (Scherer, 1987; Wang et al., 2019). Besides that, it would be expected that people with depression have a lower speech rate and a longer length of pauses (Liu, Kang, Feng and Zhang 2017; et al., 2013)

For the comparison between major depression and schizophrenia no expectation can be made because of the limited previous literature.

(20)

Method

Participants

In total 279 Dutch native speakers were included in this study (see Table 1, 2). Of these participants, 141 individuals were diagnosed with schizophrenia or another psychotic disorder, 18 individuals were diagnosed with major depression, and 120 were healthy controls without a psychiatric disorder. All patients were recruited from several medical projects in the University Medical Center in Utrecht (UMCU), and the healthy controls were recruited through flyers in the UMCU. The healthy controls were age and gender matched with the schizophrenia group. For the age-related analyses, the groups were separated in two different age levels a young group (18-34) and a middle-aged group (18-34).

For the comparison between people with major depression, people with schizophrenia and healthy controls, the people were compared based on their age and gender to make the groups more equal because the major depression group was much smaller and had a higher average age (see Table 2).

People with a speech disorder (e.g. stuttering), were excluded from this study.

Moreover, two people who were transgender were excluded because it is unknown what kind of effect the transition has on the acoustic measurements. Furthermore, the first 40

participants were recorded on one track instead of two tracks so for the reliability of this study those participants were excluded. The one track recordings were split manually so, when interviewer and participants talk together the voices could not be distinguished.

Moreover, in the original data set there was an elderly group. This group was excluded from the study since there were only 13 people (3 men) in it and the range was too to deliver reliable results.

In the schizophrenia group were 29 people with a comorbid depression. The Beck depression inventory (BDI, Beck, Ward, Mendelson, Mock & Erbaugh, 1961) or the Calgary Depression Scale for Schizophrenia (CDSS, Addington, Addington & Maticka-Tyndale, 1993) were used to assess the comorbid depression in the schizophrenia group. It depends on the study the participants were involved in which questionnaire was used. For this study no distinction was made between people with or without a comorbid depression. Schizophrenia was the main diagnosis and it was expected that the participants with a comorbid depression would still differ from people with major depressions and healthy controls.

(21)

Table 1 the demographic features of the participants for the comparison between people with schizophrenia and healthy controls

Controls People with schizophrenia

Total 120 141

Male 71 (59 %) 104 (74%)

Female 49 (41%) 37 (26%)

Age M = 32.87 ± 11.38 M = 31.57 ± 11.69

Young (18-34) 82 (54 men) 92 (75 men)

Middle aged (35-59) 38 (17 men) 49 (29 men)

Table 2 the demographic features of the participants for the comparison between people with major depression, schizophrenia and healthy controls

Controls People with schizophrenia People with Major depression Total 18 18 18 Male 9 9 9 Female 9 9 9 Age M = 42.89 ± 14.57 M = 43.56 ± 15.87 M = 43.33 ± 15.18

Young (18-34) 5 (3 men) 5 (3 men) 5 (3 men)

Middle aged (35-59) 12 (6 men) 12 (6 men) 12 (6 men)

Materials

The speech samples were recorded with a Tascam recorder. This recorder recorded the interviewer and the participant independently. All speech samples were analyzed in PRAAT (Boersma & Weenink, 2018), which is open source software to analyze speech samples. OpenSMILE (Eyben, Wöllmer, & Schuller 2010) was used for interpreting of the voice data. This is open source software for automatic extraction of features from audio signals and for classification of speech and music signals. For this study, the following measures were used to analyze in OpenSMILE: pitch and loudness, for which we extracted the mean, the

coefficient of variation and the Range of 20th to 80th percentile; HNR; Jitter (local) and Shimmer (local), for which we extracted the mean and standard deviation; articulation rate and pause time the following variables were used: Voiced Segments Per Sec; Mean Voiced Segment Length Sec; Sd Voiced Segment Length Sec; Mean Unvoiced Segment Length and Sd Unvoiced Segment Length. The statistical analyses were done with SPSS Statistics 26 (IBM Corporation, 2019).

(22)

Procedure

A qualified interviewer used structured interviews to interview all participants [see appendix A]. The participants were told that the interview questions were meant to get to know the person behind the disease. The questions were designed to avoid negativity and the

interviewers were allowed to stimulate the participant to talk. The participants were informed about the true aim of the research after the interview was completed, and were given the choice to renounce their participation. Both the interviewer and the participant were wearing a headset with a microphone; therefore, the audio signals of the interviewer and the participant could be recorded independently. The audio signals were automatically split and the recorder makes automatically a second recording that is 6 dB softer. The program PRAAT (Boersma & Weenink, 2018) was used to determine manually which recordings had the least clipping. After that the recordings were analyzed in OpenSMILE and the outcomes were collected in an Excel database. The Ethical Review Board of the University Medical Center Utrecht

approved the study and a signed informed consent was obtained from all participants.

Statistical Analyses

First, the prosody, voice quality and speech velocity of 71 men and 49 women from the healthy control group were compared with independent sample t-tests. The groups were not normally distributed for all variables, but the sample size was bigger than 30. For that reason, a parametric test has been used. After that, the MANCOVA was performed to examine if age influenced the voice differences that were found based on gender. Then, the voice measurements were analyzed for the young and middle-aged group for men and women separately. The independent sample t-tests were used because the sample size was bigger than 30.

The schizophrenia group was compared to the gender and aged matched control group to examine possible differences between them in prosody, voice quality and speech velocity. To analyze the role of gender and age, a MANCOVA was used and a follow-up was done with an ANCOVA on the variables where age and gender seemed to have influence. To examine the voice differences between men and women with schizophrenia independent sample t-tests were used. The 104 men with schizophrenia were compared with 71 healthy men and the 37 women with schizophrenia were compared with 49 healthy women. After that, these analyses were repeated for each of the different age groups. The young and middle-aged schizophrenia groups were compared with the young and middle-middle-aged healthy controls.

(23)

For the comparison with the major depression group 18 healthy controls were selected from the database. They were matched with the depression group based on gender and age. To analyze the role of gender and age the MANCOVA was used and a follow-up was done with an ANCOVA on the variables where age and gender seem to have influence. After that a follow-up analysis were done with independent sample t-tests or Mann Whitney U tests depending on the normality.

For the comparison between the major depression group and the schizophrenia group 18 people of the schizophrenia group were selected from the database matched with the major depression group based on their gender and age. The MANCOVA was used to analyze the role of gender and age and after that the ANCOVA was used. Follow-up analyses were done with independent sample t-tests or Mann Whitney U tests depending on the normality.

(24)

Results

Difference between healthy men and women

The normality was checked and most of the variables were not normally distributed. However, the sample size was bigger than 30 therefore parametric tests have been used to examine the difference between men and women. As can be seen in Table 3 almost all variables showed a difference between men and women except for range of loudness, mean Jitter, and number of voiced segments per second.

Table 3

The prosody, voice quality and speech velocity differences in healthy men and women.

Variable Gender Mean P value

Mean F0 Men Women 22.96 29.40 P = .000** SD F0 Men Women .18 .25 P = .000** Range F0 Men Women 5.19 11.92 P = .000**

Mean loudness Men Women 42.32 48.71 P = .008* SD loudness Men Women 99.21 92.36 P = .002*

Range loudness Men Women

64.11 70.55

P = .075

Mean Jitter Men

Women 7.17 7.43 P = .293 Sd Jitter Men Women 176.15 186.10 P = .001*

Mean Shimmer Men Women 164.24 139.71 P = .000** Sd Shimmer Men Women 94.80 105.42 P = .000** Mean HNR Men Women 308.72 601.59 P = .000** Sd HNR Men Women 214.63 121.72 P = .000**

voiced segment per second Men Women 123.6056 125.2878 P = .442

Mean voiced segment length per second

Men Women

26.19 23.91

P = .004*

Sd voiced segment length per second Men Women 26.83 23.77 P = .001* Mean unvoiced segment

length per second

Men Women

22.36 17.37

(25)

Sd unvoiced segment length per second

Men Women

44.89 34.26

P = .000**

The P value has been calculated with the independent sample t-test. * P < .05

** P < .001

The MANCOVA showed that the gender was still a predictor for the speech variables Λ = 0.146, F (17, 101) = 34.884, p < .001 when age was included as a covariate. Age was a predictor for some of the variables Λ = 0.766, F (17, 101) = 1.815, p < .05. To examine if there were age-related difference in the variables the group was separated in two age categories young (18-33) and middle aged (35-59). For men and women, the analyses were done separately. For women the data were normally distributed except for pitch range, Sd HNR all variables for voiced segments and Sd unvoiced segments. For those variables the Mann Whitney u-test was used. For the other variables an independent t-test was used. As can be seen in Table 4, there were no differences between young and middle-aged women

Table 4

The prosody, voice quality and speech velocity differences in young and middle-aged healthy women.

Variable Age women Mean Median P value

Mean F0 Young (18-34) Middle aged (35-59) 29.61 28.90 - P = .250 SD F0 Young (18-34) Middle aged (35-59) 0.25 0.24 - P = .221 Range F0 Young (18-34) Middle aged (35-59) - 13.49 11.69 P = .132

Mean loudness Young (18-34) Middle aged (35-59) 46.44 51.89 - P = .144 SD loudness Young (18-34) Middle aged (35-59) 94.84 89.79 - P = .120

Range loudness Young (18-34) Middle aged (35-59)

68.26 73.81

- P = .271

Mean Jitter Young (18-34) Middle aged (35-59) 7.44 7.36 - P = .880 Sd Jitter Young (18-34) Middle aged (35-59) 186.04 187.14 - P = .821

Mean Shimmer Young (18-34) Middle aged (35-59) 140.96 138.27 - P = .563 Sd Shimmer Young (18-34) Middle aged (35-59) 105.06 105.58 - P = .767 Mean HNR Young (18-34) Middle aged (35-59) 605.68 591.45 - P = .729

(26)

Sd HNR Young (18-34) Middle aged (35-59)

- 117.50

111.40

P = .470

voiced segment per second Young (18-34) Middle aged (35-59) - 123.50 125.50 P = .807

Mean voiced segment length per second

Young (18-34) Middle aged (35-59)

- 22.85

25.30

P = .069

Sd voiced segment per length second Young (18-34) Middle aged (35-59) - 22.15 24.30 P = .089 Mean unvoiced segment

length per second

Young (18-34) Middle aged (35-59) 17.82 17.13 - P = .600 Sd unvoiced segment length per second

Young (18-34) Middle aged (35-59)

- 35.10

31.30

P = .674

When the mean has been described the P, value has been calculated with the independent sample t-test. When the median has been described the P, value has been calculated with the Mann Whitney U test.

For men, the data were normally distributed except for Sd and range pitch, mean Jitter Sd HNR and for mean and Sd of voiced and unvoiced segments. For those variables the men Whitney u was used and for the others an independent t-test. In Table 5 can be seen that young men show higher Shimmer (M = 166.81) than middle aged men (M = 156.06) (t (69) = 2.160, p = .034). Furthermore, young men show a lower HNR (M = 291.57) than middle aged men (M = 363.17) (t (69) = -2.071, p = .042). Moreover, the level of mean voiced segments per second were lower for young men (Mdn = 24.15) than for middle aged men (Mdn = 28.00) (U = 217.5, z = - 3.254, p = .001). The same pattern can be seen for the variety in voiced segments. The level of Sd voiced segments per second were lower for young men (Mdn = 24.45) than for middle aged men (Mdn = 30.10) (U = 230.00, z = - 3.086, p = .002).

Table 5

The prosody, voice quality and speech velocity differences in young and middle-aged healthy men.

Variable Age men Mean Median P value

Mean F0 Young (18-34) Middle aged (35-59) 22.81 23.43 - P = .262 SD F0 Young (18-34) Middle aged (35-59) - 0.18 0.19 P = .381 Range F0 Young (18-34) Middle aged (35-59) - 4.43 4.98 P = .109

Mean loudness Young (18-34) Middle aged (35-59) 42.25 42.54 P = .934 SD loudness Young (18-34) Middle aged (35-59) 99.81 97.32 P = .461

(27)

Range loudness Young (18-34) Middle aged (35-59)

64.18 63.89

P = .960

Mean Jitter Young (18-34) Middle aged (35-59) - 6.96 7.10 P = .427 Sd Jitter Young (18-34) Middle aged (35-59) 176.06 176.47 P = .925

Mean Shimmer Young (18-34) Middle aged (35-59) 166.81 156.06 P = .034* Sd Shimmer Young (18-34) Middle aged (35-59) 94.11 96.97 P = .098 Mean HNR Young (18-34) Middle aged (35-59) 291.57 363.17 P = .042* Sd HNR Young (18-34) Middle aged (35-59) - 171.50 160.00 P = .085

voiced segment per second Young (18-34) Middle aged (35-59) 122.98 125.59 P = .426

Mean voiced segment length per second

Young (18-34) Middle aged (35-59)

- 24.15

28.00

P = .001*

Sd voiced segment per length second Young (18-34) Middle aged (35-59) - 24.45 30.10 P = .002*

Mean unvoiced segment length per second

Young (18-34) Middle aged (35-59) - 20.25 22.40 P = .549 Sd unvoiced segment

length per second

Young (18-34) Middle aged (35-59)

- 39.80

48.40

P = .146

When the mean has been described the P, value has been calculated with the independent sample t-test. When the median has been described the P, value has been calculated with the Mann Whitney U test.

* P <.05

Differences between healthy people and people with schizophrenia

The data of the healthy controls and schizophrenia group were not for all variables normally distributed. The group samples were at least 120 so, the comparison was done with an independent t-test. Table 6 shows that people with schizophrenia have a lower pitch than the control group (t (259) = 4.035, p = .000). Moreover, people with schizophrenia have a lower variety for pitch (t (259) = 2.180, p = .030) and a lower range (t (259, 227.309) = 2.098, p = .037). Furthermore, the variety level of loudness was higher for people with schizophrenia than for healthy people (t (259, 253.023) = -6.434, p =.000). For Jitter and Shimmer no

differences were found, but the mean level of HNR was lower for people with schizophrenia than for healthy controls (t (259) = 2.895, p = .004). Additionally, the speech rate of people with schizophrenia was higher (t (259, 153.347) = -10.376, p = .000). Besides that, the mean of unvoiced segments length per second was higher for people with schizophrenia (t (259,

(28)

224.581) = -6.525, p = .000). Additionally, they had a higher variety in it (t (259, 232.081) = 5.540, p = .000).

Table 6

The prosody, voice quality and speech velocity differences between people with schizophrenia and healthy controls

Variable Group Mean P value

Mean F0 Controls Schizophrenia 25.59 23.71 P = .000** SD F0 Controls Schizophrenia 0.21 0.20 P = .030* Range F0 Controls Schizophrenia 7.94 6.86 P = .037*

Mean loudness Controls

Schizophrenia 44.93 42.41 P = .345 SD loudness Controls Schizophrenia 96.42 107.99 P = .000*

Range loudness Controls

Schizophrenia

66.74 68.01

P = .781

Mean Jitter Controls

Schizophrenia 7.28 7.12 P = .419 Sd Jitter Controls Schizophrenia 180.22 182.07 P = .499

Mean Shimmer Controls

Schizophrenia 154.23 155.89 P = .591 Sd Shimmer Controls Schizophrenia 99.13 99.05 P = .948 Mean HNR Controls Schizophrenia 428.31 347.78 P = .004* Sd HNR Controls Schizophrenia 176.69 171.22 P = .922

voiced segment per second Controls Schizophrenia

124.29 176.19

P = .000*

Mean voiced segment length per second Controls Schizophrenia 25.26 25.46 P = .767

Sd voiced segment per length second Controls Schizophrenia 25.58 27.02 P = .051

Mean unvoiced segment length per second Controls Schizophrenia 20.32 27.22 P = .000** Sd unvoiced segment length per

second Controls Schizophrenia 40.55 53.71 P = .000**

(29)

The P value has been calculated with the independent sample t-test. * P < .05

** P < .001

The ANCOVA showed that variety of pitch, pitch range and the mean HNR shown no longer a difference if age and gender were included as covariates. The ANCOVA showed for variety of pitch that both gender (F (1,257) = 118.377, p < .001) and age (F (1,257) = 5.098, p <.05) were predictors of the variety of pitch, but schizophrenia itself was not a predictor (F (1,257) = .625, p = .430). For pitch range the ANCOVA showed the same effect. Both gender (F (1,257) = 152.253, p < .001) and age (F (1,257) = 3.917 p <05) were predictors for the pitch range but schizophrenia itself was not (F (1,257) = .396, p = .530). Moreover, the same effect was found for the mean HNR the ANCOVA showed that gender (F (1,257) = 242.975, p < .001) and age (F (1,257) = 4.395 p <.05) were predictors of mean HNR and schizophrenia itself was not (F (1,257) = 2.860 p = .092). The ANCOVA showed that mean Pitch, SD loudness, speech rate and unvoiced segments length per second still showed significant differences between groups when gender and age were included as covariates.

The Wilks's Λ showed that gender was a predictor for any of the variables (Λ = 0.210, F (17, 241) = 53.490, p < .001). Furthermore, the ANCOVA showed that gender was a predictor for mean pitch, variety of pitch, pitch range, Sd Jitter, mean and Sd Shimmer, mean HNR and for mean and Sd (un)voiced segments length per second. The difference between men and women with schizophrenia in comparison with healthy controls can be seen in Table 7. The data was not normally distributed for all variables however, the group samples were above 30 so the comparison will be done with an independent sample t-test. As can be seen in Table 7, the mean vocal pitch was only significant for men with schizophrenia (t (173) = 3.523, p = .001) and not for women (t (84) = 1.117, p = .267). Women show a significant difference for variability of pitch (t (84) = 2.151, p = .034) and pitch range (t (84) = 2.512, p = .014) while men did not. Only for women with schizophrenia Jitter (t (84) = 2.874, p = .005) and Shimmer (t (84) = 2.267, p = .026) show a significant difference. Moreover, only women with schizophrenia show a significant difference for the mean (t (84) = -2.285, p = .025) and the variability (t (84) = -3.209, p = .002) of voiced segments length per second. However only for men was a significant difference found for HNR (t (173, 172.234) = 2.840, p = .005). The variability of loudness, speech rate and pause time were different for both men and women.

(30)

Table 7

The prosody, voice quality and speech velocity differences between men and women with schizophrenia and healthy controls

Variable Gender Group Mean P value

Mean F0 Men Controls

Schizophrenia

22.96 21.89

P = .001*

Mean F0 Women Controls

Schizophrenia 29.40 28.83 P = .286 SD F0 Men Controls Schizophrenia 0.18 0.19 p = .551 SD F0 Women Controls Schizophrenia 0.25 0.23 P = .034*

Range F0 Men Controls

Schizophrenia

5.19 5.89

P = .063

Range F0 Women Controls

Schizophrenia

11.93 9.60

P = .014*

Mean loudness Men Controls

Schizophrenia

42.32 41.50

P = .798

Mean loudness Women Controls

Schizophrenia

48.71 44.97

P = .269

SD loudness Men Controls

Schizophrenia

99.21 108.30

P = .000*

SD loudness Women Controls

Schizophrenia

92.36 107.14

P = .000*

Range loudness Men Controls

Schizophrenia

64.11 66.81

P = .675

Range loudness Women Controls

Schizophrenia

70.55 71.36

P = .870

Mean Jitter Men Controls

Schizophrenia

7.17 7.37

P = .350

Mean Jitter Women Controls

Schizophrenia

7.44 6.40

P = .005*

Sd Jitter Men Controls

Schizophrenia

176.15 174.81

P = .663

Sd Jitter Women Controls

Schizophrenia

186.10 202.49

P = .000**

Mean Shimmer Men Controls

Schizophrenia

164.24 164.68

P = .896

Mean Shimmer Women Controls

Schizophrenia

139.71 131.19

P = .026*

Sd Shimmer Men Controls

Schizophrenia

94.80 95.18

P = .732

Sd Shimmer Women Controls

Schizophrenia

105.42 109.93

P = .023*

Mean HNR Men Controls

Schizophrenia

308.72 243.89

(31)

Mean HNR Women Controls Schizophrenia 601.59 639.81 P = .254 Sd HNR Men Controls Schizophrenia 214.63 193.71 P = .819 Sd HNR Women Controls Schizophrenia 121.72 108.01 P = .124

voiced segment per second Men Controls Schizophrenia 123.61 174.64 P = .000** voiced segment per

second Women Controls Schizophrenia 125.29 180.55 P = .000**

Mean voiced segment length per second

Men Controls

Schizophrenia

26.19 25.20

P = .285

Mean voiced segment length per second

Women Controls

Schizophrenia

23.91 26.16

P = .025*

Sd voiced segment per length second Men Controls Schizophrenia 26.83 27.05 P = .818 Sd voiced segment per

length second Women Controls Schizophrenia 23.77 26.94 P = .002* Mean unvoiced segment

length per second

Men Controls

Schizophrenia

22.36 27.42

P = .000** Mean unvoiced segment

length per second

Women Controls Schizophrenia 17.37 26.68 P = .000** Sd unvoiced segment

length per second

Men Controls Schizophrenia 44.89 53.84 P = .000** Sd unvoiced segment

length per second

Women Controls

Schizophrenia

34.26 53.34

P = .000**

The P value has been calculated with the independent sample t-test. * P < .05

** P < .001

The Wilks's Λ showed that age was a predictor for any of the variables (Λ = 0.721, F (17, 241) = 5.498, p < .001). Furthermore, the ANCOVA showed that age was a predictor for pitch variety, pitch range, mean Jitter, mean Shimmer, mean HNR, mean loudness voiced segments per second and for mean and Sd (un)voiced segments length per second. Mean pitch, Sd loudness, speech rate and pause time both age groups showed a difference in the same directions. As can be seen in Table 8, only the middle-aged groups showed a difference for pitch variability (t (85) = 2.480, p = .015), pitch range (t (85, 67.639) = 2.145, p = .036) and mean Jitter (t (85) = 2.909, p = .005). Furthermore, only the young group showed a difference for mean HNR (t (172, 167.185) = 2.877, p = .005).

(32)

Table 8

The prosody, voice quality and speech velocity differences between young and middle-aged people with schizophrenia and healthy controls

Variable Age group Group Mean P value

Mean F0 Young (18-34) Controls

Schizophrenia

25.14 23.27

P = .001*

Mean F0 Middle aged (35-59) Controls Schizophrenia 26.57 24.52 P = .012* SD F0 Young (18-34) Controls Schizophrenia 0.21 0.20 P = .363

SD F0 Middle aged (35-59) Controls Schizophrenia

0.22 0.19

P = .015*

Range F0 Young (18-34) Controls

Schizophrenia

7.61 6.95

P = .290

Range F0 Middle aged (35-59) Controls Schizophrenia

8.65 6.70

P = .036*

Mean loudness Young (18-34) Controls Schizophrenia

43.68 42.85

P = .814

Mean loudness Middle aged (35-59) Controls Schizophrenia

47.62 41.58

P = .111

SD loudness Young (18-34) Controls Schizophrenia

98.11 108.67

P = .000**

SD loudness Middle aged (35-59) Controls Schizophrenia

92.76 106.72

P = .000**

Range loudness Young (18-34) Controls Schizophrenia

65.58 69.52

P = .527

Range loudness Middle aged (35-59) Controls Schizophrenia

69.26 65.16

P = .467

Mean Jitter Young (18-34) Controls Schizophrenia

7.24 7.48

P = .340

Mean Jitter Middle aged (35-59) Controls Schizophrenia

7.35 6.44

P = .005*

Sd Jitter Young (18-34) Controls

Schizophrenia

179.46 179.85

P = .912

Sd Jitter Middle aged (35-59) Controls Schizophrenia

181.84 186.24

(33)

Mean Shimmer Young (18-34) Controls Schizophrenia

157.99 162.25

P = .277

Mean Shimmer Middle aged (35-59) Controls Schizophrenia

146.11 143.96

P = .632

Sd Shimmer Young (18-34) Controls Schizophrenia

97.85 97.72

P = .930

Sd Shimmer Middle aged (35-59) Controls Schizophrenia

101.90 101.57

P = .864

Mean HNR Young (18-34) Controls

Schizophrenia

398.83 300.57

P =. 005*

Mean HNR Middle aged (35-59) Controls Schizophrenia 491.92 436.42 P = .189 Sd HNR Young (18-34) Controls Schizophrenia 193.71 179.79 P = .877

Sd HNR Middle aged (35-59) Controls Schizophrenia

139.95 155.13

P = .399

voiced segment per second Young (18-34) Controls Schizophrenia 124.15 194.85 P = .000

voiced segment per second

Middle aged (35-59) Controls Schizophrenia

124.61 141.16

P = .040

Mean voiced segment length per second

Young (18-34) Controls Schizophrenia

24.53 23.66

P = .237

Mean voiced segment length per second

Middle aged (35-59) Controls Schizophrenia

26.82 28.83

P = .126

Sd voiced segment per length second Young (18-34) Controls Schizophrenia 24.73 25.59 P = .296

Sd voiced segment per length second

Middle aged (35-59) Controls Schizophrenia

27.41 29.72

P = .094

Mean unvoiced segment length per second

Young (18-34) Controls Schizophrenia

20.53 25.94

P = .000**

Mean unvoiced segment length per second

Middle aged (35-59) Controls Schizophrenia

19.87 29.63

P = .000**

Sd unvoiced segment length per second

Young (18-34) Controls Schizophrenia 40.83 50.25 P = .001* Sd unvoiced segment

length per second

Middle aged (35-59) Controls Schizophrenia

39.96 60.20

(34)

The P value has been calculated with the independent sample t-test. * P < .05

** P < .001

Differences between healthy people and people with Major depression

The data of the healthy controls and major depression group were not normally distributed for all variables. For those that were not normally distributed the Mann Whitney U test was used and for the other variables the independent sample t test. Table 9 shows that almost all variables showed no difference except for the variability of loudness. Sd loudness shows that people with major depression have a higher variety in their voice loudness (Mdn = 90.50) than the control group (Mdn = 107.50) (U = 76.000, z = -2.721, p = .007).

Table 9

The prosody,voice quality and speech velocity differences between people with major depression and healthy controls

Variable group Mean Median P value

Mean F0 Controls Depression 25.88 24.68 - P = .348 SD F0 Controls Depression 0.21 0.23 - P = .407 Range F0 Controls Depression - 5.65 10.27 P = .376

Mean loudness Controls Depression - 40.95 37.25 P = .343 SD loudness Controls Depression - 90.50 107.50 P = .007*

Range loudness Controls Depression

- 59.45

61.89

P = .681 Mean Jitter Controls

Depression - 7.67 7.57 P = .728 Sd Jitter Controls Depression 175.61 171.56 - P = .542

Mean Shimmer Controls Depression 152.89 155.50 - P = .714 Sd Shimmer Controls Depression 99.65 98.85 - P = .784 Mean HNR Controls Depression 436.82 374.72 - P = .400 Sd HNR Controls Depression - 233.03 334.31 P = .296

voiced segment per second Controls Depression 125.26 135.83 - P = .131

Referenties

GERELATEERDE DOCUMENTEN

To facilitate these trends, the Center for Healthy Aging at the University of Copenhagen and Insilico Medicine are building a community of Key Opinion Leaders (KOLs) in these areas

While elevated levels of IL-6 can be a consequence of stress caused by schizophrenia symptoms, this explanation seems less probable than the kynurenic acid and fetal

The aim of the study is to explore the use which people with a disability make of their private and professional network in finding and maintaining a paid job and the role values

Polymer brushes were grafted from silicon (Si) substrates (path (a)), from gold- coated substrates (100 nm gold evaporated on Si wafer having a 10 nm Cr adhesion layer, path (b)),

Hypothesis 3: Message framing (gain- vs loss-framed message) interacts with time context (long-term or short-term consequences) in influencing alcohol warning label effectiveness

Een reden waarom er weinig verschillen gevonden zijn tussen jong volwassenen en volwassenen zou kunnen zijn dat morele persoonlijkheidsontwikkeling zich al eerder voordoet,

Moreover, this study aimed to investigate the effect of different influencer characteristics (i.e., attractiveness and expertise) on consumer responses towards the influencer and

information (Mizerski, 1982). As corporate crises are usually portrayed and explained on different news channels, how often a person consumes news can determine who he/she thinks