• No results found

4. Pilot Experiment Design and Results 19

4.4. Conclusions

From the obtained data it can be concluded that the main problem when dealing with the perceptual attribute of fluctuation strength is its ambiguity and confusion with the perceptual attribute of roughness. The proposed training phase was effective in clarifying the concept to participants, by adding stimuli with a clear rough sensation, and by clearly instructing them on what the sensation is about. It should be noted that only with regard to modulation frequency this confusion arises, the other parameters do not present this particularity and as so it was not necessary to adapt the experimental procedure with them.

Sound pressure level [dB]

Figure 4.3.: Relative fluctuation strength as a function of sound pressure level for AM tones with modulation frequency of 4 Hz, center frequency of 1 kHz and modulation depth of 40 dB. The two standards had sound pressure levels of 70 and 50 dB. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.249]. Panel (b): own results

Modulation depth [dB]

Figure 4.4.: Relative fluctuation strength as a function of modulation depth for AM tones with modulation frequency of 4 Hz, center frequency of 1 kHz and sound pressure level of 70 dB. The two standards had modulation depths of 40 and 4 dB. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.249]. Panel (b): own results

Modulation frequency [Hz]

Figure 4.5.: Relative fluctuation strength as a function of modulation frequency for FM tones with center frequency of 1.5 kHz, sound pressure level of 70 dB and frequency deviation of 700 Hz. The two standards had modulation frequencies of 4 and 0.5 Hz.

The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.248]. Panel (b): own results

Center frequency [Hz]

500 1000 1500 2000 3000 4000 6000 8000

Relative fluctuation strength [%]

500 1000 1500 2000 3000 4000 6000 8000

Relative fluctuation strength [%]

Figure 4.6.: Relative fluctuation strength as a function of center frequency for FM tones with modulation frequency of 4 Hz, sound pressure level of 70 dB and frequency deviation of 200 Hz. The two standards had center frequencies of 6 and 0.5 kHz. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.250]. Panel (b): own results

Sound pressure level [dB]

Figure 4.7.: Relative fluctuation strength as a function of sound pressure level for FM tones with modulation frequency of 4 Hz, center frequency of 1.5 kHz and frequency deviation of 700 Hz. The two standards had sound pressure levels of 60 and 40 dB. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.249]. Panel (b): own results

Frequency deviation [Hz]

Figure 4.8.: Relative fluctuation strength as a function of modulation depth for FM tones with modulation frequency of 4 Hz, center frequency of 1.5 kHz and sound pressure level of 70 dB. The two standards had frequency deviations of 700 and 32 Hz. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.251]. Panel (b): own results

In this chapter the final experimental design and results for the evaluation of the fluctuation strength attribute are presented.

5.1. Design

5.1.1. Subjects

Twenty-four participants were recruited from the JF Schouten database of the Eindhoven University of Technology. Participants were between 19 and 31 years old. There were in total six females and eighteen males. All of them reported to have normal hearing, however this was not confirmed in any way. Subjects were paid for their participation.

5.1.2. Stimuli

The stimuli used in the final experiment have the same characteristics as the ones used in the pilot experiments (Section 4.1.2). Table 5.1 shows the stimuli used in the final experiments.

5.1.3. Procedure

Participants were assigned either to AM tones or to FM tones, both conditions having 12 participants. Furthermore, the presentation order of the experimental sections was varied, using a latin square design. The order of the parameters used is presented in Table 5.2.

The whole experiment had an approximate duration of 60 minutes. The experimental protocol followed during the experiment can be found in the appendix of this document (Appendix A).

The experimental phase of the experiment remained the same as the one of the last pilot experiment. The only sensible change compared to the procedure described in Chapter 3 is the split of the modulation frequency sections into two separate sections. The training phase had some changes, described below.

5.1.3.1. Training Phase

Adding all the improvements postulated in the pilot experiment section, the training phase was expanded and became a 3-part phase, described below.

Section Parameters

Table 5.1.: Description of stimuli used per experiment section

Order1 Parameters

1 fm, fc, {md or df}, fm, SPL

2 fc, {md or df}, fm, SPL, fm

3 {mdor df}, fm, SPL, fm, fc

4 SPL, fm, fc, {mdor df}, fm

Table 5.2.: Presentation order of parameters

Stimulus Comparison As in the pilot experiments, the initial part of the training phase consisted of comparison between stimuli. An additional stimulus was added (fm = 128 Hz) and a subset of FM stimuli was also added to complement the AM tones.

The stimuli presented in Tables 5.3 and 5.4 were reproduced in groups, according to their ID values. First, stimuli with ID values of 1 and 2 were reproduced. This pair presented the

values of fluctuation strength. Finally, stimuli with ID values of 3, 4 and 5 were reproduced.

This last group presented the difference between fluctuating and rough tones. After each group presentation, participants were asked whether the specific difference of sensation of the group was acknowledged. In case of a negative answer, the stimuli of the group were once again reproduced.

ID fm [Hz] fc [kHz] SPL [dB] md [dB]

1 0 1 70 40

2 0.5 1 70 40

3 4 1 70 40

4 32 1 70 40

5 128 1 70 40

Table 5.3.: Subset of AM stimuli for training phase

ID fm [Hz] fc [kHz] SPL [dB] df [Hz]

1 0 1 70 700

2 0.5 1 70 700

3 4 1 70 700

4 32 1 70 700

4 128 1 70 700

Table 5.4.: Subset of FM stimuli for training phase

Long Interval This part of the training phase presented participants with long intervals, which consisted of stimuli separated by 800 ms of silence. Two long intervals were reproduced, composed of AM and FM tones, respectively. These two intervals are described in Tables 5.5 and 5.6.

Presentation

fm [Hz] fc [kHz] SPL [dB] md [dB]

order

1 0.5 1 70 40

2 32 1 70 40

3 2 1 70 40

4 16 1 70 40

5 4 1 70 40

6 1 1 70 40

7 0 1 70 40

8 64 1 70 40

9 0.25 1 70 40

10 128 1 70 40

11 8 1 70 40

Table 5.5.: Long interval composed of AM stimuli for training phase

Test Section In order to familiarize participants with the interface used during the experiment and with the magnitude estimation procedure, a small test section was added at the end of the

Presentation

fm [Hz] fc [kHz] SPL [dB] df [Hz]

order

1 0.25 1 70 700

2 64 1 70 700

3 32 1 70 700

4 2 1 70 700

5 1 1 70 700

6 0 1 70 700

7 8 1 70 700

8 4 1 70 700

9 0.5 1 70 700

10 128 1 70 700

11 16 1 70 700

Table 5.6.: Long interval composed of FM stimuli for training phase

training phase. This section consisted of four pairs (Table 5.7), which were presented to the participants in randomized order.

Pair Parameters

fm [Hz] fc [kHz] SPL [dB] md [dB] df [Hz]

1 4 1 70 40 —

32 1 70 40 —

2 4 6 70 — 200

4 6 70 — 200

3 4 1 70 40 —

0 1 70 40 —

4 4 1.5 60 — 700

4 1.5 80 — 700

Table 5.7.: Pairs used in training phase test section

5.2. Results

The following section presents the results of the experiments, compared to the data published by Fastl and Zwicker [9]. Overall the obtained data is qualitatively similar to the data by Fastl and Zwicker, although some differences do exist. Most notably, in the modulation depth response curve for AM tones and in the center frequency and frequency deviations curves for FM tones.

In the following paragraphs the obtained curves will be described, focusing on similarities and discrepancies with the literature data. Possible causes of this discrepancies will be discussed in Chapter 7.

modulation frequency below 4 Hz on average higher values of fluctuation were obtained. This leads to the fact that the response from the obtained data has a wider bandwidth than the data from the literature. Also, more variability seems to exist with the use of the first standard, evidenced by the difference of length between the larger interquartile range (IQR)s of the first standard and the smaller IQRs of the second standard.

Figure 5.2 shows the dependency of fluctuation strength on center frequency from AM tones.

In this case both responses present a similar flat response with large IQRs. Figure 5.2 shows the dependency of fluctuation strength on center frequency from FM tones. Here the difference is more dramatic, the data from the literature decreases monotonically with the increase of center frequency, whereas the data from this study remain mostly flat.

Figures 5.3 and 5.7 show the dependency of fluctuation strength on sound pressure level for AM and FM tones. Although the curves from the obtained data are not as linear as the data from the literature, in both cases fluctuation strength increases with sound pressure level.

Figure 5.4 shows the dependency of fluctuation strength on modulation depth from AM tones.

Here a difference exists between the obtained data and the literature data. Both response curves show an increase of fluctuation strength with the increase of modulation depth. However, the obtained data increases more quickly with modulation depth than the data from the literature.

Figure 5.8 shows the dependency of fluctuation strength on frequency deviation from FM tones. In this case a clear difference exists between the obtained data and the literature data.

For the obtained data, for small values of frequency deviation a significant fluctuation strength (around 50%) does exist. This causes that the curve presents a less steep slope when compared to the literature data. Both curves present an increase in fluctuation strength with an increase in frequency deviation.

Finally, to summarize Figures 5.9 and 5.10 compared the mean of the values of the two standard for each experimental condition. The discrepancies between obtained data and literature data can be further observed in these figures.

Modulation frequency [Hz]

Figure 5.1.: Relative fluctuation strength as a function of modulation frequency for AM tones with center frequency of 1 kHz, sound pressure level of 70 dB and modulation depth of 40 dB. The two standards had modulation frequencies of 4 and 0.5 Hz. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.248]. Panel (b): own results

Center frequency [Hz]

Figure 5.2.: Relative fluctuation strength as a function of center frequency for AM tones with modulation frequency of 4 Hz, sound pressure level of 70 dB and modulation depth of 40 dB. The two standards had center frequencies of 1 and 0.25 kHz. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.250]. Panel (b): own results

Sound pressure level [dB]

Figure 5.3.: Relative fluctuation strength as a function of sound pressure level for AM tones with modulation frequency of 4 Hz, center frequency of 1 kHz and modulation depth of 40 dB. The two standards had sound pressure levels of 70 and 50 dB. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.249]. Panel (b): own results

Modulation depth [dB]

Figure 5.4.: Relative fluctuation strength as a function of modulation depth for AM tones with modulation frequency of 4 Hz, center frequency of 1 kHz and sound pressure level of 70 dB. The two standards had modulation depths of 40 and 4 dB. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.249]. Panel (b): own results

Modulation frequency [Hz]

Figure 5.5.: Relative fluctuation strength as a function of modulation frequency for FM tones with center frequency of 1.5 kHz, sound pressure level of 70 dB and frequency deviation of 700 Hz. The two standards had modulation frequencies of 4 and 0.5 Hz.

The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.248]. Panel (b): own results

Center frequency [Hz]

500 1000 1500 2000 3000 4000 6000 8000

Relative fluctuation strength [%]

500 1000 1500 2000 3000 4000 6000 8000

Relative fluctuation strength [%]

Figure 5.6.: Relative fluctuation strength as a function of center frequency for FM tones with modulation frequency of 4 Hz, sound pressure level of 70 dB and frequency deviation of 200 Hz. The two standards had center frequencies of 6 and 0.5 kHz. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.250]. Panel (b): own results

Sound pressure level [dB]

Figure 5.7.: Relative fluctuation strength as a function of sound pressure level for FM tones with modulation frequency of 4 Hz, center frequency of 1.5 kHz and frequency deviation of 700 Hz. The two standards had sound pressure levels of 60 and 40 dB. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.249]. Panel (b): own results

Frequency deviation [Hz]

Figure 5.8.: Relative fluctuation strength as a function of modulation depth for FM tones with modulation frequency of 4 Hz, center frequency of 1.5 kHz and sound pressure level of 70 dB. The two standards had frequency deviations of 700 and 32 Hz. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.251]. Panel (b): own results

Modulation frequency [Hz]

Figure 5.9.: Relative fluctuation strength for AM tones as a function of: (a) modulation frequency, (b) center frequency, (c) sound pressure level, (d) modulation depth. The black solid line corresponds to the data from Fastl and Zwicker [9]. The blue dashed line corresponds to the data from this study. Both curves represents the mean value of the two used standards

Modulation frequency [Hz]

500 1000 1500 2000 3000 4000 6000 8000

Relative fluctuation strength [%]

Figure 5.10.: Relative fluctuation strength for FM tones as a function of: (a) modulation frequency, (b) center frequency, (c) sound pressure level, (d) frequency deviation.

The black solid line corresponds to the data from Fastl and Zwicker [9]. The blue dashed line corresponds to the data from this study. Both curves represents the mean value of the two used standards

The following chapter details the process to develop a model for the sensation of fluctuation strength, based on Daniel and Weber’s roughness model [3]. The model will be analyzed in stages, where the changes needed to adapt it to this particular case will be detailed. After that, a procedure to adjust the parameters of the model is presented. Finally, the results of the fluctuation strength model will be presented, and the limitations of the data fitting will be addressed.

6.1. Roughness Model

Daniel and Weber’s roughness model [3] will be used as a basis for a fluctuation strength model adapted to the data obtained in the experimental stage of this study. It has already been mentioned in Chapter 2 that roughness and fluctuation strength sensations are similar from a physical point of view; both attributes arise from modulated sounds. As such, the methodology used to model the roughness sensation might be adequate to also model the fluctuation strength sensation.

The structure of the roughness model is presented in Figure 6.1. The model can be separated into three stages:

1. Peripheral stage

2. Modulation depth extraction stage 3. Specific roughness stage

6.1.1. Peripheral Stage

First, the input signal is divided into frames, which in the original roughness model have a 200 ms duration. For the fluctuation strength case this must be increased, in order to achieve a higher frequency resolution to be able to process stimuli with frequency components with little separation among them (e.g., fm = 4 Hz). This can be achieved by manipulating two variables, namely the sampling frequency and the number of samples. For this study it has been decided to keep the sampling frequency at 44.1 kHz, the most common frequency used in audio recordings. The number of samples is chosen such that it contains at least three periods of the slowest modulation envelope, following the criteria presented by Boersma [2, pp. 97]. The slowest stimulus has a modulation frequency of 0.25 Hz, thus three periods would constitute a 12 s frame duration. This results in a number of samples of 1048576 (220) and a duration of

input signal

200 ms frame of the sampled input signal weighted with a Blackman window

transformation of the frame spectrum into excitation patterns

critical-band filterbank of 47 overlapping channels at zi = 0.5 i Bark and a bandwidth ∆z = 1 Bark

ei(t) |ei(t)| F (|ei(t)|)

DC: h0,i= |ei(t)| bandpass filtering:

F (|ei(t)|) · Hi(fmod)

mi= ˜hBP,i(t)/h0,i

correlation correlation

ri = g(zi) · mi· ki−2· ki2

+

R i

1 · · · · · · 47

hBP,i(t) i

i−2 i i+2

ki−2 ki

1 47

Figure 6.1.: Structure of roughness model [3, pp. 116]

23.77 s, using the closest value that is a power of 2.

After the frame separation, a Blackman window is applied to the frame to reduce spectral leakage. Following this, the outer and middle ear transmission effects are taken into account by transforming the frame into the frequency domain, and multiplying its components by the parameter a0, a transfer function that is shown in Figure 6.2.

After this, the input frame spectrum is transformed into excitation patterns using Terhardt’s approach [18]. This method is a non-linear one, where the lower and upper slopes of the excitation patterns differ. The lower slopes are independent of the stimulus center frequency, and are defined by:

S1= −27 dB

Bark. (6.1)

Frequency [Hz] ×104

0 0.5 1 1.5 2

a 0

0 0.5 1 1.5 2

Figure 6.2.: Outer and middle ear transmission effects parameter a0

stimulus (L, expressed in dB), and are defined by the following equation:

S2 = [−24 −0.23

f + 0.2 · L] dB

Bark. (6.2)

After generating each of the excitation patterns for all the frequency components, the contribu-tion of all of these patterns is analyzed per critical-band. The excitacontribu-tion patterns are processed using a critical-band filter with 47 channels, each on them separated by 0.5 Bark and having a bandwidth of 1 Bark.

For each critical-band, the excitation levels of the frequency components inside the critical-band are considered. An example of this process is shown in Figure 6.3, for a critical-band centered at 12 Bark. If the frequency component causing the excitation levels falls within the given critical-band, then its level is left unchanged. This is the case of component (2) in Figure 6.3. If, on the contrary, the frequency component lies outside the critical-band, its amplitude level is adjusted to the excitation level found on the frequency limits of the critical-band. This is the case for components (1) and (3) in Figure 6.3. For component (1), the excitation level at 11.5 Bark was taken to adjust the amplitude level of the frequency component. For component (3), the excitation level value at 12.5 Bark was taken instead. The result of adjusting the relevant frequency components using their excitation levels is shown in Figure 6.3, panel (b). It is important to note that only the amplitude levels of the frequency components are affected by this procedure, their phase values are left untouched.

After doing all the calculations for all the frequency components, then the real part of an inverse fast Fourier transform (IFFT) is taken from the resulting array of the modified frequency components, yielding this the time-varying excitation per critical band.

0 2 4 6 8 10 12 14 16 18 20 22 24 0

20 40 60

80 (1) (2) (3)

Critical-band rate [Bark]

Excitationlevel[dB]

(a)

0 2 4 6 8 10 12 14 16 18 20 22 24

0 20 40 60 80

(1) (2)

(3)

Critical-band rate [Bark]

Excitationlevel[dB]

(b)

Figure 6.3.: Excitation patterns and specific excitation levels for a critical-band channel centered at 12 Bark with 1 Bark bandwith. Three pure tones are presented with sound pressure levels of 70 dB and center frequencies of 10 Bark (1), 12 Bark (2) and 14 Bark (3). Panel (a) shows the original excitation patterns and specific excitation level. The dashed lines show the bandwidth of critical-band at 12 Bark. The dotted lines show the projection of the values of the tones outside the critical-band. Panel (b) shows the modified specific excitation levels after adjustment using the projected

values shown in panel (a)

6.1.2. Modulation Depth Extraction Stage

The objective of this stage is to come up with an approximation of the modulation depth present in the incoming frame on a channel basis. For each channel signal ei(t) its absolute value is

The objective of this stage is to come up with an approximation of the modulation depth present in the incoming frame on a channel basis. For each channel signal ei(t) its absolute value is