• No results found

2.2. Fluctuation Strength

2.2.3. Related Studies

Although the work by Fastl and Zwicker is the most extensive reference for the fluctuation strength sensation, other studies have been carried out to further investigate the phenomenon.

Accolti and Miyara [1] conducted a study where they used stimuli composed of two mixed AM sources. They implemented a fluctuations strength model based on the extraction of temporal masking patterns within the stimuli. Afterwards, they compared the model outcome with perceived values of fluctuation strength using 5 participants. They used a modified magnitude estimation procedure, where 10 references with known and equidistant values of fluctuation strength were made available to participants for them to compare before giving an answer. The minimum and maximum of this set of references corresponded to stimuli with modulation indexes of 0.1 and 1, respectively. They found that their model could not properly characterize this type of stimulus. However, the most remarkable part of their study is the inclusion of a training phase before the actual experiment. They decided to include such a phase based on results from a preliminary test, which showed that results were strongly variable among subjects. One possible explanation given [1, pp. 17] is that that individuals are not familiar with the concept of fluctuation strength and that a confusion between roughness and fluctuation strength can often arise between the two sensations.

Building upon this last point, Wickelmaier and Ellermeier [19] carried out a three-part experiment that came out with similar results in that sense. First, a full-factorial design with 54 pairs of stimuli, with nine modulation frequencies and six modulation depths, and a magnitude estimation test was run. The results differ from those reported by Fastl and Zwicker, since

they do not show the characteristic band-pass response as a function of modulation rate. The second experiment tried to assess the contribution of the two factors used in the past experiment (modulation frequency and modulation depth) separately, by varying one while leaving the other constant. It was found that the variation of the modulation depth is similar to the data by Fastl and Zwicker, while the variation of the modulation frequency was not.

In the last experiment they tried to assess whether the perceived fluctuation can be represented by an additive combination of modulation frequency and modulation depth. To test this they used the Thomsen condition [6], which for their study was stated as follows:

Let a, b, c be three values of fm(∆f ) and x, y, z three values of ∆f (fm). The Thomsen condition holds, iff

ay ∼ bz, bx ∼ cy =⇒ ax ∼ c0z (2.9)

A adaptive forced-choice procedure (1-up/1-down) using 16 repetitions was used to calculate the matches (∼) stated in Equation (2.9). Their results show that only for one participant out of seven the Thomsen condition proved to be true, suggesting this that listeners do not integrate modulation frequency and modulation depth as an unidimensional percept when it comes to the sensation of fluctuation strength. Wickelmaier and Ellermeier’s conclusions are that the data by Fastl and Zwicker do not conform properly to their data, and that the status of fluctuation strength as a basic auditory perceptual attribute could be debated. The discrepancy between the two data sets could also be attributed to the lack of understanding of participants, as Accolti and Miyara had stated.

Expanding upon possible applications of the sensation of fluctuation strength, Schlittmeier et al. [14] investigated fluctuation strength as a predictor for irrelevant sound effect (ISE). ISE refers to the phenomenon that occurs when background sounds with distinctive temporal-spectral variations significantly reduce short-term memory. This negatively affects individuals cognitive performance. The changing-state features required from sounds to cause this mental interference are the following:

1. Segmentability from a temporal-spectral perspective 2. Different successive auditory-perceptive tokens

The first feature resembles the phenomenon that occurs when the sensation of fluctuation strength arises, and as so Schlittmeier et al. devised an algorithm that predicts ISE using fluctuation strength values. Using the algorithm, they were able to predict accurately performance changes in individuals from 63 out of 70 types of sounds. Although the complete characterization of ISE cannot be achieved by the use of fluctuation strength alone, it paves the way for future works regarding the understanding of ISE.

This chapter describes the general methods used in the pilot and the main experiments. First, the equipment and the stimuli used during the experiments are described. Then the magnitude estimation procedure used to assess the various dependencies of fluctuation strength is detailed.

Finally, the manner in which the results were treated is reported.

3.1. Equipment

A personal computer with a M-Audio Transit USB audio interface [12] and a set of Sennheiser HD 265 Linear headphones [15] were used to conduct the experiment. The audio interface had a specified dynamic range output of 104 dB and a signal-to-noise ratio (SNR) of 104 dB. The interface itself was configured to use a 16-bit Linear pulse-code modulation (PCM) input/output (I/O) audio data format. The headphones had an specified frequency response of −3 dB over the 10 Hz - 30,000 Hz frequency range. Furthermore, the headphones provided a diffuse-field loudness equalization to the reproduced sounds. The experiments were conducted inside a sound isolation booth. The experiment itself was programmed using the APEX software platform [10]

and Windows batch files.

3.2. Stimuli

All the sounds presented during the experiment consisted of diotic stimuli, where a monaural stimulus is reproduced to both ears simultaneously. Two types of stimuli were considered: AM tones defined in Equation (3.1), and FM tones defined in Equation (3.4),

xam = [1 − h · cos(2πfmt)] · Ac· sin(2πfct), (3.1) where

h = m0d− 1

m0d+ 1, (3.2)

m0d= 10md20. (3.3)

xf m= Ac· sin{2π[fc− df · cos(2πfmt)]t}. (3.4) In both equations the variable Accorresponds to the amplitude of the carrier signal. Moreover,

a phase shift of −π2 was introduced to the modulating signal for both tones in order to start the corresponding modulation at its lowest point. This phase shift is needed to avoid pops and clicks due to an abrupt onset when presenting the sounds to the participants. Additionally, a cosine ramp with attack and release times of 25 ms was applied to the stimuli to further prevent this phenomenon.

The duration of the stimuli was specified such that it presented at least three periods of the modulating signal, within a range of values between 2 and 4 seconds. Therefore, stimuli with long periods (e.g., fm = 0.25 Hz) were truncated to 4 seconds and stimuli with short periods (e.g., fm = 32 Hz) were generated to reach a 2 sec duration. This was done to maintain a similar duration among stimuli while keeping the whole experiment duration to an acceptable value.

Furthermore, the stimuli were generated such that a level of 100 dB SPL value corresponds to a 0 dBFS value. The sampling rate was set to 44.1 kHz.

3.3. Procedure

The experiment was divided into two phases:

1. Training phase 2. Experimental phase

The training phase will be discussed subsequently in Chapters 4 and 5, as it varied significantly from the pilot experiments to the final experiment. The experimental phase will be discussed as follows.

Experimental Phase The objective of the experiment was to evaluate the dependence of fluctuation strength on the parameters of the AM and FM tones, namely:

• Modulation frequency (fm)

• Center frequency (fc)

• Sound pressure level (SPL)

• Modulation depth (md) for AM tones; frequency deviation (df) for FM tones

In order to obtain subjective data regarding these dependencies, a magnitude estimation procedure was used. In a magnitude estimation procedure [9, pp. 9], individuals are asked to estimate a value for an attribute of a stimulus, taking as a reference an anchor value. The anchor value is usually called the standard. Hence, the standard is assigned to an specific value, and the participant must state according to this reference value which would be the value for the presented stimulus. For example, in case that a value of 10 is assigned to the standard, an answer value of 60 would indicate that the stimulus value is 6 times larger than the reference

the dependency of fluctuation strength on the parameter of the section. For each experimental section a stimuli set was defined by the variation of parameter of the section, thus specifying the stimuli to use during the magnitude estimation procedure.

Every magnitude estimation procedure presented pairs of sounds, composed of one of two possible standards (Table 3.1) and a stimulus from the stimulus set of the section. The standard and the stimulus were separated by 800 ms silence. There were four repetitions per pair, and hence eight per stimulus (four per standard). The selection of the standard and the stimulus used was randomized. After each pair presentation, the participant had to indicate how much did the second sound fluctuate with respect to the first one. A screenshot of the computer interface used in the final experiment design is shown in Figure 3.1.

After each experimental section, participants were given the opportunity to take a break away from the computer for a short while, or to continue directly with the rest of the experimental sections.

Section1 Parameters

fm [Hz] fc [kHz] SPL [dB] md [dB] df [Hz]

AM-fm 4 1 70 40 —

0.25 1 70 40 —

AM-fc 4 1 70 40 —

4 0.25 70 40 —

AM-SPL 4 1 70 40 —

4 1 50 40 —

AM-md 4 1 70 40 —

4 1 70 4 —

FM-fm 4 1.5 70 — 700

0.5 1.5 70 — 700

FM-fc 4 6 70 — 200

4 0.5 70 — 200

FM-SPL 4 1.5 60 — 700

4 1.5 40 — 700

FM-df 4 1.5 70 — 700

4 1.5 70 — 32

Table 3.1.: Description of the standards used per experiment section

3.4. Results

From the data obtained from the magnitude estimation procedures, it was possible to plot the relative fluctuation strength as a function of each varying parameter. An example of these plots

1Each experimental section is designated with the type of stimuli followed by the parameter varied. Therefore, AM-fm represents the fluctuation strength as a function of modulation frequency experiment for AM tones.

Figure 3.1.: Screenshot of the computer interface used in the final experiment design

is shown in Figure 3.2. The actual plot pertaining the experiments will be shown in Chapters 4 and 5.

In order to combine the data points from the two standards, a correction factor had to be applied to the data that used the second standard as a reference. The correction factor was obtained taking into account the data where the second stimulus in the pairs corresponded to the value of the first standard. The correction factor was calculated such that the median of the values that used the second standard was the same as the median of the values that used the first standard. An example of this can be seen in Figure 3.2 panel (b), where the value of the first standard corresponds to a modulation frequency of 4 Hz.

Finally, another correction factor was applied to all the data points, in order to normalize the maximum mean value of the medians of the standards to 100%. The curve of the mean values correspond to the black line shown in Figure 3.2.

Modulation frequency [Hz]

Figure 3.2.: Relative fluctuation strength as a function of modulation frequency (adapted from [9, pp.248]). The two standards had modulation frequencies of 4 and 0.5 Hz. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data for AM tones with center frequency of 1 kHz, sound pressure level of 70 dB and modulation depth of 40 dB. Panel (b): data for FM tones with center frequency of 1.5 kHz, sound pressure level of 70 dB and frequency deviation of 700 Hz

This chapter describes the experimental design for the evaluation of the fluctuation strength attribute. First, the initial design (as in [7]) is presented. Then, the iterative process of the pilots execution is detailed, stating the progressive changes made to the initial design. Finally, a series of conclusions that lead to the final experimental design are presented.

4.1. Initial Design

4.1.1. Subjects

In total 9 subjects participated in the pilot experiments. Participants were between 20 and 30 years old. All of them had self-reported normal hearing.

4.1.2. Stimuli

Table 4.1 presents all the stimuli used for the different experimental sections, stating which parameters were fixed and which were varied.

4.1.3. Procedure

The pilot experiment was divided in two phases, the training phase and experimental phase, as stated earlier in Chapter 3. The experimental phase remained the same as explained before, using the stimuli set defined in Table 4.1. The training phase for the pilot experiment is described as follows.

4.1.3.1. Training Phase

The objective of this phase was to make the concept of fluctuation strength clear to the participants and to familiarize them with the range of stimuli. Past studies [1] have pointed out the need of such a phase to familiarize subjects with the sensation. However, this must be approached with caution, as the intention of the training phase is to show participants what the sensation is, not teach them how to answer to specific questions regarding the stimuli.

Stimulus Comparison First, a subset of AM tones (Table 4.2) was presented to the participants sequentially according to their ID in pairs (i.e., first stimuli 1 and 2, then 2 and 3, etc.). After

Section Parameters

Table 4.1.: Description of initial set of stimuli used per experiment section

each pair presentation, participants were asked whether a difference in the fluctuation strength among stimuli was detected. In the case of a negative answer the pair was repeated until a positive answer was obtained.

Table 4.2.: Initial subset of AM stimuli for training phase

Long Interval Afterwards, a long interval was presented to the participants. The long interval

Presentation

Table 4.3.: Initial long interval composed of AM stimuli for training phase

4.2. Iterative Improvements

Not all participants were subjected to the same experimental conditions, and not all of them used the same version of the experiments. Table 4.4 presents the conditions and in which one of them the subjects participated.

Table 4.4.: Participants experimental conditions and versions

The experimental procedure was varied during the pilot experiment to accommodate perceived errors during the realization of them. The first version of the experiment yielded unsatisfactory results with regard to the relation between fluctuation strength and modulation frequency (participants 2, 3 and 4). The procedure was then modified, adding two more AM tones with modulation frequencies of 64 and 128 Hz. The idea behind the addition of these two tones was that, if the participants had stimuli that give a distinguishing roughness sensation, it would be easier for them to distinguish between a fluctuating and a rough tone. Additionally, FM tones were included in the training, since up to this point only AM tones were used in the training phase. This constitutes the second version of the experiment.

Participant 5 was the only participant that was subjected to version 2 of the experiment.

The results did not show any significant improvements with regard to the confusion between fluctuation strength and roughness. However, by talking to the participants it was discovered that the training phase was not able to make the concept of fluctuation strength clear. Participants

were only asked to tell whether there was a difference of fluctuation among the presented stimuli, without explaning what fluctuation strength actually was. As such, several participants associated the rate of change of the stimuli (modulation frequency in this case) with a bigger fluctuation in the presented sounds. Hence, they tended to deem as highly fluctuating the sounds that had a high modulation frequency. Participants 4 and 5 explicitly stated that they were counting the number of cycles in the stimuli, due to confusion on what to answer.

Taking all these comments as feedback, version 3 of the experiment was elaborated. In this version, explicit instructions regarding the rough tones were given. It was indicated that the sensation of fluctuation was unrelated to the apparent ‘speed’ of the stimuli, and that the answers should be intuitive, based on the arising sensation and not rationalizing any judgment about it (for instance by counting cycles). Using this approach participants were able to understand better the fluctuation strength concept, some of them even coming with analogies to the sensation itself (the sound of an ambulance alarm, the sound of a washing machine). The actual instructions used in the final experiment can be found in the prompts of the experimental protocol (Appendix A).

The final version, number 4, of the experiment added a small test experiment before starting the actual experimental sections. This was added as a suggestion from participant 8, who indicated that although the training phase was effective in making the fluctuation strength concept clear, it did not show the participant how to do the expected judgments using the magnitude estimation procedure. Moreover, a latin square randomization approach was used, rotating the order of the experimental sections for each participant. The purpose of this was to distribute possible learning effects of participants among the experimental conditions. Finally, the modulation frequency sections (AM-fm and FM-fm) were split into two separate sections, each one with 2 repetitions per pairs instead of 4. Therefore, two smaller sections for the AM-fm and FM-fm were used in the final experiment. This was done due to the longer duration of the modulation frequency sections when compared to the other sections. By keeping all the sections relatively short (around 6 minutes each) it was expected that participants attention and focus would be retained from one section to another.

4.3. Results

The following figures show the results of the pilot experiments, together with the results reported by Fastl and Zwicker [9]. Comparing the two plots per experimental condition, it can be concluded that both graphs are similar from a qualitative point of view. Therefore, the pilot experiment was deemed as successful in obtaining the relevant subjective data from participants.

A more detailed description of the particularities of each experimental condition curve will be given in Chapter 5.

Modulation frequency [Hz]

Figure 4.1.: Relative fluctuation strength as a function of modulation frequency for AM tones with center frequency of 1 kHz, sound pressure level of 70 dB and modulation depth of 40 dB. The two standards had modulation frequencies of 4 and 0.5 Hz. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.248]. Panel (b): own results

Center frequency [Hz]

Figure 4.2.: Relative fluctuation strength as a function of center frequency for AM tones with modulation frequency of 4 Hz, sound pressure level of 70 dB and modulation depth of 40 dB. The two standards had center frequencies of 1 and 0.25 kHz. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.250]. Panel (b): own results

4.4. Conclusions

From the obtained data it can be concluded that the main problem when dealing with the perceptual attribute of fluctuation strength is its ambiguity and confusion with the perceptual attribute of roughness. The proposed training phase was effective in clarifying the concept to participants, by adding stimuli with a clear rough sensation, and by clearly instructing them on what the sensation is about. It should be noted that only with regard to modulation frequency this confusion arises, the other parameters do not present this particularity and as so it was not necessary to adapt the experimental procedure with them.

Sound pressure level [dB]

Figure 4.3.: Relative fluctuation strength as a function of sound pressure level for AM tones with modulation frequency of 4 Hz, center frequency of 1 kHz and modulation depth of 40 dB. The two standards had sound pressure levels of 70 and 50 dB. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9, pp.249]. Panel (b): own results

Modulation depth [dB]

Figure 4.4.: Relative fluctuation strength as a function of modulation depth for AM tones with modulation frequency of 4 Hz, center frequency of 1 kHz and sound pressure level of 70 dB. The two standards had modulation depths of 40 and 4 dB. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9,

Figure 4.4.: Relative fluctuation strength as a function of modulation depth for AM tones with modulation frequency of 4 Hz, center frequency of 1 kHz and sound pressure level of 70 dB. The two standards had modulation depths of 40 and 4 dB. The data points show the median and interquartile ranges per standard. The black line represents the mean values of the medians of each standard. Panel (a): data adapted from [9,