Specific Roughness Stage - Model Development 39

6. Model Development 39

6.1.3. Specific Roughness Stage

The last stage of the model deals with the specific roughness calculation. The model is based on the dependence of roughness on modulation depth, portrayed by Equation (6.7).

R ∼ m^p (6.7)

For each channel, a specific value of roughness is obtained by using the Equation (6.8).

ri = [g(z_i) · m_i^∗· k_i−2· k_i]² (6.8)

The parameter g(z_i) corresponds to a weight associated to each channel, and it models the dependence of roughness on the center frequency of the stimulus. Additionally, cross correlations between the two neighboring channels are included (k_i−2, k_i) to account for phase shifts between the channels signals. The total roughness is then obtained by summing all the specific values for each channel, shown in Equation (6.9)

R = c ·

i=1

r_i (6.9)

where c is a calibration constant.

In order to accommodate for the fluctuation strength scenario, a different equation for the specific fluctuation values was proposed, that provided more flexibility to adapt the model response to the obtained data. First, the three factors used in the specific value calculation where assigned a separate power coefficient to each of them, shown in Equation (6.10). In this way, the contribution of these components can be weighted and adjusted individually.

f_i = g(z_i)^p^g · (m_i^∗)^p^m· (k_i−2· k_i)^p^k (6.10) After the adjustments, the parameter g(z_i) was removed, since the data gathered did not show a dependence on the center frequency. This is evidenced in the flat response obtained in the experimental results when the center frequency was used as the varied parameter (Figures 5.2 and 5.6). Furthermore, to avoid imaginary values when using power coefficients smaller than 1, the absolute value of the product of the cross correlations coefficients was used. Although this negates the subtraction of specific fluctuation due to a negative coefficient, it yields better fitting results. The final form of the specific fluctuation strength equation is shown in Equation (6.11).

f_i = (m_i^∗0)^0.25· |k_i−2· k_i|^0.375 (6.11)

where

m_i^∗0= H(m_i^∗00) · m_i^∗00 (6.12)

m_i^∗00 = m_i^∗− 0.1 (6.13)

being H the Heaviside step function.

The total fluctuation strength is then obtained using Equation (6.14),

F = 0.15 ·

Xf_i, (6.14)

6.2. Procedure

The parameters were adjusted using the following order:

1. p_m, 2. p_k, 3. c,

4. H(f_mod).

The order is such that the first parameter is the one that affects the model response curves most strongly. Furthermore, the goodness of fit of the model was judged both from a quantitative and a qualitative point of view, by comparing the model and experiment curves, and by using the square root of the sum of the square of the difference between these (^q^Pr²_i).

Parameter p_m modifies directly the effect of modulation depth on fluctuation strength. Since the model is based on this relation, it affects all the model response curves, and as such it was the first parameter to adapt. The power coefficient had to be drastically reduced from the value of 2, present in the original roughness model, to a value of 0.25, to account for the fast saturation of fluctuation values as the modulation depth increased present in the dataset. Furthermore, a transformation was needed, Equations (6.12) and (6.13), to correct for the higher values caused by the power coefficient change for lower values of modulation depth.

Parameter p_khas a predominant effect on FM tones, specially on the dependency of frequency deviation, center frequency and modulation frequency. After adjusting this parameter the calibration constant cal was adjusted to attain a value of 1 vacil for the reference condition, present in the fluctuation strength as a function of SPL curve (Figure 6.6).

Finally, the bandpass filter was adjusted, this having an effect mostly on the modulation frequency dependency for both types of tones.

6.3. Results

The following are the results of the model compared with the obtained experimental data.

The model adapts better to the AM tones than to the FM tones, reporting lower values of ^q^Pr²_i in the former case than in the latter. However, from a qualitative point of view, the responses obtained for FM tones tend to follow the ones present in the experimental data, albeit with systematic level differences, especially in the case of center frequency and frequency deviation.

Modulation frequency [Hz]

0 0.25 0.5 1 2 4 8 16 32 64 128

Fluctuation strength [Vacil]

0 0.2 0.4 0.6 0.8 1 1.2 1.4

Experiment Model

r_i²= 1.54 vacil

Figure 6.4.: Fluctuation strength as a function of modulation frequency for AM tones. The blue dashed line corresponds to the mean of the two standards of the experimental data.

The black solid line corresponds to the output of the model. In the top left corner the square root of the sum of the square of the difference between these two curves (^q^Pr_i²) is shown

Center frequency [Hz]

125 250 500 1000 2000 4000 8000

Fluctuation strength [Vacil]

0 0.2 0.4 0.6 0.8 1 1.2 1.4

1.6 _Experiment

Model

r_i²= 0.76 vacil

Figure 6.5.: Fluctuation strength as a function of center frequency for AM tones. The blue dashed line corresponds to the mean of the two standards of the experimental data.

The black solid line corresponds to the output of the model. In the top left corner

Sound pressure level [dB]

50 60 70 80 90

Fluctuation strength [Vacil]

0 0.5 1 1.5 2

2.5 Experiment

Model

r_i²= 1.04 vacil

Figure 6.6.: Fluctuation strength as a function of sound pressure level for AM tones. The blue dashed line corresponds to the mean of the two standards of the experimental data.

The black solid line corresponds to the output of the model. In the top left corner the square root of the sum of the square of the difference between these two curves (^q^Pr_i²) is shown

Modulation depth [dB]

1 2 4 10 20 40

Fluctuation strength [Vacil]

0 0.2 0.4 0.6 0.8 1 1.2 1.4

Experiment Model

r_i²= 0.37 vacil

Figure 6.7.: Fluctuation strength as a function of modulation depth for AM tones. The blue dashed line corresponds to the mean of the two standards of the experimental data.

The black solid line corresponds to the output of the model. In the top left corner the square root of the sum of the square of the difference between these two curves (^q^Pr_i²) is shown

Modulation frequency [Hz]

0 0.25 0.5 1 2 4 8 16 32 64 128

Fluctuation strength [Vacil]

0 0.5 1 1.5 2

2.5 ^Experiment_Model

r_i²= 2.25 vacil

Figure 6.8.: Fluctuation strength as a function of modulation frequency for FM tones. The blue dashed line corresponds to the mean of the two standards of the experimental data.

The black solid line corresponds to the output of the model. In the top left corner the square root of the sum of the square of the difference between these two curves (^q^Pr_i²) is shown

Center frequency [Hz]

500 1000 1500 2000 3000 4000 6000 8000

Fluctuation strength [Vacil]

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

1.8 Experiment

Model

r_i²= 1.90 vacil

Figure 6.9.: Fluctuation strength as a function of center frequency for FM tones. The blue dashed line corresponds to the mean of the two standards of the experimental data.

The black solid line corresponds to the output of the model. In the top left corner

Sound pressure level [dB]

40 50 60 70 80

Fluctuation strength [Vacil]

0 0.5 1 1.5 2 2.5

3 _Experiment

Model

r_i²= 1.21 vacil

Figure 6.10.: Fluctuation strength as a function of sound pressure level for FM tones. The blue dashed line corresponds to the mean of the two standards of the experimental data.

The black solid line corresponds to the output of the model. In the top left corner the square root of the sum of the square of the difference between these two curves (^q^Pr²_i) is shown

Frequency deviation [Hz]

16 32 100 300 700

Fluctuation strength [Vacil]

0 0.5 1 1.5 2

2.5 ^Experiment_Model

r_i²= 1.12 vacil

Figure 6.11.: Fluctuation strength as a function of frequency deviation for FM tones. The blue dashed line corresponds to the mean of the two standards of the experimental data.

The black solid line corresponds to the output of the model. In the top left corner the square root of the sum of the square of the difference between these two curves (^q^Pr²_i) is shown

This chapter discusses the main points of the experimental and modeling stages of the sensation of fluctuation strength, presented in Chapters 5 and 6. First the remarks concerning the experiments will be addressed, and later the details regarding the model will be explored.

Regarding the objectives of this research, it was possible to formulate a revised experimental procedure, that reduced the unfamiliarity of fluctuation strength and the confusion between fluctuation strength and roughness. Moreover, the obtained perceptual data were deemed as being qualitatively similar as those reported by Fastl and Zwicker.

Additionally, it was possible to propose a model for the sensation of fluctuation strength, by adjusting the model of roughness specified by Daniel and Weber. The model produced values that were deemed as being qualitatively similar to those gathered with the revised experimental procedure.

7.1. Experiment

7.1.1. Methods

The training phase, which was developed based on of the pilot experiments carried out before the actual experimental, proved to be beneficial. Participants were able to understand better the concept of fluctuation strength as a result of the presentation of key examples, the stimulus comparison section of the training phase. This was done without forcing participants to specific responses, thus allowing them to learn by themselves the differences between sensations.

Nonetheless, the confusion between roughness and fluctuation strength was not completely eliminated. In the modulation frequency sections of the experiment, some participants exhibited increasing values of fluctuation strength as the modulation frequency increased too. This reveals a lack of understanding of the sensation. However, the concept of fluctuation strength is difficult to grasp, and it depends largely on individuals’ past experiences and capability to focus on the stimuli correctly. As such, some confusion will always exist, and this phenomenon can only be reduced to a certain extent.

Furthermore, the instructions that the participants received, and how they understand them, can influence greatly the expected results. In this case, participants were told to focus on the actual sensation of fluctuation and they should not try to associate it to any physical parameters of the sounds, namely modulation frequency which leads to the roughness confusion. Some participants interpreted this as meaning that they should negate any other physical parameter effects on the fluctuation as long as it does not affect it directly. For example, sound pressure level

would sometimes be considered to have a flat response since it does not modify the modulation itself, only the loudness of the sounds. Explicitly addressing these facts could lead to more accurate fluctuation estimation on behalf of the participants.

During the training phase, the stimulus comparison and the test section experiment parts of it proved to be helpful for participants. The long interval section was not as helpful as the other sections, as participants always stated that they could distinguish the stimuli. This latter section could be removed from future experiments, in order to make them shorter and easier for the participants.

With regard to the experimental sections themselves, some participants noted the fact that the slider would not reset to its original position after each trial was done. This was a technical limitation of the chosen platform. Although one may wonder whether the fact that slider’s initial position could somehow affect participants’ responses, changing the starting 100% reference point from a visual point of view, nothing but speculation can be formulated at this point.

The experimental session had an approximate duration of one hour. During the experiment participant were sitting and staring at the computer screen while they listened to the stimuli.

Although breaks between the experimental sections were suggested to participants, almost none of them took them, preferring to finish the experiment as fast as possible. Moreover, some participants reported feeling tired, dizzy or with a slight headache after the conclusion of the experiment. These tiredness effects were intended to be balanced with the use of the latin square design, so they would distribute among experimental conditions. Another possible solution would be to reduce the number of repetitions per pair presentation. The chosen value of 4 repetitions was obtained from Fastl’s experiment [7], as it provided a ±10% value deviation between answers.

Related to experimental duration, the sections corresponding to the modulation frequency dependency were split into two sections, to keep all sections duration around 6 minutes. Whether this introduces any changes in participants responses, i.e., having two shorter sections instead of a longer section, is unknown.

7.1.2. Results

The fluctuation strength as a function of modulation frequency curves, both for AM and FM tones showed the expected bandpass responses, although the bandwidth of these response is wider than that reported by Fastl and Zwicker. One possible reason for this is the influence of the additional tones (f_m= {0, 64, 128} Hz), added to counteract the confusion of roughness and fluctuation strength. This changes the lowest and highest values present for modulation frequency. These points act as anchors, providing participants with values that implicitly have an almost zero value of fluctuation strength. As so, the response is “stretched”, similar to the enlargement of the bandwidth observed in the characteristic band-pass responses. Furthermore, during the course of the experiment participants were exposed slowly to the whole range of

to the literature data. Internal-state models for magnitude estimation procedures have been proposed in the past [11], supporting this view on the cognitives process that govern participants answers. Furthermore, past studies [17] have shown that the range of stimuli affects the outcome of a magnitude estimation process.

Regarding the fluctuation strength as a function of center frequency, for AM a mostly flat response was found. Although it is somewhat different from that presented by Fastl and Zwicker, this data also present large IQR values. Therefore the obtained curve is deemed to be qualitatively similar. With respect to FM tones the obtained data deviates significantly from Fastl and Zwicker, the former having a flat trend, while the latter decreases monotonically with the increase of the center frequency. This difference will be addressed later, when the frequency deviation curve will be discussed.

Some participants stated that they were not sure if the variation of the sound pressure level should be considered to influence the sensation of fluctuation. However, the sound pressure level curves showed some minor differences compared to the literature, but overall are consistent with the expected behavior.

The last two parameters, modulation depth and frequency deviation, present also systematical variations with respect to those reported by Fastl and Zwicker. Since these two parameters can be considered analogous indicators for the amount of modulation for each type of tone, they will be analyzed at the same time. Participants exhibit a lack of sensitivity to changes in modulation, resulting in higher than expected values for both curves. A small value of m_d or d_f results in a higher value of fluctuation, while the increase of modulation resulting in a less steep increase in fluctuation.

For FM tones, this lack of sensitivity can also explain the differences found in the center frequency and frequency deviation curves. In both cases, the number of auditory filters excited by the incoming signal depends on the parameters. For center frequency, an increase of frequency corresponds to a decrease in the number of filters excited, since for higher frequencies the auditory filters have wider bandwidths. For the frequency deviation the opposite occurs, because an increase in the frequency deviation increases the stimulus bandwidth to, resulting in more auditory filters covered by it. It seems that, as the number of auditory filters that are excited increases, their contribution to the overall fluctuation strength of the sound decreases.

Finally, the fluctuation strength as a function of center frequency curve used for FM tones is not very suitable for analyzing only the effect of center frequency itself. As the center frequency increases, the number of auditory filters excited decreases, due to an enlargement of their bandwidth for higher center frequency values. As such, two effects are present in this response, center frequency and number of auditory filters. As an improvement to the methodology, stimuli with a lower value of frequency deviation could be used, to eliminate the auditory filter effect.

7.2. Model

7.2.1. Results

Overall the model provides qualitatively similar estimates for the fluctuation strength values presented by the experimental data. However, it adapts better to the AM tones than to the FM tones. This make sense, since the model was intended to be used with AM stimuli, hence the fundamental use of modulation depth (a property found in AM only) to estimate the amount of fluctuation. Therefore, a trade-off must be established when adjusting the model parameters having each type of tone in mind. Nevertheless, the model predicts values close to those found in the data for FM tones. One possible improvement in this regard is to substitute the generalized modulation depth formula proposed by Daniel and Weber with a modulation filterbank [4], that could model better the energy distribution of the signal among specific frequency bands.

7.2.2. Limitations

The use of the absolute value of cross correlation coefficients may pose problems when using the model with BBN AM tones, the other type of stimulus used by Fastl and Zwicker in their study. Although some FM tones used in this study present already negative cross correlation coefficients, they do not affect negatively the overall fit obtain with the model.

The increased frame size needed to achieve the frequency resolution for the model renders it unfeasible to implement in practical application. Currently a frame size of 2²⁰samples is needed, which corresponds to around 24 seconds for a sampling frequency of 44.1 kHz.

Finally, more modern modeling techniques, such as the use of the ERB perceptual scale and a Gammatone filterbank instead of Terhardt’s can be implemented, to improve the model and bring it to a more up to date state.

7.3. Conclusions

The use of a training phase before the experimental phase proved to be useful. On average, participants were able to understand better the concept of fluctuation strength. Moreover, the inclusion of the training phase reduced the confusion between fluctuation strength and roughness, compared to the pilot experiments.

The obtained perceptual data were found to be qualitatively similar to the data originally reported by Fastl and Zwicker. Although some differences existed between the two data, it was possible to isolate and identify the cause of discrepancy. Furthermore, the characteristic bandpass response of fluctuation strength on modulation frequency was observed in the obtained data.

[1] E. Accolti and F. Miyara. “Fluctuation Strength of Mixed Fluctuating Sound Sources.”

Mecánica Computacional 28.2 (Nov. 2009), pp. 9–22.

[2] P. Boersma. “Accurate Short-Term Analysis of the Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound.” In: Proceedings of the Institute of Phonetic Sciences. Vol. 17. 1193. 1993, pp. 97–110.

[3] P. Daniel and R. Weber. “Psychoacoustical Roughness: Implementation of an Optimized Model.” Acta Acustica united with Acustica 83.1 (Jan. 1997), pp. 113–123.

[4] T. Dau, B. Kollmeier, and A. Kohlrausch. “Modeling Auditory Processing of Amplitude Modulation. I. Detection and Masking with Narrow-Band Carriers.” The Journal of the Acoustical Society of America 102.5 (Nov. 1997), pp. 2892–2905.

[5] E. Edwards and E. F. Chang. “Syllabic (∼2–5 Hz) and Fluctuation (∼1–10 Hz) Ranges in Speech and Auditory Processing.” Hearing Research 305 (Nov. 2013), pp. 113–134.

[6] J. Falmagne. “Random Conjoint Measurement and Loudness Summation.” Psychological Review 83.1 (Jan. 1976), pp. 65–79.

[7] H. Fastl. “Fluctuation Strength and Temporal Masking Patterns of Amplitude-Modulated Broadband Noise.” Hearing Research 8.1 (Sept. 1982), pp. 59–69.

[8] H. Fastl. “The Psychoacoustics of Sound-Quality Evaluation.” Acta Acustica united with Acustica 83.5 (1997), pp. 754–764.

[9] H. Fastl and E. Zwicker. Psychoacoustics: Facts and Models. 3rd ed. Vol. 22. Springer-Verlag Berlin Heidelberg, 2007. isbn: 9783540688884.

[10] T. Francart, A. van Wieringen, and J. Wouters. “APEX 3: A Multi-Purpose Test Platform for Auditory Psychophysical Experiments.” Journal of Neuroscience Methods 172.2 (July 2008), pp. 283–293.

[11] A. A. J. Marley. “Internal State Models for Magnitude and Related Experiments.” Journal of Mathematical Psychology 9.3 (Aug. 1972), pp. 306–319.

[12] M-Audio. M-Audio Transit USB User Guide. 2003.

[13] G. Müller and M. Möser. Handbook of Engineering Acoustics. Springer-Verlag Berlin Heidelberg, 2013. isbn: 9783540694601.

[14] S. J. Schlittmeier et al. “Algorithmic Modeling of the Irrelevant Sound Effect (ISE) by the Hearing Sensation Fluctuation Strength.” Attention, Perception & Psychophysics 74.1 (Jan. 2012), pp. 194–203.

[15] Sennheiser Electronic GmbH. Sennheiser HD 265 Instructions for Use. 1994.

[16] A. Sontacchi. “Entwicklung eines Modulkonzeptes für die psychoakustische Geräuschanalyse

In document Eindhoven University of Technology MASTER Modeling the sensation of fluctuation strength García León, R. (pagina 54-0)