Testing Functions with Multiple Frequencies

A function with multiple frequencies is one that has more than one contributing frequency com-ponent. In terms of the periodogram, this entails more than one dominant peak. In Figure5.22 we see that adding a sine function with period 6 to a cosine function with period 5 introduces an extra frequency component represented by the second peak at frequency 2π/6 = π/3. The period of functions with multiple contributing frequencies is the least common multiple of the periods corresponding to these frequencies. In our example, we see that the contributing frequencies are 2π/5 and 2π/6. The corresponding periods are then 5 and 6 by equation 3.8. The least common multiple of 5 and 6 is 30, making the period of the function 30. This can be verified by Figure 5.23where we see that the function repeats itself every 30-time units.

In section 5.1 we saw that the periodogram is the base of all the test statistics used. Even Lomb-Scargle is a redefinition of the standard periodogram. Thus, the extra peaks will alter the way we implement the tests. We found that the test using Fisher’s G-statistic behaves differently than the other tests for functions with multiple frequencies. This unique behaviour is based on the fact that under the null hypothesis Fisher’s statistic remains constant. Thus, the additional peaks in the periodogram affect the test differently.

5.6.1 Test using Fisher’s G-statistic

While investigating the test with Fisher’s G-statistic for functions with multiple frequencies, it was found that this test fails to reject the null hypothesis for periods corresponding to the contributing frequency components and any of their multiples. Thus, rather than testing whether a value τ0is the period of the function, we define a set K which contains the different frequency components contributing to the period of the function. The period of the function will then be the least common multiple of the periods corresponding to the values in K. Then, we would like to test whether the frequency component k ∈ (0, π] contributes to the period of the function. Thus, we test the composite null hypothesis,

H₀: k ∈ K

CHAPTER 5. RESULTS & COMPARISON OF METHODS

Figure 5.22: This figure shows the periodogram for a cosine function with period 5 on the left and the periodogram of the function when adding a sine function with period 6 on the right. We see that adding the sine function introduces an extra frequency component which is shown by the second peak at frequency 2π/6 = π/3.

Figure 5.23: A cosine function with period 5 plus a sine function with period 6. In this figure we see how the period of the function is 30 as the function values repeat every 30-time units.

against the alternative,

H₁: k /∈ K.

Note here that we are leveraging the first limitation, where the period of the function is a multiple of the contributing frequencies.

We evaluate this test in the same way as we did in the previous sections. However, due to the composite null hypothesis, our test statistic becomes the maximum of Fisher’s G-statistic for each k ∈ K. We use the function that we introduced above; a cosine function with period 5 plus a sine function 6. We also multiply this function by√

2 so that the SNR is two. For this example, K = {2π/5, 2π/6}.

From this function, 1000 data samples of length 150 were generated under the null hypothe-sis. The test was run on these samples for each of the permutation methods using 1000 Monte Carlo iterations. Using significance level 0.05, the fraction of time that the test rejected under the null hypothesis was 0.0000 for all three permuting methods. Figure5.24shows the ROC curves for the different permuting methods. We see that the test performs best under permutation method 3. Note that the curve for method 1 is obscured by the curve of method 2. We see that the test performs similarly to the test using Fisher’s G-statistic for functions with a single frequency.

CHAPTER 5. RESULTS & COMPARISON OF METHODS

Figure 5.24: ROC curves for the function cosine with period 5 plus sine with period 6 for the test using Fisher’s G-statistic.

Similar to the case with single frequency functions, there are two main limitations for this test.

For clarity we let K⁰ denote the periods corresponding to the frequencies in K. Then this test accepts frequencies with periods corresponding to values in K⁰. We also see a limitation similar to the second limitation for single frequency functions where the test accepts frequencies which correspond to fractions of multiples of the periods in K⁰. Thus, for this example the test will then also fail to reject H₀ for the frequencies,

2π

5c and 2π

6c,

for all c ∈ Q⁺ that are in the interval (0, π]. These limitations are analogous to the ones addressed in Section5.5. Therefore, solutions to the limitations in Section5.5 will also address these.

5.6.2 Other Tests

While investigating the other tests, they did reject the null hypothesis for periods corresponding to the contributing frequency components. Thus, the test is the same as before, where we test

H₀: f has period τ0

against the alternative,

H1: f does not have period τ0

for τ0∈ Q such that τ0≥ 2 due to the Nyquist frequency. To evaluate the tests we compute the fraction the test incorrectly rejects the null hypothesis and plot their ROC curves for the same example used for the test with Fisher’s G-statistic. Furthermore, the same algorithm parameters were used as for Fisher’s G-statistic. The results can be found in Table5.7and Figures5.25,5.26, and5.27for the different permuting methods.

In Table 5.7Bartlett’s and Welch’s methods have high values. As in Section5.3 there are many contributing errors that could cause higher values. However, the value of Welch’s method for per-muting method 3 is still considered large. This large value manifests from the averaging performed by Welch’s method. This averaging already causes large fractions in the case of functions with

CHAPTER 5. RESULTS & COMPARISON OF METHODS

Test Statistic Bartlett Welch Lomb-Scargle

Method 1 0.064 0.060 0.026

Method 2 0.064 0.063 0.021

Method 3 0.066 0.121 0.024

Table 5.7: Shows the fraction that the test rejected under the null hypothesis for the different permutation methods of 1000 data sets generated from a cosine function with period 5 plus a sine function with period 6. Here the length of the data was set to 150 and the Monte Carlo parameter was set to 1000 permutations.

a single frequency. Also, for this function, the peaks are close together (see Figure 5.22). Thus, with the added noise and the averaging, it is possible that the two peaks merge into one peak which does not correspond to periods 5 or 6. This causes the test to incorrectly reject the null hypothesis for period 30.

Considering the ROC curves in Figures5.25, 5.26, and 5.27, we see that the test using Bartlett’s method performs better than the other tests, performing the best overall for permuting method 3.

Moreover, the test using Welch’s method improves quite drastically for permuting method 3. The reasons behind this are the same as before, where permuting method 3 changes the neighbourhood of observations more drastically creating more variations in the results. Note that these results are also affected by the limitations discussed in Section5.5.

Lastly, we compare these results to those found in Section 5.6.1 for the test using Fisher’s G-statistic to find the best test overall. We do this by computing the areas under the curves. We find that for permuting method 3, the test using Bartlett’s method and the test using Fisher’s G-statistic have almost the same area of 0.9656 and 0.9650, respectively. Similarly for permuting method 2, the tests using Fisher’s G-statistic and Bartlett’s method have similar areas of approx-imately 0.9443. However, for permuting method 1 the test using Fisher’s G-statistic is the best with an area 0.9475.

Although it may seem that Fisher’s G-statistic is better, the test using Bartlett’s method is less affected by the limitations described in Section 5.5 compared to Fisher’s G-statistic. The test using Bartlett’s method has fewer multiples because we only need to consider multiples of τ whereas for Fisher’s G-statistic all multiples of the periods corresponding to the contributing frequencies must be considered. For functions with more than two contributing frequency, this will have a more noticeable effect. Therefore, the test using Bartlett’s method with permuting method 3 is recommended for functions with multiple frequencies.

Note that the functions used to evaluate the tests are quite basic. However, almost all peri-odic functions can be written as an infinite sum of cosines and sines and thus, this forms a solid based for more complicated functions. Preliminary testing for this already looks very promising.

For functions such as

i=1

icos 2πix τ

which can be used to approximate functions for n large, the tests using Fisher’s G-statistic and Bartlett’s method perform well. Therefore, the tests should work for more complicated functions as well. Especially if these functions are smooth i.e. have a high order of continuous derivatives.

CHAPTER 5. RESULTS & COMPARISON OF METHODS

Figure 5.25: ROC curves for the cosine plus sine function for permuting method 1 (Permuting Blocks).

Figure 5.26: ROC curves for the cosine plus sine function for permuting method 2 (Permuting Sub-blocks).

Figure 5.27: ROC curves for the cosine plus sine function for permuting method 3 (Permuting Observations).

CHAPTER 5. RESULTS & COMPARISON OF METHODS

In document Eindhoven University of Technology BACHELOR Testing for the Period of a Function using Permutation Methods Freyer, Caroline (pagina 47-52)