• No results found

3.3 Known Approaches

3.3.7 Testing using Lomb-Scargle Periodogram

The Lomb-Scargle Periodogram is another LSSA method. Scargle(1982) modifies the standard periodogram value using a time delay ` which is chosen such that the pair of sinusoids are mutually orthogonal at the time samples t = 1, . . . , n.

Definition 3.3.1. (Time Delay) The time delay ` is defined by

tan(2ω`) = Pn

t=1sin(2tω) Pn

t=1cos(2tω) (3.32)

Then we can defined the Lomb-Scargle Periodogram as follows:

Definition 3.3.2. (Lomb-Scargle Periodogram value) The Lomb-Scargle Periodogram value of the time series {Xt: t = 1, . . . , n} at frequency ω is,

Scargle(1982) shows PX(ω) has the same value as a least-squares fit at any frequency ω. This implies that the least-squared fitting and the Lomb-Scargle Periodogram are exactly equivalent.

Let ω ∈ [−π, π]. Then we use PX(ω) as the test statistic to test H0: f does not have frequency ω, against

H1: f has frequency ω.

We reject H0 if PX(ω) is large enough. The exact distribution for PX can be found in Scargle (1982). Scargle(1982) also derives a critical value z0 for the test which is defined by

z0= −ln

1 − (1 − α)2/n

(3.34) where α is the significance level of the test. For small α we find that

z0≈ 4.6 + ln(n/2). (3.35)

Thus, if n = 60 and α = 0.01, PX(ω) must be greater than 8 to reject H0 at significance level α. This shows that only large signals can be reliably detected, especially if the data is noisy.

Moreover, Lomb-Scargle’s periodogram has a high asymptotic error where for smaller values of n the test is more likely to reject H0.

Chapter 4

Testing using Permutation Methods

Permutation methods allow us to find an exact distribution of a test statistic, given that the observations are exchangeable under the null hypothesis. For the problem given in Section 3.1, this means that the distribution of the sample data must be invariant if the period of f is τ0. Thus, there are constraints on the permutations that can be made; we cannot permute observations that can alter this period. In this chapter three different methods for permuting the observations under these constraints are explained. After that, we check whether the test statistics given in Chapter 3remain constant under permutations. This check determines whether the test statistic is useful.

If a test statistic stays constant under the permutations its distribution cannot be found and no conclusions can be made.

4.1 Methods for Permuting

Given that τ0∈ Q+∪ {0}, we know that

τ0=p

q (4.1)

for p, q ∈ N and q 6= 0. Note that if τ0∈ N, then q = 1. For simplicity, we assume that p/q is an irreducible fraction; in other words p and q have no common divisors. In addition, we define

m = n p



(4.2) and only consider the observations X1, . . . , Xmp to simplify computations. In this case, we disre-gard the observations Xmp+1, . . . , Xn. For large n, the number of disregarded observations will be small compared to mp.

Under the null hypothesis, we assume that f has period τ0 and thus, we cannot permute ob-servations that would alter this period. The first method for permuting the obob-servations uses the concept of blocks. A block is an interval of time in (0, mp]. We denote the blocks by Bi for i = 1, . . . , m where

Bi= ((i − 1) · p, i · p]. (4.3)

Thus, we divide the interval (0, mp] into m equally sized blocks with start and end points at consecutive multiples of q · τ0 = p ∈ N. In this case, every block corresponds to q periods. The second method uses sub-blocks. A sub-block is also a time interval denoted by Bij for i = 1, . . . , m and j = 1, . . . , q where

Bij= ((i − 1) · p + (j − 1) · τ0, (i − 1) · p + j · τ0]. (4.4)

CHAPTER 4. TESTING USING PERMUTATION METHODS

In this case, we divide each block into q sub-blocks of length τ0. Lastly, we have the individual observations at times t = 1, . . . , mp which are used in the third method. Figure 4.1 illustrates these concepts. In Figure4.1, the function has period 7/3. Thus under H0, p = 7 and q = 3. In this example, we have observations X1, . . . , X17. However, m = 2 which implies that we disregard X15, X16, X17, as we only consider observations that can be part of a complete block to simplify computations. The blocks for this example are B1 = (0, q ∗ 7/3] = (0, 7] and B2 = (7, 14]. We see that each of these blocks enclose q = 3 periods. Thus, we can divide each block into three sub-blocks. For B1we get the sub-blocks B11= (0, 7/3], B12= (7/3, 14/3], B13= (14/3, 7]. Lastly, the red dots at each time unit are the observations.

Figure 4.1: Illustrates the difference between blocks, sub-blocks and observations. The blocks are denoted by Bi and are time intervals defined by consecutive multiples of p. Sub-blocks subdivide the blocks into q intervals of length τ0 = p/q. The observations are illustrated by the red dots at times t = 1, . . . , 17. The colours indicate which sub-blocks can be swapped with each other.

Now that blocks, sub-blocks, and observations are defined, we describe three different methods for permuting under the null hypothesis which ensure that the data sample’s distribution remains invariant under these permutations.

Method 1: Permuting Blocks Each block starts and ends at a multiple of τ0. Thus, under the null hypothesis the blocks have the same structure by definition of periodicity. Therefore, we can permute the different blocks under H0. Although we state that we are permuting the blocks, we are actually swapping the observations that are contained in the blocks. If we swap Bi and Bi0

the time of the observations in Bi are incremented by i0p − ip and the time of the observations in Bi0 are incremented by ip − i0p. Thus, the time of the observations get increased or decreased by a multiple of p, yet their relative positions within their blocks remain unchanged.

For this method, we divide the time interval (0, mp] into m different blocks. These blocks can be permuted in m! different ways. For each permutation, the value of the test statistic can be computed and we can build a histogram describing its probability mass function.

Method 2: Permuting Sub-blocks Under the null hypothesis we can permute the sub-blocks that are in the same relative position in their blocks. As seen in Figure 4.1, the label of the sub-block depends on its block. For example, sub-block B12is in block B1. If the sub-blocks have the same second index, they are in the same relative position within their blocks. Therefore, we can permute the sub-blocks that have the same second index. For example, in Figure4.1we can permute the sub-blocks that have the same background colour.

CHAPTER 4. TESTING USING PERMUTATION METHODS

If we define the sets Dj = {B1j, . . . , Bmj} for j = 1, . . . , q, we see that we can permute the sub-blocks that are in the same set. As in Method 1, note that permuting sub-blocks means swap-ping the observations in the sub-blocks. Therefore, the time of the observations are translated by a multiple of p. However, this does not change the relative position of the observations in the blocks.

For this method we have (m!)q different permutations as we have q sets of permutable sub-blocks, each of size m. For each of these permutations we can then compute the value of the test statistic and derive its probability mass function. Note that for q = 1, method 2 is equivalent to method 1. Therefore, we only have two different methods for permuting if τ0∈ N.

Method 3: Permuting Observations Similar to method 2, we can permute individual ob-servations that are in the same relative position in their blocks. Therefore, we can permute those that are in the same equivalence class of the relation modulo p. For example, we can swap the observations at t = 1 and t = 8 in Figure 4.1 as p = 7. Thus, there are p sets defining which observations are exchangeable under the null.

S0= {Xp, X2p, . . . , Xmp} S1= {X1, Xp+1, . . . , Xmp−p+1}

. . .

Sp−1 = {Xp−1, X2p−1, . . . , Xmp−1}

Any observations in Si can be switched with another observation in Si for i = 0, . . . , p − 1. Each of the p sets have size m and for each set we have |Si|! permutations, making the total number of permutations (m!)p. If we compute the value of the test statistic for each of these permutations we can find its probability mass function.

Regardless of the method used we see that the number of permutations can be very large, es-pecially if n is large and p is small. Therefore, in many cases it is not feasible to compute all permutations and Monte Carlo methods are used. We can use Monte Carlo methods as we only need to evaluate probabilities under the null hypothesis and we can easily generate samples from the permutation distribution (Ernst, 2004). Thus, we generate a number of samples from the permutation distribution and approximate the p-value.

Lastly, it is important to note that in order for the permutations to occur without altering the period of the function, it is essential to assume that the times are equally spaced. Otherwise, observations will no longer be in the same relative positions if swapped according to the above mentioned methods.