Efficiency of change-point tests

(1)

Efficiency of change-point tests

Citation for published version (APA):

Praagman, J. (1986). Efficiency of change-point tests. Technische Hogeschool Eindhoven. https://doi.org/10.6100/IR246345

DOI:

10.6100/IR246345

Document status and date: Published: 01/01/1986 Document Version:

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne

Take down policy

If you believe that this document breaches copyright please contact us at: openaccess@tue.nl

(2)

efficiency of

.75 .50 t .25

tests

o~~~~ 0 .2 .4 .6 .8 1

.

(3)

efficiency of change-point tests

Proefschrift

ter verlaijging van de graad van doctor in de technische wetenschappen aan de Technische Hogeschool Eindhoven, op gezag van de rector magnificus, Prof. dr. F.N. Hooge,

voor een commissie aangewezen door het college van dekanen in het openbaar te verdedigen op vrijdag 27 juni 1986 te 16.00 uur

door

JAKOB PRAAGMAN

(4)

Dit proefschrift is goedgekeurd door de promotoren:

prof.dr. P.C. Sander en

(5)

(6)

(7)

CONTENTS

1. INTRODUCTION 1. Objective

2. The change-point problem 2

3. Test statistics 5

4. Bahadur efficiency 6

5. Bahadur optimality 9

6. Preview of the subsequent chapters 11 2. TESTS FOR A CHANGE IN THE MEAN OF NORMAL VARIATES

1. Introduction 12 2. Preliminaries 14 3. Variance known 17 1. Sum-type statistics 17 2. Max-type statistics 22 4. Variance unknown 29 1. Sum-type statistics 30 2. Max-type statistics 36

5. Some comparisons and examples 41 3. TESTS BASED ON SIMPLE LINEAR RANK STATISTICS

1. Introduction 48

2. Slopes of SN and ~ 51

1. SN and~ as functions of e.d.f.'s 52

2. Large deviations 53

1. Sum-type statistics 54

2. Max-type statistics 59

3. Almost sure limits 64

1. Sum-type statistics 65

2. Max-type statistics 66

(8)

3. Efficiency of SN with respect to ~ 71 4. Bahadur efficiencies at local alternatives 75

5. Median scores 81

4. PITMAN EFFICIENCY

1. Introduction

2. Sum-type statistics

3. Max-type statistics

4. The likelihood ratio test

5. POWER FOR SMALL SAMPLE SIZES

1. Introduction 2. The simulations 3. Results REFERENCES SAMENVATTING CURRICULUM VITAE 85 86 89 97 107 109 110 123 131 134

(9)

1. INTRODUCTION

1.1. Objective

Let x₁,x₂, ..• ,XN be a sequence of independent random variables. Then, this sequence is said to have a change-point at n, 1~n<N, if xi, i=1, ••• ,n has distribution

function F(x) and Xi, i=n+1, .•• ,N, has distribution function G(x); G~F. We consider the problem of testing the null

hypothesis of no change against the alternative of a one-sided change, G<F, at an unknown change-point n. Thus, defining a class of distributions F, consider the testing of

H

0 : Xi"'F , i=1, ... ,N FEF

against the alternative (1.1.1) _Ha _{xi"'F ,i=1, ••. ,n} FEF

X ""G _i , i =n+ 1 , •.. , N ; GEF

,

G<F with n, F and G unknown.

Henceforth, this will be called the change-point problem, which will be abbreviated as c.p.p.

Various test-statistics have been proposed for (1.1.1), and thus,mutual comparisons are of interest. For most of these statistics, however, no manageable expressions exist for the distribution under H

0 and under the alternative

hypothesis. As a result simple power comparisons cannot be made, unless we resort to Monte Carlo experiments. In fact, the power comparisons reported in the literature are almost all based on such experiments. In addition, for the c.p.p., no uniformly most powerful test exists. Therefore, we will rely on asymptotics and use asymptotic efficiency as our standard.

(10)

The main purpose of this study is to derive asymptotic efficiencies for two rather general classes of test statistics and to compare these asymptotic results with

power calculations mainly based on Monte Carlo experiments -for samples of small and moderate sizes.

From the different asymptotic efficiency concepts available, we chose those of Bahadur and Pitman. Since the latter is well-known, we will give a short description of the Bahadur approach only (sections 1.4 and 1.5) but firstly,we take a closer look at the c.p.p. (section 1.2) and introduce the classes of statistics that will be considered (section 1.3). Remark 1.1.1.

Note that the initial distribution, i.e. that of x₁

is also supposed to be unknown. Starting with a known distribution leads to a slightly different problem, for which results similar to those presented in the following chapters can be shown to hold.

The usefulness of asymptotic measures in the change-point situation is somewhat questionable, because the assumption of only one change may make little sense when the sample size is increased, by taking observations over an extended period of time. However, this objection can be overcome in those cases, where it is possible to increase the sample size by sampling more frequently during a fixed time period. Thus, we will assume that for N~oo, n/N tends to some

~€(0,1).

1.2. The change point problem

Some specific statistical problems adequately fit a change-point description like (1 .1.1). In some applications, a

c

(11)

possible change in distribution can be related to an apparent causal event at a time that can be precisely specified. However, in many other cases, potential causal events cannot be identified with reasonable confidence, and the timing of the possible shift point is uncertain. - An obvious application of change-point tests is the

testing for stability of some critical product character-istic within a statcharacter-istical quality control setting. - Time series of the annual rainfall at a certain spot,

or of any other meteorological variable, may be used in a retrospective analysis to signal a possible change in climate.

- Breast-cancer patients, treated with radical mastectomy, are postoperatively screened with different tests tor metastases. This screening taking place at regular intervals. For some biochemical tests i t is known that abnormal values arise in patients with advanced metastatic disease. But then, detection by clinical means is possible too. However, the question is whether these tests can indicate metastases at an earlier stage than the convential techniques1 more specifically: can any

evidence for metastases be seen in the sequence of test results. Therefore, the sequences of measurements for a patient, after the primary treatment but before the definite detection of metastases by clinical means, is inspected for a possible change-point. When normal-values are available, the initial level can be set at them, but in case of big individual differences between patients, the initial level should be taken as unknown. - According to a model of the Stock Exchange which is

widely accepted, the share price Pt at time t is

Pt=Pt_₁+xt, where

xt~N(~t,cr

2

)

and the Xt are independent. A bull-market is one in which ~t>O, a bear market one in which ~t<O. If we want to know whether the market has changed its position during an interval, we may set up

(12)

the hypothesis H

₀

:~t=~ for all t , against Ha:~t=~

₁

, t<T, ~t=~

₂

, t>T, for some T. And a series of week to week differences in the Dow Jones industrial average can be used to test H

0•

Many authors have studied testing problems like (1 .1.1) or related inferential questions such as the estimation of the change-point n, or the estimation of the kind or magnitude of a possible change, mostly,under some further assumptions concerning the class of distributions

F,

the kind of change expected and possible prior knowledge about the location of the change. These studies lead to estimators and test statistics in parametric and non-parametric, classical and Bayesian frameworks. A recent

survey was given by Zacks (1983). E.g. for inference

concerning the location of the change-point n, the general Bayesian framework was given by Broemeling (1974), Smith

(1975) and others, whilst Hinkley (1970) used the classical approach and studied the maximum-likelihood estimator. The same estimation problem, but under less stringent conditions, was investigated by Pettitt (1980), who proposed an

estimator based on Mann-Whitney type statistics for a general location change model.

In most of the literature on the testing problem, the investigators have concentrated on models that deal with possible changes in only one of the parameters of the distribution function F. See amongst others, Hsu (1977), Talwar and Gentle (1981) and.Hsieh (1984) for changes in the scale parameter; Brown, Durbin and Evans (1975), McCabe and Harrison (1980) for the problem of switching regressions, and Matthews and Farewell (1982) and Matthews, Farewell and Pyke (1985), who considered the problem of testing for constant hazard against a change-point alternative. However, the location change model is by far the one most studied.

(13)

Remark 1 • 2 . 1 •

There are numerous papers on closely related dynamic control problems, all of which deal in one way or another with the problem of shift at unknown time points. In these problems the Xi are available one after another and the series is tested for a possible change after each new arrival. (See e.g. Lorden (1971), Bhattacharya and

Frierson (1981) and Pollak (1985)). c

1.3. Test statistics

As previously, denote the null hypothesis by H

0 and let

Hk, for k=1, •.• ,N-1 denote'the hypothesis Xi~F, i=1, .•• ,k, and xi~G, i=k+1, ... ,N: F>G. The c.p.p. requires a te~t

N-1 of H

0 against u Hk, whilst the test of H0 against Hk is

k=1

a two-sample test for every k, with sample sizes k and N-k. Thus the alternative for the c.p.p. can be conceived of as the union of the alternatives for the N-1 two-sample problems with k=1,2, ••. ,N-1 respectively. This relation-ship is reflected in the statistics used for the c.p.p. Let TN,k denote a two-sample statistic, for samples x₁, ••• ,Xk and Xk+₁, ••• ,~: then, there are two obvious ways to define a c.p.p. statistic:

- the sum-type statistic (1.3.1)

- the max-type statistic (1.3.2) ~ max c T

l:;;k<N N,k N,k

where the eN k are nonnegative weight coefficients. Note that, from a

1

Bayesian point of view,

cN,k/k~=~

cN,k can be interpreted as the prior probability that Xk+l is the

(14)

initial shifted variable. Most of the statistics proposed for the c.p.p. can be written in one of the forms (1.3.1) and (1.3.2), or as a minor modification.

An exception worth mentioning is Page's test. Page (1955, 1957) was one of the first to consider the c.p.p. in a retrospective way like we do; but he drew his inspiration from tests that already existed for the sequential version of this problem (see remark 1.2.1). For the case when the Xi are Bernoulli, P(X 1=1)=P(X1=-1) 1/2, he proposed, (take X =0) 0

T~

= max (

~

x. -

min

~

x.)

1~n~N i=1 ~ 1~k~n i=O ~

for testing against a one-sided change: P(X.=1)>1/2, for ~

i>n. Within the brackets is seen the well-known cusum-statistic, which is used as a sequential test for a.o. q~ality control purposes. (In his 1957-paper, Page proposed a similar statistic

This

T~

can be written as

T~

for normall~ distributed Xi). max E X., from which

O~k<n:.iN k+1 ~

p p

it is immediately clear that TN(x₁, ... ,XN)=TN(XN, ... ,x₁), a curious property for a one-sided test! In fact, the retrospective and the sequential test situation differ too much for the statistics to be interchangeable. In the sequel these statistics will not be considered and the general forms (1.3.1) and (1.3.2) will be taken as the starting point in the following chapters.

1.4. Bahadur efficiency

A detailed description of the concepts of Bahadur slope and efficiency is given in the monograph by Bahadur (1971), or in the survey paper by Groeneboom and Oosterhoff (1977). We will content ourselves with a short introduction and the definitions of slope and efficiency in this section and

(15)

some notions about optimality in the next one.

Bahadur efficiency is a so-called non-local or fixed alternative method, in contrast to local methods such as the Pitman efficiency. The latter compares the properties of tests at a sequence of alternatives getting arbitrarely close to the null hypothesis. This sequence is chosen so that the probabilities of type I and type II error remain bounded away from zero.

In non-local methods the rate of convergence of type I or type II error is considered for a fixed alternative. In fact asymptotic efficiency in the sense of Bahadur of a sequence of test-statistics {TN} measures the rate at which the level attained by TN tends towards zero, when a given fixed alternative obtains.

Consider the testing problem (1.1.1). For each N, a simple alternative is characterized by a triple

a=(F,G,n/N), with F and GEF, and for N+® we will consider a fixed alternative to be a triple e=(F,G,A) with A€(0,1), such that n/N+A and F~G.

Introduce e=Fxfx(0,1] and e₀={eEeiF=G and/or A€{0,1}}, then, (1.1.1) can be seen as the test of the hypothesis e€e

0, against the alternative eEea=e/e0• Furthermore, for each N, TN is considered to be a test statistic based on

x

₁

,x

₂, ••• ,XN, large values of TN being significant, and for any tElR , let

then the level attained by TN is defined by

Clearly, LN is a random variable and depends only on

(16)

Definition 1.4.1.

Suppose some 8E0a obtains. Then, the sequence of test statistics {TN} is said to have exact Bahadur slope c,

i f -1 lim - N log LN N+<» c/2 Definition 1.4.2.

*

Consider two sequences of tests, {TN} and {TN}, with exact Bahadur slopes c and c* respectively. Then the

*

Bahadur efficiency of {TN} with respect to {TN} is defined as the ratio of their slopes c/c*.

Remark 1 • 4 . 1 •

If convergence in probability only is required in definition

0

1.4.1., c i s called the weak slope of {TN}. o

In general the determination of the exact slope of a sequence {TN} is nontrivial. A method that will be used frequently in the subsequent chapters is based on the following theorem by Bahadur (1971, th. 7.2).

Theorem 1.4.1.

Let {TN} be a sequence of test statistics for testing 8Ee

0 against 8E0a c 0/00• Suppose that

lim N-1 TN = b(8)

N+"'

where -oo<b(S)<oo~ and that for each t i n an open interval I,

-1

lim - N log[sup{P(TN<:Nt) ~eEe

₀

}J a(t) N+oo

where a is a nonnegative function on I, continuous at t=b(S); then, the Bahadur slope of {TN} at

e

equals 2a(b(8)).

(17)

According to this theorem, the Bahadur slope of a

sequence of statistics at a specified alternative can be found if both the almost sure limit under that particular alternative and a large deviation result under the null-hypothesis are known. In the next two chapters we will be concerned mainly with these aspects of the various statistics.

1.5. Bahadur optimality

A sequence of test statistics TN is said to be

asymptotically optimal in the sense of Bahadur efficiency for testing 8Ee

0 against eEea, if the exact Bahadur slope of {TN} is maximum amongst all sequences of test statistics, for all 8E8 . A frequently used method to show the _a

asymptotic optimality of a sequence of test statistics is to find an upperbound for the Bahadur slope. A sequence {TN} is then clearly asymptotically optimal if its

Bahadur slope is equal to this upperbound for every 8E8a· The next theorem provides that upperbound and was proved by Raghavachari (1970).

~or any F₁ and F

2 E F let the Kullback Leibler information number be defined by

F 1 (x) log F

2(x) dF 1 (x). Then, O~K(F

₁

,F

₂

)~oo, K=O if and only if F

1=F2 and K=oo if F₁ is not absolutely continuous with respect to F

2• For each 8=(F,G,A)E8, let

(18)

If c(e) is the exact Bahadur slope of a sequence of test statistics {TN}, then, for every e=(F,G,A)Eea,

c (e) _{2KA (F,G)}

On the other hand, under certain regularity conditions, the (exact) Bahadur slope of the likelihood ratio test statistics attains this upperbound 2KA (F,G) for all

eEe • _a (Bahadur, 1967, 1971). Hence under these conditions the likelihood ratio tests are asymptotically optimal in the sense of Bahadur. However, in general these conditions cannot be verified easily.

Recently, Haccou et al. (1985) derived sufficient conditions for the (weak) Bahadur optimality of the likelihood ratio test for the c.p.p. and pointed out that these conditions are satisfied for F, a one-parameter exponential family, and for most k-parameter exponential families (for instance the normal distribution) .

It should be observed that the upperbound in theorem 1.5.1. depends only upon

e

0 and the alternati~e

e

under consideration. This illustrates the fixed alternative character of Bahadur efficiency: for the slope of a test at some alternative e, in addition to e₀ only that

particular

e

is of interest and it makes no difference whether i.t is an element of ea or whether it belongs to

some other alternative hypothesis '. (Of course, the complete alternative is taken into account when the test statistics are chosen). Now, any element 6=(F,G,A) of the c.p.p. alternative belongs to the alternative hypothesis of the two-sample problem with sample sizes

[AN] and N-[AN] too; and indeed the upperbound is the same in both cases. As long as only one alternative

(F,G,X) is considered, obviously, a change point test

(19)

can do not better than the best two-sample test, appropriate to that particular alternative. (See remark 3b in

Bahadur, 1967).

Hence, we conclude that at all alternatives (F,G,A*) with A* fixed, the c.p.p. likelihood ratio test -which is optimal for all (F,G,A) -has Bahadur efficiency 1 with respect to the likelihood ratio test for the corresponding two-sample problem (with sample sizes nand N-n, where n/N+A*).

1.6. Preview of the subsequent chapters

Chapters 2 and 3 are devoted to the derivation of the Bahadur efficiency of the sum- and max-type statistics defined by (1.3.1) and (1.3.2), for the c.p.p. In chapter 2, we will consider the case when all Xi are normally distributed, with the same (possibly unknown) variance and the change is a location shift: G(x)=F(x-o),

o>O. For the TN,k in (1.3.1) and (1.3.2)., we use the two-sample likelihood ratio statistics.

The third chapter deals with the more general case when

F

consists of all continuous distribution functions; then the Bahadur efficiency of sum- and max-type statistics is found for the statistics that result when the TN,k are the two-sample linear rank tests.

In chapter 4 we turn to the Pitman efficiency and consider the same statistics as in chapters 2 and 3. A comparison

is made between the Bahadur and Pitman efficiencies. Finally, in chapter 5, we will present the results of power calculations for moderate sample sizes to answer the question whether the asymptotic efficiencies are a good measure for the powers of the tests for sample sizes of practical interest.

(20)

2. TESTS FOR A CHANGE IN THE MEAN OF NORMAL VARIATES

2.1. Introduction

This chapter covers the change-point problem where the distributions, before and after the change-point, are normal with equal but possibly unknown variance ~2 The only change occurring is a positive location shift. Thus, F:={FjF(x)=.P(crx+p),uE.'m}, where <1> is the standard

normal distribution function. The hypothesis can be written

*

2 Ho: X. N().J,cr) i=1, ••• ,N. ~ 2 H*: X. N(Jl,O ) i=1, . . . ,n, a ~ 2 N(u+ccr,o ) , i=n + 1 , ..• , N for some n, 1 ;;; n < N and 0 > 0.

Both ll and

o

are unknown.

To simplify the notation, in this chapter, a particular alternative will be denoted by (o,A), where A€(0,1) is such that n/N+A.

With regard to the classification introduced in section 1.3, two approaches to this problem can be distinguished. The first commenced with Chernoff and Zacks (1964), who

* *

showed that a Bayes test of H

0 versus

H ,

for

o

values close to zero and if a2 is known, is

gi~en by the sum

N . - - -1 N

type statistic zi=

₁

<~-1) (Xi-XN), where XN=N Ei=₁xi. Since this statistic is a linear function of normal random

variables, it is easy to obtain the critical value for a size a test and the power function. These are given in their paper with some numerical illustrations. Furthermore they derive similar results for the case when ll is known. Gardner (1968) used the same approach to tackle the

(21)

two-sided problem.

When a2 is unknown, a corresponding statistic is obtained if the Chernoff-Zacks statistic is divided by a suitable variance estimator. Sen and Srivastava

N - 2

(1 975a) used both Ei= 1 (Xi -XN) .j (N-1) and

N-1 2

Ei= 1 (Xi+ 1-Xi) /2(N-1) to estimate the variance and compared the powers of the two resulting statistics by Monte Carlo methods, showing only small differences

for the sample sizes involved (N~SO) (see chapter 5).

*

For the first statistic, the H

0-distribution was found. Max-type statistics, more particularly the likelihood

*

ratio statistic for testing H

0 against Ha also were

investigated by Sen and Srivastava. In the same paper (1975a) the power of the likelihood ratio test when a2

is unknown was estimated, whilst in a second paper (1975b) power comparisons, again based on Monte Carlo experiments, of the likelihood ratio test when a 2 is known, with the Chernoff-Zacks statistic have been reported. These comparisons indicate that the Chernoff-Zacks statistic is generally more powerful when n is close to N/2. On the other hand, when n is close to 1 or toN, the likelihood ratio test is more powerful. Hawkins (1977) considered the likelihood ratio test statistic for the two-sided case, o~O, and found its distribution under the null hypothesis for the case of known a2• The null distribution for unknown a2 has been given by Worsley (1979), who pointed out that the

earlier result of Hawkins for this case is incorrect.

As far as we know, no results on asymptotic efficiency have been reported, apart from the Pitman efficiency of

(22)

one of the studentized forms of the Chernoff-Zacks statistic (Bhattacharyya and Johnson (1968), see also chapter 4); and an investigation of the likelihood ratio test using Chernoff's approach by Deshayes and Picard

(1982).

In this chapter, we will derive the Bahadur slopes for the classes of sum- and max-type statistics taking for TN,k the two-sample likelihood ratio statistics. After some preliminaries in section 2, we consider in section

3, the case when o2 is known and, in section 4, when o2 is unknown. Some examples are presented in the fifth section.

2.2. Preliminaries

In this (and the following) chapter we will follow the strategy suggested by theorem 1.4.1 in order to derive the desired Bahadur slopes. Therefore, the almost sure limits under a specified alternative and large deviation results under the null-hypothesis are needed. The first theorems in this section all bear on these aspects.

Theorem 2.2.1.

Let

x

1

,x

2, ••• be i.i.d. random variables such that

Ex

1=o. Then, for 1~a~2,

Elx

1

la<oo

if and only if,

0 , a.s.

for each array aN,k of real numbers such that

N 2

limsup ~k=₁aNk<ro,

N+oo

(23)

OUr first large deviation result is a theorem by Killeen, Hettmansperger and Sievers (1972).

Let {TN} be a sequence of random variables and {eN}, {tN} and {uN} nonnegative sequences of real numbers with

-1 -1

N log e:N=o ( 1) and N log uN =a ( 1) as N-+-co. Theorem 2.2.2.

Suppose, for each N=1,2, •.• ,TN has an absolutely continuous distribution with density fN(t). If there exists an N

₁

E~

such that for N~N

₁

, fN(t) is non-increasing for tE[tN,oo) and if

0 ( 1) (N+co) and

Then,

Proof: See Killeen, l;Iettmansperger and Sievers (1972),

theorem 2.1 .1. c

For large deviations of max-type statistics, the following generalization of theorem 3.1 in Killeen and Hettmansperger

(1972) will turn out to be very useful.

Theorem 2.2.3. For each N, let TN

1, ••• ,TN be identically distributed

, ,mN

random variables1 eN

1, ••• ,cN nonnegative real numbers

, ,mN

and ~= max eN kTN k. 1:>k:;;mN • ' ,

(24)

- +

Suppose an interval (t ,t )c:JR exists, such that

-1

limN log P(TN ₁HN)=h(t), for each sequence {tN}

N-+oo - ~

with tN+tE(t ,t ) and there exists an integer ~ such that for sufficiently large N, 1~~~Nt.

Then, -1 lim N log P (M__<;;tN) = h ( t ) N+oo -~ C Proof: Since {cN,kTN,k;;;tN}c{~<;;tN}=k=

1

,.~., {cN,kTN,k~tN} it follows that

~

max

P(cN,kTN,k~tN)~P(~~tN)~E~

1 P(cN,kTN,k~tN).

1~k~mN Hence,

P(TN,

₁

~m~n tNc;~k)~P(~~tN)~NtP(TN,

₁

;;;m~n tNc;~k)

-1 - +

and thus, since m~n tNcN,k-+t

0E(t ,t ) ,

lim N-1 log P

(~~tN)

=lim N-1 _{log P (TN, 1}

~mkin

tNc;1,k) =h (t

0 ) • c

N-+oo N-+oo

Theorem 2.2.4.

Let f and g be measurable and square integrable functions on [0,1], Jf(x)dx=Jg(x)dx=O and Jf2(x)dx=Jg2(x)dx

rY

JY

Suppose f is nonincreasing and

0

j

f(x)dx~

0

g(x)dx, for all y£[0$1]. Then,

J

(f(x)-g(x))2dx=O.

Proof: (Unless otherwise stated integration is over

(25)

(2.2.1) _J<f-g)2_dx

J

(g2-f2-2(g-f)f)dx = -2J (g-f)f dx

Define h: [ 0,1 ]+JR, by h (y) =

r

(g-f) dx, then, h is nonnegative and h(O)=h(1)=0?

Thus, by substitution of h and partial integration J<g-f)f

dx=Jfdh=-Jhdf~O

because f is nonincreasing and h is nonnegative. Together with (2.2.1) this completes the proof.

2.3. Variance known 2

When o is known, the two-sample likelihood ratio statistic for samples x₁, ••• _{,xk and xk+ 1 , ••• ,XN is}

(2.3.1)

. - 1 k -, =--1- ~N

w~th Xk=k Ei=₁xi and XN-k N-k ~i=k+

₁

xi. Without losing generality, put a2 equal to one.

2.3.1. Sum-type statistics

With T k as in (2.3.1), the sum-type statistics N I N-1

defined by Ek= 1cN,kTN,k can be written as (2 .3 .2) S ( 1) _N = ,...N, _~~=1d _N,iX _i

i-1 -1 -1 ! N-1 -1 -1

l

where dN,i=Ek= 1 (k(N-k) N ) cN,k-Ek=i (k (N-k)N ) cN,k' i=1, ..• ,N. Note that,

(26)

(2.3.3)

and that, from cN,k~O, k=1, ••• ,N-1, i t follows that,

( 2 • 3 • 4 ) dN , 1 :£ dN I 2 :£ • • • :£ dN , N

Furthermore, we assume, without loss of generality, that,

(2.3.5)

Just like the cN,k' the dN,i will be called weight-coefficients.

Now, we come to the formulation of theorems on large deviations (theorem 2.3.1) and the almost sure limit under an alternative (5,~) (theorem 2.3.2).

Theorem 2.3.1.

Let xi~N ( ).l, 1) , i=1, ••• ,N, all xi independent, J.IEIR and let

s~

1

)

be defined by (2.3.2) with dN,i satisfying

( 2 . 3 . 3) and ( 2 • 3 . 5) then,

(2.3.6) lim- N-1 log[sup{P(S~1)<;;Nt) ;pEIR }] it2 N+oo

Proof: For all \lEIR,

s~n

...

N(O,N) thus,

P(S~

1

l;;;;Nt)

P(Z~tNl), with z~N(0,1). Thus, the use of P(Z>z)=(z/27i")-1exp(-}z2) (1+o(1)), (z+oo) (see e.g. Feller, 1957, p. 166), yields, for all J.IElR,

P(S~1);;;Nt)=t-1 _{(21TN)-!exp(-it}2_{N) (1+o(1)), (N+oo), and} ( 2. 3. 6) follows.

Note that only the standardization of the dN . according

,l.

to (2.3.3) and (2.3.5) is assumed here and that no

conditions whatsoever are imposed on the relation between

(27)

the dN . for different values of N. Such a condition is

.~

needed for the next theorem.

Theorem 2.3.2.

Let xi~N(\.1,1), i=1, ••• ,n; xi~N(].t+o,1), i=n+1, ••• ,N; all X. independent, o>O, l!ElR and n/N+.AE (0, 1) •

T~ke s~

1

₎

_{as in theorem 2.3.1 and suppose}

(2.3.7) lim N-1 _Ei=n+1N _{dN .} _1;1. < 00

N+oo ,~

then,

lim -1 N

s

(1) o~;~, a.s. N+oo N

Proof: Introduce Yi=X.-EX. and aN .=N-!dN . , i=1, ••• ,N.

~ ~ ,~ ,~

Then Y₁, ••• ,YN i.i.d., EY.=O, EY~<oo and according

to-N 2 ~ ~

(2.3.5), Ei= 1aN,i=1, thus by theorem 2.2.1,

-!

N

-limN Ei=,aN,iYi-0, N+oo

-1 N -1 N

i.e. lim(N _{Ei= 1dN iX.-N E.} _{1dN .o) = O, a.s.} N+oo ' ~ ~ =n + , ~

-1 N

And, since N Ei=n+ 1dN,i+I;.A' the theorem follows. Remark 2. 3. 1 •

In the proof, no use is made of the normality of the Xi. Indeed, the theorem holds for all Xi, with a finite second moment.

Together with theorem 1.4.1, the foregoing theorems immediately lead to the Bahadur slope of

{S~

1

)}:

Theorem 2.3.3.

2 2

Let X. ~N ( ll , cr ) , i = 1 , ••• , n ; X . ~N ( ll +ocr, cr ) , i =n + 1 , ••• , N;

~ ~

all xi independent, o>O, l!ElR and n/N+I.E(0,1).

0

(28)

Then, with S ( 1) _N slope of 2 2 6 ~A· {S(1)} N Remark 2.3.2.

and ~A as in theorem 2.3.2, the Bahadur at the alternative (6,>.) is given by

The mutual Bahadur efficiency at (6,!t) of two sequences

( 1 )

of SN -type tests depends, therefore, on the location of the change, A, and the weights,through ~It' but not

[J

on the magnitude of the change, o. o

Example 2.3.1.

The Chernoff-Zacks statistic ScNz=~~=

1

(i-1) (X.-XN)=

N . ~ ~ •

=!E._₁(2i-N-1)X .• Scaling the weights to meet

(2.3.5)-~- ~ 2 2 2

(2.3.3) is met already- we get ~x=3!t (1-!t) ; thus, the Bahadur slope of S~z at the alternative (o,!t) equals

3c2>.211-x)2.

Example 2.3.2.

The two-sample likelihood ratio statistic TN,k' for alternatives (o,A₁) with a particular A

1, can be seen as a special degenerate case of s~1)

/ k (N-k) (-' - ) /N-k' k

I

k N

TN ,k= N XN-k -Xk =- kN ~i=1 Xi +(N-k) N ~i=k+1 Xi' with k/N.;..A

1•

It turns out that, writing at-b for min(a,b),

,.

= ('

(1-' ))1 (1-!t • A )

"It "1 "1 1-!t, "

I1 ,

and thus, the Bahadur slope at the alternative (o,~t

₁

) is

o

2A

1(1-x1). By a straightforward calculation, we see that Raghavachari's upperbound (see section 1.5) equals

o2_~t(1-A),

for the problem under consideration in this section. So the slope of the TN,k test does attain this upperbound for !t=A.₁, i.e. {TN,k} is Bahadur-optimal at

A. =It

1 •

[J

(29)

Theorems 2.3.1 and 2.3.2 (and thus 2.3.3) do not use the monotonicity of the d . as stated in (2.3.4). The

~~~

statistics considered in this section, however, do have this monotonicity as an immediate implication of the obvious restriction to non-negative cN,k coefficients. The following theorem makes use of the monotonicity; it proves that the class of statistics

s~

1

),

of type

(2.3.2) with weights dN,i satisfying (2.3.3), (2.3.4) and (2.3.5) contains no statistic that is Bahadur optimal for all the alternatives (o,:>.).

For the convergence of the weights, we introduce a condition that differs a little bit from (2.3.7). Define the weight functions 1/JN: (0, 1 ]-+JR by

and suppose a measurable and square integrable function 1/1:[0,1]-+JR exists, such that Jw2(v)dv=1 and,

(2.3.7a) J!wN(v)-tjl(vJI2dv-+ 0.

Note that

~:x=AJ

1

~(v)dv.

Theorem 2.3.4.

Within the class of statistics

s~

1

)

as defined by (2.3.2) with weights dN. satisfying (2.3.3), (2.3.4),

,1.

(2.3.5) and (2.3.7a), no pair of statistics exists, in which one is uniformly better in the sense of Bahadur than the other.

Proof: Consider two sequences of limit weight functions

w

₁and

w

₂

s~

1

_l

statistics with respectively. Firstly, note that due to the conditions on the dN .,

,1.

Jw~=Jw~=1,

Jw,=Jw2

=o

and both

w

₁ and

w

₂ are nondecreasing on [0,1]. And since,

:xf

1

~

1 (v)dv~AJ

1

(30)

XE(0,1), when the first sequence of statistics is uniformly better than the second, the result directly

follows from theorem 2.2.4. a

Now, let us focus on one particular x

1 and look for those statistics that have maximal slope for the alternatives (o,X) with X x

1• Thus, we want to find the statistics

8~

1

)

such that

N-

1

E~+ldN,i

is maximal under the restrictions that

E~=ldN,i=O

and

E~=ldN,f=N,

where k/N~x

₁

• Using the generalized Lagrange multiplier theorem we find that this maximum is attained for d _N,l.=-(N-k)!k-; if i~k, and d _N,1.=(N-k)-!k! if i>k.

Indeed, this is the optimal two-sample likelihood ratio test (see example 2.3.2), for samples x

1, ••• ,xk and Xk+1 ' ••• ,XN.

Therefore, we have the next corollary,

Corollary 2.3.5.

Apart from the two-sample statistics, the class of statistics

8~

1

)

defined by (2.3.2) with weights dN,i satisfying (2.3.3) and (2.3.5) does not contain a statistic that is optimal in the sense of Bahadur for

any alternative (o,X). o

2.3.2. Max-type statistics

In this section , we will study the statistics

(2.3.8) ~l)

with cN,k;::O.

max cN,k

1~k<N

If cN,k 1, k=1, ••• ,N, this becomes the likelihood ratio statistic for the c.p.p.

(31)

Theorem.2.3.6.

Let xi~N(1J,1), i=1, ••• ,N, all Xi independent, llElR, and let

~

1

)

be defined by (2.3.8).

Suppose

(2.3.9) lim max cN,k N+co 1::;;k<N

Then,

lim- N-1 log[sup{P(~1)ii:N1t) ;11ElR }] N+co

Proof: For each N, and 1::;;k<N, TN,k~N(0,1), hence, using the same relationship as in theorem 2.3.1 we get, for each IJElR,

-1

!

lim - N log P(TN,kii:N t) N+co

Thus, application of theorem 2.2.3 together with (2.3.9) results in

lim- N-1 log

P(~

1 );;:N!t)

=

~

,

N+co 2ymax

and, since this holds for each 11ElR , the proof is complete.

We proceed to the almost sure limit under an alternative

(6,A). Let 0=0[0,1] be the space of functions on [0,1], that are right-continuous and have left-hand limits. Endow 0 with the Skorokod-topology (see e.g. Billingsley, 1968, Chapter 3). Introduce the function yNEO by

(take cN,o=cN,N=O),

(2.3.10) yN(u) := cN,[Nu] , O::;;u::;;1

(32)

And define, for each A€(0,1),

( 2 • 3 • 11 ) b ₁( u; A) :

=

(u ( 1-u) ) !

C

=~

t.fr)

Theorem 2.3.7.

Let x.~N(J.!, 1), i=1, .•• ,n; x . ....,N(JJ+o,1), i=n+1, ••• ,N; all

1 1

Xi independent, o>O, JJEIR, n/N-+-A€(0,1) and let

~1)

be defined by (2.3.8).

Suppose there exists a function y:[0,1]-+-IR+, such that yED and yN-+-y in the Skorokod topology.

Then,

max y(u) b

1(u;A)

O~u~1

a.s.

Proof: Introduce for u€[0,1], (take TN

0=TN N=O),

-;

-!

-

,

TN(u) :=N TN,[Nu)' then N cN,kTN,k-yN(k/N)TN(k/N). Using the strong law of large numbers for independent random variables, it is easy to show that for each u€[0,1], TN(u)-+-ob

1 (u;A), a.s. However, to justify the interchange of max and lim, in addition to the given Skoroko.d convergence of the yN functions, the uniform convergence of TN(u) for u€[0,1] is needed. That is to say, we need to show that for almost every realisation x

1,x2, ... of

x

1

,x

2, ••• the corresponding functions tN(u) converge uniformly to ob

1(u;A).

To that end, firstly, we will show that for each

u€[0,1] and for each e>O, a neighborhood Bu of u exists, such that for almost every realisation

(2.3.12) 3N ElN VN>N VvEB [itN(u)-tN(v)i<e].

u u u

For notational convenience, introduce zN{u)=(N-[Nu])![Nu]-i. Fix e>O and uE(0,1); suppose w>u (for w<u the proof is similar), then for all vE[u,w]

(33)

TN(u) - TN(v) =

(zN(u)-zN(v))N-

1

Lf~~](-Xi)

+

-1 -1 [Nv]

+ (zN (u)+zN(v))N Li=[Nu]+1 xi + -1 -1 -1 N

+ (ZN (u)-ZN (v))N Li=[Nv]+1 Xi.

Successively, we will consider x 1 to X[Nu]' X[Nw]+ 1 to

~and _{x[Nu]+ 1 to X[Nw]·}

By the strong law of large numbers,

-+ ( - / 1

~u·

+ / 1

~w)

( u ll + ( u-A. ) 1 [ A , 1] ( u ) & ) a · s • so, for w-u small enough, say w-u<6

1, this limit will be smaller than e/6 and, hence, for almost every realisation x

1,x2, •.•

for N sufficiently large.

But then, the same inequality, with w replaced by v, also holds true for every vE[u,w].

In a similar way, it follows that for w-u smaller than some 8₂, for almost every realisation and N sufficiently large,

-1 -1 -1 N

I

_{(zN (u) -zN (v)) N Li=[Nw] +1}xi < e/3,

for all vE[u,w]

For the third and final part, the Cauchy-Schwarz inequality will be used. Define Yi:=Xi-EXi, then

(34)

-1 [Nw] 2

Yi~N(0,1) _{and, hence N E[Nu]+ 1 Yi+(w-u)} a.s. Again, this limit is small for (w-u) small, and thus for almost every realisation x

1,x2, ••• ,

-1 [Nw] 2

( 2 . 3 . 13 ) N E [ _{Nu ] + 1 y i} < t./6 for N sufficiently large.

On the other hand, for all vE[u,w],

[ Nv] -1 -1 2 [ Nw] -1 -1 -1 2 E[Nu]+₁ N (ZN (u)+zN(v)) + E[Nv]+₁ N (zN (u)-zN (v)) ::>

-1 -1 2

~ ( [Nv] -[Nu]) N (zN (u) +ZN(u)) +

-1 -1 -1 2

+ ([Nw]-[Nv])N (ZN (u)-ZN (w)) Letting N+oo, the right hand side tends to

(v-u>((,~u)'

+

(1~u)!)2

+

(w-v>((~)'

_1-u -

(~)')

_1-w 2 _'

which can be made arbitrarily small by taking w-u small enough. Hence, for N large enough the right hand side of the inequality will be smaller than e/6, for every vE[u,w]. And thus, together with (2.3.13) and

Ca•.lchy-Schwarz, we .get that for w-u smaller than A 3,

for almost every realisation and for every vE[u,w]

-1 -1 [Nv]

I

(zN (u)+zN(v))N E[Nu]+₁ Yi +

-1 -1 -1 [Nw]

+ (zN (u)-zN (v))N _{E[Nv]+ 1 yil} ~ t./6 if N is sufficiently large.

And it easily follows, that the same inequality holds with the y i replaced by xi and the right hand side by

(35)

e/3~ which completes the proof of (2.3.12) for u€(0,1). Using the same kind of argument (2.3.12) can be proved

for u=O or 1.

Furthermore, the continuity of b

₁

(u~A) implies that for each u€[0,1], a neighborhood exists such that

lob

₁

(u~A)-ob

₁

(v~A) l<e, for each v in that neighborhood. Let B' be the intersection of B with that

u u

neighborhood.

In addition, TN(u) converges a.s. to ob

₁

(u~A), thus, for almost every realisation, there is an N~ such that for N>N~, itN(u) -ob₁(u:A)

I

<e.

Consequently, since

we get that, for almost every realisation x

1,x2, •.•

Due to the compactness of [0,1], the open cover,

I

U B, has a finite subcover {Bu(i)}i=1, ... ,t: thus, u€[0,1] u

for N> max N*(.), we have that for all

1~i~t u 1

vE[0,1LitN(v)-ob

1(o;A) l<3e, i.e. the a.s. convergence

of TN(u) is uniform. o

Remark 2. 3. 3 •

For the max-type statistics in this and in the next chapter, Skorokod-convergence will be required for the weight functions yN; L₂- (or L₁-) convergence, as used

(36)

Remark 2. 3. 4.

The weight functions yN and y are bounded (as are all the elements of D). Hence, without loss of generality, it may be assumed that max yN(u)=1, for all Nand

O;;>u:£1

max y (u) =1.

O;;>u:>1

Theorem 2.3.8.

Let xi~N(JJ,1), i=1, ••• ,n; xi~N(JJ+o,1), i=n+1, .•. ,N; all X. independent, o>O, JJE1R, n/N+l€(0,1).

~ (1) .

Then, with

MN

andy as in theorem 2.3.7, and

max y(u)=1, the Bahadur-slope of

{~

1

)}

at the alternative

O;;>u;;>1

(o,l) equals

o

2( max y(u) b₁(u;A))2

O;;>u;;>1

Proof: Directly from theorems 1.4.1, 2.3.6 and 2.3.7. Remark 2.3.5.

As with the

s~

1

)

statistics - see remark 2.3.2 - the mutual efficiency of two sequences of

M~

1

)-type

tests does not depend on the magnitude of the change o. And the same holds true for the efficiency of an

s~

1) .with respect to an M( 1 ) test.

N

Example 2.3.3.

The likelihood ratio statistic for known is the special case of

~l),

all k and all N. Thus, y(u) 1, for

2

the c.p.p. with cr when cN,k=1, for u€[0,1] and from theorem 2.3.8, the Bahadur slope of the likelihood ratio test at the alternative (o,A) is

o

21(1-1).

This slope equals Raghavachari's upperbound for all alternatives, which complies with the Bahadur optimality of the likelihood ratio test proved by Haccou et al.

[J

(37)

(1985) (see section 1.5). With the understanding that we consider the strong slope whilst they use the weak version.

Remark 2.3.6.

is Bahadur optimal at (o,A') for all o>O.

- ignoring statistics with other yN, but converging to the same limit function y, the likelihood-ratio statistic treated in example 2.3.3 is the only one

D

that is optimal for all alternatives with A€(0,1). o

2.4. Variance unknown

When cr2 is unknown, the two-sample likelihood ratio statistic for samples x₁, .•. ,xk and Xk+₁, ..• ,~ is,

(2.4.1) TN,k

where,

(38)

-'

-(2.4.2) T' = (k(N-k)\l (X -Xk) N,k N

7

sN

2 1 N

where, sN

=

N-₁ Ei=₁

Sum type statistics, using this modified likelihood ratio were suggested· by Sen and Srivastava (1975a) (see section 2.1).

Remark 2.4.1.

Since, (N-1)s~=(k-1)s~+(N-k-1)s~:k+k(~-k) _(Xk-XN-k)- _, 2 or sstotal=SSwithin+SSbetween' the statistics

and {2.4.2) have a functional relationship

(2.4.1)

And thus, the monotonicity of

' 2 ss .

{T ) _{(N 1 )} _{between "N- 1} N ,k = - 88_{within -} _'

the two-sample problem, based fact the same.

2.4.1. Sum-type statistics 2 -1 x { 1-x ) on implies that TN k and on-

_,

[ 0 1 1] along tests for I in TN,k are with

The sum-type statistics based on (2.4.1) are rather

intractable. This is the main reason for the introduction

I

of the statistics TN,k. Only sum-type statistics based on the latter will be treated here.

(2) N-1 •

Hence consider SN =Ek~

₁

cN,kTN,k'. which, like (2.3.2), can be written as

{2.4.3)

with

(39)

(2.4.4) _Ei=1N _{dN .} = 0 I

,l.

and

(2.4.5) _{dN ,1} ::;;

dN,2 ::;; ::;; dN,N•

Again, assume without loss of generality that

N-1 N 2

(2.4.6) _{Ei= 1 dN,i}

The standardization of

s~

21 differs from that of

s~

1

)

I

section 2.3.1, because the denominator of (2.4.3) equals the sum of squares, and is not a variance estimator. A large deviation result for

s~

2

)

is formulated in theorem 2.4.3. The proof of this theorem refers to theorem 2.2.2, therefore, the distribution of

s~

21 has to be known~ this is stated in theorem 2.4.1 and

corollary 2.4.2. (The null distribution of (N-!s( 2 )) 2 is

N

given also in Sen and Srivastava (1975a)).

Theorem 2.4.1.

Let x.""N(p,cr2), i=1, ••• ,N, all xi independent, pEJR I o2>0

and

l~t s~

2

)

be defined by (2.4.3). Suppose the dN,i satisfy (2.4.4) and (2.4.6).

Then, (N-!s< 21 )_N 2 has a Beta distribution with parameters ! and !N-1.

~I Proof: Introduce the vectors XN=(X₁,x

2, ••• ,XN) and

a'-

_N-(N

-!

-i

dN, 1 , ••• ,N dN,N) and define

~'- 2 N - 2

-!

(2) 2 2 -2

UN:=dNXN and VN: Ei= 1 (Xi-X) Then (N SN ) =UNVN.

~·- 2 2 2 2

Since dNdN=1, we have UN""N(O,o ),hence UN/o ""x₁•

Define the NxN matrix

~:=IN-N-

1

EN,

with IN the identity

2 -.

-and EN the matrix of ones; then VN=XNANXN.

(40)

2 2 2 thus (Rao, 1972, p. 186) VN/a ~xN-l •

~ ~ -1 ~ ~ ~·~

Furthermore, ~dN=dN-N ENdN=dN and dNdN=1 thus dN is a normed eigenvector of ~·

..,. .... r - .... '

But then both AN-dNdN and dNdN are idempotent • ...

-'

...

'

And thus, since (~-dNdN)dNdN=O, the quadratic forms

X~(~-dNd~)XN

and

X~dNd~XN

are both cr2

x

2-distributed (with N-2 and 1 df) and independent (Rao, 1972, p. 186-187). So we have (N-i

s~

21

)

2

=

u~v~/

_,_ ~·~ 2 (~dNdNXN)/a ~· ~ ~· - -·- -·- 2 (XN(~-dNdN)XN+XNdNdNXN)/o Beta(} ,i (N-2)).

(See e.g. Johnson and Kotz, 1970 Ch. 24).

Remark 2.4.2.

Note that the distribution of

(N-is~

2

))

2 is independent of the weights dN,i' as long as the dN,i meet the

0

conditions (2.4.4) and (2.4.6). o

Using this theorem and the symmetry of the distribution of s~2) , i t is easy to prove

Corollary 2.4.2.

Under

H~

the density function fSN of

N-is~

2

)

is (2 4 7) _{• •} f _{SN s}( )

₌

_B(J,!N-1)1 (1 _-s2)(N-4)/2 1

I I<

_s ₌

We are ready now to prove the large deviation theorem.

(41)

Theorem 2.4.3.

Let Xi~N(JJ,cr2), i=1, .•• ,N1 all Xi independent, J.!ElR, cr2>0 and let s~2) be defined by (2.4.3). Suppose the dN. satisfy (2.4.4) and (2.4.6). Then, for O<t<1,

.~

(2.4.8) lim- N- 1 log [sup{P(S~2)~N!t) HIElR}] -Hog(1-t ) 2

N+oo

Proof: Introduce for all N, tN:=uN:=t and

-1 -1

Then, N log eN=o ( 1) and N log uN=o ( 1 ) . From (2.4.7), it is evident that for N~5, on [ 0 , 1 ] • fSN is decreasing Furthermore, -1 -1 (fSN(t+N )) N log f (t) = SN 0 ( 1 ) (N+<») Finally, hence,

So, all the conditions for theorem 2.2.2 are fulfilled and thus,

-1

lim - N log f SN ( t) = N+oo

=lim (N- 1 logB(!,!N-1) - N-4 log (1-t2))

(42)

Now, by definition B(;,;N-1)=r(})r(!N-1)/r(!N-!), and r(!N-1)<f(iN-i)<f(iN)=(iN-1)f(}N-1). So for all N>2, and thus, r(iN-1) r(iN-1) < ₁ r (!N) < r C!N-i >

lim N- 1 log r (!N-l)

o·.

N+oo r(iN-i) Consequently,

lim - N-l log P(S~2) ~N~t)

N+oo -ilog(1

which holds for all ~Em and thus the proof is complete. o

Our next concern is the almost sure limit of

s~

2

)

under an alternative (o,t.) (theorem 2.4.5). First we

formulate a lemma. Lemma 2.4.4.

Let Xi, i=1, .•. ,N be independent =~;1=1, ••• ,n; and

EXi=~+oo,i=n+1, ... ,N, var(X₁)=o2,i=1, ••• ,N and n/N+/.E(0,1) if N+oo. Then,

a.s. The proof is straightforward and will be omitted. Theorem 2.4.5.

Let Xi~N(J.!,o2), i=1, •.• ,n; XfvN(JJ+oo,o2), i=n+l, ... ,N; all X. independent, ~Em, o>O, 5>0 and n/N+XE(0,1). Take

s~

2

_{) ~s}

_in_{theorem 2.4.3 and suppose}

(43)

lim N- 1 N _dN,i Ei=n+1 r;l, <

""·

N+<» Then, lim N

_,

s<2> = or; A. (1 +l, (1-l.)

o

2

)!

a.s. N+oo N

Proof: Directly from theorem 2.3.2 and lemma 2.4.4. The next theorem combines the results of the theorems 2.4.3 and 2.4.5 according to theorem 1.4.1.

Theorem 2.4.6. Let X _i 8 (2 ) and

I N r;l. be as in

Bahadur slope of s< 2 l at the _N 02 2 -log(1 - r;l. )

1+l.(1-l.lo2

theorem 2~4.5, then, the alternative (o,l.) equals

Just as in the known-variance case, the monotonicity of the dN . has not been used in the foregoing theorems •

.

~

Results similar to those stated in theorem 2.3.4 and corollary 2.3.5 can easily be shown to hold here, by application of the next corollary.

·Corollary 2.4.7.

Let sJ1' be defined by (2.3.2) I

s~

2

'

by (2.4.3) with

-1 N

the same dN,i coefficients and suppose N Ei=n+₁dN,i+t;A. Then,

log[ 1 +o2A. ( 1-l.)] -BS (S~2); o ,>..)=log[ 1 +o 2 >.. (1->..) -BS (S~1);

o,

l.)] where BS(S~i) ;o,l.) denotes the Bahadur-slope of

{

S~

i) } I i : 1 12 at ( 0 I A) •

Proof: Directly from theorems 2.3.3 and 2.4.6.

c

0

(44)

2.4.2. Max-type statistics

In this section, we deal with max-type statistics based on (2.4.1) and (2.4.2). Contrary to the sum-type statistics, max-type statistics based on (2.4.1) can be handled. For the sake of completeness, those based on (2.4.2) will be taken into consideration too.

I

Thus, with TN,k and TN,k according to (2.4.1) and (2.4.2), (2.4.9) and I (2.4.10) ~2) _max _{cN,k TN,k} 1 ::!k<N

Again, with cN,k=1,

~

2

)

equals the likelihood-ratio statistic for our problem. And, due to the equivalence

I . ( , _ (2)

of TN,k and TN,k see remark 2.4.1), ~f cN,k-1,

MN

and

~2₎

are equivalent too (Worsley, 1979).

Once again, we start with a large deviation theorem.

Theorem 2.4.8.

Let x.~N(p,o2) I i=1, ..• ,N,

~

o2

>o.

Let

M~

2

)

and

~

2

)

be

(2.4.10).

all X. independent, 11ElR ,

~

defined by (2.4.9) and

Suppose lim max c k = ymax<~; N-+co 1:iik<N N,

then,

(2.4.11) l i m - N-1log [sup{P(~2)~N,t) ;11ElR}]

N-+oo

=

2

log ( 1 +

~

2

) Ymax

(45)

and

(2.4.12) lim - N -1 log [sup{P(MN - ( 2 ) ' !1:N t} ~JlEJR}]

N+oo

-! log ( 1 - ; 2

) Ymax

~~~ For all k, 1~k<N, TN,k has a t-distribution with N-2 degrees of freedom~ thus,

-1

lim - N log P (TN k

N+oo f

(see Bahadur, 1971, example 5.1), which, along with theorem 2.2.3 leads to (2.4.11).

To prove (2.4.12}, first note that

(N-1)-!T~

k is a

-! (2) I

special case of N SN (compare with example 2.3.2). One consequence (corollary 2.4.2) being that the

1

distribution of TN,k is independent of k; thus, theorem 2.2.3 applies again.

Secondly, by theorem 2.4.3

-1

-!

1

lim-N logP((N-1) TN,k;;;t)

N+oo

and applying theorem 2.2.3 completes the proof.

Recall the definition of the yN functions,

yN(u) :=cN,[Nu]' with cN,O=cN,N=O, and introduce analogous to b

1(u;X),

(2.4.13) b₂(u;o,X)

(46)

Theorem 2.4.9.

be defined by (2.4.9) and (2.4.10).

. +

Suppose a function y:[0,1]-+JR exists such that yED and yN-+y in the Skorokod-topology.

Then,

and

max y(u)b₂(u;o,A.) o~u::>1 max 0 ::>u~1 5y (u) b 1 (u; A.) (1+A.(1-Alo2

ll

a.s. a.s.

This proof follows the same line of argument as used to prove theorem 2.3.7 and hence, will be omitted.

Again, (see remark 2.3.4) the weight functions are bounded, therefore assume that max y(u)=1.

O~u:£1

2

Let xi""N(v, ) , i=1, •.. ,n; x."'N(JJ+6cr,cr ) I i=n+1, ••• ,N;

2 ~

all X. independent, 6>0, a >0, 1JEJR and n/N-+A.E(0,1).

Then,~the

Bahadur slope of

{~

2

)}

at the alternative

(6,).) equals

and, that of

{~

2

)}

equals

(47)

log(1 + b

₂

(A~o,A))

Again, as in example 2.3.3, this slope equals

2

)})

is optimal for some alternative (o,A), if and only if,

y(A)= max y(u). o

O:>u:>1

From the theorems 2.3.8 and 2.4.10, it is clear that the relation between the Bahadur slopes of

s~

1

)

and

s~2_~

asstated in corollary 2.4.7, holds for the statistics ~1) and ~2) too.

Corollary 2.4.11.

(1) ~(2)

Let

MN

be defined by (2.3.8) and

MN

by (2.4.10). Suppose both have the same weight coefficients, cN,k' and the necessary convergence conditions are met. Then,

log[1+o

2

_A(1-A)]-BS(~

2

_{) :o,A)=log[1+o}

2

_A(1-A)-BS(~

1

₎

(48)

Of course it is interesting to consider the efficiency of an

~

2

)

statistic with respect to the

~

2

)

statistic with the same weights cN,k"

Theorem 2.4.12.

(2) - (2)

Let

MN

be defined by (2.4.9) and MN by (2.4.10). Suppose that both have the same weights, cN,k' and the necessary convergence conditions are met; then, for all (

o,

A) ,

with equality if and only if y(u)b

1(u;;\) and

y(u)b₂(u;d,;\) attain their maxima for the same u=u* and y(u*)=1.

Proof: First note that,

ob 1 (u;;\) b ₂(u;o,;\)

=

₂ _{2 2}

! ,

. (1+& ;$1-).)-o b 1(u;;$) thus

max y{u) b₂(u;o,;\) <:; _max

u u 0 max ;;: u (1+o2A (1-).) -o2 and since BS(~2₎ ;o,;\)

this proves the theorem.

dy{u)b

0

(49)

2.5. Some comparisons and examples

In this final section, the results of the foregoing

sections are applied to derive the slopes and efficiencies of some specific statistics from the general classes

under consideration. Figure 2.5.1 presents the

efficiencies of four

s~

1

)-statistics

see table 2.5.1 -with respect to Raghavachari's upperbound, as a

function of A. It should be recalled that these efficiencies do not depend on

o,

the size of change. This figure clearly complies with theorem 2.3.4 and corollary 2.3.5.

Figure 2.5.1. Bahadur efficiency

8~

1

)-statistics

o - Chernoff-Zacks; A - cN,k constant; + - TN 1k' k1/N+0.3; X - TN k +TN N-k , k1/N+0.3 I 1 I 1

(50)

Table 2.5.1. Bahadur slopes of s~1>-statistics statistic (weights) slope at

( o,

A) Chernoff-Zacks, (Ex. 2.3.1) 3

o

2 X 2 ( 1->.) 2

2 2 -1

':' c , k=1, •.• ,N-1

4o

('IT

-ol

<-h(1-A) + cN,k + (1-A.)arccos!f=I+Xarccoslr}

c

, k=k1 , k1 /N+A1 ₂

e

_{1-A. , 2} cN,k , k>'k1

o

(1-A. 1)X 1 x-A~) 1 1

f

1 1 k=k1 or N-k 1 ; 2

e

1-A)2 cN,k

=

_lo

_{, elsewhere; k} ~0 1..1 r-A1A-A-1 /N+X 1 <; 1 1

According to corollary 2.4.7, the mutual relations between the corresponding

s~

2

>-statistics,

i.e. with the same weight functions - are similar. See figure

2 2

2.5.2. However, they now depend on o • For o +0, the efficiency of an

s~

2

)-test

(w.r.t. Raghavachari's upperbound) tends to the efficiency of the

s~

1

)-test

with the corresponding weights. And, if

o

2

+~,

this

efficiency goes to zero for all A, for which

s~

2

)

o

2.

Although this efficiency tends to 1 when o2+0 (or A+O, or At1), even for small values of o2, (or A-values

close to 0 or 1) the loss in efficiency is considerable. For the

~

11

-statistics,

we easily get from theorem 2.3.8 that if y(A

1)=1 (=maxy(u)), then, _u -for all alternatives (o,A) with A<A

1, the Bahadur-slope is independent of y(u) for u>A

1;

and provided that

y(u)~c~:

1

:u)i for uE(O,A

1], i t does not depend on y(u) for u<A₁either.

Of course, a similar statement holds for alternatives with A>A

1• This property is illustrated in figure 2.5.5 (see also table 2.5.2). For A<A

1=0.3, the

included statistics all have the same slope. (Moreover, the two-sample statistic TN,k' with k/N+A₁, also has this slope for A<A

1, see table 2.5.1).

Table 2.5.2. Bahadur slopes of M~1>-statistics statistic (weights) slope at

( o ,

A)

(u 1-u ) _A 1<i

{'2'211-1

)A

-1

, 0 <A:>A₁ y1(u) = -r-"'~

,

₁ ₁ 1 1

o

2 A(1-A) 3 (1-A 1)-2

,

A1<A o2(1-A) 2 (1-A ) .. 2 , i <A<1

1

c (

1 -A ) ( 1 -u) A ) i

y2(u)=

(1-u)~

1 Au(1-i.

1 ~

o

2 (1-A 1) A1 (.>.. -r-A~ 1-A )2 1 1

(53)

,

.

• 75

.50

.2 .4 .6 .8

\+

s~

2

)-

with respect to

s~

1

_>-statistic

with Chernoff-Zacks weights; same

o

2 values as in fig. 2.5.3 •

• 75

.50 eff. t

\+

~

1

)-statistics,

weight functions as in table 2.5.2; o - r

1(u);

+ - r