• No results found

Errors in Fully Parametric Hazards Constructing a consistent estimator using moment conditions

N/A
N/A
Protected

Academic year: 2021

Share "Errors in Fully Parametric Hazards Constructing a consistent estimator using moment conditions"

Copied!
33
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Errors in Fully Parametric Hazards

Constructing a consistent estimator using moment conditions

Bas Rekveldt

April 19, 2007

University of Groningen

(2)

ABSTRACT

Measurement errors occur whenever observations are made. Econometric methods have been developed to overcome the problems caused by these measurement errors in many fields of research. For transition data models these methods are all based on a semi-parametric method introduced by Cox (1972, 1975), the partial likelihood method. However, fully para-metric methods play a less prominent role and are less often used. It is the aim of this thesis is to develop and examine a fully parametric estimation method based on the Method of Moments to be used in transition data models subject to measurement errors.

(3)

CONTENTS 1. Introduction . . . 5 2. Hazard Models . . . 7 2.1 Hazard Models . . . 7 2.2 Hazard Functions . . . 7 2.3 Duration Dependence . . . 8 2.4 Model Specification . . . 9 2.5 Incomplete Observations . . . 10 3. Measurement Errors . . . 11

3.1 Origin of Measurement Errors . . . 11

3.2 Consequences of Measurement Errors . . . 11

3.3 Estimation Methods . . . 12

4. Constant Hazard Model . . . 13

4.1 Maximum Likelihood . . . 14

4.2 Least Squares . . . 14

4.3 Comparing Estimators . . . 15

5. Including Measurement Errors . . . 16

5.1 Least Squares . . . 16

5.2 Distributional Consequences . . . 17

5.3 Heterogeneity . . . 17

5.4 Consistent Estimation . . . 17

5.5 Properties of the Estimators . . . 18

5.6 Testing for Measurement Error . . . 20

5.7 Improving the Estimators . . . 21

5.8 Maximum Likelihood . . . 21

6. Including Censoring . . . 23

6.1 Censoring without Measurement Errors . . . 23

6.2 Censoring with Measurement Errors . . . 25

7. Conclusion . . . 30

(4)

Contents 4

Appendix 34

A. Constants, Distributions and Asymptotics . . . 35

A.1 Mathematical Constants . . . 35

A.2 Statistical Distributions . . . 35

A.3 Slutsky’s Theorem . . . 36

A.4 Central Limit Theorem . . . 37

B. Mathematical Derivations . . . 38

B.1 Derivations for Chapter 4 . . . 38

B.2 Derivations for Chapter 5 . . . 40

B.3 Derivations for Chapter 6 . . . 43

C. Matrices . . . 45

C.1 Matrix Ψ from section 5.5 . . . 45

C.2 Matrix G from section 6.2.1 . . . 45

(5)

1. INTRODUCTION

Measurement errors occur in all branches of econometrics and transition data is no excep-tion to this. Often these errors are inherent to the method of data collecexcep-tion used. During a survey subjects can have difficulty with recollecting the precise information of interest or the interviewer needs to interpret an answer. At other times the measurement errors are deliberately introduced to the dataset to ensure a higher degree of confidentiality (Brand, 2002). Whatever the source of the measurement errors problems are sure to arise if they are not dealt with appropriately (Cochran, 1968 and Chesher, 1991). Therefore, methods have been developed to handle these problems. Most common among these methods is the use of instrumental variables (Buzas & Stefanski, 1996 and and Lewbel, 1997), although others ex-ist as well, such as non-parametric Maximum Likelihood estimation (Aitkin & Rocci, 2002). Schneeweiss and Augustin (2006) give an overview of more recent advances in this field of research.

The usual method of estimation in transition data models is by partial likelihood (Cox, 1972, and Cox, 1975). This is a semi-parametric method that is easy to implement for proportional hazard models and eliminates the baseline hazard function from the estimation procedure. With it, this method also eliminates the need for assumptions on the shape of the error distri-bution since it is disregarded in the estimation procedure. After the parameters in the model are estimated by this procedure it will be possible to construct a non-parametric estimate for the baseline hazard function and thus for the underlying distribution (Cox & Oakes, 1985). The use of such a semi-parametric method in a model with measurement errors eliminates the need to control for the effect of the measurement errors on the disturbance term and sim-plifies matters considerably in most cases. The classical papers that looked into these types of models are Prentice (1982) from a structural point of view and Nakamura (1992) from a functional one. Although both papers provide somewhat negative results they have formed the basis of continuing research by others, such as (Tsiatis & Davidian, 2001). This has lead to several different estimation methods to account for measurement errors in transition data models by using semi-parametric methods. Two of these methods are Non-Parametric Maxi-mum Likelihood Estimation (P. Hu, Tsiatis, & Davidian, 1998 and Dupuy, 2004) and the Cor-rected Likelihood Approach (C. Hu & Lin, 2002 and Augustin, 2004). Both of these methods provide consistent estimators for the parameters of interest in transition data models with measurement errors.

(6)

1. Introduction 6

method are severely limited in their application to new observations to make predictions. If the covariates of these new observations take on values outside of the range of the covariates used in the estimation of the semi-parametric model, the predictions will be biased and thus less useful.

This thesis aims at developing a fully parametric method of estimation to be used in a tran-sition data model with measurement errors. This estimation method will be based on the Method of Moments (MM), which provides a simple and elegant approach to parameter es-timation in regression models. Furthermore, MM estimators are consistent by construction, using Slutsky’s theorem, under general conditions. The presence of measurement errors in a normal regression model often leads to identification problems for the estimation procedure. However, due to the specific form of the linearised transition data model, the distribution of the disturbance term is explicitly specified. This additional information, inherent to the model, can be used during the estimation procedure to overcome some of the usual prob-lems encountered when measurement errors are introduced in the model.

Transition data models form the foundation of this research. Therefore, chapter 2 gives a short introduction to this type of modelling and discusses some of the most common prob-lems encountered in such models. Similarly, chapter 3 gives a short overview on measure-ment errors. Then chapter 4 starts of with one of the simplest transition data models, the con-stant hazard model, and examines the use of several estimation procedure that are generally used for such a model. Chapter 5 introduces measurement errors to the model, introduced in the previous chapter. Furthermore, it discusses the problems encountered when using the estimation methods already discussed in chapter 4 and introduces a new estimation method based on MM that is consistent and easy to use. In addition, the chapter also explains on how this MM estimator can be improved upon by using more information available to the researcher. As a last extension to the constant hazard model to be dealt with in this thesis, chapter 6 introduces censoring to the model. Estimation methods to be used in this case are discussed both in the absence and presence of measurement errors in the model. Finally, chapter 7 concludes this thesis and presents several suggestions for future research in this area.

(7)

2. HAZARD MODELS

This chapter will give a general introduction to hazard models. The main questions that will be addressed in this chapter are: “What is a hazard model?”, “When are hazard mod-els used?” and “What kind of problems might be encountered when working with hazard models?”. The theory and methods described in this chapter and the next will form the foundation for the analysis performed in chapters 4 and 5.

2.1 Hazard Models

Hazard models are used in a wide variety of studies, ranging from biostatistics (Baker, 1998) and engineering to economics ((Lancaster, 1979) and (Czado & Rudolph, 2002)). In all these fields of research hazard models are used to describe risks, the risk of a patient dying, the risk of machine failure, the risk of getting a job or the risk of accidents. All of these risks de-scribe some event occurring and the data used in hazard models usually comes in the form of lengths of time. Therefore, it is important to have an understanding of a duration.

Defining a duration exactly requires a starting time of the duration (beginning), a time scale (unit of measurement) and a clear definition of the event that ends the duration (end). Most problems encountered when working with transition data are concerned with the starting and end time of a duration, which are described in more detail in section 2.5. Considering an example from economics, unemployment spells, the starting time of a duration would be defined as the moment a person is registered as unemployed at the appropriate institution. The time scale can be defined in terms of years, months or even days. This choice is highly dependent on the available data. Finally, as a definition for the event that ends a duration, becoming employed seems a very natural choice.

When dealing with durations, normal regression models are often inadequate, so other types of models need to be considered. The models used to analyse transition data are often de-noted by hazard models, named after the function used to specify a distribution for the durations, the hazard function. Although there are quite a few ways to specify a hazard model, the two most common specifications are proportional hazard and accelerated life-time models, which will be described in section 2.4.

2.2 Hazard Functions

(8)

end-2. Hazard Models 8

ing a duration at time t, given that the duration lasted until time t. The hazard function in terms of probabilities is given by

λ(t) = plim

dt→0

P{t ≤ T < t + dt|T ≥ t}

dt .

Another way to specify the hazard function is in terms of the density, f , and distribution, F , functions of a distribution

λ(t) = f(t) 1− F (t),

which is equal to the inverse of Mills’ ratio if the distribution considered is the normal dis-tribution (Mills, 1926).

As its name suggests, the hazard function can be interpreted as the risk that an observation is exposed to at time t. So for example in the setting of a labour market, the risk of an un-employed worker to get a job at time t and thus ending his duration of unemployment. For more information on hazard functions, see Kiefer (1988), Lancaster (1990) and Hosmer and Lemeshow (1999).

Besides the hazard function, it is often useful to define another function as well, called the integrated hazard function. This function has no useful interpretation but is often used in calculations. It is defined by

Λ(t) = Z t

0

λ(u)du.

Although it is not a probability it has a very nice relationship with the distribution function F(t) = 1− exp(−Λ(t)).

2.3 Duration Dependence

An important property of a hazard function is its duration dependence. This describes how a hazard function behaves during a duration. A positive duration dependence would mean that the hazard function increases in time, so the risk to which an observation is exposed increases as the duration becomes longer. Again using an example from the labour market: An unemployed worker gets more education with time during his unemployment. With that he would increase his chances of finding a job and thus increase the risk that he will no longer be unemployed the longer he is unemployed. A hazard function is said to exhibit positive duration dependence at some time t∗if and only if

∂λ

∂t > 0, at t = t

.

A negative duration dependence would mean exactly the opposite. The longer a worker is unemployed the less effort he spends on finding a new job and the lower the risk of losing unemployment status becomes. Thus the hazard function decreases in time. So a hazard function exhibits negative duration dependence if and only if

∂λ

∂t < 0, at t = t

(9)

2. Hazard Models 9

In the special case where this derivative is exactly equal to zero we are in a case of no dura-tion dependence, which is one of the features of the exponential distribudura-tion.

2.4 Model Specification

Often there are some covariates (ξ) available to model the durations under consideration and these should be included in the model. The most direct approach would be to allow the covariates to affect the hazard function directly, i.e., λ(t, ξ, β) instead of λ(t). However, to be able to use parametric estimation methods, either semi or fully parametric, with such a specification, more information is required. Therefore, a transformation ϕ(ξ, β) is often considered, which influences the hazard function. To avoid the need for restrictions on the values of the parameters to be estimated (β), often an exponential form is chosen for ϕ, i.e.,

ϕ(ξn, β) = exp(ξβ).

Although other specifications are certainly possible. Two of the most common approaches to modelling transition data are described in this section.

2.4.1 Proportional Hazard Specification

The name proportional hazard comes directly from the manner in which the model is spec-ified. The covariates adjust the baseline hazard function proportionally and thus influence the probability that a duration ends. This approach leads to a specification of the form

λ(t; ξ, β) = λ0(t)ϕ(ξ, β).

2.4.2 Accelerated Lifetime Specification

Another approach would be an accelerated lifetime specification. In this case the origin of the name is also easy to see. The covariates adjust the time directly, by accelerating time for some observations and decelerating for others. In this case the covariates do not alter the probability of ending a duration, but the length of the duration. This leads to a specification of the form

λ(t; ξ, β) = λ0(tϕ(ξ, β)) .

2.4.3 Differences in Specifications

Both specifications have their own merits. The most notable among these become apparent after rewriting the model in linear form. In this linear model the proportional hazard spec-ification allows for a very broad range of transformations of the durations t. However, the distribution of the disturbance term is very specific, namely an extreme value distribution, as described in Appendix A.2.1. On the other hand, after linearisation the accelerated lifetime specification allows for quite a general distribution of the disturbance term. Which comes at the cost of a very restrictive transformation of the durations. The choice of model specifi-cation will depend largely on the phenomenon under consideration and the preferences of the researcher. However, there is one case in which both specifications are exactly the same, namely the case where the distribution of the durations tnis assumed to be exponential. The

(10)

2. Hazard Models 10

mathematically and it is memoryless, thus having a constant hazard rate, i.e., no duration dependence. It is this special case that will be examined in the rest of this thesis.

2.5 Incomplete Observations

One of the things that makes working with transition data troublesome is incomplete obser-vations. These incomplete observations can lead to some bias in the resulting estimates if the incomplete observations are discarded or treated similarly to complete observations, which is unwanted. By discarding the incomplete observations a lot of information that could have been used is thrown away, which will effect the results of your estimators. On the other hand, treating incomplete observations as if they were complete will also lead to undesired inconsistencies in the resulting estimates. Therefore methods, based on maximum likeli-hood estimation, were developed to address these problems. There are two different forms of incomplete observations which will be explained in this section.

2.5.1 Censoring

An observation is called censored if either the starting or the ending time is unknown. If the starting time is unknown, the observation is called left censored and for an unknown ending time right censored. This censoring can have different causes, dependent on the type of data available. In the case of data on unemployment durations it might be that an unemployed person moves from one place to another and disappears from a regional study this way. The most common form of censoring is right censoring, which means that it is unknown when the duration ends. The only thing that is known about a right censored duration is that it lasted at least for the observed period until it was censored and most likely even longer, however how much longer the censored duration lasted is unknown.

2.5.2 Truncation

Another form of incomplete observations comes from truncation. In this case the incom-pleteness of the observations does not come from the durations observed, but from the du-rations that are not observed. An observation is called truncated if it comes from a sample that is incomplete. So in other words the observation comes from a truncated distribution. The most common form of truncation is left truncation, which means that not all of the dura-tions that started at the same time in the past are observed. Only those duradura-tions that lasted for a longer time are observed in the sample.

2.5.3 Accounting for Incomplete Observations

(11)

3. MEASUREMENT ERRORS

This chapter will give a short introduction to general measurement error theory. The main topics discussed will be “What are measurement errors?” and “How do measurement errors affect estimation results?”.

3.1 Origin of Measurement Errors

Measurement errors occur when we try to observe some data ξ, but instead we observe this data with some additional disturbance. Perhaps due to imprecise instruments or incomplete definition of the data we wish to observe. So instead of observing ξ we observe some other data x, which is equal to ξ plus some disturbance, often denoted by v, i.e.,

x = ξ + v. (3.1)

These disturbances v are called measurement errors, they interfere with observing the true values of the data. Moreover, in this terminology ξ is called an unobservable variable or a latent variable, depending on the theoretical and philosophical foundation of the model. While x is called the observable variable. In economics the unobservable variable might represent income or inflation. In a more specific setting of transition data in which the goal is to explain unemployment spells it might represent the productivity of a person, whereas we would observe, for example, education and work experience instead. While these variables might give a good impression of productivity, they are still imprecise and some discrepancies should be expected in the form of measurement errors.

3.2 Consequences of Measurement Errors

Due to the measurement errors the observations that are available become less reliable and this hampers estimation procedures, especially if the process that generates the errors is unknown. The main consequence is that estimators that are otherwise consistent become in-consistent when measurement errors are part of the observations. To illustrate this, consider a very simple model, consisting of equation (3.1) and

y = ξβ + , (3.2)

where y is the dependent variable, β is the regression coefficient of interest and  is a distur-bance term. In the case that ξ would be directly observable, Ordinary Least Squares (OLS) estimation of equation (3.2) would result in a very simple and consistent estimator of the form

ˆ

(12)

3. Measurement Errors 12

Alternatively, in the case where ξ is not observable, the OLS estimator for the system, equa-tions (3.1) and (3.2) combined, would lead to

ˆ

β = (x0x)−1x0y.

Unfortunately the probability limit of this estimator is not equal to β, but equal to plim N→∞ ˆ β = βσ 2 v σ2 x β,

where σ2x and σv2 are the variances of x and v respectively. From this we can see that

in-consistency of an estimator poses a serious problem, since we even expect to estimate the parameter of interest incorrectly.

3.3 Estimation Methods

Fortunately, methods have been developed to account for the measurement errors and to correct for this bias and thus generate consistent estimators. Since we are concerned with a linear model, the methods mentioned in this section are only directly applicable to lin-ear models. To account for measurement errors in non linlin-ear models stronger assumptions are required (Wolter & Fuller, 1982), which will not be discussed in this thesis. One such ap-proach, which is often employed, is introducing instrumental variables to provide additional information for the estimation procedure. These instruments exhibit a correlation with the regressors. Often the instrumental variables come from additional data that is collected for the research. Sometimes, it is even possible to construct the instrument from the available data directly.

Another approach relies on information about the signal-to-noise ratio, σ

2 ξ

σ2

v. If this ratio is

(13)

4. CONSTANT HAZARD MODEL

To start off the analysis of transition data models and the effect of measurement errors in such models, consider a very simple setup. Assume that the durations tnare exponentially

distributed without any censoring or truncation. Furthermore, assume that there is only a single random covariate ξn, with mean zero and variance σξ2which we observe without error.

Besides a covariate, an intercept is also included in the model. This means that the matrix of regressors Ξ takes the following shape

Ξ 

ι ξ  ,

where ι is a column vector of ones of appropriate length. For the functional form of ϕ con-sider the most common case of an exponential form,

ϕ(ξn, β) = exp(Ξnβ),

where Ξn is the n-th row of the matrix Ξ. Due to the assumption of an exponential

distri-bution and the included intercept, the baseline hazard function λ0 can be normalized to 1

without loss of generality. Under these specifications the hazard function becomes

λ(t; ξn, β) = λ0(tn)ϕ(ξn, β) = exp(Ξnβ), (4.1)

And the integrated hazard function becomes Λ(t; , ξn, β) =

Z t

0

λ(u, ξn, β)du = t exp(Ξnβ). (4.2)

To be able to formulate this specification as a linear model, consider the random variable ∗n

defined by

∗n ≡ − ln Λ0(tn)− Ξnβ.

The distribution of ∗ can be derived directly from its construction by examining

P{∗n≤ E} = exp(− exp(−E)), (4.3)

which is the cumulative density function of the standard extreme value distribution which has a mean of γ and a variance of π62. For more information of this distribution see Ap-pendix A.2.1. Thus we can write this proportional hazard specification as a linear model of the form

− ln Λ0(tn) = t∗n = Ξnβ+ ∗n,

for all n. However, since the distribution of the error term does not have an expectation of zero, it would be better to consider

t∗n = Ξnb+ n, (4.4)

where nis defined as n ≡ ∗n− γ for all n and b as b ≡ β + e1γ, where e1 is the first unit

vector. This can be done without loss of generality due to the intercept in the model. As a consequence of this correction nis distributed according to an extreme value distribution

(14)

4. Constant Hazard Model 14

4.1 Maximum Likelihood

One way to estimate the parameters in this model without measurement errors is by Max-imum Likelihood (ML) Estimation. The log-likelihood function can be derived from equa-tions (4.1) and (4.2). Then the log-likelihood function for all observaequa-tions together becomes

L(β) = ι0Ξβ− t0exp(Ξβ), (4.5)

where t is a vector of length n containing all durations tn. The first order conditions of this

log-likelihood are

s(β) = ι0Ξ− t0∆Ξ = 0, (4.6)

where ∆ is a diagonal matrix with the elements of exp(Ξβ) on the diagonal, i.e., ∆nn =

exp(Ξnβ). The expectation of this score-vector is equal to

E [s(β)] = 0, (4.7)

since t has a multivariate exponential distribution with a mean of exp(−Ξβ). Unfortunately it is not possible to derive closed expressions for the estimators for β from this equation. A numerical approach to optimize this log-likelihood function however, is quite straight-forward by using the Newton-Raphson method (Verbeke & Cools, 1995) or the method of scoring. For this, the second order derivative is also required, first note that

t0∆ = (exp(Ξβ))0∆∗,

where ∆∗is a diagonal matrix with the elements of t of its diagonal, i.e., ∆∗nn = tn. Then the

second order derivative becomes

H(β) = −Ξ0∆∆∗Ξ. (4.8)

The inverse of the negative expected value of this matrix can be used as an approximation of the variance of the estimators for β, this matrix is given by

I(β) = Ξ0Ξ−1

. (4.9)

4.2 Least Squares

Another way of estimating the parameters in this model, is by using least squares (LS). The usage of this method is allowed, because there is no censoring or truncation involved. Since the disturbance term in the linear model (4.4) has an expectation of zero, the standard LS estimator for b is given by

ˆb = (Ξ0Ξ)−1Ξ0t,

where t∗is a vector of length n containing all transformed durations t∗n. The original param-eters β can be obtained from this estimator by subtracting γe1from it. The probability limit

of the LS estimator ˆb is equal to plim

N→∞ˆb = b.

(4.10) So the LS estimator is a consistent estimator and has a variance, according to LS theory, equal to

Varˆb = π

2

(15)

4. Constant Hazard Model 15

4.3 Comparing Estimators

(16)

5. INCLUDING MEASUREMENT ERRORS

Now consider the case where the true regressor ξnis unobserved, but instead an imperfect

substitute, xn, for ξnis observed and is to be used in the estimation procedure. Where xnis

given by

xn = ξn+ vn, (5.1)

where vn represents the measurement error in this model with an expectation zero and a

finite variance σ2v. Furthermore, assume that that vnis independent of nand ξnfor all n. In

matrix notation equation (5.1) becomes

X = Ξ + V, (5.2)

where V is defined as V ≡ 

0 v . By combining (4.1) and (5.2) the hazard function, including measurement errors, becomes

λ(t; xn, vn, β) = exp(Xnβ− Vnβ),

and the integrated hazard function with measurement errors becomes Λ(t; xn, vn, β) = t exp(Xnβ− Vnβ).

The linear model in (4.4) can be adjusted as well to include the measurement errors and becomes

− ln Λ0(tn) = t∗n = Xnb+ un, (5.3)

where un is the new disturbance term defined by un ≡ n− Vnb, which has an, as of yet,

unspecified distribution.

5.1 Least Squares

Adapting the LS estimator from section 4.2 slightly to the new model, yields a LS estimator ˆb of the form

ˆb = X0X−1 X0t∗.

The probability limit of this estimator is equal to plim

N→∞ˆb = b − Σ −1

X ΣVb, (5.4)

where ΣX and ΣV are the second order moments of Xnand Vnrespectively and defined as

(17)

5. Including Measurement Errors 17

Thus the standard LS estimator is no longer consistent, as can be expected from general measurement error theory. However, before trying to derive a consistent estimator, we will first consider some other consequences of the measurement errors included in the model.

5.2 Distributional Consequences

To determine the distribution of un, define u∗n= un+ γ as un∗ ≡ − ln Λ0(tn)− Xnβ.

P{un≤ U} = exp (− exp(−Vnβ− U)) . (5.5)

This expression for the cumulative density function of u∗nmeans that u∗nconditional on Vnβ

has an extreme value distribution with location parameter−Vnβand scale parameter 1. This

extreme value distribution for u∗nwould have an expectation of γ−Vnβand a variance equal

to π62.

However, it is not possible to observe the measurement errors v and therefore we are unable to condition on them. As a result of this the model including measurement errors is no longer a true proportional hazard model as the disturbance term does not have a standard extreme value distribution, which would require that Vnβ is equal zero, i.e., no measurement errors

present in the included regressors. Another assumption, albeit unrealistic, that would allow us to stay within the regular proportional hazard framework is independence between ∗n

and u∗n, in which case we would be able to conclude from theory that vn follows a logistic

distribution, see Appendix A.2.1 for more information on this. Neither approach is very satisfying and thus we conclude that neglecting the presence of measurement errors and treating (5.3) as a regular proportional hazard model will lead to misspecification of the model.

5.3 Heterogeneity

The misspecification of the model mentioned in the previous section is also expressed in the form of heterogeneity. Besides the inaccuracy of inferences about the parameters as a result of heterogeneity, it will also lead to misleading inferences about the duration dependence. Since measurement errors can be treated as omitted covariates, in that including them as an observed regressor would solve this problem, we can conclude that the estimate of the du-ration dependence will be biased downward (Kiefer, 1988). This gives us another incentive to model the measurement errors appropriately and prevent model misspecification.

5.4 Consistent Estimation

(18)

5. Including Measurement Errors 18

information, we will first consider the second order moments of the model, given by EX0 nXn  =  1 0 0 σ2 ξ+ σv2  Et∗ nXn0  =  b1 b2σξ2  Eht∗n2i = b21+ b22σ2ξ+ σ2.

Since σ2is known to be equal to π62, this system of four equations in four unknowns can be solved in principle, without resorting to higher moments or additional assumptions on the distributions of ξn and vn. Solving these equations for b, σ2ξ and σ2v leads to the following

expressions b1 = E [t∗n] b2 = Eht∗n2i− E [tn]2π62 E [t∗nxn] σ2ξ = E [t∗nxn] 2 Et∗2 n − E [t∗n]2−π 2 6 σ2v = Ex2 n − E [t∗ nxn]2 Et∗2 n − E [t∗n]2−π 2 6 .

Substituting the population moments for their sample counterparts leads to estimators of the form ˆb1 = ι0t∗ N ˆb2 = t∗0t∗(ι0t∗) 2 N − N π2 6 x0t∗ ˆ σ2ξ = (x0t∗) 2 Nt∗0t − (ι0Nt∗)2 − N π2 6  ˆ σ2v = x0x N − (x0t∗)2 Nt∗0 t∗(ι0t∗)2 N − N π2 6  .

These estimators are consistent and thus the model is identified. Note that the estimators obtained are obtained through the use of the Method of Moments (MM) and we are thus able to apply standard theory for these estimators.

5.5 Properties of the Estimators

(19)

5. Including Measurement Errors 19

variances which we can use as a measure of efficiency and to construct test statistics to de-termine whether or not measurement errors are present in our data. However, first we will introduce some notation to be used in this setting, starting with the vector gn, consisting of

observations, in our case defined as gn =  t∗n t∗nxn t∗ 2 n x2n 0 .

The average of these observations, denoted by ¯g, represents the sample moments to be used in the estimation procedure, i.e.,

¯ g = 1 N N X n=1 gn =  ι0t∗ N x0t∗ N t∗0t∗ N x0x N 0 . This vector has an expectation equal to

γ(θ) = E [¯g] = E [gn] =



b1 b2σ2ξ b21+ b22σ2ξ62 σ2ξ+ σ2v 0,

where θ is a vector containing all parameters of interest, in this case θ consists of the elements b1, b2, σξ2 and σ2v. This vector γ(θ) denotes the corresponding population moments. The

moment conditions can now be rewritten in the form E [gn− γ(θ0)] = 0,

which forms a system of equations equivalent to the one used in section 5.4 to derive the consistent estimators. To be able to determine the asymptotic variances of the estimators we now define the matrix G containing the first order derivatives of the moment conditions with respect to the parameters, which is given by

G(θ) = −∂γ(θ) ∂θ0 = −     1 0 0 0 0 σξ2 b2 0 2b1 2b2σξ2 b22 0 0 0 1 1     .

The inverse of this matrix is given by

G(θ)−1 =       −1 0 0 0 2b1 b2σ2ξ 1 σ2 ξ − 1 b2σ2ξ 0 −2b1 b2 2 − 2 b2 1 b2 2 0 2b1 b2 2 2 b2 − 1 b2 2 −1       ,

(20)

5. Including Measurement Errors 20

for which it holds that plim

N→∞

ˆ

Ψ = Ψ.

Using these matrices the covariance matrix for the estimators Σθcan be derived to be equal

to

Σθ = G(θ)−1Ψ G(θ)−1

0 ,

from which we can calculate the individual variances to be AVar√Nˆb1 = Σθ,11 = b22σ2ξ+ π2 6 (5.6a) AVar√Nˆb2  = Σθ,22 = E4 n −361 π4 b22σ4ξ + b21σ2v+16π2σv2 σξ4 + b21+ b2 2σ2v+16π2 σξ2 (5.6b) AVar√Nσˆξ2  = Σθ,33 = E4 n −361 π4 b42 + 4b2 1σ2ξ+ 4b21σv2+23π2σ2v b22 + 4b1Eξ3n  b2 + Eξ4 n + 4σξ2σv2 (5.6c) AVar√Nσˆv2 = Σθ,44 = E4 n −361 π4 b42 + 4b21σ2ξ+ 4b21σ2v+23π2σ2v b22 − 4b1Evn3  b2 + Ev4 n − σ4v+ (1 + b2) Eξ4n . (5.6d)

It is possible to derive these asymptotic variances, under the assumption that all moments of fourth order and lower are finite for both ξnand vn, by applying a Central Limit Theorem

(see also Appendix A.4), which states that √

N ˆθ− θ  A

∼ N (0, Σθ) .

5.6 Testing for Measurement Error

Due to the estimator for the variance of the measurement error, as derived in section 5.4, and the knowledge of its asymptotic distribution, as derived in the previous section. it is possible to construct a test statistic to determine whether or not measurement errors are present in the data. The test statistic takes the form of

A = ˆσ 2 v pAVar (ˆσ2 v) ,

which has, due to the use of estimates instead of true values, a Student’s t-distribution in-stead of a normal distribution.

A remark should be made about the use of this test statistic. Since no restriction on the parameters, such as σxi2 ≥ 0 and σ2

v ≥ 0, are used in the MM estimation procedure it is

possible to use this statistic to test for the presence of measurement errors. However, the null hypothesis of σv2 = 0 should be tested against the alternative σv2 >0, because negative

(21)

5. Including Measurement Errors 21

5.7 Improving the Estimators

Since the estimators as derived in section 5.4 are based on the method of moments, it is possible to improve these estimators by including more moment conditions and applying a Generalized Method of Moments (GMM). The addition of moment conditions improves the efficiency of the estimators obtained trough GMM in general. For a more in depth analysis of this conclusion please see Wansbeek and Meijer (2000). The additional third-order moments are given by Ex3 n  = Eξ3 n + E vn3  (5.7a) Ex2 nt∗n  = b1σξ2+ b1σv2+ b2Eξ3n  (5.7b) Ehxnt∗ 2 n i = 2b1b2σ2ξ+ b22Eξ3n  (5.7c) Eht∗n3i = b13+ 3b1b22σ2ξ+ 1 2b1π 2+ b3 2Eξn3 + E 3n  (5.7d) While the distribution of  is known and has a third-order moment equal to 12

√ 6ζ(3)

π3

(Ap-pendix A.2.1), no such information is available for ξ and v without making additional as-sumptions about their distributions. However, the only asas-sumptions required to perform the estimation are that the third order moments of both distributions are finite. The introduc-tion of two addiintroduc-tional parameters and four equaintroduc-tions to the system provides us with more information to work with and still be able to derive consistent estimators. Furthermore, by including the fourth order moment conditions, given below, as well, we again expand the available information, five equations, at a low cost of two additional parameters.

Ex4 n  = Eξ4 n + 6σξ2σv2+ Evn4  (5.8a) Ex3 nt∗n  = b1Eξ3n + b1Evn3 + b2Eξ4n + 3b2σξ2σv2 (5.8b) Ehx2nt∗n2i = b21σξ2+ b21σv2+ 2b1b2Eξ3n + b22Eξn4 + b22σξ2σv2+ π2σ2ξ 6 + π2σv2 6 (5.8c) Ehxnt∗ 3 n i = 3b21b2σ2ξ+ b2π2σξ2 2 + 3b1b 2 2Eξ3n + b32Eξn4  (5.8d) Eht∗n4i = b41+ 6b21b22σ2ξ+ b21π2+ b22π2σ2ξ+ 4b1b32Eξn3 + 4b1E3n  (5.8e) +b42Eξ4 n + E 4n , (5.8f) where 12 √ 6ζ(3)

π3 and 125 can be substituted for E3n and E 4n respectively, since the exact

distribution of nis known.

5.8 Maximum Likelihood

Now that we know that it is possible to derive a consistent estimator for the model with measurement errors and thus the model is identified, it is useful to examine the correspond-ing likelihood function and its derivatives to be able to draw more conclusions. To be able to calculate the likelihood function for this model, we will first have to derive the joint density function. By the definition of a conditional density we know that

(22)

5. Including Measurement Errors 22

Once ξnis known tnand xnbecome independent of each other and we can write

ft,x(tn, xn|ξn) = ft|ξ(tn|ξn) fx|ξ(xn|ξn) .

Furthermore, since we assumed a simple exponential model in chapter 4 we know that if ξn

is given tnwill follow an exponential distribution with mean exp(Ξβ), therefore, the density

function for tnbecomes

ft(tn|ξn) = exp



Ξnβ− tneΞnβ



. (5.9)

However to be able to determine the density function of both xnand ξnit will be necessary

to make additional assumptions on the distributions of ξnand vn. To this end, assume that

both ξnand vnare normally distributed with mean zero and variance σ2ξand σ2vrespectively.

Then the density function for ξnbecomes

fξ(ξn) = 1 q 2πσ2 ξ exp ξ 2 n 2σ2ξ ! .

Since xnis defined as the sum of ξnand vnit will be normally distributed as well, in addition,

if ξnis known the distribution of xnbecomes a normal one with mean ξnand variance σv2.

Therefore, the conditional density function for xnbecomes

fx(xn|ξn) = 1 p2πσ2 v exp (xn− ξn) 2 2σ2 v ! .

Combining these five equations it is possible to derive the log-likelihood function for the model with measurement errors to be

L(θ) = ι0Ξβ− t0exp (Ξβ)−(x− ξ)0(x− ξ) 2σ2 v − ξ0ξ 2σ2 ξ − N ln2πqσ2 ξσv2  , (5.10) if ξ were observable. However, since ξ is unobservable, we cannot simply use this log-likelihood function to obtain estimators. Instead we should maximize the log-log-likelihood function based on the marginal distribution of t and x. However, it is not analytically possi-ble to integrate Ξ out of the above distribution to achieve the desired result (Prentice, 1982) even if we were to assume different distributions for ξnand vn. Therefore, we are unable to

specify an information matrix in a direct analytical form. However, it is possible to give an expression for the information matrix for the vector β that can be simulated in the form of

I(β) = E Z −∞ Ξ0∆∆∗Ξdξ −1 . (5.11)

While we do know that there does exist a consistent estimator and thus that the information matrix is of full rank, it is difficult to confirm this analytically from equation (5.11).

(23)

6. INCLUDING CENSORING

As indicated in section 2.5, the presence of censoring is the feature that distinguishes transi-tion data analysis from most other types of data analyses and can be troublesome at times. When working with transition data at least part of the data will be censored in some way, the most common being right censored observations. These are observations of durations that have not yet ended upon ending the observational period or study. The only informa-tion available on these durainforma-tions is that they lasted at least until the time they were censored and possibly longer. However, how much longer they have or will last is unknown and can range from just one period to an infinite number of periods.

In the first section of this chapter we will outline some of the methods that can be used for estimation in transition data models without measurement errors, such as the Tobit model (Tobin, 1958 and Amemiya, 1984). The second section will discuss why these methods are difficult to apply in a transition data model with measurement errors and propose an alter-native based on the MM estimator derived in chapter 5.

6.1 Censoring without Measurement Errors

To be able to handle censoring by using a Tobit-like model we start off with a transition data model without measurement errors as introduced in chapter 4. Additionally we will need to define some variables to indicate when and if an observations is censored. Since right cen-sored observations are most common we will restrict our attention to this type of censoring. Furthermore, to keep the model as simple as possible we will not consider censoring as a result of subjects dropping out during the study. This limits the possibility of censoring to only those cases in which the event that ends a duration as not yet occurred for an observa-tion at the end of the study period.

The starting time of the study or observational period will be normalized, without loss of generality, to zero. However, since not all observations are required to begin at the start of the study, let snindicate the starting time of an individual observation n. Furthermore, let

edefine the ending time of the study or observational period, which also defines the only point in time that observations can be censored. Finally, let endefine the ending time of an

individual observation n in the case that the observation is not censored. In the case of no censoring, an observed duration tnis defined by

tn = en− sn.

By introducing the possibility of right censoring, observed durations tnare instead given by

(24)

6. Including Censoring 24

where e < en indicates that the corresponding observations is censored and it is not

cen-sored if e ≥ en. However, also note that, by linearising the model to a form like that of

equation (4.4) we are no longer dealing with right censored, but with left censored obser-vations instead. This is due to the transformation performed on the observed durations, including multiplying by minus one. Using this information, it is possible to develop a complete model of the durations, including censoring, in a form as suggested by Heckman (1990). This model is given by

t∗1n = − ln Λ0(en− sn) = Ξnb+ n (6.1a)

t∗2n = − ln Λ0(e− sn) = cn (6.1b)

dn = 1 (cn− Ξnb− n>0) (6.1c)

t∗n = (1− dn) t∗1n+ dnt∗2n (6.1d)

where 1(·) is an indicator function that returns the value 1 if the statement is true and 0 otherwise. The variable dnindicates whether an observed duration is censored (dnequals 1)

or not (dnequals 0).

6.1.1 Tobit Estimation

One of the most common methods to use for estimation in this type of model, is by a Tobit style ML estimator. For this particular model it is quite easy to construct and in the absence of measurement errors it is preferred based on its efficiency. The standard Tobit model for equation (6.1) is given by

t∗n = maxnβ+ ∗n, cn} ,

which is quite similar to the model used in chapter 4 for ML estimation. The log-likelihood function in the presence of censoring becomes

L(β) = (ι− d)0Ξβ− t0exp(Ξβ),

where d is a vector of length N and contains the elements dnis given by equation (6.1c). The

first order derivatives are equal to ∂L(β)

∂β = (ι− d)

0Ξ− t0∆Ξ

and the second order derivatives are given by ∂2L(β)

∂β∂β0 = −Ξ0∆∆∗Ξ.

Note that all these functions are highly similar to the ones used previously in chapter 4 and can even be used in the absence of censoring to yield the same results as before.

6.1.2 Probability Estimation

(25)

6. Including Censoring 25

is censored or not. This probability, using the model of equation (6.1) as a foundation, is given by P{dn= 1|Ξn} = exp  −eΞnβ−cn  . (6.2)

Using this information it is possible to derive a likelihood function for this model to be used in the estimation procedure. The log-likelihood function is given by

L(β) = N X n=1  (1− dn) ln  1− exp−eΞnβ−cn− d neΞnβ−cn  ,

while the first order derivatives corresponding to this log-likelihood are given by ∂L(β) ∂β = N X n=1  (1− dn) F 1− F − dn  exp (Ξnβ− cn) Ξ0n,

where F denotes the value of the distribution function of ∗nevaluated in the point cn− Ξnβ.

Furthermore, the corresponding second order derivatives are given by ∂2L(β) ∂β∂β0 = N X n=1  (1− dn) F 1− F  1exp (Ξnβ− cn) 1− F  − dn  exp (Ξnβ− cn) Ξ0nΞn.

These expressions can be used to calculate estimates for the parameter vector β by using the Newton-Raphson algorithm. Note that this method is quite similar to the Tobit estimation. However, in this case the observed durations are not used directly, only the knowledge on whether or not the durations are censored is used as the dependent variable.

6.1.3 Other Estimation Methods

Besides the aforementioned estimation methods for limited dependent variables there are other possibilities as well. Among them are the two step estimator proposed by Heckman (1990), the Least Absolute Deviations estimator (Powell, 1984), the Symmetrically Trimmed Least Squares estimator (Powell, 1986) and the Asymmetrically Trimmed Least Squares Es-timator developed by Karlsson (2006). However, all of the estimation methods mentioned in this section have their own drawbacks, especially when measurement errors are introduced to the model. The estimation procedures become difficult to apply in this case, either due to high non linearity of or unwieldy expression for the estimator.

6.2 Censoring with Measurement Errors

(26)

6. Including Censoring 26

not change by including measurement errors in the model, the process is no longer directly observable. Instead it is possible to observe the process

P{dn= 1|Xn} = Fu(cn− Xnβ) . (6.3)

Thus if the distribution of unis known, it should be possible to use this method to estimate

β even in the presence of measurement errors. The problem with this approach is that the method now depends on the distribution of the measurement errors, which makes estima-tion quite difficult unless very specific assumpestima-tions for this distribuestima-tion are made. Moreover, as already discussed in section 5.2 it is not possible to introduce suitable assumptions to en-sure that un follows an extreme value distribution. As a result of which, other methods

should be considered.

As an alternative to the methods mentioned already, this section will generalize the use of the MM estimator, as derived in section 5.5, to a model with censoring as well as measurement errors. The linearised model, including censoring and measurement errors is given by

t∗1n = − ln Λ0(en− sn) = b1+ b2ξn+ n (6.4a) t∗2n = − ln Λ0(e− sn) = cn (6.4b) dn = 1 (cn− b1− b2ξn− n>0) (6.4c) t∗n = (1− dn) t∗1n+ dnt∗2n (6.4d) xn = ξn+ vn (6.4e) 6.2.1 Moment Estimation

The method of moments applied in the previous chapter can be adapted to account for the presence of censoring in the observed durations. Moreover, due to the censoring some ad-ditional moments will be available to use in the estimation procedure. The first and second order moments, derived from the model as given in equation (6.4), that can be used are given by

E [t∗n] = (1− E[dn])b1− b2E [ξndn]− E [ndn] + cnE[dn] (6.5a)

E [dn] = Π (6.5b) Eht∗n2i = (1− E[dn])b21+ b22 σξ2− Eξ2ndn − E 2ndn + π2 6 + E[dn]c 2 n −2b1b2E [ξndn]− 2b1E [ndn]− 2b2E [ξnndn] (6.5c) E [tn∗xn] = b2 σξ2− Eξn2dn + (cn− b1) E [ξndn]− b2E [ξnndn] (6.5d) Ex2 n  = σξ2+ σv2 (6.5e) E [xndn] = E [ξndn] (6.5f) Ex2 ndn  = Eξ2 ndn + E[dn]σv2 (6.5g)

Since it is not possible to observe ξn and n, solving this system of seven equations in ten

parameters, where we consider Π, E [ξndn], Eξn2dn, E [ndn], E2ndn and E [ξnndn] as

(27)

6. Including Censoring 27

simplification of notation, define zn, kn, lnand mnrespectively as

zn ≡ ξn σξ kn ≡ cn− b1− b2σξzn ln ≡ cn− b1− n b2σξ mn ≡ Z ∞ e−kn−γ e−t t dt.

Even with additional assumptions on the distribution of the latent regressor ξnit is difficult

to derive directly calculable functions of these extra parameters in the original parameters. However, by assuming a normal distribution for ξn, and thus as a result a standard

nor-mal distribution for zn, it can be possible to apply a simulation-based method to evaluate

these functions. In this case we will use the so called Method of Simulated Moments (MSM) as put forward by McFadden (1989). This leads to the following expressions for the extra parameters that can be simulated

E [dn] = Ez[F(kn)] (6.6a) E [ξndn] = −σξE[φ(ln)] (6.6b) Eξ2 ndn  = σ2ξE[Φ(ln)− lnφ(ln)] (6.6c) E [ndn] = Ez[knF(kn)− mn] (6.6d) E2 ndn = Ez Z kn −∞ u2f(u)du  (6.6e) E [ξnndn] = −σξE[nφ(ln)] , (6.6f)

where φ(·) and Φ(·) are the density and distribution function of the standard normal dis-tribution respectively. Furthermore, let f(·) and F(·) denote respectively the density and

distribution function of an extreme value distribution, given by F(u) = exp −e−u−γ

f(u) = exp −u − γ − e−u−γ .

The moments denoted in equation (6.6) can be simulated, as described in Cameron and Trivedi (2005), by drawing a sequence of length S from a standard normal distribution for each observation, denote these by the vector ˜zn, and a sequence of equal length per

observa-tion from an extreme value distribuobserva-tion with parameters−γ and 1, denote these draws by the vector ˜n. Substituting these vectors, ˜znand ˜n, for znand nrespectively the simulated

moments can be calculated and used in the estimation procedure.

For estimation we also require a matrix containing the first order derivatives of the moment conditions. As a generalization of the estimator developed in section 5.4 we will choose the same four moments for our procedure, namely E[t∗n], E[t∗n], E

(28)

6. Including Censoring 28

However, the expectation of the elements of this vector, γ(θ) are now given by the right hand sides of equations (6.5a), (6.5d), (6.5c) and (6.5e) respectively instead. The moment conditions can then be again written in the form

E [gn− γ(θ0)] = 0.

Using the above mentioned moments we can now derive the matrix of first order derivatives to be used in the estimation procedure. For ease of reading and presentation this matrix is presented in appendix C because of its size. Additionally, a matrix containing second order derivatives is required as well to be able to apply the Newton-Raphson algorithm correctly. This matrix can be calculated according to

H(θ) = ∂vec (G(θ)) ∂θ0

and is given in appendix C due to its size.

6.2.2 Simulated Moments Estimation Procedure

This section describes the entire estimation procedure based on the MSM outlined in the previous section. Since the the simulated moments, E[dn], E[ξndn], E[ξn2], E[ndn], E[2ndn]

and E[ξnndn], are in fact functions of the parameters, b1, b2and σ2ξ, an iterative procedure is

required to achieve suitable estimates. This procedure can be decomposed in the following steps

1. Before any estimations are made, it is important to generate all random vectors, ˜znand

˜

nfor n = 1, . . . , N , required for the simulations. Doing so before the estimations and

not during them we avoid unwanted noise during the estimation process that might be caused by using different drawings for ˜znand ˜nduring each step of the estimation

procedure. The vectors ˜znand ˜nare both of length S and contain an identically and

independently drawn sample of their respective distribution. ˜n,s is drawn from an

extreme value distribution with location parameter −γ and scale parameter 1 for all s = 1, . . . , S and n = 1, . . . , N , because of the structure of the model. Due to the as-sumptions made ˜zn,sis drawn from a standard normal distribution for all s = 1, . . . , S

and n = 1, . . . , N .

2. For the estimation procedure we require some initial estimates for the parameters b1,

b2, σ2ξ and σ2v. A very simple way to do this is by ignoring the censoring in the

ob-servations and use the estimator derived in section 5.4. Obviously this estimator is not consistent, however, in contrast to most other estimators this one does take into account to presence of measurement errors, which somewhat makes up for this. 3. During this step we calculate all expectations that require simulation by using the

drawings, ˜z and ˜, based on the functions given in the previous section in which the current estimates, ˆθi, for the parameters are used. While there are quite a few

expecta-tions that have to be simulated during this step, the calculaexpecta-tions required are relatively simple and should not take too long.

4. This step of the estimation procedure consists of updating the estimates for the param-eters based on a Taylor-series expansion, given by

(29)

6. Including Censoring 29

where ˆθidenotes the i-th (current) estimates for the parameters, ˜G(ˆθi) denotes the

ma-trix of first order derivatives calculated based on the simulated expectations form step 3 using the estimates ˆθi and ˜γ(ˆθi) denotes the expectation of the moments used in the

estimation procedure as noted before, calculated by using the simulated expectations from step 3 using the current estimates for the parameters, ˆθi.

5. Repeat steps 3 and 4 by setting i equal to i + 1 until the process has converged. 6.2.3 Properties of the Estimators

Since the estimator is still based on MM estimation the asymptotic distribution is known to be normal. Although, the variance is larger by a factor 1 + S1 due to the use of frequency simulators, as described in Cameron and Trivedi (2005). If S is chosen large enough this should hardly have any impact on the resulting distribution. The covariance matrix can be calculated in a way similar to that used in section 5.5. Unfortunately, however, due to the presence of censoring the analytical expressions for this matrix become very complex with-out adding information on the estimators that cannot be obtained through simpler means. For example through an approximation of the Ψ for this model, obtained by

ˆ Ψ = 1 N N X n=1 (gn− ¯g) (gn− ¯g)0,

for which it holds that plim

N→∞

ˆ

Ψ = Ψ.

Using this approximation instead of the true matrix Ψ it is possible to calculate the variances of the estimators according to

ˆ Σθ =  1 + 1 S  G(θ)−1Ψ G(θ)ˆ −10 .

6.2.4 Improving the Estimators

(30)

7. CONCLUSION

While the constant hazard model studied in this thesis to construct a consistent estimator is rather simple in nature, it can easily be expanded upon to include more covariates with or without measurement errors. The estimation method can be adapted to accommodate the increase of covariates as well without much trouble. The two most important requirements for consistent estimation on which this method is based are the knowledge of the precise form of the transformation performed on the observed durations, in most cases this coin-cides with the integrated hazard function, and the exact specification of the distribution of the error term n. Given these two pieces of information it is straightforward to adapt the

model used to accommodate different data generating processes.

The choice of specification, as discussed in section 2.4, already provides at least one part of the required information. If a proportional hazard specification is used, the distribution of n

is known to be an extreme value distribution with location parameter−γ and scale param-eter 1. On the other hand, if an accelerated lifetime specification is used, the transformation of the observed durations is limited to the form− ln(tn). Therefore, whatever the choice for

the specification is made only one part of the required information still has to be obtained. This information on either the transformation of the observed durations or the specification of the error term distribution, depending on the specification chosen, is entirely determined by the distribution assumed for the assumed durations. For the analysis in this thesis we have chosen to use the simple exponential distribution. Other common choices for duration distributions include the Weibull distribution and the log-logistic distribution, but many oth-ers are also possible. However, the additional parametoth-ers associated with these distributions can make the estimation process more complicated, since they can influence the transforma-tion of the duratransforma-tions or the specificatransforma-tion of the error distributransforma-tion. Without expanding upon the method already developed it is not possible to deal with additional parameters in ei-ther the transformation of the durations or the distributional specification of the disturbance term.

(31)

REFERENCES

Aitkin, M., & Rocci, R. (2002). A general maximum likelihood analysis of measurement error in generalized linear models. Statistics and Computing, 12, 163-174.

Amemiya, T. (1984). Tobit models: A survey. Journal of Econometrics, 24, 3-61.

Augustin, T. (2004). An exact corrected log-likelihood function for cox’s proportional haz-ards model under measurement error and some extensions. Scandinavian Journal of Statistics, 31, 43-50.

Baker, S. (1998). Analysis of survival data from a randomized trial with all-or-none com-pliance: Estimating the cost-effectiveness of a cancer screening program. Journal of the Statistical American Association, 93, 929-934.

Brand, R. (2002). Microdata protection through noise addition. In J. Doningo-Ferrer (Ed.), Inference control in statistical databases, from theory to practice. lecture notes in computer science (p. 97-116). Berlin: Springer.

Buzas, J., & Stefanski, L. (1996). Instrumental variable estimation in generalized linear mea-surement error models. Journal of the American Statistical Association, 91, 999-1006. Cameron, C., & Trivedi, P. (2005). Microeconometrics, methods and applications. New-York:

Cambridge University Press.

Chesher, A. (1991). The effect of measurement error. Biometrika, 78, 451-462. Cochran, W. (1968). Errors of measurement in statistics. Technometrics, 10, 637-666.

Cox, D. (1972). Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological), 34, 187-220.

Cox, D. (1975). Partial likelihood. Biometrika, 62, 269-276.

Cox, D., & Oakes, D. (1985). Analysis of survival data. London: Chapman and Hall.

Czado, C., & Rudolph, F. (2002). Application of survival analysis methods to long-term care insurance. Insurance: Mathematics and Economics, 31, 395-413.

Dupuy, J.-F. (2004). The proportional hazards model with covariate measurement error. Journal of Statistical Planning and Inference, 135, 260-275.

Heckman, J. (1990). Varieties of selection bias. American Economic Review, 80, 313-318. Hosmer, D., & Lemeshow, S. (1999). Applied survival analysis, regression modelling of time to

event data. New York: John Wiley & Sons Inc.

Hu, C., & Lin, D. (2002). Cox regression with covariate measurement error. Scandinavian Journal of Statistics, 29, 637-655.

Hu, P., Tsiatis, A., & Davidian, M. (1998). Estimating the parameters in the Cox model when covariate variables are measured with error. Biometrics, 52, 1407-1419.

Johnson, N., Kotz, S., & Balakrishnan, N. (1995). Continuous univariate distributions (2 ed., Vol. 2). New Tork: John Wiley & Sons.

Karlsson, M. (2006). Estimators of regression parameters for truncated and censored data. Metrika, 63, 329-341.

(32)

References 32

Klein, J., & Moeschberger, M. (2003). Survival analysis, techniques for censored and truncated data (Vol. 15). New-York: Springer-Verlag Inc.

Lancaster, T. (1979). Econometric methods for the duration of unemployment. Econometrica, 47, 939-956.

Lancaster, T. (1990). The econometric analysis of transition data (Vol. 17). Cambridge: Cam-bridge University Press.

Lewbel, A. (1997). Constructing instruments for regressions with measurement error when no additional data are available, with an application to patents and R&D. Econometrica, 65, 1201-1213.

McFadden, D. (1989). A method of simulated moments for estimation of discrete response models without numerical integration. Econometrica, 57, 995-1026.

Mills, J. (1926). Table of the ratio: Area to bounding ordinate, for any portion of normal curve. Biometrika, 18, 395-400.

Nakamura, T. (1992). Proportional hazards model with covariates subject to measurement error. Biometrika, 48, 829-838.

Nielsen, S. (2000). On simulated EM algorithms. Journal of Econometrics, 96, 267-292.

Powell, J. (1984). Least absolute deviations estimation for the censored regression model. Journal of Econometrics, 25, 303-325.

Powell, J. (1986). Symmetrically trimmed least squares estimation for tobit models. Econo-metrica, 54, 1435-1460.

Prentice, R. (1982). Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika, 69, 331-342.

Schneeweiss, H., & Augustin, T. (2006). Some recent advances in measurement error models and methods. Allgemeines Statistisches Archiv, 90, 183-197.

Slutsky, E. (1925). Ueber stochastische asymptoten und grenzwerte [On stochastic asymp-totes and limit values]. Metron, 5, 3-89.

Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica, 26, 24-36.

Tsiatis, A., & Davidian, M. (2001). A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika, 88, 447-458. Verbeke, J., & Cools, R. (1995). The newton-raphson method. International Journal of

Mathe-matical Education in Science & Technology, 26, 177-194.

Walck, C. (1996). Hand-book on statistical distributions for experimentalists. (Internal Note SUF-PFY/96-01)

Wansbeek, T., & Meijer, E. (2000). Measurement error and latent variables in econometrics (Vol. 37). North-Holland: Elsevier Science.

Weisstein, E. (2003). Logistic distribution. MathWorld–A Wolfram Web Resource. (http://mathworld.wolfram.com/LogisticDistribution.html, last accessed on 2007-02-06)

Weisstein, E. (2005). Euler-mascheroni constant. MathWorld–A Wolfram Web Resource. (http://mathworld.wolfram.com/Euler-MascheroniConstant.html, last accessed on 2007-02-06)

Weisstein, E. (2006a). Ap´ery’s constant. MathWorld–A Wolfram Web Resource. (http://mathworld.wolfram.com/AperysConstant.html, last accessed on 2007-02-06) Weisstein, E. (2006b). Extreme value distribution. MathWorld–A Wolfram Web

(33)

References 33

Referenties

GERELATEERDE DOCUMENTEN

The WHO classification 7 was used: class I - normal at light microscopic level; class II - mesangial; class III - focal proliferative; class IV - diffuse proliferative; and class V

Serial renal biopsies provide valuable insight into the frequent and complex histological transitions that take place in lupus nephritis.u Despite therapy, the 4 patients who

Onderstaande lijst geeft een kort overzicht van de meest gebruikte termen in dit document. begraafplaats : Dit is een algemene term voor elk terrein waar de stoffelijke resten

Naast de verschillende sporen en structuren die onderzocht werden, zijn er heel wat losse vondsten aangetroffen in het plangebied die het belang van het gebied in de

contender for the Newsmaker, however, he notes that comparing Our South African Rhino and Marikana coverage through a media monitoring company, the committee saw both received

Fourier Modal Method or Rigorous Coupled Wave Analysis is a well known numer- ical method to model diffraction from an infinitely periodic grating.. This method was introduced at

[r]

Static Field Estimation Using a Wireless Sensor Network Based on the Finite Element Method.. Toon van Waterschoot (K.U.Leuven, BE) and Geert Leus (TU