• No results found

THE END

N/A
N/A
Protected

Academic year: 2021

Share "THE END"

Copied!
3
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

VU University Statistical Data Analysis, part I

Faculty of Sciences 27 March 2014

Use of a basic calculator is allowed. Graphical calculators and mobile phones are not allowed. This exam consists of 4 questions (27 points).

Please write all answers in English. Grade = total+33 .

GOOD LUCK!

Question 1 [7 points]

Are the following statements correct/sensible? Motivate your answer by a short argument or a sketch.

a. [2 points] In the context of bootstrapping: the parametric bootstrap is always better than the empirical bootstrap.

b. [1 point] The influence function of the 10%-trimmed mean is bounded.

c. [2 points] Consider a bivariate sample (X1, Y1), . . . , (Xn, Yn). The two stem-and-leaf plots of X-values and Y -values separately contain the same information as the bivariate scatter plot.

d. [2 points] If the dots in a QQ-plot of a sample (vertical axis) against some distribution F0 (horizontal axis) show an S-shape, the

distribution F0 has heavier tails than the distribution of the data.

Question 2 [8 points]

In Figure ?? (see page 3) an empirical (two sample) QQ-plot of two data sets x and y is shown.

a. [2 points] Do you think that the underlying distributions of the two data sets belong to the same location scale family? Motivate your answer.

b. [1 point] What can you say about the normality of the underlying distribution of data set x?

c. [3 points] Suppose that we would like to test whether or not the underlying distribution of data set x is the normal distribution with expectation 0.5 and variance 1. Evaluate for each of the following tests for goodness of fit how suitable they are for testing this, and motivate your answer:

i) Kolmogorov-Smirnov test;

ii) chi-square test for goodness of fit;

iii) Shapiro-Wilk test.

1

(2)

d. [2 points] Consider the following goodness-of-fit test situation:

H0 : F ∈ F0 where F is the unknown underlying distribution of a given sample and F0 is some class of distributions. Explain the following statement: “The test statistic T for goodness of fit is nonparametric (distribution free) under the null hypothesis”.

Question 3 [5 points]

Consider the data presented in Figure ?? (see page 3).

a. [2 points] Empirical bootstrap values for the sample mean and sample median of this data set were computed and some quantiles of these bootstrap values of both location estimators are:

quantile 0.025 0.05 0.5 0.95 0.975 estimator 1 0.68 0.75 1.08 1.39 1.50 estimator 2 1.49 1.55 2.07 2.54 2.78

Indicate which of the two estimators is the mean: estimator 1 or estimator 2? Motivate your answer.

b. [2 points] Determine the length of the 95% bootstrap confidence intervals both for the mean and the median of the underlying distribution. (You are not asked to determine the intervals, only their lengths.)

c. [1 point] Which estimator for location do you prefer for this data set?

Motivate your answer.

Question 4 [7 points]

Let X1, . . . , Xn be independent and identically distributed random

variables with unknown distribution P . Suppose that the sample variance Tn(X1, . . . , Xn) = SX2 is used to estimate the variance of P . To determine the accuracy of this estimator, its standard deviation is estimated by means of the empirical bootstrap.

a. [3 points] Describe the steps of the empirical bootstrap scheme that you would use to find the bootstrap estimate of the standard deviation of Tn.

b. [2 points] Describe shortly which two errors are (necessarily) made in this bootstrap procedure.

c. [2 points] Now consider a bootstrap test for testing H0 : P ∈ P0 using some sensible test statistic which has an unknown distribution under H0. Indicate of each error that you mentioned in part (b) whether such an error is also present in the context of this bootstrap test. Motivate your answer.

2

(3)

0 1 2 3 4

0.00.10.20.30.4

Two sample QQ−plot

x

y

Figure 1: Two sample QQ-plot of two data sets x and y.

Histogram of x

x

Frequency

0 2 4 6 8 10 12 14

05102030

Figure 2: Histogram of a data set x.

THE END

3

Referenties

GERELATEERDE DOCUMENTEN

(15 points) c) Suppose we subject network B to a one-way sensitivity analysis, where we are interested in the probability distribution Pr(V 5 ). More specifically, we are interested

The way the concept of global city develops show interesting points in the perspective of ethnicities. While the present underlying notion is putting the emphasis on the old actors

The general method to evaluate an approach is to split the normal dataset into two disjoint subsets, use one to build a model of the normal traffic (training set ), and the other

However, the relation between perfect equilibrium points in extensive form games and perfect equilibrium points in normal form games, is not as nice as one (perhaps) would like i t

Such an algorithm needs (i) a training set and a test set consisting of both segmenting and nonsegmenting points belonging to the word images and (ii) a set of feature vectors

The goal of the current research was to test a recent hypothesis about the mechanism underlying associative-learning effects in conflict paradigms like the Stroop task:

Figure 4: Achieved cycle service levels and average inventory levels regarding the normal, discrete, BDQ, general distribution fitted on the basis of one mean estimate and one

To test the economic significance, a set of minimum variance portfolios will be constructed using the actual mean returns and volatility per security, and each of