• No results found

Self-reported household income satisfaction among elderly in Europe, investigated by means of anchoring vignettes ∗

N/A
N/A
Protected

Academic year: 2021

Share "Self-reported household income satisfaction among elderly in Europe, investigated by means of anchoring vignettes ∗"

Copied!
48
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Self-reported household income satisfaction among elderly

in Europe, investigated by means of anchoring vignettes

Fred Heijnen January 30, 2012

Abstract

We investigate the determinants of self-reported income satisfaction. We correct for individual specific scale biases by means of anchoring vignettes. The model of choice is the CHOPIT model introduced by King (2004) for anchoring vignettes. Furthermore we investigate if two assumptions of the (CHOPIT) model hold. The first assumption is response consistency: Individuals use the same scales for the self-report and vignette questions. The second assumption is vignette equivalence: There is no systematic difference in the interpretation of a vignette between respon-dents with different characteristics. We find that both assumptions are violated. Hence we reject the CHOPIT model. We also give an explanation for the strange behavior of the after-tax household income observations. Furthermore we investi-gated if the vignettes are chosen in a good way, we found that they maximize the information on self-reported income satisfaction.

Keywords: Household income satisfaction; CHOPIT model; Anchoring vignettes; Response consistency; Vignette equivalence

(2)

University of Groningen

Master’s Thesis Econometrics, Operations Research and Actuarial Studies Specialization: Econometrics

(3)

Contents

1 Introduction 4

2 Anchoring vignettes 6

3 The data 9

3.1 Self-reported household income satisfaction and vignettes . . . 9

3.2 Household income . . . 11

3.3 Exchange and PPP rates . . . 14

3.4 Other explanatory variables . . . 15

4 The CHOPIT model 16 5 Estimation results 20 6 Response consistency and vignette equivalence 22 6.1 Response consistency . . . 22

6.2 Using the θl estimates for testing response consistency . . . 24

6.3 Vignette equivalence . . . 25

7 Design of the vignettes 28 7.1 Estimation results including only one vignette . . . 34

(4)

1

Introduction

Subjective surveys on for example life satisfaction, health and economic status are widely used, for instance to measure which aspects determine life satisfaction. They are also used to compare the well-being of individuals of different countries. However individuals of different countries (or different groups within a country) might rate their satisfaction on a different scale. Hence a model in which it is possible to adapt the benchmark per individual would be a solution to the problem. However with only data on the subjective survey and individual characteristics we cannot identify the benchmark parameters and the actual parameters on the influence of individual characteristics.

Thus we need extra data, King et al. (2004) proposed to ask extra questions on hypothetical persons. The individuals who fill in the survey are asked to answer the same question for hypothetical persons. This method is called anchoring vignettes and described in the following section.

The method is used in for instance measuring life satisfaction by Angelini, Cavapozzi, Corazzini and Paccagnella (2010) and household income satisfaction by Bonsang and van Soest (2010). In this paper we will also investigate household income sat-isfaction among 11 European countries. We are interested if possible differences in household income satisfaction between countries can be explained by differences in the benchmarks. The answer to this question is already given by Bonsang and van Soest (2010). They have already discussed the impact of fitting a model which takes the differences in the benchmarks into account. The more generalized model is significantly better than a restrictive model, and for instance shifts France from a dissatisfied country to a satisfied country which is in correspondce with the height of their income. The household size and the number of chronicle diseases has a negative effect on household income, while the number of years of education has a positive effect. We obtain the same results.

Therefore we expand their research and focus on the question if the (CHOPIT) model should be rejected or not in the case of household income satisfaction and if the vignette questions provide enough information on the benchmarks.

We added extra explanatory variables on for instance living area and also important variables such as number of children and marital status. The latter variables are important since they are part of the description of the hypothetical persons. Some findings are that the number of children and being married have a negative effect on household income satisfaction, while living in the suburbs or a large town has a positive effect compared to living somewhere else.

(5)

estimate a more general model for testing vignette equivalence. We also propose and use several other formal tests. All the results of the tests indicate that both assumptions are violated, thus we reject the (CHOPIT) model.

We also give an explanation for the strange behavior of the after-tax household in-come observations. Bonsang and van Soest already indicated the presence of a large number of observations with extremely large after-tax household incomes. They discarded those observations as outliers, but we explain that the large after-tax household incomes are not monthly incomes (as should be) but yearly incomes. Furthermore we investigate the significance of the vignette questions. The most important feature of the vignette questions is that they are able to discriminate between individuals, hence all the individuals should not give the same answer to a vignette question. Since then the vignette questions do not provide any information on difference in the benchmarks. We found that the two vignette questions have almost maximum discrimination power for only 2 vignettes.

Thus we find the same estimation results as Bonsang and van Soest (2010), while handling the after-tax household income in a different way, but we reject the model. We also found that vignettes are formulated in a good manner.

The outline of our research is as following. In section 3, the data is described and some descriptive statistics are shown. In section 4, we describe the ordered probit model, CHOPIT model and the generalization of the CHOPIT model. In section 5, we are going to compare both models. We also investigate which individual charac-teristics determine self-reported income satisfaction and how they affect household income satisfaction.

In section 6, we show some counterfactuals. This means that we predict the house-hold income satisfaction level by means of the outcome of the CHOPIT model, but assuming all individuals live in a specific country. Then by looking at the predicted satisfaction levels by country we show the impact of different benchmarks between individuals from different countries.

(6)

2

Anchoring vignettes

Data on self-reported household income satisfaction show a large difference in how satisfied persons are between individuals from different countries (Bonsang and van Soest, 2010). This can be caused by different interpretations of the same question due to personal or cultural characteristics. For instance individuals from French are less satisfied than German, Belgium or Danish individuals, while on average their purchasing power of household income is quite similar. The French are equally satisfied as individuals from the Czech Republic, but their household income is sub-stantially higher. Thus it seems that due to cultural differences French interpret the same question different or use different norms for self description. The difference in interpretation (or using different norms) of the same question is called Differ-ential Item Functioning (DIF) or item bias, which was first described by Holland and Howard (1993). Thus two individuals can be identically satisfied with their household income, but give different answers to the question on household income satisfaction. This is shown in Figure 1, where person A and B are evenly satisfied, but person A indicates to be dissatisfied and person B answers to be neither satisfied or dissatisfied.

Figure 1: Differential item functioning

(7)

true. This means that the self-reported level of satisfaction cannot be compared by means of a generalization of a binary response model, such as an ordered probit model. The ordered probit model is described in section 3. Hence we need a model that is able to predict the different scales of person A and B, but for this we need extra data.

The extra data we need is called anchoring vignettes and provides a solution for DIF. Anchoring vignettes are extra questions in which household income satisfaction of hypothetical persons are asked. This method provides anchors that correct for differences in the interpretation of questions on subjective evaluations. If for instance we have two questions on hypothetical persons and person A and B answered to vignette 1 and 2 as pictured in Figure 2, we might expect that they use different (unobserved) thresholds.

Figure 2: Answers of person A and B to the vignette questions.

Hence person A indicates that the hypothetical persons are less satisfied compared to the answers of person B. One of the assumptions of the model, the so-called vignette equivalence assumption, assumes that different individuals understood the vignette questions in the same way on average, so person A and B should rate a hypothetical person with almost the same household income satisfaction level. Thus we assume that the difference is caused by the fact that person A and B use different benchmarks.

(8)

of household income satisfaction can now be compared. Since the thresholds of in-dividuals are now adjusted such that they are identical. In the case of person A and B, we adjust the scales of person B such that they are comparable with those of person A.

Figure 3: Answers of person A and B to the vignette questions.

(9)

3

The data

We use the data of the Survey of Health, Ageing and Retirement in Europe (SHARE), which is a cross-national database containing variables on health, socio-economic sta-tus and social and family networks of more than 45,000 individuals mostly aged 50 or over. We only use a subsample of the database, which is the COMPARE sample, i.e. a random subsample of the total population who filled in a questionnaire on household income satisfaction and corresponding vignette surveys. The 11 countries in the sample are Germany, Belgium, Czechia, Denmark, France, Greece, Italy, the Netherlands, Poland, Spain and Sweden. The questionnaires are often answered by more than one member of a household, however household specific questions are answered by only one representative of the household. Household income satisfac-tion is not a household specific quessatisfac-tion, thus answered by every individual in the COMPARE sample. The questionnaires are mainly taken in 2007, but also in 2006. Bonsang and van Soest (2010) use the same database in their research, however they use a different approach for missing data. The appendix contains the approach of Bonsang and van Soest (2010). After handling for missing data we are left with a dataset of 5794 observations. For more informaton on the SHARE data, have a look on the website: www.share-project.org.

3.1 Self-reported household income satisfaction and vignettes

Vignettes are used in surveys in which questions on self-reported evaluations are asked from individuals about their personal situation, for example how satisfied the individual is with his household income. The question about household income sat-isfaction is the following:

How satisfied are you with the total income of your household?

Very dissatisfied/ Dissatisfied/ Neither satisfied, nor dissatisfied/ Satisfied/ Very satisfied.

(10)

Answer Germany Belgium Czechia Denmark France Greece Italy Netherlands Poland Spain Sweden Very dissatisfied 2.7 3.2 4.5 0.9 5.3 15.7 5.1 1.0 13.9 6.1 1.3 Dissatisfied 12.0 16.1 21.8 3.0 21.6 17.2 16.6 6.2 30.3 23.7 6.5 Neither 23.9 25.6 38.9 15.2 38.5 37.7 32.6 16.4 31.1 28.9 27.9 Satisfied 52.5 46.7 31.4 58.2 30.1 20.9 41.2 60.1 21.6 38.4 47.2 Very satisfied 8.8 8.4 3.4 22.6 4.5 8.4 4.5 16.2 3.1 2.9 17.1 Nr. observations 859 694 679 884 289 267 567 398 478 262 417

Table 1: Percentual response on self-reported household income satisfaction.

The problem of DIF can be overcome by anchoring vignettes, which provide bench-marks and scales (anchors) for different individuals. Anchoring vignettes are ques-tions about hypothetical individuals, which provide a measure to estimate the dif-ferent benchmarks and scales. The questions corresponding to the vignettes are: Jim is married and has two children, the total after tax household income of his family is 1,500 Euro per month. How satisfied do you think Jim is with the total income of his household?

Very dissatisfied/ dissatisfied/ neither satisfied nor dissatisfied/ satisfied/ very sat-isfied,

and

Anne is married and has two children, the total after tax household income of her family is 3,000 Euro per month. How satisfied do you think Anne is with the total income of her household?

Very dissatisfied/ dissatisfied/ neither satisfied nor dissatisfied/ satisfied/ very sat-isfied.

(11)

Answer Germany Belgium Czechia Denmark France Greece Italy Netherlands Poland Spain Sweden Very dissatisfied 4.8 8.4 7.0 11.9 12.4 33.3 15.8 2.8 2.0 12.7 13.3 Dissatisfied 41.6 43.6 28.6 47.6 48.6 54.2 37.4 40.5 12.6 39.7 53.9 Neither 34.8 29.5 33.0 24.6 27.0 12.1 27.1 38.1 23.9 17.3 23.4 Satisfied 18.0 17.4 26.2 14.3 11.2 0.4 16.0 18.2 50.1 22.1 8.8 Very satisfied 0.8 1.1 5.3 1.7 0.8 0.0 3.6 0.4 11.3 8.1 0.7 Amount used 1600 1460/1500 864 1906 1500 - 1450 1500 872 1300 1622

Table 2: Percentual response on the first vignette question by country, and

Answer Germany Belgium Czechia Denmark France Greece Italy Netherlands Poland Spain Sweden

Very dissatisfied 5.4 2.0 0.3 0.4 0.8 0.8 3.8 0.2 1.5 1.1 0.9 dissatisfied 1.8 6.0 1.6 4.2 8.1 10.7 3.9 0.6 3.7 4.8 3.6 Neither 9.7 13.2 5.4 17.1 22.5 31.8 15.1 5.4 4.2 15.1 18.7 Satisfied 47.4 43.5 46.3 56.6 50.6 42.5 57.3 50.1 38.2 48.2 51.9 Very satisfied 35.7 35.3 46.3 21.7 18.0 14.2 19.9 43.7 52.5 30.7 24.9 Amount used 3100 2900/3000 1728 3812 3000 - 2900 3000 1744 2600 3243

Table 3: percentual response on the second vignette question by country. The poorest country in our sample is Poland, incomes around 1500 and 3000 Euro are quite large, hence most citizens of Poland rank both vignettes as quite satisfied. While an income of 1500 Euro is not that large in the Netherlands, hence most citizens in the Netherlands indicate the hypothetical person is dissatisfied with his household income. The amounts are in Euros, the amounts of Greece are unknown and in Belgium different amounts for the Dutch and French speaking area are used. We use exchange rates of 2007, which are supported by the SHARE data. The hypothetical individual of the second vignette has two times the household income than the individual of the first vignette, so we expect that most individuals in the sample report a higher satisfaction level for hypothetical individual 2. Indeed, 84.9 percent of the individuals affirm that the second hypothetical individual is more satisfied than the first individual. Another 12.5 percent answered with similar satisfaction levels and 2.6 percent answer that the hypothetical individual with the lowest household income is more satisfied than the richer hypothetical individual. The poorest countries are Poland and Czechia, so the second hypothetical person earns much more than the average person in one of these countries. Hence almost all the citizens of both countries indicate that Anne is satisfied or very satisfied with her household incomes, as expected.

3.2 Household income

(12)

variable.

We have 7272 observations, unfortunately we do not have data on total after-tax household income per month for every individual. There are 575 individuals who answer that they do not know their household income and 735 individuals who refuse. Finally we omit household incomes below 10 and above 100000 Euro, which are 56 observations in total. We plotted log income per country, since the distribution of income resembles that of a log normal distribution.

Figure 4: Log after-tax household income per month per country

The country with the most observations is Germany, hence the histogram of Ger-many is the smoothest and resembles that of normal distribution. Except there is a small bump to the right of the main part of the histogram. The household in-comes in the small bump are extremely large and there are too many observations compared to the actual household income distribution of the German elderly. In the other countries we also observe a small bump with extremely large incomes, and the observations of Italy contain a lot of unreliable household incomes.

Thus there are a large number of individuals who did not fill in their correct after-tax household income per month, but on average a number at least 10 times as large as expected. We expect that these individuals filled in their after-tax household income per year. Let’s assume that a random 10 percent of the German individuals filled in their household income per year, and we know

log(income per year) = log(12 × income per month)

= log(12) + log(income per month) ≈ 2.5 + log(income per month).

(13)

and the other 2.5 to the right containing 10 percent of the observations, both re-sembling a normal distribution. Thus our assumption is consistent with the actual histogram of Germany.

The remaining question is why so many individuals filled in their household income per year. In the questionnaire, the question about total after-tax household income per month is proceeded by questions on income per year. The question about after-tax household income per month is stated as follows:

Concluding, how much was the after-tax total income of your entire household in an average month in “previous year”?

where “previous year” should be replaced by for instance 2006 or 2007. Hence if the question is not read carefully one might read: ‘Concluding, how much was the after-tax total income of your entire household in 2007?’ Thus the questionnaire does not emphasize the word month enough or should have asked the after-tax house-hold income per year. We also investigated if there are any differences between the Italian questionnaire and the other questionnaires, since the number of unreliable observations is large. We translated the question in the questionnaire:

Concluding, what was the total monthly after-tax income that your entire family has on average received in the “previous year”?

Hence the word month is stated early on in the sentence, while in the questionnaires of the other countries it is one of the last words.

(14)

The data contains four countries with different currencies than the Euro, which are the Czech Republic, Denmark, Poland and Sweden. However the incomes for ev-ery country are expressed in Euros. Hence all incomes are expressed in the same nominal rate, but there are difference in purchasing power between countries, i.e. a loaf of bread can be on average 1 Euro in the Netherlands and 0.80 Euro in Poland. Thus an individual of a low price level country can buy more for the same nominal income as another individual who lives in a country with a high price level, and therefore is expected to be me more satisfied with his income. This means we have to change the nominal rates into real rates by using a measure of Purchasing Power Parity (PPP).

The quartiles of the income distribution per country in Euros adjusted for PPP are given in the table below.

Answer Germany Belgium Czechia Denmark France Greece Italy Netherlands Poland Spain Sweden

1st quartile 1299 1171 785.9 1373 1267 843.6 961.9 1509 570.1 828.8 1575

Median 1925 1596 1039.0 2035 1949 1289.0 1469.0 2062 765.8 1113.0 2039

3rd quartile 2695 2199 1349.0 2876 2923 2051.0 2212.0 3068 1005.0 1842.0 2780

Table 4: The quartiles of the income distribution

When comparing Table 4 with Table 1, we see for instance that citizens of Poland are the least satisfied which corresponds to their low after-tax household income. The same holds true for most countries, except the French who have a high household income but are not that satisfied compared to countries with similar household income distributions.

3.3 Exchange and PPP rates

The dataset contains variables on exchange rates between countries with different valuta’s for several years. It also contains variables on purchasing power for every country with base country Germany and base year 2005. Thus we use these vari-ables to alter the nominal household income into one in Euro’s and adjusted for purchasing power. The questionnaires are taken in different years, 2006 and 2007, so we take this into account when adjusting for currency and purchasing power. The household income of the hypothetical persons is the same for 2006 and 2007, so not adjusted for changes in the rates of currencies and PPP between different years. Fortunately the rates do not alter much between 2006 and 2007, so we assume that this not affect the results.

(15)

3.4 Other explanatory variables

We also have data on personal characteristics, i.e. gender and age; socio-economic status, i.e. labour force status and years of education; health status, i.e. number of chronic diseases; and housing, i.e. type of residence and living area. Descriptive statistics by country are shown in the Appendix.

(16)

4

The CHOPIT model

We define yi∗ as household income satisfaction and xi as a vector of explanatory

variables of individual i such as log household income, gender and country. We now have the following model

yi∗= x0iβ + i, i|xi ∼ N (0, 1). (1)

The standard deviation of the error term is fixed at 1, to make sure the model is identified. However we do not observe y∗i, but the self-reported household income satisfaction yi. Hence yi = j if τij−1 < y ∗ i ≤ τ j i, j = 1, ..., 5, (2) where τi0 = −∞, (3) τi5 = ∞. (4)

The above model is called the ordered probit model. The thresholds are fixed across individuals, however we want to analyze if self-reported income satisfaction is het-erogeneous across individuals of different European countries. The thresholds should vary across individuals, thus we now define

τi1 = x0iγ1 (5)

τij = τij−1+ x0iγj, j = 2, 3, 4 (6)

In equation (6) the linear function is often replaced by τij = τij−1 + exp(x0iγj) to

guarantee τi1 ≥ τ2

i ≥ ... ≥ τi4. However we use the linear functions since all of the

estimated thresholds for the individuals in the dataset are in the correct order. In the above model β and γ1 are not identified, since the likelihood function is given

by L(β, γ1, ..., γ4) = n Y i=1 5 Y j=1  Φ(τij− Xi0β) − Φ(τij−1− Xi0β)I(yi=j), (7) which can be rewritten as

L(β, γ1, ..., γ4) = n Y i=1 5 Y j=1  Φ(ˆτij− Xi0β) − Φ(ˆˆ τij−1− Xi0β)ˆ I(yi=j), (8)

where ˆβ = β + γ1 and ˆτij = τij − x0iγ1 for j = 1, 2, 3, 4. Hence ˆτij is not a function

of γ1. By maximizing the likelihood function we end up with the ML estimate of

ˆ

(17)

ˆ

β = β + γ1, and two unknowns.

Thus we need extra data, the vignette results. Individual i’s perception of the household income satisfaction of fictive person l is denoted by zil∗ and the true level of household income satisfaction of fictive person l is given by θl. Then we have the

following model

zil∗ = θl+ νil, νil∼ N (0, σ2νl), l = 1, 2. (9)

We have l = 1, 2, since the data contains two vignettes. However we do not observe zil∗, but only the response of individual i to the vignette surveys, i.e. zil. Hence

zil= j if τij−1< z ∗ il≤ τ

j

i, j = 1, ..., 5. (10)

The thresholds of the self-reported satisfaction are the same as for the vignettes, hence response consistency is assumed. The above model is called the CHOPIT model. The CHOPIT model is the model introduced by King (2004) for anchoring vignettes.

The most important assumptions of the model are response consistency and vi-gnette equivalence. Response consistency is: Individuals use the same scales for the self-report and vignette questions. Vignette equivalence is: There is no systematic difference in the interpretation of a vignette between respondents with different char-acteristics. The assumptions will be discussed in more detail in one of the following sections. We introduce a generalization of the model by substituting equation (9) for l = 1 or l = 2 by

zil∗ = θlk+ νil, νil∼ N (0, σ2ν). (11)

The subscript k is meant for a division of the dataset in subsamples, such as male or females or a division in countries. We cannot substitute equation (11) for equation (9) for both vignettes, because then the country dummy parameters and θlkwill not

be identified. The generalization is meant for testing vignette equivalence.

The model is estimated by maximum likelihood, the likelihood equation consists out of two parts: the self-reported household satisfaction and the vignette surveys part. Hence the likelihood function has the following form

(18)

and Lz(θ1, θ2, γ1, ..., γ4, σν12 , σν22 ) = n Y i=1 5 Y j=1 2 Y l=1 Φ(τ j i − θl σνl ) − Φ(τ j−1 i − θl σνl ) !I(zil=j) . (14) In equation (10) and (11) the function I(·) is the indicator function, Φ(·) is the stan-dard normal cumulative distribution function and n is the number of individuals in the dataset.

The probability that individual i falls in category j is equal to P (yi = j|xi) =

Z τj

τj−1

φ(z − x0iβ)dz, (15)

where φ(·) is the standard normal probability density function. We estimate this probability by replacing β, τj and τj−1 for the corresponding estimated coefficients

of the CHOPIT or ordered probit model. If we are for example interested in the probability that a German citizen falls in category j, we simply compute

P (German citizen falls in category j|xi, i ∈ G) =

X

i∈G

P (yi = j|xi), (16)

where G is a set of indices corresponding to all German citizens in the dataset. The interpretation of the estimated coefficients β is complicated. Fortunately, the sign (positive or negative) of the estimated parameter β determines if the explana-tory variable has an increasing or decreasing effect on yi∗. The marginal effect of xi

for the (generalized) ordered probit model is given by

∂P (yi= j|xi) ∂xi = ∂Rτj τj−1φ(z − x 0 iβ)dz ∂xi = φ(τj−1− x0iβ) − φ(τj− x0iβ) β, (17)

where the term in brackets can be negative or positive. The marginal effect for the (generalized) CHOPIT model is more complicated due to the fact that τj is a

function of xi for j = 1, 2, 3, 4. The marginal effect for the first two categories are

∂P (yi = 1|xi) ∂xi = (γ1− β)φ(τ1− x0iβ), (18) and ∂P (yi = 2|xi) ∂xi = (γ1+ γ2exp(x0iγ2) − β)φ(τ2− x0iβ) − (γ1− β)φ(τ1− x0iβ). (19)

(19)
(20)

5

Estimation results

The estimated coefficients and corresponding standard errors of the CHOPIT model are presented in Table 15. The log-likelihood is -20935. The main results of the model are not surprising, for instance a major determinant of household income satisfaction is household income. The household size has a negative effect on satis-faction. As expected, since the amount of income spend on the individual decreases when the household size increases. The number of children has also a negative effect and also the being married or having a partner has a negative effect compared to being single. If you are single you do not have to decide with your partner or spouse on what income is spent. The number of years of education has a positive effect on satisfaction, possible explanations are an imperfect household income variable or are more certain of an higher income according to Kapteyn et al. (2008).

Women are more satisfied than men, which is in accordance with the findings of Kalugina et al. (2005). They have investigated household income satisfaction within households and found that men describe themselves as poorer than the women in the household. Age has a positive effect on satisfaction and a poor health decreases after-tax household income satisfaction.

Individuals living in a large town or suburbs are more satisfied compared to individ-uals who live in a city, small town or village. The type of home is not a determinant of satisfaction. Respondents who live rent free or own a home are more satisfied than respondents who rent their home. The labourforce status is not a significant determinant, except individuals who are unemployed are significantly less satisfied. The country dummies show that Greek and Danish citizens report the highest in-come satisfaction levels followed by Sweden, while Poland and France report the lowest income satisfaction levels. This result does not imply that living in Greece makes you more satisfied with your household income than living in France. We predicted the after-tax household income satisfaction distribution by country. The results are shown in the table below.

Answer Germany Belgium Czechia Denmark France Greece Italy Netherlands Poland Spain Sweden

Very dissatisfied 2.6 2.8 6.2 0.3 4.4 9.6 6.3 0.3 15.9 8.2 0.5

Dissatisfied 13.3 16.9 22.7 4.5 23.6 21.4 17.3 7.1 29.8 27.1 8.7

Neither 23.8 26.9 33.4 14.5 35.2 34.0 29.8 19.0 27.3 24.0 24.4

Satisfied 51.0 44.7 33.7 59.8 33.7 29.4 42.2 57.7 25.2 35.1 51.7

Very satisfied 9.3 8.7 4.0 20.9 3.0 5.6 4.4 15.9 1.8 5.6 14.7

(21)

dataset, which are given in Table 1. The largest prediction error is in the distribu-tion of Greece, which is much larger than in the distribudistribu-tion of the other countries. This might be evidence that Greek respondents are too different to be pooled in with the other countries. We also estimated a model without the Greek data, which resulted in similar estimates of the parameters.

We also estimated the ordered probit model, the estimated coefficients and corre-sponding standard errors are shown in Table 15. The significant coefficients have similar signs compared to the estimated β coefficients of the CHOPIT model. Ex-cept for the country dummies. The most satisfied countries are Denmark and the Netherlands, while French respondents are most dissatisfied followed by the Greeks. Therefore we are interested if household income satisfaction is heterogeneous across the 11 European countries due to cultural and individual differences or only individ-ual differences. Therefore we also estimate a less restrictive model in which only the γ’s of the country dummy variables are set to zero and perform a likelihood ratio test. The LR test statistic is equal to 1419 on 40 degrees of freedom, hence we reject the restricted CHOPIT model.

Another important difference between the CHOPIT model and the ordered probit model is the clearly lower estimate for household income variable in the ordered pro-bit model. All the estimated γ coefficients for household income are positive, thus an increase in household income means an increase of all the benchmarks in such a way that also the gaps between the benchmarks become larger. Thus respondents with higher incomes do not evaluate an increase of income, for instance an increase with 500 Euro, the same as someone with a lower income. As example, someone with a lower income might move from dissatisfied to neither satisfied nor dissatisfied while the increase for an individual with a higher income is not really substantial and remains satisfied. Hence not correcting for DIF in the ordered probit model results in a too low estimate for the coefficient of household income.

(22)

6

Response consistency and vignette equivalence

The requirements of anchoring vignettes are response consistency and vignette equiv-alence according to King et al. (2004). Response consistency implies that individuals use the same scales for the self-report and vignette questions. The second require-ment, vignette equivalence, implies that there is no systematic difference in the interpretation of a vignette between respondents with different characteristics. In the following subsections we are going to test both requirements.

6.1 Response consistency

There are some papers on response consistency, however an objective measure is needed for testing response consistency. King et al. (2004) use for instance an ob-jective measure of vision (eyes) in a research on self-assessment of vision. There is not an actual objective measure of household income satisfaction of individuals. However if we assume that household income provides a reasonable measure for household income satisfaction, then we have such an objective measure. We are now able to test response consistency by ranking the countries on income level and average estimated satisfaction level by means of the CHOPIT results. In one of the following sections, Counterfactuals, we notice that estimated satisfaction in Greece is ranked too high compared to the income level and in the Netherlands too low (or Sweden and Denmark are ranked too high). Both rankings of the other countries are reasonable similar. For more details see section Counterfactuals. We now inves-tigate if response consistency is violated in the data of Greece.

The individuals living in Greece indicate that 27.1 percent of them is as satisfied as Anne (second vignette) and 19.5 percent indicates to be more satisfied. Hence almost 50 percent of the Greek respondents indicate to be more or equally satisfied as someone earning around the third quartile of the income distribution of all the eleven countries. This includes countries as the Netherlands, Sweden and Germany, who have a much higher income level than Greece. Below we have plotted a his-togram of household income (PPP-adjusted) of all Greek citizens indicating to be equally or more satisfied than Anne.

(23)

Figure 5: Greek citizens who are equally or more satisfied than Anne.

other countries. Thus we continue testing response consistency for the entire dataset. One method to verify the assumption that does not involve an objective measure is proposed by Angelini, Cavapozzi, Corazzini and Paccagnella. The method states that, if the assumption holds, respondents that match the description of one of the hypothetical persons in the vignettes should give a similar answer to the self-reported income satisfaction as the corresponding vignette question. Thus married men with two children and a household income close to 1500 Euro per month (amount should be adjusted for PPP) should give the same answer to the self-reported household income satisfaction and the first vignette question. The correlation between the self-reported satisfaction and the first vignette question is 0.02. When we select only married men with two children the correlation goes up to 0.08. Selecting only the individuals within a bandwidth of 100 Euro of the household income mentioned in the vignette question, results in a correlation of 0.23. The second vignette shows similar results. If we assume the data on household income is correct, we would expect a correlation closer to 1.

(24)

histogram around 1500 Euro, etc.

Figure 6: Relative satisfaction levels

Hence the top 3 histograms are in correspondence with our expectations. This is not the case for the bottom 2 histograms, the sample mean of the 4th histogram is much lower than 3000 Euro and in the 5th histogram most household incomes are below 3000 Euro. Possible causes are a violation of response consistency or reported higher household incomes are too low. The average household income for instance for the Dutch subsample is too low compared to the actual household income according to the Dutch Central Bureau of Statistics (CBS). Hence a plausible explanation could be individuals who downsize their high household income, which is a common phenomenon. A violation of an assumption or incorrect data both lead to prediction errors, so the reliability of our results is doubtful.

6.2 Using the θl estimates for testing response consistency

We are going to test vignette equivalence by means of the estimates of θl, we now

(25)

4 that the same scale is assumed for yi∗ and z∗il due to the restrictions on τij. Hence response consistency is assumed. The only differences between the two hypothetical persons are that the household income of Anne is twice as large and that she is a female. The predicted difference in household income satisfaction on the scale of yi∗ is

1.3103 ∗ log(2) + 0.1784 ∗ 1Female = 1.086631, (20) where 1.3103 and 0.1784 are the estimated coefficients for log income and gender, and log(income Anne) = log(2 ∗ income Jim) = log(2) + log(income Jim). Hence the difference in income is log(2). If we assume that the individuals in the dataset only use the information which is given in vignettes, then the large difference between 1.9234 and 1.086631 cannot be explained by prediction error. We have performed a Wald test and found a p-value of zero, so we reject the hypothesis that the difference is equal to zero. Hence we reject the assumption of response consistency.

Another method would be to compute estimates for θl by means of the estimated β

coefficients. Thus for the first vignette of Jim we know he is a male, has an household income of approximately 1500 Euro, has 2 children and is married. We take for the other explanatory variables an average of the dataset. Thus a household size of 2.185, 64.28 years old, 15 percent citizen of Germany, 10 percent citizen of Italy, etc. A Wald test shows the same result as the previous test, however the previous method was a bit more precise. For instance, the difference betweeen household incomes is exactly 2 and not approximately.

6.3 Vignette equivalence

One way to check if vignette equivalence is violated is to identify the global ordering of the vignettes and to compute the percentage of individuals who do not violate the ordering per country. The global percentage of violations is low, 2.6 percent. The highest percentage of violations is 3.3 percent (Poland), so there is consistency in the responses of the individuals per country. Hence there is no evidence of systematic difference in the interpretation of the vignettes between respondents with different characteristics.

(26)

restricted model. We have grouped the countries in 4 categories: Scandinavian: Sweden and Denmark

East Europe: Poland and Czechia

West Europe: Germany, Belgium, the Netherlands and France Mediterranean: Greece, Spain and Italy

The LR test statistic is equal to 940.6 on 394 degrees of freedom. Thus we could now reject the restricted model, however the condition is not necessary for vignette equivalence.

A necessary condition is no significant difference in the estimates of θl in the

CHO-PIT models of the groups or countries compared to the global CHOCHO-PIT model. The estimates of θl and corresponding standard errors are given in the table below.

θ1 s.e. θ2 s.e.

Scandinavian 13.1540 1.2779 15.3186 1.2936 East Europe 9.8741 1.1380 11.8017 1.1525 West Europe 13.5376 0.7201 15.7121 0.7358 Mediterranean 4.1409 0.8238 5.7891 0.8304

Table 6: Testing vignette equivalence

Angelini et al. (2010) only determine if the ordering of the estimates of θl is

consis-tent. The table above shows there are no inconsistencies in the estimates. However we want to determine if for instance the relative low estimates of θl in the

Mediter-ranean countries is evidence of a violation of vignette equivalence. Thus we have to compare the estimates of θl for the grouped subsamples with the estimates of the

entire sample. This is only possible for West Europe, because this group has the same benchmark as the global CHOPIT model. The benchmark we speak of is a German male, with no income, no children, etc. If we change the benchmark for in-stance to Denmark and leave everything else the same, then the German male, with no income, no children, etc. is pushed in the negative of the axis of y∗i. Since the estimation results show that Danish respondents report significantly higher income satisfaction level than German respondents. Hence all scales (τij) are also pushed to the left on the yi∗ axis and as a consequence the values of θl are also smaller. The

estimate of β for the Danish dummy variable is 0.8734, so if we change the bench-mark country to Denbench-mark we have to subtract 0.8734 of the original estimates of θl. Thus we can simply compute the new estimates of θl. The estimates are shown

below.

(27)

Benchmark θ1 s.e. θ2 s.e.

Denmark 9.1697 0.4392 11.0931 0.4452 Czechia 10.1561 0.4209 12.0796 0.4278 Germany 10.0431 0.4444 11.9665 0.4509 Greece 9.1384 0.4306 11.0618 0.4366 Table 7: Estimates of θl with different benchmark countries

between respondents with different characteristics. The countries seem to be simply too different from each other. We remark that for instance it might be possible that the Mediterranean countries do not interpret the vignettes differently, but the effect of explanatory variables is different (which affects the values of θl). As an

example, in the Mediterranean CHOPIT model the estimated β for log income is 0.5535 (0.1019). However the conclusion remains the same, the countries are too different.

We explained that the above test does not only test vignette equivalence. Thus we want to keep all parameters fixed for different countries, but allow θlk to vary

between countries for only one vignette. Therefore we estimate the generalized CHOPIT model, where θlk for l = 1 or l = 2 is allowed to take on different values

(28)

7

Design of the vignettes

Anchoring vignettes is developed by King et al. (2004) to eliminate problems caused by differential item functioning. In the beginning of the methodology there were re-searches with 12 vignettes per survey question. An amount which could lead to increased research expenses and is often unnecessary when correcting for DIF, ac-cording to King et al. (2006). It is even possible to correct for DIF with only one vignette, further on in this section we will research if this is possible with our data on household income satisfaction.

One important aspect of the design of vignettes is using the same response cate-gories as the survey question. The idea behind this is that a respondent assumes the hypothetical persons is alike but with different characteristics, which are described in the vignette.

We do not have 12 vignettes, but only 2. However this amount lends itself perfectly for the following. If the usual assumptions hold we can create a new variable which is without differential item functioning, according to King et al. (2006). The new variable has 5 ordered categories, which are: less satisfied than Jim, equally satisfied as Jim, satisfaction between Jim and Anne, equally satisfied as Anne, and more sat-isfied than Anne. Hence we have a new household income satisfaction variable with 5 categories and without DIF. There are some ties and inconsistencies, individuals who filled in that Jim is equally satisfied or more satisfied compared to Anne. In-consistencies might be caused by violations of response consistency, however there are also other possible reasons.

Tied responses might be caused by respondents who do not perceive the difference in the vignettes. Inconsistencies in the answers might be caused by individuals who differ a lot compared to the hypothetical persons. We illustrate this with the follow-ing example: we are measurfollow-ing height and an individual is able to correctly specify how tall he is, however he is not able to correctly rank two oak trees of 20 and 21 meter. This is not a major problem, because the individual is probably able to understand that he is shorter than both oak trees. We remark that this kind of inconsistency is a bit strange in our research, since the household income of Anne is twice as large as that of Jim.

(29)

Ci =              1 yi< zi1 2 yi= zi1 3 zi1< yi< zi2 4 yi= zi2 5 zi2< yi,

where zi1is the vignette variable corresponding to the hypothetical person Jim and

zi2 corresponds to Anne. Unfortunately the above variable is not able to handle

with some ties and inconsistencies. In the table below we describe the 13 possible scenarios. Scenario Ci Nr. obs. yi < zi1< zi2 1 661 yi = zi1< zi2 2 1102 zi1< yi < zi2 3 1062 zi1< yi = zi2 4 1465 zi1< zi2< yi 5 606 yi < zi1= zi2 1 294 yi = zi1= zi2 {2, 4} 261 zi1= zi2< yi 5 181 yi < zi2< zi1 1 18 yi = zi2< zi1 {1, 4} 18 zi2< yi < zi1 {1, 5} 14 zi2< yi = zi1 {2, 5} 53 zi2< zi1< yi 5 59

Table 8: The 13 possible scenarios

There are 4 scenarios in which Ci takes on 2 values, one caused by a tie and the

other 3 caused by inconsistencies. The ninth and last scenario described in the table could be caused by individuals as described in the example of measuring length and the two oak trees. Fortunately the number of inconsistencies is not very large, only 18 + 18 + 14 + 53 + 59 = 162 observations or 2.8 percent of the data. The number of ties is a bit larger, 294 + 261 + 181 = 736 observations or 12.7 percent of the data. The low percentage of inconsistencies indicates that the setup of the research and formulation of the survey and vignette questions are chosen in a good way. However the 12.7 percent observations which are tied could indicate that the hypothetical persons do not differ enough. Ties could also be caused by the low number of re-sponse categories, which is only 5.

If we now cluster all scenarios for which Ci = 1 and do the same for Ci = 5,

(30)

another result which indicates a good choice of the vignettes. If there were too many observations in one bin, correcting for DIF would be difficult. In the table below the percentual responses are shown per country. For simplicity we have deleted the observations with multiple values for Ci. King et. al (2004) included them by

uniformly distributing them over there corresponding range. For example, if we have an observation for which yi = zi1 = zi2, Ci is randomly assigned one of the values

2, 3 or 4 with equal probability. According to King et. al (2006) not a plausible approach.

Ci Germany Belgium Czechia Denmark France Greece Italy Netherlands Poland Spain Sweden

1 13.1 10.6 31.9 2.9 16.8 12.6 19.2 5.7 62.5 28.7 5.2

2 20.7 24.6 27.4 10.0 22.7 15.3 21.5 23.5 24.3 25.0 11.2

3 23.4 24.4 21.1 16.5 17.6 25.6 17.7 22.2 2.7 14.8 25.6

4 28.8 27.4 15.7 41.1 27.1 27.1 24.8 34.6 6.1 18.4 34.3

5 14.1 13.0 4.0 29.4 15.8 19.5 16.8 14.0 4.3 13.1 23.6

Table 9: Distribution of Ci per country

The results are in accordance with previous results. Poland has the most dissatisfied citizens followed by Spain and Czechia, and Denmark, Sweden and the Netherlands have the most satisfied citizens.

If there were no scenarios in which Ci takes on 2 values, we simply could estimate

an ordered probit model described in section 4. However this is not the case and we also do not want to throw away 261 + 18 + 14 + 53 = 346 observations. Thus we substitute equation (2) of section 4 by the following equation:

Ci= j if τimin(j)−1< y ∗ i ≤ τ

max(j)

i , j = 1, ..., 5, {2, 4}, {1, 4}, {1, 5}, {2, 5}. (21)

We are now able to estimate a (censored) ordered probit model, which allows us to include observations with multiple values for Ci. The estimation results are in

Table 16. The estimated coefficients are not significantly different compared to the estimation results of the CHOPIT model, except for the variable log household income. In the table below the predicted levels of the (censored) ordered probit model are shown per country.

Germany Belgium Czechia Denmark France Greece Italy Netherlands Poland Spain Sweden

1 13.2 12.2 29.9 3.6 14.5 10.6 15.1 9.5 54.9 22.5 5.1

2 21.9 22.8 30.5 11.6 22.6 19.7 23.4 19.7 26.5 26.9 14.9

3 22.0 23.2 20.3 17.8 21.9 21.4 22.0 22.0 11.7 21.4 20.5

4 28.6 29.0 15.9 35.8 27.7 30.5 27.0 31.3 6.2 21.6 35.5

5 14.3 12.8 3.5 31.2 13.4 17.7 12.6 17.4 0.8 7.6 24.0

Table 10: Predicted levels of the (censored) ordered probit model

The actual levels of Ci are not too different compared to the predicted distributions

(31)

does not add much information for predicting the distributions of income satisfac-tion.

We are interested in a measure which is able to assess the strength of the chosen vignettes. The measure was first described by King et. al (2006) and is intended to choose the best set of vignettes. The methodology behind this, is starting with a small sample of individuals who have to fill in a survey with many vignette questions. Then an optimal set of vignettes is chosen, which conveys the most information about the topic of the survey, and distributed to all the individuals in the dataset. Thus the amount of data which should be processed by researchers is decreased, which leads to a decrease of research expenses. Unfortunately we only have two vignettes, so the choices are limited.

If we would have a vignette with a hypothetical person having an after-tax household income of one million per month, most individuals would indicate to be less satisfied than our millionaire. Hence the vignette is not able to discriminate among the individuals in the dataset and the amount of information gained by asking the vignette question is zero. A better choice of vignettes would be such that individuals are evenly divided over the categories of the variable Ci. Hence

P (Ci = 1) = ... = P (Ci= 2J + 1) =

1

2J + 1, (22)

where J is the number of vignettes. The worst choice of vignettes is of course when all individuals fall in only one category of Ci. Thus we need a measure who assigns

the smallest possible value to the latter choice and the highest possible value for the situation depicted in equation (20). Another criterion for the measure is that all categories of Ci are equally important in the computation, to overcome the difficulty

of weighting the importance of the different categories. Another condition is that the measure, which is a function of the number of vignettes, should be monotoni-cally increasing. Thus more categories should lead to more discrimination among the individuals, if there are no ties or inconsistencies of course. Furthermore, the decomposition caused by adding more vignettes should not lead to more information in the union of the categories who before adding new vignettes were one category. The information in categories who are not decomposed should stay the same. Thus if we start with only the vignette of Jim, we have three categories 1, 2 and 3∗. We now add the vignette of Anne, so 3∗ is decomposed in three categories. The names of all the categories are 1, 2, 3, 4 and 5. Thus in mathematical terms the condition implies

P (Ci∗ = k) = P (Ci = k), k = 1, 2, (23)

(32)

P (Ci∗ = 3∗) = P (Ci = 3) + P (Ci= 4) + P (Ci = 5), (24)

where Ci∗ is the variable created by the nonparametric approach with only the vignette of Jim. The last condition is that the measure should give identical results if all the categories are included in the measure or is calculated by a weighted average of the measure by means of grouped categories. We clarify this condition by means of our example in which we include the vignette of Anne. Let H(·) be the measure function and we define pk= P (Ci = k), then the condition implies

H(p1, p2, p3, p4, p5) = H(p1, p2, p3∗) + p3∗H(p3, p4, p5). (25)

The first condition of a monotonically increasing function and the last condition imply H(·) ≥ 0 for all possible values. According to Shannon (1949) there is only one function which satisfies all the above conditions, which is known as entropy. The measure is given by H(p1, ..., p2J +1) = − 2J +1 X k=1 pkln(pk), (26)

where we define −0ln(0) = 0. The proof of the above conditions is trivial and there-fore omitted. We are only left with one problem, when Ci has multiple values. The

easiest way to overcome this problem is by estimating the censored ordered probit model and using the predicted distribution to calculate entropy.

However this method of estimating entropy assumes that our model, the censored ordered probit model, is correctly specified. Thus the estimated entropy also mea-sures if the model is correctly specified. We are of course interested in a measure of how informative the vignettes are and nothing more. Thus we should not make any extra assumptions for calculating entropy.

We know for scalar valued observations of Ci the correct category. In the case of an

individual who gives the tied response yi = zi1= zi2, we do not know for sure if the

individual is equally satisfied as Jim and less satisfied than Anne, more satisfied than Jim but less satisfied than Anne, or more satisfied than Jim and equally satisfied as Anne. Hence we do not know if Ci should be equal to 2, 3 or 4, however we know for

sure that the values 1 and 5 are not possible. The same holds true for other cases of ties or inconsistencies. Thus in the case of zi2< zi1< yi it is even possible that

(33)

p1 = 5794973 + q1

p2 = 11025794 + q2

p3 = 10625794 + q3

p4 = 14655794 + q4

p5 = 5794846 + q5,

Table 11: The known entropy

where 5794 is the total number of observations and the values in the numerator are the number of observations for which we know sure the correct category due to a corresponding scalar valued Ci. The probabilities q1, ..., q5 correspond to the ties

and inconsistencies. The value of the probabilities depends on in which category the tied and inconsistent observations fall. There are 261 observations for which yi = zi1 = zi2 holds true, so those observations only can be allocated in category

2, 3 or 4. The same holds true for other possible ties or inconsistencies. A possible solution would be to spread them over the possible categories, however this implies making additional assumptions. The only solution is to allocate the tied and incon-sistent observations such that entropy is minimized. Thus the minimum entropy is the amount of information we know for sure exists.

The solution for the minimization problem is not that difficult in our case, we have to maximize the unfairness in the allocation of the observations over the categories. There are already 1465 observations in category 4 and for every tied or inconsistent situation category 4 is a possible category. Hence all tied and inconsistent observa-tions fall in category 4, so q4= 5794346 and q1 = q2= q3= q5 = 0.

In the figure below we have plotted minimum entropy against estimated entropy for the three possible cases: including only the vignette of Jim or Anne, or including them both. There is no difference in minimum and estimated entropy when only one vignette is included, since there are no ties or inconsistencies. The vignette of Anne is more informative than the vignette corresponding to the hypothetical person Jim. As expected, since significantly more individuals indicate to be equally satisfied as Anne compared the number of individuals who indicate to be equally satisfied as Jim. We also notice that the difference between estimated and minimum entropy is not large when both vignettes are included. This is also as expected, since the amount of ties and inconsistencies leading to multiple valued Ci’s is not large.

(34)

if only the vignette of Anne is included there are 3137 individuals in the category “less satisfied than Anne” (not including the ties and inconsistencies). When we now include the vignette of Jim it is broken down in categories of 973, 1102 and 1062 individuals. The maximum entropy possible is 1.61, hence the amount of information in the two vignettes is almost at the maximum. Or in other words, the two vignettes provide almost the highest level of discriminatory power between the individuals in the dataset.

Figure 7: Estimated against known entropy

The only way to increase the amount of information is to increase the number of vignettes. However increasing the number of vignettes can also increase the amount of ties and inconsistencies. Thus a possible decrease in information is possible if the number of vignettes is increased. Another problem is the current number of individuals per category if we want to include an extra vignette. Since the spread of the observations over the 5 categories is already uniform, so adding an extra vignette will decrease the uniformity. Hence creating three new vignettes would be a better idea. We remark that the above investigation only considers the amount of information for the entire sample, due to the topic of our research. If one would only consider Poland for their research one might discard the vignette of Anne. We conclude that both vignettes add an significant amount of information and both should be included in our research.

7.1 Estimation results including only one vignette

(35)
(36)

8

Counterfactuals

We have estimated a CHOPIT model for correcting for DIF, however in the section with the estimation results we only showed predicted distributions of household income satisfaction which are not corrected for DIF. Therefore we now introduce the concept of counterfactuals to show predicted distributions which are corrected. A counterfactual distribution replaces the country benchmarks of all observations for one set of benchmark belonging to a specific country. In other words, we assume only for the γ parameters that all individuals live in the same country and other characteristics do not alter. The counterfactual distribution using thresholds of the Netherlands is shown below.

Answer Germany Belgium Czechia Denmark France Greece Italy Netherlands Poland Spain Sweden

Very dissatisfied 1.01 0.66 3.19 0.04 1.36 0.34 0.94 0.32 13.84 1.99 0.08

Dissatisfied 11.54 10.16 23.94 1.90 12.91 4.12 9.22 7.07 37.29 13.25 3.20

Neither 23.72 23.13 32.60 8.65 25.52 13.16 21.67 18.99 28.55 25.00 12.71

Satisfied 53.33 55.79 36.94 56.48 51.75 55.30 54.96 57.71 19.39 50.96 60.13

Very satisfied 10.41 10.25 3.33 32.93 8.47 27.08 13.21 15.91 0.93 8.81 23.89

Table 12: Counterfactual distribution with benchmarks of the Netherlands. The predicted percentages for the Netherlands do not alter, hence they are identical to Table 6. Except for Poland the predicted number of satisfied individuals increases in all countries, and in some countries with a large amount. Hence individuals living in the Netherlands are very likely to report that they are satisfied with their household income. The counterfactual distribution shows that French is out of the top 3 of dissatisfied countries, Polish residents are most dissatisfied followed by Czechia. This is in accordance with rankings of countries by gross domestic product at purchasing power parity per capita (for instance the rankings of the International Monetary Fund and the World Bank). Thus Poland has the lowest gross domestic product at PPP per capita followed by Czechia. In the table below we show the ranking of household income satisfaction for the different countries using the counterfactual distribution, the distribution of Ci and two rankings of measures

of gross domestic product at purchasing power parity per capita.

Country IMF World Bank Counterfactual Ci

Germany 5 4 5 5 Belgium 4 5 6 6 Czechia 10 10 10 10 Denmark 3 2 1 1 France 6 6 7 7 Greece 9 9 3 4 Italy 8 8 8 8 Netherlands 1 1 4 3 Poland 11 11 11 11 Spain 7 7 9 9 Sweden 2 3 2 2

(37)

The ranking of Ci does not contain ties or inconsistencies who cause multiple values

for Ci. We notice that for most countries their ranking for income satisfaction is close

(38)

9

Conclusion

We have analyzed the household income satisfaction among elderly in Europe and tried to explain differences among countries by means of differential item function-ing. We find evidence for DIF, especially in the case of French who move from a very dissatisfied to an average satisfied country when we correct for DIF. We also find that the most satisfied citizens live in Denmark and the most dissatisfied citizens live in Poland. Overall, we find enough evidence of DIF to reject the more restricted ordered probit model in favor of the CHOPIT model.

We also investigated which explanatory variables are major determinants of house-hold income satisfaction. Not surprisingly, after-tax househouse-hold income is the most important variable in explaining after-tax household income satisfaction. There is no significant difference between individuals who are retired and still working. Retired individuals earn less than working, but the difference in the satisfaction level is al-ready captured by the after-tax household income variable. We also added variables on living area, space and if individuals rented or owned their home. These variables were partly significant in explaining after-tax household income satisfaction. We reject response consistency and vignette equivalence by means of various new techniques. The more generalized CHOPIT model is a bit complicated to estimate, but provides a method which can be used to test vignette equivalence in every research concerning anchoring vignettes and the CHOPIT model. The test for re-sponse consistency based on the difference between the θl estimates may be more

problematic to assess in other researches. However we have provided statistical tests for both assumptions, which is more straightforward then intuitively deciding if the assumptions hold. The latter is done by for example Angelini et al. (2010) and in various other papers.

We conclude that both vignettes contribute in showing the differential item func-tioning amoung countries. The discriminatory power of the vignettes is high, so the descriptions of the hypothetical persons are chosen in a good way. The measure of discriminatory power, entropy, is also a good tool in assessing the usefulness of vi-gnettes. We also showed that an estimated CHOPIT model with one vignette gives quite good estimates, but the model with both vignettes is preferred.

In the section of counterfactuals, we showed that correcting for DIF brings the sub-jective and obsub-jective measure closer to each other. When correcting for DIF, French moves to a more normal level of satisfaction. However Greece is now one of the most satisfied countries, but the we already explained when testing the assumption that Greece is too different to be pooled in with the countries.

(39)
(40)

10

Bibliography

1. Allison, P. D., (2001), “Missing Data”, CA: Sage Publications

2. Angelini, V., Cavapozzi, D., Corazzini, L., and Paccagnella, O., (2010), “Do Danes and Italians Rate Life Satisfaction in the Same Way? Using Vignettes to Correct for Individual-specific Scale Biases”, University of Padua

3. Bonsang, E., and Soest, A. van, (2010), “Satisfaction with Job and Income among Older Individuals across European Countries”, Tilburg University 4. Holland, P., and Howard, W., (1993), “Differential Item Functioning”,

Hills-dale, NJ:Erlbaum

5. Kalugina, E., Radtchenko, N., Sofer, E., (2005), “Using Self Reported Income in a Collective Model: Within Household Income Comparisons”, University of Paris

6. Kapteyn, A., Smith, J.P., and Soest, A. van, (2007), “Vignettes and Self-Reports of Work Disability in the U.S. and the Netherlands”, American Eco-nomic Review, 97: 461-473.

7. Kapteyn, A., Smith, J.P., and Soest, A. van, (2008), “Are Americans really less happy with their incomes?”, RAND Labor and Population working paper WP-591

8. King, G., Christopher J.L.M., Joshua A.S., and Tandon, A., (2004), “Enhanc-ing the Validity and Cross-cultural Comparability of Measurement in Survey Research.”, American Political Science Review 98: 191-207

9. King, G., Murray, C., Salomon, J., and Tandon, A., (2004), “Enhancing the validity and crosscultural comparability of measurement in survey research”, American Political Science review, 98(1), 567-583

10. King, G., Wand, J., (2006), “Comparing Incomparable Survey Responses: Evaluating and Selecting Anchoring Vignettes.”, Political Analysis Advance Acces

11. Kristensen, N., and Johansson, E., (2008). “New Evidence on Cross-Country Differences in Job Satisfaction Using Anchoring Vignettes”, Labour Economics, 15: 96-117

12. Little, R.J.A., and Rubin, D.B., (1987), “Statistical analysis with missing data”, New York, Wiley

(41)

14. Shannon, C., (1948), “A mathematical theory of communication”, Bell System Tech. J. 27, 379-423

15. Wand, J., and King, G., and Lau, O., (2007), “Anchoring Vignettes in R: A (Different kind of) Vignette”, Stanford University

(42)

11

Appendix

The Appendix contains summary statistics and the most important model results. It also contains a brief explanation on our previous solution for the problem with household income variable before we understood the nature of the outliers.

11.1 Household income variable

In section 3 we already explained how we constructed the after-tax household in-come variable. The method used is quite unique due to the unusual problem with the after-tax household income variable. Bonsang and van Soest (2010) probably did not notice the cause of the strange behavior of the income variable, since they neglected the large household incomes as outliers.

Their method started with removing the observations in the first and last percentile of the household income distribution per country. They regressed after-tax house-hold income on country dummies, age, gender, education, log househouse-hold size, labour-force status dummies, number of chronic diseases and number symptoms. They used histograms of the residuals of the regression model per country to determine which observations are outliers and estimated another regression model by excluding the outliers and added another measure of household income to the explanatory vari-ables. Finally they replaced the outliers and missing household incomes with the predicted values of the second regression model.

The method cannot predict after-tax household income if for instance the other measure of household income is also missing. Thus one must assume that those observations are missing at random, otherwise the model outcomes might be biased. If the explanatory variables including the other measure are not missing, the predic-tion is indeed condipredic-tional on the informapredic-tion we have of the observapredic-tion. However the information was already in the data and the added observations are perfectly predictable from the other explanatory variables. Thus extra observations are added without error variance, so estimated standard errors will be too low.

Therefore the above method, which is called regression imputation, is often replaced by multiple imputation (Rubin et al. 1987). There are several methods for imple-menting multiple imputation, we mention only one. We start off with regression imputation, but now we add a random component to the predicted values. The random components could be drawn randomly from the residual distribution of the regression model. Then the model is estimated. This method is repeated several times and finally an average of the estimates is taken.

(43)

a normal distribution with a standard deviation equal to the standard error of the regression model, we generated the random component. However the assumption of a homoskedastic error term might be too strong.

(44)

Variable Belgium Czechia Denmark France Germany Greece Italy Netherlands Poland Spain Sweden Gender (Woman in %) 54.9 59.7 55.5 56.7 53.5 55.6 54.3 52.5 56.7 54.8 54.8 Age 64.8 64.1 63.9 64.2 64.5 63.8 64.6 61.3 62.4 63.3 66.0 Household size 2.1 2.2 2.0 2.1 2.1 2.4 2.5 2.2 2.9 2.7 2.0 Number of children 2.1 2.0 2.3 2.2 2.0 1.7 2.0 2.3 2.5 2.4 2.3 Years of education 11.7 11.4 13.2 11.9 12.5 8.9 8.3 11.4 9.3 7.9 11.2

Number of chronic diseases 1.6 1.8 1.7 1.4 1.4 1.3 1.8 1.1 2.1 1.6 1.5

Number of symptoms 1.8 2.1 1.5 1.9 1.6 1.4 1.8 1.1 2.6 1.6 1.7 Marital status (%) Single 24.4 29.8 18.4 29.2 19.5 28.5 16.9 15.4 23.4 18.2 23.1 Partner 4.9 7.2 5.9 2.2 4.5 0.8 1.7 4.8 1.5 2.9 9.2 Married 70.7 63.0 75.6 68.5 76.0 70.7 81.4 79.8 75.1 78.9 67.6 Labourforce status (%) Disabled 4.2 1.9 5.5 2.8 2.7 1.3 1.7 5.6 13.7 4.2 1.6 Inactive 17.9 0.7 1.3 10.1 10.0 26.1 25.6 18.6 6.9 32.0 0.9 Retired 49.7 66.1 46.6 54.2 52.6 37.7 52.3 34.1 54.8 32.9 58.9 Unemployed 5.4 2.9 2.4 3.1 6.8 0.8 1.8 0.4 5.3 3.3 1.8 Working 22.8 28.3 44.1 29.8 27.8 34.1 18.6 41.3 19.2 27.6 36.9 Living area (%) Big city 9.4 12.6 8.4 16.0 15.8 40.6 10.6 9.0 15.0 27.2 11.0 Large town 17.4 25.0 25.9 8.4 8.4 23.4 11.0 18.8 26.3 19.7 36.4

Rural area or village 23.6 33.6 24.6 34.0 34.5 12.1 39.4 20.4 42.2 4.6 17.3

Small town 37.3 14.4 25.9 24.2 26.1 1.9 30.8 13.6 11.2 37.7 19.8

Suburbs 12.2 14.4 15.2 17.4 15.2 22.0 8.3 38.1 5.3 10.7 15.5

Type of home (%)

Flat 11.4 51.4 14.0 24.4 34.1 48.9 43.4 15.8 48.4 56.4 30.3

Free standing house 57.1 39.7 74.6 53.7 47.3 46.6 43.3 20.2 49.7 28.9 60.4

Elderly home 0.6 0.0 0.7 0.3 0.1 0.0 0.2 2.2 0.0 0.0 0.2 Linked house 30.9 8.9 10.7 21.6 18.5 4.6 13.1 61.7 1.8 14.7 9.0 Homeowner (%) Tenant 16.0 18.0 19.7 16.9 35.3 8.8 10.9 28.1 18.1 4.6 23.4 Member of cooperative 0.0 9.9 3.9 0.0 0.0 0.0 0.0 0.0 7.3 0.0 12.1 Owner 81.5 64.8 76.2 78.1 59.2 85.8 83.4 71.1 64.5 88.6 64.3 Rent free 2.5 7.3 0.2 5.1 5.6 5.4 5.7 0.8 10.1 6.8 0.2

(45)

Variable β s.e. γ1 s.e. γ2 s.e. γ3 s.e. γ4 s.e. Intercept 0 - 7.3276 0.5072 -0.1893 0.4156 0.4751 0.2895 0.4399 0.3999 Log income 1.3103 0.0533 0.2353 0.0528 0.1644 0.0505 0.0214 0.0347 0.1478 0.0472 Household size -0.1479 0.0239 -0.0425 0.0253 -0.0067 0.0231 -0.0073 0.0158 -0.0018 0.0217 Gender 0.1784 0.0407 0.0185 0.0426 0.0906 0.0412 0.0070 0.0277 -0.0245 0.0368 Nr. children -0.0226 0.0147 -0.0028 0.0153 -0.0124 0.0145 0.0083 0.0099 -0.0074 0.0134 Age 0.0210 0.0029 -0.0034 0.0029 0.0001 0.0028 0.0028 0.0019 0.0052 0.0026 Years of educ. 0.0265 0.0060 -0.0030 0.0061 0.0202 0.0059 0.0029 0.0041 -0.0022 0.0053 Nr. chronic dis. -0.0636 0.0154 -0.0170 0.0159 -0.0022 0.0149 -0.0050 0.0102 0.0182 0.0142 Nr. symptoms -0.0506 0.0123 0.0397 0.0126 0.0060 0.0117 -0.0129 0.0082 -0.0217 0.0114 Country Germany - - - -Belgium 0.0920 0.0737 0.1715 0.0784 0.0804 0.0755 0.0039 0.0482 -0.2292 0.0671 Czechia -0.1131 0.0814 -0.1240 0.0895 0.0037 0.0832 0.0882 0.0542 -0.1275 0.0742 Denmark 0.8734 0.0742 0.1871 0.0779 0.0866 0.0774 0.0428 0.0498 0.0015 0.0651 France -0.1854 0.0959 0.1629 0.0995 0.2330 0.0976 0.1786 0.0656 -0.0926 0.0992 Greece 0.9047 0.1069 1.2625 0.0965 -0.0276 0.0907 0.1592 0.0737 -0.4035 0.1005 Italy 0.3218 0.0833 0.4911 0.0840 -0.0343 0.0796 0.0702 0.0550 0.0424 0.0799 Netherlands 0.1782 0.0933 -0.4917 0.1216 0.3107 0.1200 0.0743 0.0652 0.0335 0.0821 Poland -0.3170 0.1006 -0.3858 0.1103 0.0493 0.0974 -0.0111 0.0642 0.0120 0.0906 Spain 0.2456 0.1099 0.3086 0.1123 0.2545 0.1059 -0.1291 0.0677 -0.2549 0.0980 Sweden 0.5442 0.0902 0.1908 0.0929 0.1966 0.0935 0.1327 0.0625 -0.2351 0.0790 Labourforce stat. Working - - - -Disabled 0.0474 0.0986 0.0553 0.1000 0.0601 0.0946 -0.0570 0.0628 -0.1474 0.0889 Inactive -0.0574 0.0771 -0.0611 0.0791 -0.1118 0.0751 -0.0107 0.0509 -0.1342 0.0692 Retired 0.0038 0.0596 -0.0311 0.0610 -0.0065 0.0591 0.0234 0.0405 -0.0445 0.0542 Unemployed -0.2095 0.1081 0.0865 0.1063 -0.0844 0.0952 0.0483 0.0715 -0.0948 0.1009 Marital status Single - - - -Partner -0.3746 0.1016 -0.0158 0.1047 -0.0164 0.1014 -0.0597 0.0671 -0.0095 0.0932 Spouse -0.2426 0.0572 -0.0674 0.0597 -0.0218 0.0564 -0.0371 0.0383 -0.0200 0.0523 Living area City - - - -Large town 0.1540 0.0666 -0.0061 0.0673 -0.0264 0.0644 -0.0536 0.0441 0.0849 0.0597 Village 0.0738 0.0705 -0.0347 0.0709 0.0122 0.0675 0.0551 0.0476 -0.0605 0.0638 Small town 0.0587 0.0678 -0.1447 0.0702 0.0315 0.0670 0.0035 0.0454 0.1240 0.0616 Suburbs 0.1414 0.0719 0.0558 0.0714 -0.0016 0.0694 0.0706 0.0492 -0.0328 0.0647 Type of home Free standing - - - -Elderly home 0.2695 0.3004 0.4481 0.3022 -0.1818 0.2803 -0.1820 0.1916 -0.3749 0.2381 Flat -0.0099 0.0570 -0.0019 0.0573 -0.0205 0.0541 0.0271 0.0385 -0.1046 0.0520 Linked house -0.0370 0.0567 -0.0010 0.0590 -0.0064 0.0565 -0.0117 0.0383 -0.0479 0.0515 Ownership Tenant - - - -Member coop. 0.0456 0.1105 -0.0514 0.1206 0.0216 0.1146 0.1178 0.0766 -0.2015 0.0960 Owner 0.1366 0.0560 -0.0490 0.0585 -0.0941 0.0556 0.0420 0.0369 0.0394 0.0517 Rent free 0.1602 0.1047 -0.0062 0.1074 -0.0438 0.0996 0.0084 0.0681 0.0232 0.0978

Table 15: The CHOPIT model, the log-likelihood is equal to -20935, 5794 observations andθ1: 10.0431 (0.4444),

Referenties

GERELATEERDE DOCUMENTEN

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

The same procedure has been followed as for the Loppersum pilot (section 5.2). The location of the selected geological areas is shown in Figure L.1.Of these areas,

In order to determine testicular func- tion several semen specimens are obtained for examina- tion at intervals, preferably over a period of 3 months at monthly intervals, since

In the final step, we assessed how differences in support are related to the weight that people with different cultural profiles attach to the deservingness criteria, for both

The subtraction of the government expenditure renders remains of 0.6 which are allotted to increasing private investment (Felderer,Homburg, 2005,p.171). In the works by

It was studied by getting to understand the organization of the Kibanja (popularly known as the.. banana plantation). In the Kibanja, a number of issues were

Everyone in Charleston was so welcoming and the International Office was so helpful and organized events where we all as internationals got to meet each other and were matched

On the other hand, on behalf of the evaluation of the project on the Financial Investigation of Crime and the relatively small number of investigations that have taken place on