• No results found

Investigating the effects of the Great Recession on the financial position of European households.

N/A
N/A
Protected

Academic year: 2021

Share "Investigating the effects of the Great Recession on the financial position of European households."

Copied!
65
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Investigating the effects of the Great Recession on the financial

position of European households.

S.M. Jongsma

(2)

Master’s thesis Econometrics, Operations Research & Actuarial Studies Specialization: Econometrics

University of Groningen

(3)

Investigating the effects of the Great Recession on the financial

position of European households.

S.M. Jongsma s1793101

s.m.jongsma@student.rug.nl

March 18, 2015

This paper investigates the financial effects of the Great Recession on European households using micro data. The global economic downturn of the late 2000s can be considered to be the most severe recession since the Great Depression of the 1930s. Consumers became increasingly pessimistic and consumption and investments dropped. Moreover, (youth) unemployment rose sharply, especially in Southern Europe. We expect that financial ef-fects of the Great Recession on households could be severe, but that the actual impact will depend on the specific situation and composition of the household. For this reason we analyze whether the crisis affected the various groups of the population differently. We do this by estimating probit models for two different indicators for financial hardship: difficulties in making ends meet and the ability to face unexpected expenses. Further-more, we are interested in whether there are specific changes in the relations between our outcomes of interest and control variables throughout the crisis. The latter is investigated by applying a Blinder-Oaxaca decomposition on a part of our data set, distinguishing a pre-crisis year from a post-crisis year for each country. We perform the analysis for the five largest economies of the Euro area. We make use of data from EU-SILC, covering the period 2004-2012. Our chosen indicators for financial hardship show most severe effects for Spanish households. When focusing on the second phase of the recession, strong effects are also observed for households residing in Italy. In contrast, the results suggest that the crisis had limited impact on the Northern countries. When looking at specific groups, we especially find stronger adverse effects for households headed by unemployed persons and by divorced and separated people. We can not find evidence for stronger effects of the recession on young households, relative to the elderly.

(4)

Contents

1 Introduction 1

2 Literature review 4

2.1 Italian and Dutch households and the Great Recession . . . 4

2.2 Age effects in the ability of making ends meet . . . 5

2.3 Expected results and hypotheses . . . 6

3 EU-SILC data 7 3.1 Overview of database . . . 7 3.2 Data discussion . . . 8 3.2.1 Structure of database . . . 8 3.2.2 Unit non-response . . . 9 3.2.3 Item non-response . . . 10 3.3 Variables of interest . . . 11 3.3.1 Outcome variables . . . 11 3.3.2 Explanatory variables . . . 11 3.4 Descriptive evidence . . . 13 3.4.1 Summary statistics . . . 13 3.4.2 Key variables . . . 15 4 Empirical models 19 4.1 Practical issues in our study . . . 19

4.2 Regression methods . . . 19

4.2.1 Binary outcome models . . . 19

4.2.2 Marginal effects . . . 20

4.2.3 Year dummy variables and interactions . . . 21

4.2.4 Blinder-Oaxaca decomposition . . . 22

5 Analysis 26 5.1 Difficulties in making ends meet . . . 26

5.2 Ability to face unexpected expenses . . . 29

5.3 Interaction effects . . . 33

5.4 Comparing with Alessie et al. (2014) . . . 35

5.5 Blinder-Oaxaca decomposition: before and after the crisis . . . 36

6 Conclusion and discussion 41 7 Appendix 43 7.1 Figures . . . 43

7.1.1 Age class - year interactions . . . 43

7.1.2 Marital status - year interactions . . . 45

7.1.3 Education - year interactions . . . 46

7.1.4 Activtity status - year interactions . . . 47

7.1.5 Owner - year interactions . . . 49

7.1.6 Household composition - year interactions . . . 50

7.2 Tables . . . 51

References 60

(5)

1

Introduction

The global economic downturn of the late 2000s can be considered to be the most se-vere recession since the Great Depression of the 1930s. Initially, the crisis started at the end of 2007 in the United States with a crash of the house market. As a consequence, investment banks were threatened by bankruptcy and stock markets collapsed. Due to the worldwide interconnection of financial markets, the problems expanded to Europe and Asia, becoming a global recession. European banks encountered financial problems and became reluctant to provide new loans. Various banks even needed bailouts from national governments. Consumers became increasingly pessimistic and consumption and investments dropped. In Table 1 it can be seen that in terms of GDP the economy of the Euro area worsened considerably in 2009. Moreover, (youth) unemployment rose sharply, especially in Southern Europe. Although real GDP recovered again in 2010, in many countries unemployment stayed at the higher levels. While the previous described events are usually being referred to as the Great Recession, in the Euro zone also a second phase of economic downturn could be recognized: the European debt crisis. As a result of the severe financial crisis, many Euro states encountered relatively high budget deficits and increasing public debt. This put pressure on the stability of the Euro system and led to turbulence on financial markets once again. Some (Southern) European states needed financial support and were forced to drastically cut spending. In Table 1 we observe that the average GDP growth in the Euro area was again negative in 2012 and 2013. European economies generally showed gradual recovery in 2014 and at the start of 2015 perspectives were still positive, at least when it came to GDP growth. However, the enormous levels of public debt in some countries still threaten the stability of the Euro zone.

We expect that financial effects of the Great Recession on households could be severe, but that the actual impact will depend on the specific situation and composition of the household. In Europe there are clear differences between states when it comes to welfare benefits and family ties in the population. To illustrate, the Netherlands has a more elab-orate welfare system than Spain. Also, as noted by Alessie, Brugiavini, and Weber (2006), the proportion of co-residence of the elderly together with their child(ren) is generally quite higher in Southern Europe compared to the North. This could be connected with the higher youth unemployment in the South, but other possible explanations are also proposed in the literature. The impact of the recession on households is being extensively investigated. For example, Hurd and Rohwedder (2010) analyzed experience and expec-tations of American households during the period November 2008 to April 2010. They found that unemployment rose heavily, households cut spending and problems with nega-tive house equity increased significantly. Moreover, (long-term) expectations about house prices and stock markets worsened. Effects for European households above 65 years were described by Cavasso and Weber (2013). They found a general increase in financial dis-tress when comparing 2006 with 2011. Furthermore, a strong positive correlation between financial distress and difficulties in making ends meet was obtained. Alessie, Angelini, Brugiavini, Celidoni, Pan, and Weber (2014) examined the financial position of Italian and Dutch households covering both the pre-crisis and crisis period. In both countries they found differing trends for households if they are characterized by socioeconomic vari-ables like age, education and employment status.

(6)

Table 1: Yearly percentage change of real GDP for several European countries, 2006-2013. Source: Eurostat. See http://ec.europa.eu/eurostat/web/national-accounts/ data/main-tables.

2006 2007 2008 2009 2010 2011 2012 2013

Euro area (18 countries) 3.2 3.0 0.5 -4.5 2 1.6 -0.7 -0.5

France 2.4 2.4 0.2 -2.9 2.0 2.1 0.3 0.3 Germany 3.7 3.3 1.1 -5.6 4.1 3.6 0.4 0.1 Italy 2.0 1.5 -1.0 -5.5 1.7 0.6 -2.3 -1.9 Netherlands 3.8 4.2 2.1 -3.3 1.1 1.7 -1.6(*) -0.7(*) Spain(*) 4,2 3.8 1.1 -3.6 0.0 -0.6 -2.1 -1.2 (*): Provisional figures.

we particularly analyze whether the crisis affected the various groups of the population differently. We do this by estimating binary outcome models for two different indicators for financial hardship: difficulties in making ends meet and the ability to face unexpected expenses. The former measures to what extent the household is able to pay its usual expenses, whereas the latter focuses on costs for the household which occur infrequently but require a higher amount of financial means. We extend the analysis of Alessie et al. (2014) to the five largest economies of the Euro area: France, Germany, Italy, the Nether-lands and Spain. These states are representative for the whole Euro zone and allow for comparison between Northern and Southern Europe. Furthermore, we are interested in whether there are specific changes in the relations between our outcomes of interest and control variables throughout the crisis. If this is the case, this would indicate a shock effect of the recession on the determinants of financial hardship at the household level. The latter will be investigated by applying a Blinder-Oaxaca decomposition on a part of our data set, distinguishing a pre-crisis year from a post-crisis year for each country. In our analysis we make use of data from EU-SILC, covering the period 2004-2012. This is a uniform database which contains household samples of all countries of interest, which improves comparability of the results.

Looking again at Table 1, we observe that only in Italy real GDP already decreased in 2008. In terms of GDP growth the countries of interest also showed differences in following years. Germany and France only faced a drop in GDP for 2009, whereas the economies of the other states declined again after 2011. In particular, on a yearly basis Spanish real GDP did not grow at all after 2008. Hence, it is obvious that the Euro area differed in economic development throughout the crisis, when looking at the individual states. This leaves the question to what extent this also holds for the financial well-being of European households, which will be investigated in the remainder of this paper. The main research question to be answered is the following:

• What are the effects of the Great Recession on the financial position of European households?

We will answer the following sub questions that will contribute to answer our main research question:

1. What are the effects of the crisis on difficulties in making ends meet for European households?

(7)

2. What are the effects of the crisis on the ability to face unexpected expenses for European households?

3. Are there differences between specific groups in the population in how they went through the crisis?

4. Are there specific changes in the relation between our outcome variables of interest and their determinants throughout the crisis?

(8)

2

Literature review

In this section we discuss a recent study of Alessie et al. (2014) who investigated the effects of the Great Recession on the financial position of Italian and Dutch households. In our analysis we will compare some of our findings with their results. Therefore, it is worthwhile to take a closer look at this report. Next, we briefly review the results of Soede (2012), who analyzed the ability of making ends meet in relation to age. His study has different purposes, but its results provide some interesting insights on the factors that influence the ability to make ends meet. Finally, we mention the results we expect to obtain.

2.1 Italian and Dutch households and the Great Recession

Alessie et al. (2014) analyzed the financial situation of Italian and Dutch households during both the pre-crisis and crisis period. They used the Survey on Household Income and Wealth (SHIW) for Italy and micro data from the DNB Household Survey (DHS) for the Netherlands. The Italian data set covers the period 1989-2012, in which generally once every two years a sample of about 8.000 households filled out a questionnaire regarding income, consumption and wealth. This was done mainly by means of computers; a small part (about 10-15%) of the interviews was conducted using pen and paper. The Dutch data covers the years 2006 to 2013 and was obtained from a sample of 2.000 households, which was updated each year by new households to account for panel attrition. The DHS is an Internet panel study and consists of five questionnaires, which had to be filled out by all household members aged 16 and over, except for a housing questionnaire which was filled out by the household head only. In both surveys information was also provided about the household composition and personal characteristics of the household head.

First, Alessie et al. (2014) investigated the dynamics of the reported ability to make ends meet. In Italy the ability of making ends meet was measured on an ordered scale from 1 to 6, where 1 means ‘with great difficulty’ and 6 means ‘very easily’. In the Dutch questionnaire respondents had to choose between five categories, i.e. 1 stands for ‘very hard’ and 5 stands for ‘very easy’. Also, a binary variable regarding difficulties in making ends meet was constructed which equals one if the respondent chose either answer 1 or 2. Other outcome variables that were analyzed are existence of financial distress, house value (net of outstanding debt) and real equivalized household net income, i.e. corrected for household size. These indicators were modelled by OLS, using robust standard er-rors clustered at the household level, on the following explanatory variables: age classes based on 10-year bands corresponding to the age of the household head, gender, marital status, household composition, type of education, employment status, home ownership and year dummy variables to account for the yearly dynamics during the crisis. Subse-quently, interaction terms between the year dummies and various other covariates were added as covariate in the regressions, one at the time. This way one can see whether the recession had different implications for groups of households with specific socioeconomic characteristics.

Alessie et al. (2014) showed that the crisis initially did not have impact on (difficulties in) making ends meet, but in 2012 the proportion of households reporting difficulties increased in Italy. This was especially the case for households with children and for self-employed and unself-employed household heads. Also, in 2012 real income sharply decreased in Italy, but for older Italian households income remained more stable. On the other hand, in the Netherlands both the ability in making ends meet and real income did not seem to suffer from the recession significantly. Only after 2010 we see that especially unemployed and self-employed people, and low-educated household heads experienced a small decrease

(9)

in real income. Furthermore, one could see that the proportion of financially distressed households clearly increased in Italy in 2008, but in the Netherlands such an increase started only in 2011. In both countries this was not the case for households aged 65 and over, which did not seem to suffer from the crisis at this point.

Two country-specific reports analyzed some other economic indicators as well, based on the same survey data sources. First, Alessie, Angelini, and Pan (2014) discussed infor-mation on beliefs and expectations that was only available from the Dutch questionnaires. Based on the same methodology they found that Dutch households report more difficulties in obtaining a loan during the crisis. For household heads with university education the effect was limited; only in 2009 these persons reported a significant increase in difficulties in obtaining a loan. Also, after 2007 Dutch households became more pessimistic about the economic situation which continued until 2013, in which they became more optimistic for the first time in years. Expectations about house prices roughly followed the same pattern; until 2008 households expected increasing house prices, but during the years that followed expectations reversed. The proportion of home owners whose mortgage is underwater (i.e. the value of the house is lower than the outstanding mortgage debt) started to increase in 2012 and almost doubled in 2013. It follows that in these years especially mortgages of younger household heads between 36-45 years and that of employees became underwater. Second, in the country-specific report of Brugiavini, Celidoni, and Weber (2014) to-tal, non durable and durable expenditure of Italian households were analyzed over time. They found that equivalized total expenditures clearly decreased in 2012; the same trend was noticed regarding equivalized household income. Durable expenditure decreased even more than income during the crisis, whereas households also reduced their non-durable expenditure, but not that strong as was the case for durables. Finally, in particular un-employed household heads lowered their expenditures, whereas older households did not significantly cut expenditures at all.

2.2 Age effects in the ability of making ends meet

Soede (2012) discussed how age is related to the ability of making ends meet in the Nether-lands. This analysis was done specifically to investigate differences in income satisfaction between pensioners and younger people. Soede made use of the Sociaal Economisch Panel (SEP), which is a Dutch longitudinal household survey covering the years 1981-2002. For each household one respondent gave information about how well the household was able to make ends meet. This was done by using a ordered scale of 6 possibilities. Also, socioeconomic characteristics of the respondent were known.

First, the proportion of households that makes ends meet either with (great) difficulty or (very) easily were showed for different ages. For this purpose Soede also makes use of data from the Permanent Onderzoek LeefSituatie (POLS) and EU-SILC for the years 2003-2010. Note that in his study age denoted the age of the oldest household member. The proportion of pensioners that was able to make ends meet (very) easily clearly in-creased over time, more than the corresponding proportion of younger households. In particular, from 2006 older households were able to make ends meet more easily than younger households. On the other hand, the proportion of households reporting (great) difficulties was quite stable over time. From 2003 older households clearly reported less difficulties than younger households.

(10)

but the latter effect was smaller in magnitude. Furthermore, household size and household type had distinct effects on the level of income that is needed, compared to the ability in making ends meet of a one-person household. Finally, Soede made direct and indirect effects of age and cohort explicit by estimating equivalence factors for various household types and other socioeconomic variables. From these results it follows that for a pensioner the ability to make ends meet was not significantly higher than for a working person. On the other hand, an unemployed person needed significantly more income in order to make ends meet with the same efforts.

2.3 Expected results and hypotheses

First, we expect that the financial crisis had more impact on households in Southern

European countries than on households in the Northern countries of our study. The

Netherlands and Germany are known to have more elaborate welfare systems than coun-tries like Italy and Spain. Hence, it is to be expected that becoming unemployed, for example, has stronger adverse effects for households in the South. Also, Alessie et al. (2014) concluded that real equivalized household income of Italian households decreased during the first phase of the recession and even sharply in 2012, whereas the effects on Dutch incomes were only limited. Moreover, they observed that the proportion of Dutch households having difficulties to make ends meet was not affected by the crisis.

Next, we expect that in general the recession had more consequences for younger people than for pensioners. This is because we anticipate that pensioners are less directly dependent on the economic circumstances as most of them will not have a job anyway. This would be in line with the findings of Soede (2012), who found less difficulties in making ends meet for the elderly already since the early 21st century. Furthermore, Alessie et al. (2014) concluded that for both Italy and the Netherlands there was some upward trend in financial distress during the crisis, but only for households younger than 65 years. Also, they obtained that in 2012 households below 45 years were more financially distressed than other households. Additionally, the impact of the recession in making ends meet and income of Italian households was especially felt by self-employed and unemployed household heads and households with children, which was in contrast with older households.

Finally, when we estimate separate models for both a pre-crisis and a post-crisis year, we expect that conclusions regarding structural breaks in the parameters will differ over the countries. In line with our other expectations, it would be natural to assume that significant differences are found only for Italy and Spain. However, the specific timing and length of the recession varies for each country, so the results may depend on the choice of the pre-crisis and post-crisis year in this part of the analysis. If we find a structural break, then a Blinder-Oaxaca decomposition will show whether differences in the outcome variable of interest are either due to changes in the values of the regressors or as a result of actual differences in the parameters over the years.

(11)

3

EU-SILC data

This section presents the data that is used in the modelling part. First, we give a short overview of the database. Next, we discuss several issues concerning the data which have to be taken into account when performing the analysis. Subsequently, we describe our variables of interest in more detail. Finally, we provide summary statistics for all variables and take a first look at trends in our key variables during the period of study.

3.1 Overview of database

In this study we use data from EU-SILC: European Union Statistics on Income and Liv-ing Conditions. This is a comprehensive micro-level data set in which many European countries participate. To be precise, countries that are ultimately involved in EU-SILC are the 28 EU countries and Iceland, Norway, Switzerland and Turkey. The five countries that are relevant for our study participated either in 2004, i.e. France, Italy and Spain, or 2005, i.e. Germany and the Netherlands. For all these countries we have data available up and including 2012.

The reference population of EU-SILC is all private households and their current mem-bers residing in the territory of the country at the time of data collection. Persons living in collective households and in institutions are generally not included in the selected samples. Also, small parts of the territory of a country that amounts to no more than two 2% of the national population may be excluded from the selected sample. EU-SILC provides two types of data: cross-sectional data pertaining to a given time or a certain time period with variables on income, poverty, social exclusion and other living conditions, and longitudinal data pertaining to individual-level changes over time which is observed periodically over a four-year period. In a following subsection we will elaborate on the fact that for our pur-poses it is only possible to exploit the former type of data, which implies that we cannot do panel data analysis. This is an important limitation on the analysis. At the same time, the data set contains extensive information about the characteristics of each household, so that we can control for demographic and socioeconomic variables and, also, look for potential interaction effects. Information was collected on two scales: on household level and personal level. The household respondent is the person from whom household level information is obtained. Ideally, this is the person responsible for the accommodation. If he or she was not available for interview, any household member aged 16 and over who is the best placed to give the information was chosen.

Interviews were used to collect data. Each country is responsible for the set-up of its

own questionnaire; for the national questionnaires we refer to the website of Eurostat.1.

Moreover, in the Netherlands also data from administrative registers was used, avoiding the need to interview all members which are aged 16 and over in each sample household. Much flexibility was allowed regarding the sample design. All countries used the form of a rotational design to interview a sample of households: the sample used for any year consists of 4 replications, which have been in the survey for 1-4 years. Each replication stays in the survey for four years: every year one of the four replications from the previous year is dropped and a new one is added. An exception is France where a nine-year panel, including nine rotational groups, is used.

To sum up, we have 587,377 observations available during the years 2004-2012. From this large collection of data we will lose about 4.7% of the observations as a consequence of missing values in our control variables. However, a large part of these missing values

1

(12)

are due to one control variable that is insufficiently recorded in the Netherlands, as will be discussed in the next subsection. Without this variable only 1.4% of all observations would be lost.

3.2 Data discussion

For any empirical study the quality of the available data is crucial for the strength of the analysis. EU-SILC provide a unique data source for comparative analysis on income, poverty and other social domains, which is very useful for studies concerning a broad range of topics. On the other hand, a number of studies on EU-SILC point out some problems and shortcomings with the data. In this subsection we mention a few limitations on EU-SILC that applies to our analysis. Moreover, we discuss the issue of non-response for our sample.

3.2.1 Structure of database

To begin with, EU-SILC provides information in separate files for each year. However, these files cannot be linked with each other; personal and household identifiers are ran-domized for each data file. In this way files cannot be linked across years which means that households cannot be followed over time. An alternative would be to use available longitudinal files which includes only one to four observations per household, but for our purposes we cannot base our analysis on such a short time period. Hence, we can only consider the separate files as independent samples over the years. However, as also noted by Iacovou, Kaminska, and Levy (2012), we know that some households will be present in the data for only one year, while for other households there will be repeated observa-tions. In order to calculate appropriate standard errors, we must take this clustering into account, which is not possible with the current structure of the data. We do compensate for possible model misspecification by using robust standard errors, although this will not solve the problem of time dependent observations. This implies that our standard errors should be viewed with care.

Second, there is a reference period mismatch in EU-SILC between income and non-income variables. Variables concerning non-income are related to the calendar year preceding the year in which the data are collected, which is called the income reference period. However, other variables are related to the moment of interview, which is the current reference period. Iacovou et al. (2012) notes that this is harmful for any study which analyze the relationship between income and any other variable. For example, the financial situation of a person who becomes unemployed in the beginning of the year of interview may be quite different compared to the previous year (i.e. the year on which his income figures are based). Since it is not possible to track previous income information of a particular household, we will just use the household income as provided by EU-SILC. Note that this also implies that we cannot use (positive or negative) changes in household income as explanatory variable in our regressions, as done by Soede (2012), which would otherwise be an interesting covariate.

Third, the user database description of EU-SILC, see Eurostat (2008), is not very clear about weighting of observations. The database includes household cross-sectional weights which should make it possible to draw inference from the effective sample to the target population. However, as noted by Iacovou et al. (2012), we do not know what these weights exactly stand for in the various countries. Either they only correct for the probability of selection of the household, or the weights also adjusts for non-response. As there are no

(13)

alternatives we choose to present our data later on using these household weights provided by EU-SILC.

Next, a general point noted by Iacovou et al. (2012) is that EU-SILC is output-harmonised, rather than input-harmonised. This implies that countries have much free-dom in how they collect the required information. For example, in the Netherlands the data comes from two sources, i.e. surveys and registers, whereas the other countries only use surveys. Also, there is flexibility in the wording of questions, which potentially has influence on the answer that is given by the respondent.

Furthermore, for modelling purposes we will delete households that report zero or neg-ative income so that we can use the log of income. A disadvantage is that self-employed persons are overrepresented in this group: almost a quarter of the households who report zero or negative income are self-employed, while they account for only about 8% in the sample. Also, Italy and Spain report clearly more observations of such incomes than the other countries. In the upper part of Table 2 an overview of this type of observations is given. However, the total occurrences of zero or negative income are very small rela-tive to the total observations: less than one percent of the sample. Hence, we expect that this deletion will not have significant consequences for the representativeness of our results.

Table 2: Details of the various observations which are discussed in the text.

Type of observation Counts/proportion

Total DE ES FR IT NL

Total number of households with income ≤ 0 3375 382 1255 93 1430 215

Self-employed with income ≤ 0 in whole sample (%) 22%

Self-employed in whole sample (%) 8%

Income ≤ 0 in whole sample (%) 0.6%

Overall unit non-response rate (in 2007)(∗) 13% 38% 13% 20% 22%

Households with missing personal income figures (%) 0.9% 3.9%(∗∗) 0.3% 0%(∗∗∗) 0% Item non-response (%):

- Ability to make ends meet 0.1%

- Ability to face unexpected expenses 0.3%

- Real household income 0.1%

- Status in employment of household head 16.5% 12.4% 4.4% 10.5% 11.1%

- General health of household head 0.1% 0.1% 0.1% 1.0% 23.3%

Total deletion of observations for

- Ability to make ends meet (%) 4.7%

- Ability to face unexpected expenses (%) 4.6%

(∗): Source: Verma et al. (2010). (∗∗): 3397 of the 4410 Spanish households with incomplete income

information were observed in 2004, i.e. 77 percent. (∗∗∗): Italy deletes households with incomplete

income information from their sample.

3.2.2 Unit non-response

During any process of data collection by surveys, one has to deal with the problem of non-response, which can have effects on the representativeness of the results. Countries differ in their procedures regarding the treatment of non-response, where we distinguish between unit and item non-response. We will elaborate on both types in the remainder of this subsection.

(14)

one or more specific persons in that household. The data description of EU-SILC is not always clear regarding the applied procedures to this phenomenon. Verma et al. (2010) state that non-response is a ‘serious problem’ in EU-SILC. For example, they conclude that in most countries the overall non-response rate for personal interviews is quite high for newly introduced households in the sample. In 2007 this rate varies from 13% for both France and Germany to 38% for Spain, as can be seen in the lower part of Table 2. Also, they state that this type of non-response often varies across sub populations of a country. However, since no information is provided concerning the response status of households and individuals, it is not possible to view non-response for different groups in a country. However, the database description of EU-SILC states that the provided household weights also ‘corrects for the non-response at the household level’. As we apply these weightings on our sample of observations, we will assume that this aspect of unit non-response will not lead to crucial problems for our analysis.

In the case that not all household members could be interviewed, it is necessary to correct for the effect of non-responding individuals on total household income. Countries differ in their methodologies for dealing with this issue. The Netherlands uses administra-tive sources to collect income figures. Therefore, for this country this type of non-response is not an issue here. In Italy households with one or more missing personal interviews are completely deleted from the sample, but we cannot track for how many households this was the case. In France full-case imputation of missing values is applied from 2006. In Germany and Spain (until 2007) the collected income value is multiplied by a factor that is determined on the basis of characteristics of the household and that of the non-interviewed member(s). The latter is also the case for French households before 2006. The factor has been used only when no other imputation is performed; in all other cases it is equal to one. From Table 2 it can be seen that for our data set this type of non-response is hardly an issue for Germany and France, but for 3.9% of the Spanish households in our sample there are missing income components. Actually, this type of Spanish households is primarily observed in 2004. On the other hand, for Germany the few occurrences of incomplete income figures are equally spread over the years. For this reason we choose to apply the non-response factor on income only for Spain and delete the remaining house-holds containing missing income components. This factor is a real number between 1 to 5 and it only applies to the Spanish sample during the period 2004-2007.

3.2.3 Item non-response

Item non-response refers to missing answers to specific questions in the survey that would otherwise be successfully completed. For example, this can be due to refusal of the respon-dent to answer, the recorded answer is invalid or the responrespon-dent is not able to provide the requested information. We will look at a few important variables (to be presented in the next section) with respect to this issue. In the lower part of Table 2 we notice that item non-response within our sample is low for our key outcome variables. On the other hand, information about general health of the household head is subject to high non-response in the Netherlands (and is more or less equally high over the years), whereas it is quite low in the other countries. We will describe potential consequences for the modelling when we present summary statistics of our variables later on. Besides the Netherlands, for the other countries most sample households are lost due to the variable about status in employment: on average this variable is unknown for 11% of the households in our initial sample. For-tunately, we only need this variable until 2008 (as after this year another question in the survey is sufficient to determine the status in employment). In order to model the ability to make ends meet, we lose on average 4.7% of our households due to missing variables

(15)

in our set of explanatory variables. For the modelling of the ability to face unexpected expenses this figure is 4.6%. However, if we would drop the variable about health we would only lose 1.4% or 1.7% of all observations, respectively. Overall, we conclude that item non-response does not significantly damage the representativeness of our study. 3.3 Variables of interest

3.3.1 Outcome variables

The most important variable to be analyzed in this paper is makendsmeet ; the ability to make ends meet. This is related to the net monthly income of the household, and the question in the survey is about with which level of difficulty the household is able to pay its usual expenses. Note that although only one member of the household is being asked this question, more than one household member may contribute to the total household income. In each country the ability of making ends meet is measured by means of the following ordered scale: with great difficulty, with difficulty, with some difficulty, fairly easily, easily or very easily. These possible answers range from 1 to 6, respectively, where 1 means with great difficulty and 6 means very easily. Note that this variable refers to the current reference period, which is the moment of interview.

Second, from this ordered outcome variable we construct a binary variable difmakends which defines whether a household has difficulties in making ends meet. This dummy takes value one if a household is only able to make ends meet either with great difficulty or with difficulty (i.e. the respondent choose either answer 1 or 2 as defined above) and zero otherwise.

Finally, we will compare the estimation results of our models for the ability of making ends meet with a model concerning a different indicator of financial hardship: unexpexp, the ability to face unexpected expenses. This variable also refers to the time frame of the moment of interview. In contrast with the ability to make ends meet, facing unexpected expenses is about costs for the household that occur infrequently. This variable has a value of either zero or one, depending on the answer of the respondent on the following question: ‘Can your household afford an unexpected required expense and pay through its own resources?’. Here, the values zero and one correspond to ‘no’ and ‘yes’, respec-tively. The mentioned unexpected expense is defined as 1/12 times the at-risk-of-poverty

threshold of the corresponding country. The latter threshold is defined by Eurostat2 to

be 60% of the national median equivalized disposable income after social transfers (i.e. corrected for household size). Hence, the size of the unexpected expense that has to be faced hypothetically depends on the corresponding national income level. For the type of unexpected expense we can think of a major repair of the house or replacement of durables like a washing machine.

3.3.2 Explanatory variables

First, an important explanatory variable for the ability of making ends meet is income. In our analysis we use the log of realinc, which is real equivalized disposable household income. Here, the reference period for income is the calendar year prior to the interview. Disposable household income comprises several individual components. The data description of EU-SILC mentions that disposable income can be computed in two ways. Graf, Wenger, and Nedyalkova (2011) note that Germany and the Netherlands use a definition for disposable

2

(16)

income based on gross income components, whereas France, Italy and Spain base income figures on net income components during 2004-2006. In the latter case the collected values are then converted to gross values, see Verma et al. (2010) for more details. In order to use real income, we make use of the harmonized index of consumer prices (HICP) provided by

Eurostat3, so that income values are expressed in 2005 prices. The HCIP shows consumer

price inflation for each country in the euro area on the basis of harmonized statistical methods, as opposed to national consumer price indices, which improves comparability of inflation rates between countries. In order to take into account that larger households have more needs, we equivalize our income figures based on household size. We use the square root of household size as equivalence scale for disposable income. Hence, we follow the method applied in Alessie et al. (2014) which enables us to compare some of our results later on. Other common approaches are the OECD equivalence scale, see OECD (2013) for details, or the OECD-modified scale proposed by Hagenaars, de Vos, and Zaidi (1994). The different methods all share the idea that due to economies of scale the needs of a household do not increase proportionally with each additional member. From Table 8.1 of OECD (2013) it follows that the various approaches are not very different from each other; only for relatively large households the OECD equivalence scale gives clearly more weight to household members.

Second, we control for various demographic and socioeconomic variables which are available for the person representing the household, which we will call the respondent from now on. Hence, we make the assumption that the person interviewed regarding variables on the household level, is also the person who is responsible for the household, which is generally also the case. female is one if the respondent is female. From the variable age we construct age dummies like age16 35 which is one if the respondent is between 16 and 35 years old, and so on. Marital status is known for the household head: we construct dummy variables single, partner, widowed and separ divorc. separ divorc has value one if the respondent is either separated or divorced from his or her partner. Next, the level of education is grouped into four categories, which are constructed from

information about the highest ISCED-1997 level attained4: educlow has value one if the

respondent completed at most primary education. educlowersec and educuppersec are dummies for lower secondary and upper secondary education, respectively. The latter also includes post-secondary education. Finally, educhigh equals one if the respondent completed tertiary education. The current economic status is described by the dummy variables employed, selfemployed, unemployed, retired, disabled and inactive. disabled takes value one if the respondent is permanently disabled or unfit to work, whereas inactive has value one if the respondent is a student or some other economically inactive person. Also, selfemployed has value one if a person is self-employed either with or without employees. We have information about the self-declared health of the respondent: respondents can choose between either very good, good, fair, bad or very bad. We construct a dummy badhealth that has value one if either ‘bad’ or ‘very bad’ is chosen’. In 2004 and 2005 the scale that was applied is slightly different so that badhealth is one if the respondent chose either ‘rather bad’ or ‘bad’. A dummy variable homeowner is included which has value one if the residence is owned by the household. Note that households living in an accommodation that is provided rent-free are not considered as owner.

Furthermore, hhsize denotes the number of household members. youngchild is a

3

See http://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=prc_hicp_aind&lang=en. 4

ISCED stands for International Standard Classification of Education and has been developed by UN-ESCO to facilitate comparisons of education statistics across countries on the basis of uniform defini-tions. For precise definitions of the categories we refer to http://www.uis.unesco.org/Education/Pages/ international-standard-classification-of-education.aspx.

(17)

dummy variable that is one if underage children are living in the household (one-person households excluded), while adultchild denotes the presence of only adult children in the household. These are people between 18 and 24 years who are economically inactive and live with at least one parent.

Finally, we include time dummies for the period 2004-2012 to analyze how the ability of making ends is related with the pre-crisis and crisis years, when controlling for the above mentioned variables.

3.4 Descriptive evidence

3.4.1 Summary statistics

In Table 3 weighted5 summary statistics of all variables of interest for our study are

presented. We shortly discuss some values that can be observed in the table.

First, makendsmeet out, difmakends out and unexpexp out are values for sample house-holds who contain at least one missing value of our set of regressors. Hence, these observa-tions will be left out of the models. We will only focus on difmakends out and unexpexp out for the moment. For Germany, Spain, France and Italy the number of unused observations is acceptable low. On the other hand, for the Netherlands more than 24% of the sample include at least one missing. As already mentioned in the previous subsection, this is exclusively due to the high number of unrecorded values for badhealth. Hence, we could choose to leave out this variable from our models. However, we consider this variable to be relevant for our models, and only in the Netherlands this variable is insufficiently collected. From the summary statistics of difmakends out we observe that the out-of-sample average difficulties in making ends meet is somewhat lower than that of difmakends. Moreover, the mean of unexpexp out is clearly higher than that of unexpexp. Hence, we prefer to keep badhealth in our models but for the Netherlands we will estimate an additional model that ignores the health situation of the household head. Compared to the Netherlands, the other countries give an opposite image for the out-of-sample outcome variables: on average difmakends out is higher than difmakends (except for Italy) while unexpexp out is lower than unexpexp, especially for France. Hence, it seems that, generally, respondents who do not (want to) report their health status are more often in financial problems, and we expect that these people are in general less healthy. However, as only a very small part of these samples include such missing recordings in these countries, we will not consider it for these countries in the remainder of our analysis.

Second, the standard deviation of Spanish disposable income is quite lower than that of the other countries. Spain reports significantly less extreme incomes; the 250 highest incomes recorded over all years do not include Spanish households. Also, compared to the sample of Alessie et al. (2014) we observe a relatively high proportion of female respon-dents; only in Italy it is below 50%. Average age is highest in Italy; this country includes relatively few young households below 36 and a relatively high proportion of households in the highest age class. Regarding marital status we observe more one-person households in the Northern countries than in the South. In addition, in Germany the proportion of sep-arated and divorced respondents is almost three times lower than that of Spain. There are also significant differences in education level between the countries. Households in South-ern countries completed only low education clearly more often than NorthSouth-ern countries. In fact, low education in Germany is very rare as it only applies to 0.8% of the respondents, which may be (partially) the result of a difference in applied definitions. When we look

5

(18)
(19)

at employment status, we observe that Italy has a lower proportion of unemployed re-spondents than expected. This could be because in Italy it can be more advantageous for unemployed Italians to categorize themselves as being self-employed, which is indeed rel-atively high for Italy in our sample. Furthermore, the Netherlands (and to a lesser extent Spain) has relatively few pensioners but also an unexpected high proportion of inactive respondents. Hence, there may be some differences between the countries concerning the used definitions when recording these variables. Finally, home ownership can be seen to be more common in Southern Europe, whereas household size has also higher values here. The latter is as expected since in Italy and Spain it is more common to live together with several generations in the same residence.

3.4.2 Key variables

In the remainder of this section we look at our key variables in more detail. First, to provide an overview of the variable makendsmeet over the years, in Figure 1 the average ability of making ends meet over time is displayed for each country. Note that households in the Netherlands report to have on average the least difficulties in making ends meet, whereas Italian households declare to have most difficulties during all years. The French average of 2004 is remarkable as it is considerable higher than all succeeding years. For all countries average ability decreases (slightly) in 2008, the year in which the financial crisis started.

Figure 1: Average ability of making ends meet for all years and all countries. Ability is measured on a scale from 1 to 6 where 1 means ‘with great difficulty’ and 6 means ‘very easily’.

Difficulties in making ends meet: difmakends

(20)

i.e. answer 2, and the proportion of households reporting great difficulties, i.e. answer 1. Hence, the sum of these proportions equals difmakends as defined before. Notice that for most years in Germany the lowest proportion of households has difficulties, whereas it can be seen that Italy has the highest proportion of this type of households. We also observe that, compared to Italy and Spain, a relatively small part of French households who reports difficulties actually has great difficulties. When focusing on each country over the different years, it seems that the crisis did not affect Dutch households very much, as the proportion of households reporting (great) difficulties remain fairly stable; only a slight increase in difficulties can be observed after 2009. This would be in line with Alessie et al. (2014), who did not find significant effects for both Dutch income and making ends meet during the same period. In Germany and France declared difficulties increase in 2009 and remain stable afterwards. On the other hand, in Italy and Spain an increase in difficulties can be observed already in 2008, but the proportion of this type of households is more fluctuating after the start of the crisis.

Facing unexpected expenses: unexpexp

Next, in Figure 2 (bottom graph) the proportion of households that is able to face unex-pected expenses is displayed for each country. We notice that, except for 2005, households in the Netherlands are most able to face unexpected expenses. Compared to the ability of making ends meet, the other countries do not differ very much when it comes to facing unexpected expenses. The German figures have a remarkable peak in the first year of study, whereas they are very stable during the crisis years. The high value for 2005 is questionable as there are no clear factors which may explain this. Later on we will also leave out this year from our models, in order to examine whether this yield different results. Based on the French figures the Great Recession would not have any impact on financial robustness of households in France at all; we only observe a gradual upward movement during the whole period of study. Likewise, for Dutch households unexpexp initially has an upward trend. However, after 2009 unexpexp decreases slightly and then stays at the same level. This is in line with the slight increase of Dutch difficulties in making ends meet, which we observe after 2009 in the upper graph of Figure 2. For Italy and Spain a clear decrease in ability to face unexpected expenses can be noticed in the years after the start of the crisis. However, the timing is different: in Spain unexpexp already lowers after 2008, whereas in Italy this happens only after 2010. Both countries report for 2012 the lowest proportion for this variable over the whole period. From the figures it seems that regarding financial robustness Italian and Spanish households clearly suffered from the crisis.

Household income: realinc and logrealinc

Finally, in Figure 3 the mean and median of both realinc and logrealinc, here denoted in terms of purchasing power parities (PPPs), are shown. These PPP factors are computed

relative to the EU-28 average.6 For example, it implies that real household income in

Spain is adjusted upwards as the purchasing power of the euro is relatively high in this country, compared to the EU-28 average. However, in the modelling part we will neglect PPPs as we will then analyze each country separately, so we will then just use logrealinc as defined earlier. Looking at Figure 3, we notice that real income in France rises significantly in 2008. This is due to a change in methodology for the collection of the amounts of social benefits: since 2008 these values are directly obtained from the three main benefit funds

6Source: http://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=prc_ppp_ind&lang=en.

(21)

Figure 2: Proportion of households having difficulties or great difficulties in making ends meet (above) and proportion of households being able to face unexpected expenses (below) for all years and all countries.

(22)

Figure 3: Real equivalized disposable household income (above) and the log of real equivalized disposable household income (below) in terms of PPPs for all years and all countries (euros). Both the mean and median are shown.

(23)

4

Empirical models

This section describes the models that will be applied in the analysis. First, we briefly discuss a few issues that have arisen from our data inspection, which we have to take into account for our study. Next, we elaborate on the formulation of our modelling techniques. Furthermore, we explain how to interpret obtained results when using the presented meth-ods.

4.1 Practical issues in our study

We already mentioned that it is not possible to fully exploit the longitudinal aspect of the EU-SILC data, which is the most important limitation for our study. This implies that we cannot make use of panel data models, such as fixed-effects or random-effects specifications. These models account for the individual effect of the respondent who generally participated for several years in the survey. In contrast, we have to rely on cross-sectional models

including dummy variables for each year. Hence, obtained standard errors should be

viewed with care as a considerable part of the sample of households are repeatedly included in the sample, which implies that observations are not independent. We will compute robust standard errors (White errors) in our estimation to correct for heteroscedasticity, as proposed by White (1980), although this will not solve the problem of clustering in observations. As a result, the sampling variation of the estimates will be underestimated. We will use a significance level of 1% when evaluating the estimation results.

Next, we encountered a sharp increase in the French figures for household disposable income after 2007. This is the result of a change in methodology for the collection of French income components. It has consequences for comparability over the years and, also, among countries. Since disposable income is a key variable for the modelling of ability of making ends meet, this may lead to unreliable results for France. For this reason we will not consider the French data in the remainder of this paper; from now on we will focus on Germany, Italy, the Netherlands and Spain. Note that our study does not model the relation between household income and the various socioeconomic variables; income is considered to be exogenous in the modelling of our indicators for financial hardship. However, we expect income to be positively related with age and education, for example. Investigation of such relations is outside the scope of this paper; we are particularly interested in the dynamics of subjective measures of financial hardship for the various groups of the population.

Finally, from Figure 1 it can be seen that the dynamics in the average ability of making ends meet are limited. On the other hand, Figure 2 shows some interesting patterns which make it worthwhile to model difficulties in making ends meet and, also, the ability to face unexpected expenses. Hence, we will limit ourselves by only modelling the latter two binary variables.

4.2 Regression methods

4.2.1 Binary outcome models

In our study we consider binary outcome variables, which can be modelled in several ways. We will limit our modelling effort mainly to probit. However, we will use a logit

model for the modelling of unexpexp for Germany.7 Also, we will compare the results with

(24)

ordinary least squares (OLS) estimation by specifying a linear probability (LP) model, as this more simple technique will be useful for further analysis later on. These models contribute to the investigation of the effects of the recession on both difficulties in making ends meet and the ability to face unexpected expenses for European households, which are the first two sub questions in our study. Also, applying these models on the data will reveal whether our expectations concerning the effects of the crisis on the different countries are confirmed, e.g. whether it impacted more severely the South than the North. To briefly describe the modelling process, we follow the notation of Cameron and Trivedi (2005). We define our binary outcome variable difmakends, for example, as

difmakends =



1 with probability p,

0 with probability 1 − p. (1)

To model the probability p in equation (1), we define a K ×1 regression vector x and a K ×1

parameter vector β. For each country c we have a sample difmakendsi, xi, i = 1, . . . , Nc,

and assuming independence over i we then model the corresponding probability pi as

pi ≡ Pr

h

difmakendsi= 1 | xi

i

= F (x0iβ),

where F (·) is a function defined in such a way that 0 ≤ pi ≤ 1. In the case of a probit

model, F (·) is defined to be the cumulative distribution function (cdf) of the standard normal distribution and in the logit model we define F (·) to be the logistic cdf.

In contrast, an LP model just specifies the identity function for F (·), i.e. pi = x0iβ.

The latter model can be easily estimated by OLS, and is easier for interpretation. This method also has some disadvantages. First, the predicted probabilities may turn out to be greater than one or negative. Moreover, in order to perform hypothesis tests OLS assumes normality of the error terms, but this is not the case for the LP model; given

any set of values for the regression vector xi, the error term can only take two values, i.e.

either 1 − x0iβbOLS or −x0iβbOLS. Finally, OLS assumes homoscedasticity, which is violated in the LP model; the binary nature of the outcome variable corresponds to a Bernouilli

distribution, which implies that it has variance pi(1 − pi) = x0iβ(1 − x0iβ), which is not

constant across observations as it depends on varying regressors xi. This means that

computed standard errors will be wrong, as well as hypothesis tests on the parameters. Nevertheless, OLS regression can be helpful as a first guess. As noted by Cameron and Trivedi (2005), in many applications the predicted probabilities will be between zero and one. Estimation of probit and logit models is done by the maximum likelihood (ML) method. Again, we refer to Cameron and Trivedi (2005) for details.

4.2.2 Marginal effects

In our analysis we will look at marginal effects of the explanatory variables to investigate the particular influence of each regressor on the outcome variable. The marginal effect of a regressor in probit and logit models depends on the values of all regressors:

∂pi ∂xij ≡ ∂ F (x 0 iβ) ∂xij = F0(x0iβ)βj,

whereas in the LP model the marginal effect of the jth regressor just equals the constant

βj. Therefore, corresponding to our estimated probit and logit models we will compute

(25)

for each regressor the average marginal effect (AME). The AME of the jth regressor takes the average of the computed marginal effects of regressor j for all observations i:

AM E ≡ 1 N N X i=1 n

Estimated marginal effect of regressor j for individual i o

.

The computation of the AME depends on the type of the regressor of interest. Let yi be

our outcome variable for individual i, then the estimate of Pr yi = 1|xi, which equals

F (x0iβbM L), is called the predictive probability. First, suppose that the jth regressor is a

continuous variable. In this case the AME for the jth regressor is the instantaneous rate of change in the predictive probability, averaged over all observations i. In other words,

Continuous regressors: AM E = 1 N N X i=1 lim δ→0 Pr  yi = 1|xi, (xij+ δ)  − Pryi = 1|xi  δ , where Pr  yi = 1|xi, (xij+ δ) 

denotes the predictive probability for individual i computed

at his own values xim, m 6= j, and for the jth regressor computed at (xij + δ). Also,

Pr yi= 1|xi is the predictive probability for individual i computed at its own values xi.

Note that if the value of a covariate changes, the marginal effect of that covariate becomes also different; it only holds exactly at the value of the variable of interest.

The computation of the AME differs for categorical variables. Let the jth regressor be a factor variable with base level 0 and some higher levels k, k = 1, . . . , l. Then the AME of regressor j shows the average discrete change in the predictive probability if the

regressor xj changes from the base level 0 to level k. In other words,

Categorical regressors: AM E = 1 N N X i=1 h Pr yi= 1|xi, xij = k − Pr yi = 1|xi, xij = 0 i ,

where Pr yi = 1|xi, xij = k denotes the predictive probability for individual i computed

at its own values xim, m 6= j, and at xij = k, regardless of its true level for regressor

j. Similarly, Pr yi = 1|xi, xij = 0



denotes the predictive probability for individual

i computed at his own values xim, m 6= j, and at xij = 0. The same interpretation

holds if the regressor is a dummy variable, which only has two levels, i.e. k = {0, 1}. For example, a dummy variable that has an AME of 0.15 implies that on average the predictive probability increases with 15 percentage points if the dummy changes from 0 to 1, ceteris paribus.

4.2.3 Year dummy variables and interactions

(26)

Given significance of year dummy variables, even more relevant for our study could be interaction terms between the year dummy variables and some of our regressors that typically characterizes the household. This reveals whether the various socioeconomic groups went through the financial crisis differently, in terms of our measures for financial hardship. Hence, such an investigation will answer the third sub question of our study. Moreover, this part of the analysis will show whether our expectations regarding the different effects of the recession on particular sub groups are confirmed, e.g. whether it had more consequences for younger households than for the elderly. To this end we choose the categorical variables age, marital status, education, activity status, household composition and the dummy variable indicating home ownership. We combine one of these variables at the time with year, and then perform again a Wald test to test the joint significance of the interaction terms. To illustrate, consider marital status. This variable includes the categories single, separ divorc, widowed and the reference category partner. Recalling that the base level for year is 2007, in this example the following interaction terms will be included in our model:

year04 × single , year04 × separ divorc , year04 × widowed year05 × single , year05 × separ divorc , year05 × widowed, year06 × single , year06 × separ divorc , year06 × widowed, year08 × single , year08 × separ divorc , year08 × widowed, year09 × single , year09 × separ divorc , year09 × widowed, year10 × single , year10 × separ divorc , year10 × widowed, year11 × single , year11 × separ divorc , year11 × widowed, year12 × single , year12 × separ divorc , year12 × widowed.

Note that we still include regressors for year and marital status in the model. For ex-ample, besides the effect of the year dummy of 2009 and that of widowed, the inclusion of year09 ×widowed shows a specific effect for a widowed household head in 2009 with respect to a household head who lived together with a partner in 2007. If the coefficients of the interaction terms for a particular interaction are jointly significant, we will compute predictive probabilities for the outcome variable. To present easily interpretable results, these predictive probabilities will be plotted for each category of the corresponding co-variate that is interacted with year, including that for the reference categories of each interaction. This provides a clear overview of potential differences between the various household types during the crisis.

4.2.4 Blinder-Oaxaca decomposition

In the last part of the analysis we will look at a specific part of our sample to compare esti-mation results for both a pre-crisis year and a post-crisis year. It is relevant to investigate whether estimated models for the separate years yield different results; this may suggest a shock-effect of the crisis on the determinants of financial hardship at the household level. The described methods below will contribute to answer our fourth sub question. Moreover, the results will show whether our expectations regarding this point will be confirmed, e.g. whether we find structural breaks in the parameters only in the Southern countries. In order to choose two particular years, we again refer to Table 1. We consider the crisis still to be present in Europe in 2012; the most recent year for which we have data available. In this year the European debt crisis impacted government budgets and economic progress; real GDP again declined for the Euro area as a whole and unemployment stayed at high levels. For this reason we choose 2010 as post-crisis year, in which all economies showed recovery from the severe financial crisis of 2008. The pre-crisis year is chosen to be 2007; the year preceding the crash. Of course, the results obtained later on may depend on these

(27)

choices, but we consider them to be the most appropriate for our purposes. Subsequent studies could prefer to pick 2014 or 2015 as post-crisis year, for which data is not avail-able yet. The methods described below will be based on the linear probability model, i.e. applying OLS.

First, we estimate separate models for 2007 and 2010 (without interactions) and compute robust standard errors. Subsequently, we test whether there is evidence for a structural break. This can be done by performing a Chow test which amounts to testing whether the corresponding coefficients of both models are equal, excluding the constant term. Note that we restrict the variance of the error terms here to be equal for both samples. We can also include both sub samples into one regression model by writing

y = Xβ7+ Zγ + u, u ∼ (0, Ω) ≡  X7 X10  β7+  O X10  (β10− β7) + u, (2)

where the outcome vector y either contains the observations of difmakends or unexpexp for both 2007 and 2010. We assume here that both X and Z are predetermined. The matrices

X7 and X10 contain exactly the same regressors, but they correspond to observations

for 2007 and 2010, respectively. Note that for our large data set we do not need the assumption of normality of the error term. The variance-covariance matrix Ω allows for heteroscedasticity, i.e. Ω is diagonal but the elements on the diagonal may differ from each other.

Now, performing a Wald test on γ, i.e. testing β10− β7 = 0, amounts to the Chow

(28)

part would indicate potential wage discrimination between the groups of study. However, the latter may also include the effects of unobserved variables which were not included in the regression model. Makepeace, Paci, Joshi, and Dolton (1999) extended the framework to investigate changes in the wage gap over time. In contrast, we will apply it here by comparing our household samples of both years for our indicators of financial hardship, defining the sample of 2007 to be the first group and the sample of 2010 to be the second group. In the following we outline the decomposition method which will be applied in the analysis; for a more thoroughly discussion of the various decomposition approaches and their corresponding estimation procedures we refer to Jann (2008).

To begin with, we define y7 and y10to be the dependent variable for the household sample

of 2007 and 2010, respectively (i.e. either difmakends or unexpexp) and we assume a linear probability model for each period. Suppressing the index for each individual, we can write

yyr = x0yrβyr+ uyr, yr ∈ {7, 10}, (3)

where xyr is the regressor vector (including a constant) which contains the same predictors

for both years, as before. Also, the error uyr has zero expectation. We then have for the

mean outcome difference R,

R = E(y10) − E(y7)

= E(x10)0β10− E(x7)0β7, (4)

where we note that the regressors are assumed to be stochastic, which has consequences for the estimated standard errors to be discussed later on. To examine the contribution of differences in the regressors over the years to the total outcome difference, we apply a two-fold decomposition by rewriting equation (4) as

R = E(x10) − E(x7)

0

β10+ E(x7)0β10− β7, (5)

where the first term refers to the contribution of the group differences in the regressors and the second term is the unexplained part. Note that the decomposition in equation (5) weights the unexplained part by the characteristics of 2007, for which we also could use the characteristics of 2010. In that case equation (4) is decomposed as

R = E(x10) − E(x7)

0

β7+ E(x10)0β10− β7. (6)

This is referred to as the index number problem; results may differ for the models in equation (5) and (6). We will estimate both decomposition forms in the analysis and examine to what extent this impacts the results. Focusing on the decomposition in (5) for now, estimation of the components of the decomposed model uses the OLS estimates

b

β7 and bβ10 obtained from the separate models shown in equation (3). Note here that we

again allow for heteroscedasticity which implies that White errors are computed for the

separate models. Furthermore, we estimate E(x7) and E(x10) by x7and x10, respectively,

which are the vectors of group means of the regressors. Consequently, the differential R can be estimated by

b

R = y10− y7 = (x10− x7)0βb10+ x07( bβ10− bβ7). (7)

In order to investigate significance of each component of the estimated decomposition in (7), the assumption of stochastic regressors becomes relevant. As also noted by Jann

(2008), this implies that both the corresponding group means xyrand the estimators of the

(29)

regression coefficients bβyr have sampling variances, which have to be taken into account

to obtain consistent standard errors for the decomposition results. Note that we assume

that xyr and bβyr are uncorrelated, which is true if the model is correctly specified. In

that case one can prove that the following holds for the variance of the product of two

uncorrelated random vectors x and bβ :

Lemma 1 : V (x0β) = E(x)b 0V ( bβ)E(x) + E( bβ)0V (x)E( bβ) + traceV (x)V (β) ,b (8)

where V (x) and V ( bβ) are the variance-covariance matrices for x and bβ. For a proof of

this result, see Jann (2005). In our analysis, as an estimate for V ( bβyr) we use the

esti-mated variance-covariance matrix obtained from the individual OLS regressions for 2007

and 2010. Also, V (xyr) can be estimated by X0X /n(n − 1), where X is the centered-data

matrix Xyr− 1x0yr. Furthermore, note that the last term in equation (8) vanishes

asymp-totically so it will be ignored in the computations. Now, if assuming independence of the

two groups8 and noting that the variance of the sum of two uncorrelated random variables

is the sum of the variances, then the variance of the two terms of the decomposition in the right hand side of equation (7) would be estimated by:

b V(x10− x7)0βb10 ≈ (x10− x7)0V ( bb β10)(x10− x7) + bβ100 n b V (x10) + bV (x7) o b β10, and b Vnx07( bβ10− bβ7) o ≈ x07nV ( bb β10) + bV ( bβ7) o x7+ ( bβ10− bβ7)0V (xb 7)( bβ10− bβ7). Note that in the first of the two equations above we used bV x10− x7 =V (xb 10) + bV (x7), which is a strong assumption because of the structure of our dataset; see the note below this page.

Looking again at equation (7), note that the explained part of the decomposition could also be split out for the individual contributions of each regressor, as the total explained differential is just a sum over the individual contributions. For the residual part this is only meaningful for variables that have a natural zero point; contributions to the residual part may depend on arbitrary scale shifts. For an illustration, see Jann (2008). Finally, we have to mention that the results for the individual contribution of categorical variables depend on the choice of the base category. However, as noted by Yun (2005), this can be solved relatively easy by restricting the coefficients of the categories of such a variable to sum to zero. This way, the choice of the base level does not influence the results; in fact, one computes a simple average of the decomposition results in which each category is used one after another as the base level. The latter approach will be applied in the analysis.

8

Referenties

GERELATEERDE DOCUMENTEN

 We present an application of the Borrmann effect in multilayer optics  We present first calculations for XUV filters with very high resolution  Process of deposition on

Andere argumenten met een negatieve invloed op de legitimiteit van dit project zijn de eventuele sociale druk en het gebrek aan controle op de representativiteit van

De leden die niet actief zijn in de gemeenschap vormen een probleem als er weinig of geen berichten worden gepost (Preece, 2004). Niemand wil tenslotte deelnemen aan een

Therefore we could state that if the possibility for switching is cancelled, the faculty could, in the most severe case, lose the well-performing students who choose

The expectation is that the three optimism measures have a negative effect on three year IPO performance, measured in buy-and-hold returns (BHAR) and cumulative abnormal returns

Door alleen de managers voor wie VNB daadwerkelijk relevant is te beoordelen op VNB, gecombineerd met verbeterde management informatie en heldere doelstellingen kan de

The managerial conclusion is that the European Commission needs to stimulate board gender diversity in Great-Britain and Ireland, because in these countries a more gender diverse

23 Using the subjective variance of unemployment risk as a proxy for earnings uncertainty increases the economic significance of precautionary savings.. Where the first