• No results found

Target-earners and theory-burners : investigating how transitory wage changes influence the decision to work

N/A
N/A
Protected

Academic year: 2021

Share "Target-earners and theory-burners : investigating how transitory wage changes influence the decision to work"

Copied!
33
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Target-earners and theory-burners:

Investigating how transitory wage changes influence the decision

to work.

Master thesis

University of Amsterdam – Faculty of Economics and Business Supervisor: prof. dr. Offerman

(2)

2

Summary

In this thesis, I describe an experiment to investigate the behavioural response to transitory wage variation. The direct inspiration for this was an article by Camerer et al. (1997) who found that cab drivers worked longer on days when their wage was lower. The explored literature is inconclusive on the sign and size of the elasticity in supplied labour time when encountering transitory wage variation. The experiment was unfortunately unsuccessful in reliably estimating the behavioural responses. Nevertheless, the subject should be more thoroughly and accurately researched, as it is one of the most basic variables on which our economy and understanding of economic agents is build.

(3)

3

Index

Summary ... 2 Introduction ... 4 Literature review ... 6 Experimental set-up ... 15 Result Analysis ... 19 Conclusions ... 28 Literature ... 29 Appendix 1: Instructions ... 31

(4)

4

Introduction

One of the main theorems in economics is that people work for money. Or in more economical terms, we are all mainly motivated by our wage to supply labour. Increase our wage and our labour supply will follow, is the general assumption in neoclassical economics. We might however behave differently, especially in the short term, as other variables influence our motivation: the weather, our health, the current status and quality of our social life, our intrinsic motivation for our job, the friendliness of our clients and co-workers, the list goes on. Measuring the change in labour supply as a result of a change in wage can thus be a messy job, especially when using data from the everyday, real world context, containing a lot of noise. Despite being a messy job, it is a very useful one. First of all, because economics should be able to explain one of its most central theorems and back it up with evidence, to strengthen economics as a science and our understanding of this science. Next to that, labour is a very central theme in economics, as many people have or want a job, or income. Being able to explain what and why people want this exactly is crucial when we want to base our policies on economic theory. Or as Farber(2008) puts it:

"There are important reasons to understand how these considerations affect estimation of labor

supply elasticities. Evaluation of much government policy regarding tax and transfer programs depends on having reliable estimates of the sensitivity of labor supply to wage rates and income levels. To the extent that individuals’ levels of labor supply are the result of optimization with reference-dependent preferences, the usual estimates of wage and income elasticities are likely to be misleading".

Most contracts in western societies hardly allow for transitory wage changes, but lately one can see a trend towards more flexible contracts and self-employment. For example, in the Netherlands some 12% of employees is now independent without employees (Dutch: zzp), and another 21% work on a flexible contract, versus respectively 8% and 15% in 2003 (Ramaekers,2015, percentages are from 2003 and 2013). We can also spot increases in for example flexible contracting via

smartphone apps, for example Uber. This is a taxi app that allows a driver to sign on and off as (s)he pleases, by simply (de)activating the app, providing absolute flexibility. This example is especially interesting since the seminal article on this topic by Camerer et al. (1997) also deals with taxi drivers.

People who are self-employed or on flexible contracts, might very well be represented by the theory we are looking for. There more applications one can think of. In more informal economies, contracts are hard to enforce if they even exist at all. Workers being hired per day or even per hour are no exceptions. For these economies, it might prove very fruitful to understand how people respond to these changes, so we can advise governments, businesses and workers to get what they want.

Taken all these arguments together, it seems more than reasonable to put serious effort in evidencing labour supply elasticity in a variety of contexts.

A few attempts have been made to establish short-term supply elasticities, which will be discussed in detail in the chapter Literature. One of the most influential of these studies is done by Camerer et al.(1997), who found an interesting contradiction of the main theorem. They found New York taxi drivers who seem to work less on good days, when they earn more per hour. Camerer et al. explained this data using behavioural economics, stating that taxi drivers make decisions one day at a time and set income targets each day. When they reach the target, they quit, resulting in quitting earlier when the wages are higher. This contradicts the neoclassical-based theory of

(5)

life-5 cycle models which assumes that people use substitution between points in time to maximize their utility of both their income and leisure. According to this theory, people should work more on a day with a higher pay and stop working early when the pay is low, so their moments of leisure correspond with the moments they earn least. This model should especially work well with taxi drivers, as they encounter "transitory" wage changes between and within days, yet little to none structural variation. Transitory changes are generally defined as wage changes that last only a short time and do not have a structural impact on income, but definitions do vary throughout the

literature. This transitory variation however should, according to the life-cycle model, influence when someone takes his leisure time if he has a choice, as the taxi drivers do. The contradiction of this model sparked more research and field experiments, but the results remain inconclusive as the results differ greatly. Farber (2005; 2008), who used mainly the same data as Camerer et al., came up with very different conclusions by using a different model specification. Also new data does not provide definitive insights so far: Fehr and Goette(2007) conducted a randomized field experiment using bike messengers to test the hypotheses stated by Camerer et al., by temporarily increasing the wage of the messengers. They found some evidence which contradicts Camerer et al., though their set-up was slightly different.

Given the impact of such differences on the results, a lab experiment seems called for. In a lab experiment the wage variations can be controlled with great precision and many noise variables can be excluded. In such an experiment, we vary the wage, both within days and between days, and the job can be made very consistent. A more detailed description can be found in the chapter Experimental Set-up. Using such a lab experiment, we can closely monitor the effect of wage on labour supply. Furthermore, a highly influential variable in the research of Camerer et al. seems to be the investment which cab drivers have to make at the start of the day, week or month, as they rent the taxi. Investment is suspected to have a strong effect on their target-setting behaviour and has quite an influence on the model specifications. Therefore, this should also be an independent variable in the experiment next to the wage.

Analysing the data of this experiment should yield a more reliable estimator of the wage effect on labour supply, though some caution will remain necessary as real life situations might be different and could contain interdependent variables which are not accounted for in the experiment. That being said, within this thesis the data from the experiment will be analysed to conclude on the effect that wages have on labour supply. The role that investment plays within labour supply decisions will also be investigated.

(6)

6

Literature review

In this thesis I will start to measure behavioural responses to variations in wages. To properly measure such responses, it is vital to clearly define both the source and nature of variation, as well as the relevant response. To make sure the measurements are empirically sound as well as

theoretically relevant, I will present a theoretical framework in which these measures are generally operationalised. This will provide guidance for both a proper definition, as well as how to shape the experiment around the measurement. The theoretical framework is mostly based on models as described by Friedman and MaCurdy. I will also give a detailed overview of the attempts so far to empirically estimate short-term labour supply elasticity, the most seminal of which is the article by Camerer et al. (1997).

Origins of labour supply models

The model referred to in most estimates of labour supply elasticity is the life-cycle model (LCM). However, the roots of the LCM lie in the Permanent Income Hypothesis. The Permanent Income Hypothesis(PIH) is the theoretical backbone of most current income and consumption theories, being coined as term and explained as a model by Friedman (1957). Friedman developed it as a response to the consumption expenditure function as developed by Keynes (1937), around the time also Modigliani and Brumbergh (1954) came with their life cycle theory of consumption. Though contested with regards to empirical evidence, many economists assume the PIH to be theoretically relevant, and use it as a starting point for their own models. The PIH mainly states that income consists of two main parts: permanent income and transitory income. The literature shows a lot of discussion on what is exactly permanent and what transitory, which I will elaborate on later. The differences in income are used to explain differences in responses to income changes. The household or consumer is expected to have some meaningful notion of permanent income, which would be the expected or average long term income. On the other hand, there is transitory income, which are short term deviations from the normal trend in income. The main statement of the PIH is that consumption levels are based on permanent income, rather than transitory income. This means that transitory income variations (both gains and losses) are spread out over the household's lifetime and thus have a limited impact on day-to-day consumption.

This last implication positions the PIH as an extension to Keynes' (1937) consumption models. These assume a constant Marginal Propensity to Consume and use only current income, largely disregarding future income. This makes it hard and seemingly irrelevant to the analysis of labour decisions, which are generally made with at least one eye towards the future. Friedman's PIH on the other hand predicts that windfall income is treated differently than permanent income, i.e. has a different MPC and is spread out over a longer term, and thus transitory changes have limited direct effect on consumption and demand.

From the disagreement of two highly esteemed and oft-cited economists, we can already see the importance of these basic assumptions and specifications. The implications for government policies are also evident: whether during hard times the government should spend all they can or cut taxes (Keynes) or whether that might not be so useful (Friedman). This latter debate has resurfaced the past few years as economists, including renowned economists such as Krugman (2014), have urged governments to increase stimuli to combat the global recession, referring back to the original works of Keynes.

To be able to differentiate between transitory and permanent income, it is important to clearly define both concepts. Unfortunately, there is a lot of discussion on this in the literature; even Friedman himself has given different definitions. Chao (2003) gives a clear overview of the

(7)

7 development of Friedman's theories in his own work and in the work of others. Starting with Friedman's own work, Chao identifies three definitions of permanent income as Friedman's work evolved over time and gives the following overview:

F1 Permanent income is affected by the permanent components whose effects on income last more than a certain period of time (Friedman and Kuznets, 1945)

F2 Permanent income is the amount that a consumer unit can consume while wealth remains intact (Friedman, 1957).

F3 Permanent income is estimated by a weighted pattern of past income (Friedman 1957,1963b)

Chao also gives the definitions shown by Mayer (1972), who in turn extensively reviewed Friedman's work:

M1 Permanent income is whatever the household thinks it is.

M2 Permanent income is equal to the household's wealth times the relevant discount rate. M3 Permanent income is an exponentially weighted average of past incomes plus a trend. In his work, Friedman also often differentiated between permanent and transitory income in a more econometric perspective, by describing permanent income as the measured income, and transitory income as the difference between actual and measured income(Friedman, 1957). All in all, quite some discussion has taken place regarding the definition of permanent versus transitory income, also in more recent work. There is no undisputed definite answer, and the definition used is often derived from personal taste or the scientific context. This in turn implies that who wishes to use these terms owes it to his or her readers to state the definition as it is operationalised in their model or theory.

Recent adaptations

Friedman's work has been adapted and expanded by many people. Most relevant for the subject of this thesis are the contributions of MaCurdy (1981) and Lucas & Rapping (1969). Starting with Lucas & Rapping, their main aim at the time of writing was to reconcile two models describing the relation between labour supply and wages. The first model is the neoclassical growth model, stating that labour supply is only moderately influenced by wage, as labour is an inelastic function of the real wage. The other model, from the short-run Keynesian employment theory, is that labour responds strongly to wages: labour is infinitely elastic at a mostly rigid wage(Lucas &Rapping,1969). Lucas and Rapping attempt to support their found synthesis with empirical evidence, which might be their strongest point, or at least the most admirable. In fact, they state after mentioning the lack of empirical evidence for the aforementioned models: "Despite this lack

of evidence, economists have found it necessary to proceed on the basis of certain widely accepted assumptions".

Lucas and Rapping conclude with an intertemporal model to reconcile both approaches. The model is mainly designed for the aggregate level, but founded on microeconomic principles. As the

complete model is not directly relevant to the subject at hand, I will only describe the parts concerning the labour supply. Nevertheless, I strongly recommend anyone to read their complete article, as it is a very thorough and accurate.

The model describes the intertemporal substitution function, between current goods and leisure and future goods and leisure. Labour supply (or the choice for goods over leisure) depends mostly on prices, current and expected real wages. Expectations around the real wage result in

(8)

8 differentiation between permanent and transitory wages. The "permanent wage" is seen as the normal or expected wage. The market sometimes deviates from this wage, resulting in a transitory wage, which is expected to soon return to the normal, permanent level. When the transitory wage is above the permanent wage (unusually high wages), the model predicts and confirms increases in labour supply, and vice versa. Do note that again, the terms permanent and transitory wages are somewhat different from Friedman, despite references to his work.

Much like Lucas and Rapping, MaCurdy(1981) positions his article as an extension of Friedman’s permanent income hypothesis. He also describes an intertemporal substitution model with as main variables consumption, wages and leisure. However, MaCurdy extends this from basically a two-period model as in Lucas & Rapping to a multi-period model, resulting in the life-cycle model, focussing on the labour supply decisions within this model. This describes the rise and fall of wealth during a consumers lifetime, and the variation of wages and leisure time during that lifetime. It basically states the well-known pattern of consumption smoothing: consumption levels should vary little on a short time scale. As income starts low, grows to a maximum, and then stops at the time of retirement, consumers would optimise their lifetime utility by borrowing at the start of life, then paying off and saving as their income grows, and start dissaving as they retire. This means one consumes well above income during the first and the final periods of life, and consume below income during their working life. Interesting to note is that MaCurdy emphasizes the

potential of a testable labour supply model within a life cycle model, but does not refer to Modigliani and Brumbergh (1954) who already had a strong focus on the life cycle setting when they presented their adaptation of Keynes' models.

MaCurdy sets out to make empirical estimates of the elasticities of labour supply in response to wage increases. Though there are quite some assumptions and transformations necessary to obtain a testable equation, he ends up proving with annual wage panel data that most elasticities are small, but positive. So when the wage profile along the whole life time is shifted up (wages increase for someone’s entire life, say due to an increase in education), labour supply increases, as well as elasticity for transitory increases of wages. Also for the year-on-year growth of wages a positive elasticity is estimated. Only when the steepness of the whole wage profile increases, the results are ambiguous: there is a decline in labour supply early on and an increase in later periods.

Like the permanent/transitory wage article of Lucas and Rapping, this article mostly supports (semi-)rational intertemporal substitution, or simply a positive relation between wages and labour supply. MaCurdy uses annual panel data, just as Lucas and Rapping, to estimate relations between wages and labour supply. Nevertheless, this overview of literature does provide evidence for the positive relation between wages an labour supply in the medium to long term. They also supply interesting models to test intertemporal preferences. What remains challenging is the estimation of expected wages, and insights into short term labour supply.

Lucas and Rapping did find evidence for the positive relation between wages and labour supply in their panel data of the US labour market. Also similar econometric articles generally find a positive relation in panel data, see for example Altonji (1986). However, this is generally at least quarterly and often annual data. This implies that these models might be valid for medium to long term data, but show little evidence for the short to very short term. Despite this lack of evidence this type of models is used to support policy decisions and is part of the economic theory knowledge base. Given these applications, they require empirical validation for the short term just as well, as Lucas & Rapping pointed out.

(9)

9

Experimental approaches to labour supply modelling: Camerer et al.

There have been a few attempts to establish the relation between wages and labour supply in the short term. Most of these attempts have been (natural) experiments. The seminal article on experiments with transitory wage variations is done by Camerer et al. (1997). Camerer et al draw mostly from MaCurdy (1981) and Lucas & Rapping (1969) to establish their theoretic model. They mostly accept the life cycle model as the theory they want to test, specifically for transitory wage variations. Though Camerer et al. do not explicitly define their use of transitory, they imply that they use it for inter-day variations. This would mean that the transitory wage is the difference between the current wage and the average wage over some (multi-day) period.

To test this theory, they use data of New York taxi drivers. They collected 3 samples of trip sheets from two companies to create a database. From there, they establish the difference between “good” days with high wages, and “bad” days with relatively low wages. In practice, the latter means taxi drivers spent more time waiting or cruising to find fares, thus making less money per hour. Note that revenues per kilometre, per time unit and per trip start are constant, and wage differences are only derived from the time between fares. Since the trip sheets show the amount and value of fares, as well as start and end time of a cab driver, Camerer et al. have all the

information they need to test the difference between good and bad days on time worked, i.e. labour supplied.

They also assume and empirically confirm autocorrelation within days, but hardly between days. This means that an experienced taxi driver hardly knows in advance if it will be a good day, but can tell soon after starting. Furthermore, the earnings structure has a few different opportunities, determined by ownership of the taxi: some drivers own their own cab (including the registration medallion, which represents the vast majority of the value), whereas others lease it per month, week or even day. For drivers renting per day shifts are by law maximum 12 hours long, but drivers are free to stop before that time.

By regressing hours worked on wage earned(total revenue divided by hours worked), and using an instrumental variable of the average daily wage of other drivers to control for the division effect, Camerer et al. find rather strong negative elasticities in two samples. When excluding driver fixed effects the third sample also shows significant negative correlation, but this becomes insignificantly different from zero when fixed effects are included. When differentiating for experience, the

elasticity for drivers with little experience are even close to -1. However, in two out of the three samples, there is a significant and large difference between drivers with little or a lot experience, where the experienced drivers have much higher elasticities. When differentiating between

payment structures an interesting picture shows: those who rent per day have an elasticity which is not significantly different from 0, which might be caused by the truncation at the maximum of 12 hours. For drivers who rent per week or month, or own the cab, the elasticities are however close to -1.

Though there is quite some variation in elasticities, depending on the specification of the estimated model, the general tendency is negative, often even strongly negative. This leads

Camerer et al. to look into the reasons why the elasticity might be negative. Their main conclusion is that drivers use a one-day time horizon to make decisions. This means that they set a (varying) daily income target, and quit once they reach it. When they make more money, they reach the daily target sooner and thus quit earlier on in the day. The daily target would be somewhere around their average daily wage, and might be a function of rental fees or a simple round number, according to Camerer et al. This is consistent with the finding that drivers who rent daily have a

(10)

10 lower elasticity, as the lease price is a likely reference point, and their lease fees are generally higher. This means they have to keep working longer on each day to reach their target, resulting in less sensitivity to changes in wage. According to Camerer et al. negative elasticities are also

consistent with their finding that experienced drivers have a higher (less negative) elasticity. As experience increases, they learn to use a longer horizon, or a different rule (such as fixed hours) to increase their income and/or reduce their average working hours.

The article of Camerer et al. is quite contested. First of all, it already shows in their own analysis that although the data context seems perfect for estimating elasticities of short-term variations in wages, the quality of the data is suboptimal. They use a number of dummy variables which do not vary for some of the samples, such as rainy weather and weekend days. Furthermore, the data is quite sensitive to specification: the driver fixed effects has a strong influence on the significance of estimated elasticities. Experience and payment structure also have a lot of influence, although they use it to support their hypotheses. In the end, they do not directly give empirically supported reasons for the existence of their hypothesis of a daily target. They do refer to concepts of

behavioural economics such as mental accounting and loss aversion as described by Benartzi and Thaler (1995), but these have not been extensively applied to similar circumstances. Possibly the biggest problem they face is autocorrelation of intra-day wages, which is strongly contested by Farber (2005), but vital to their theory. This brings us to the reviews of the article.

There have been several reviews of the article, as well as similar exercises in different contexts. Farber(2005;2008) wrote two reviews, followed by Crawford and Meng (2011).

Farbers(2005) first review is not so much just a review as it as a complete contradiction of both the findings and hypotheses of Camerer et al. He assumes a life cycle model as well, and derives the positive labour supply elasticity from this model. However, in the same and similar data as Camerer et al., he finds no autocorrelation within days, but does find correlation between days. Basically, there are certain parts of the day that are always busier than other moments, and the variation between busy and quiet moments on the days is larger than between busy and quiet days. From these fundamentally different assumptions, he formulates a probit stopping-likelihood model. Farber does acknowledge some negative elasticities, but since he found low

autocorrelation and large intraday variations, he thinks this cannot be considered parametric to the labour supply decision. In Farbers probit model, the influence of worked hours on the stopping decision is larger than earned income. When both variables are used, the influence of accumulated income is rendered insignificant.

Farber also recalculates the estimates using only the data of Camerer et al. Again, he finds little autocorrelation, due to his different specification of wage. Farber specifies wage basically as money made in that specific clock hour: he first divides the hour into minutes, than calculates money made for each minute. He uses 0 during breaks and waiting time, and the fare divided by the minutes it took to complete the fare for the wage during fares, and then sums each wage per minute to arrive at the hourly wage. Camerer et al. use a much rougher estimate: total time worked per day divided over total revenues made per day. However, to estimate autocorrelation, they use roughly the same method: Farber the same one as used for wages, and Camerer et al. calculate an hourly wage by taking the median revenues in trips for each hour of all the drivers working at that particular hour on that particular day. However, though the articles do not

explicitly describe the calculation of autocorrelation in full, it seems that Camerer et al. indeed use each separate hourly wage of each day as a separate data point, whereas Farber uses that data only to calculate wage variance, but not autocorrelation. For autocorrelation, Farber uses only the aggregated order of hours. In his wage variation analysis however, it does show that just including

(11)

11 the date (so not day of the week, but specific date) gives an R2 of 0.15, which is about the same as including day of the week, hour of the day, driver fixed effects, and weather all together. This different approach to calculating the autocorrelation is likely to be the reason for their

disagreement, as it has a huge effect on the assumption underlying the rest of their calculations. Farber finds a relationship between hours worked and stopping likelihood for experienced drivers, but not for inexperienced ones, which points again in the direction that experience changes the earning behaviour and wage elasticity of drivers. Interesting within Farbers specification of the life cycle model is that he assumes a low estimate of the time preference indicator(θ), meaning that cab drivers are not very impatient, based on drivers participating in long term contracts, such as apartment leases.

Farber(2008) revisits the problem three years later. He uses the model around stopping

probabilities again, but now allows in his model for a reference-level dependent utility decrease. He specifically tests whether there is a target (which would be the reference level) below which utility for earnings is higher, due to loss aversion relative to the target, and above which the utility decreases with a kink. He bases this framework on the prospect theory of Tversky and Kahneman (1991). Before estimating his model using the same data as for the 2005 article, he formulates four requirements which the model estimates must satisfy:

1. The continuation probability strongly declines after reaching the reference level.

2. The spread of reference levels within drivers (and thus between days) is relatively small. 3. The reference level is often reached, and thus;

4. The reference level is a strong predictor of average daily income.

Testing for a reference level, he finds that the first requirement is usually met, even to the point that Farber states that drivers most likely will stop as soon as the reference level is reached. The second requirement is not fulfilled: the variance of reference levels both between and within drivers is large. Between drivers makes sense, but within drivers, Farber finds an interquartile variation implying a driver with average reference level of $250, ranges day-to-day from $188 to $311. This is quite a large variation, even when accounting for daily variation due to alternative activities and varying lack of self-control or motivation to keep going up to the target.

The third and fourth requirements are somewhat ambiguous. The reference level is reached, but not often, an estimated 34% of the time. Though this is inconsistent with the loss aversion frame Farber uses, it might be congruent with assumptions around lack of self-control. The fourth requirement is even more ambiguous, as the coefficient of the reference level in a regression on income is about 0.408. So there is certainly a relation, but it is not extremely strong, especially since that regression only has an R2 of 0.07. Farber notes in his estimation process that he was not able to allow for both driver-fixed effects and a driver-specific reference level, due to data

restrictions. However, he did attempt to estimate his model with separately fixed effects and specific reference levels, which both produced similar results as above.

Interestingly, Farber’s later article with a reference-dependent level is the first attempt of an estimation of labour supply elasticity with an explicit target or reference level. Farber sees his results as indicative of a smooth utility function, without a kink at some reference level. He sees the large within-driver variation as particular evidence of this reference level. However, an important note that Farber mentions during his introduction but not in his conclusion, is that reference levels are almost always indirectly derived. Also, his data is collected over a long period, with an average 35.8 shifts per driver over the whole period of approximately a year. Farber notes that due to the unsystematic data collection, he does not know whether drivers did or did not

(12)

12 work on days for which no trip sheets are provided. It could thus very well be that drivers work infrequent, or in a highly variant manner. This could make sense for drivers for who the taxi might be a way to make some extra money when the money from their other job(s) is tight. This could quickly increase variance in both targets and general driving behaviour.

Crawford and Meng (2011, CM) attempt to reconcile the findings and model of Farber with those of Camerer et al. They do so by using a framework of Köszegi and Rabin (2006,KR). In this

framework cab drivers follow both predictions of standard utility theory on consumption and leisure, as well as the loss-averse target earning behaviour as described by Camerer. KR do so by formulating targets for both income and hours, and assuming loss aversion around both targets. CM state that the target which is reached last, has the most impact on stopping probability. This does imply that if a wage is high during a whole day, the income target is reached sooner, and thus the hours target will actually mean that the driver will stop. On the other hand, when wages are low, the hours target is easily reached, but the income target is what actually makes a driver stop. This theory could indeed be consistent with Camerer et al. when assuming two targets: income and hours.

CM, using the same data as Farber (2005;2008), found that the second target always has the most influence, which is sometimes income (when wages are low) and sometimes hours (when wages are high). This is in line with their reference-dependent model with homogeneous preferences, but contradicts a neoclassical model in which a specific good or event always must have the same influence, so reaching the income target should always have the same influence. They also use the results of KR to state that an early unexpected high wage will reduce both effort and likelihood to work later on in the day in all cases, since one of the targets is already reached or likely to be reached, but that an expected high wage later on will cause a higher likelihood to work than a low expected wage.

CM conclude that the signs of the econometric estimates are congruent with their expectations derived from the reference-dependent model, which they see as an indication that a better database would yield conclusive evidence for a reference dependent model with targets for both hours and income. They also conclude that their findings are both consistent with the claim from Camerer et al. (1997) that the data contradicts neoclassical models, as well as Farber’s finding that a kink in the utility curve is unlikely, based on his data.

Both articles from Farber (2005;2008) and of Crawford and Meng (2011) critically review the article of Camerer et al. (1997), mostly using the same or similar data. There are however also two other notable articles, which set out to do the same as Camerer, but in completely different contexts.

Experimental approach of labour supply modelling: new experiments

The first experiment is by Oettinger (1999), on stadium vendors. Vendors participate in a sort of pool and can sign up to work at a certain game, i.e. determine their labour supply quite flexibly. They do not get disciplined for not showing up, and are always accepted into the stadium if they show up without announcing. The schedule they sign up to is thus a quite loose schedule.

However, the subcontractor who is in charge of organizing the vendors does attempt to persuade vendors to show up for a game if participation is expected to be (too) low. Their wage also varies, as the amount of public to sell to varies with the popularity of participating teams. They get paid on pure commission basis. Different from Camerer et al. though, the vendors make a decision to participate, but do not make a decision on the amount of time they spent, they will always stay for the whole game. Using this data, Oettinger expects to make a reasonable estimate of participation elasticity based on expected wage. He does note that there is competition, so the expected wage goes up with higher attendance (and thus higher demand) but also creates a higher supply of vendors, which again reduced expected wage.

(13)

13 Oettinger shows that vendors can reliably estimate expected earnings. He also shows that this strongly influences their participation decision: he finds finds a positive elasticity of some 0.55-0.65. Oettinger criticizes the data used by Camerer et al. for not having a clear estimate of demand shifters. He sees that demand shifters, especially if they are somewhat predictable by suppliers, have a strong influence on the sign and significance of the elasticity estimation.

Some points of self-criticism are also noted by Oettinger, for one that he can not estimate the effort by vendors, which might fluctuate rather strongly. He also notes that the vendor context is a quite specific one, from which it might be hard to generalize. Another issue is that it seems

Oettinger's main measured variable is the participation rate, rather than labour supply time. This is as vendors can choose as they want to show up or stay away, but stay for the whole game. Despite Oettinger's failure to mention this, it might actually also be relevant to the study by Camerer et al., where cab drivers also make a participation decision to either rent a cab and drive or not.

Unfortunately, this variable is hardly analysed as such by either Oettinger or Camerer et al. Except for touching upon it during the estimation of (auto)correlation, they both make some assumptions regarding the participation decision. Camerer et al. do mention that cab drivers generally have a good sense of the wages that day, or will soon develop a good sense. However, the information used in this decision is unclear and not incorporated.

The other experiment is an actual field experiment, as opposed to the natural experiments by Oettinger and Camerer et al. Fehr and Goette (2007) conducted a field experiment amongst bike messengers. The bike messengers get paid only a commission, and Fehr and Goette divided the messengers in two groups, who both got a raise of 25% for 4 weeks in their piece-wise rates. Fehr and Goette note that it is important that it is only a one-time rise, and thus impact of lifetime wealth is negligible, which makes them define the increase as transitory. Also, the raise is announced and thus fully anticipated.

The messengers can sign up for a shift of 5 hours, and they work only one shift a day. During this shift, they are offered deliveries via their portable radio. They can reply to this offer or ignore them, which allows for discretion on the effort level during a shift, just like cycling faster or slower does. However, they are expected to be on the radio for the 5 hour period. In that sense, the labour time supply decision is again more of a participation decision like Oettinger (1999) than a labour time decision as in Camerer et al. (1997) in which a worker can stop at any moment. The experiment shows that the intertemporal elasticity of labour supply (time as well as effort) is between 1.12 and 1.25, so strongly positive, contrary to Camerer et al. What is quite unique is that they have a fair estimate of effort: parcels delivered and responded to per hour. They found that effort decreases when the wage increases, so total increase in supplied labour time is even higher when holding effort constant. Subsequently, they redefine the decrease in labour time supplied in the data of Camerer et al. to be due to effort decrease, which would align the two studies. Would, as they do not address the notion that Camerer et al. expect that driving in a high wage context (many fares) is in fact easier for drivers, and thus requires less effort. This detail makes comparing across contexts significantly less straightforward.

Fehr and Goette hypothesize that bike messengers are generally loss averse (supported by a questionnaire). They continue that the loss aversion might cause messengers to work up to their target level, but not beyond it. During the increased wage period, this would cause them to apply less effort, yet incentivize them more to come to work in the first place, which would explain the increase in supplied labour time.

The field experiment by Fehr and Goette shows clear data due to their experimental set-up, which clearly shows the wage variation. However, the rather large (25%) and clearly announced and

(14)

one-14 period wage increase makes it quite different from the daily wage fluctuations in the data of

Camerer et al. And although the messengers discretion in supplied effort is large and well-documented, the discretion in time supplied is more similar to a participation decision as

Oettinger. This makes their field experiment very useful, yet hard to directly compare to Camerer et al.

(15)

15

Experimental set-up

As shown in the literature section, a few investigations into elasticity of labour supply have been done. This thesis aims to improve mostly and firstly on the research done by Camerer et al. (1997) as this was the most seminal paper, with the most striking result of a strong negative elasticity. For this reason, the set-up of the experiment is aimed to imitate the taxi-driver context of Camerer et al. as much as possible. However, by doing a lab experiment, much of the noise around the data can be filtered out. This allows for investigating the underlying principle sought after by the authors mentioned in the literature section. One quickly realizes that their varying contexts contaminate their comparability. They also use fundamentally different approaches, ranging from field-experiment to basically panel data. Camerer et al. assume and show by means of econometric analysis an identifiable difference between "good" days and 'bad" days, as they find a reasonable amount of auto-correlation within days. However, Farber (2005) opposes this, using mostly the same and similar data. Such a basic premise of a theory must be almost indisputable to make sure the results are trustworthy. Fehr and Goette (2007) already come closer by inducing a 25% increase in wages. However, this increase is fully predictable, anticipated and temporally defined. This is again different from Camerer et al. who assume an ex-ante mostly unpredictable variance, and over much shorter time spans (days rather than weeks), as well as frequent, rather than just once. This affects whether one is researching a participation decision or a direct wage variation response. In this respect, it is important to make clear what we wish to research. This leads us to the most fundamental question all these authors and I try to answer: Do people increase their labour supply

if they encounter a transitory increase in wage?

The accompanying hypothesis is based on current economic assumptions, which assume that people work more if they get paid more:

H0: When people encounter a transitory increase in wage, they will increase their labour supply, both in hours and effort.

Labour supply is captured in the literature in roughly three ways: time, effort, and participation. I focus the experiment on time and effort, as these seem to be the main issue across all papers, even though the participation decision is often framed as a labour time supply decision in said papers. To get the best measurement of the effect of wage on both dimensions, subjects in an experiment must be completely free to choose whatever they want within those two dimensions. Since participation is not a part of the experiment, or to be precise not one of the measured variables, subjects must be informed to come to the lab each day, and to free up their agenda's for the day. This way opportunity costs vary as little as possible between and within subjects. Ideally we keep subjects in the lab even if they choose to stop working, though their time off must be clearly quality leisure, which might be perceived differently in a lab than at home. To measure the two dimensions, I need to split the research question into two sub-questions:

Do people increase their supply of labour time if they encounter a transitory increase in wage? Do people increase their supply of labour effort if they encounter a transitory increase in wage?

This distinction is important, as from the literature we find different final conclusions based on the size of the effect in both variables. Based on Camerer et al. (1997), the first alternative hypothesis opposing standard economic theory is a decrease in supplied hours and total supply:

HA1: When people encounter a transitory increase in wage, they will decrease their supply of labour hours, holding their effort constant, decreasing their total labour supply.

(16)

16 However, the research by Fehr and Goette (2007) shows another option, contradicting both

standard theory as well as Camerer et al., as they expect the effort to fall, yet the total effect is expected to stay in line with standard economic theory:

HA2: If people encounter a transitory increase in wage, they will increase their supply of labour hours but decrease their labour effort. In total, the labour supply will increase.

Related to the participation decision is the prerequisite that wage variation can not be reliably forecasted ex-ante, and subjects must thus always participate to find out what the wage is. Since the anticipated horizon of Camerer et al. is only one day, the wage must vary between days. If one would try to mimic the context of Camerer et al. completely, within-day variation is also necessary. Adding within-day variation however does add noise around the labour supply decision by

introducing uncertainty into the decision. Therefore this falls outside of the scope of this thesis. Nevertheless, it might make an interesting extension. Important to note here is that I choose to assume a high within-day correlation, even though Faber contests the existence of auto-correlation in the data of Camerer et al.

To establish a purely financial motivation, the task for the subjects should be fairly simple. Simple enough to not create enough satisfaction to do the task just for the fun of it, yet complicated enough to require effort to continue the task and thus create an incentive to stop working. Some experimentation needs to be done to find the sweet spot for this. However, as long as the task is not as tedious or satisfactory to induce extreme situations with little variation within subjects, the data of such an experiment should be useful.

Since the labour supply generally finds place within a familiar context, of which the average wage is known, subjects must be informed on this average wage. Either directly by telling them the average wage, or by getting them adjusted to the average wage in the first few rounds of the experiment. The wage should always be piecewise, to make sure it is necessary to apply some effort to get paid. This also increases the quality of data with respect to the exact moment of stopping. If subjects get paid for doing nothing, they might already mentally switch off before they actually stop the task.

A salient issue in Camerer et al.(1997) seems to be the difference between taxi owners and

renters. Unfortunately, there is quite some variation in this aspect in his data and estimates of this contract specification are not necessarily trustworthy. The extra data that Farber(2005) used is even worse in this respect, as it has only weekly rental drivers. In the other experiments

investment to be able to work is usually no relevant issue. However, a fixed investment might make a target much more salient. To measure the effect of investment on the labour supply decision, I will require an investment in some groups. This should reliably estimate the influence of an investment decision on the height and influence of a potential income target and answer the question:

Are people more likely to work up to an income target if they have to invest to be able to work?

Extrapolating from Camerer et al., I do indeed expect investment to increase the tendency to use an income target, leading to the last hypothesis, which also contradicts standard economic theory:

HA4: The treatment group with the investment requirement will show a stronger tendency to set an income target and stop when that it is reached than the control group.

For this thesis, I use only limited time per subject: half an hour. I recruit subjects at Wageningen University, by asking random students walking by. Furthermore, there are clear financial and temporal constraints for a Master’s thesis, so the group of subjects will be limited. This means that

(17)

17 normality of the data is unlikely. Due to me personally asking students to participate, there is a huge risk of a selection effect. It is also quite likely that a lot of students, having experience with having to find subjects for their own experiments, might just be doing me a favour, rather than being motivated by the wage. Finally, there might be a huge unobserved variation in motivation and opportunity costs between subjects. For all these reasons, the data of this experiment is to be treated with the utmost caution. Even an indication of the sign of elasticity is to be taken as an incentive to run a more elaborate lab experiment, rather than taken as proof.

Practical set-up

The actual experiment will be both within-subjects and between-subjects. Variation in wages will occur for all subjects and will thus be within-subjects. Between-subjects tests will be the influence of investment by assigning subjects to either the treatment (investment) group or control (no investment) group. To control for order effects, there will also be 2 groups for either high wage(€0.20/100 strokes) first or low wage (€0.10/100 strokes) first, after starting with the

adjustment round (€0.15/100 strokes). This gives a total of 4 experimental groups as shown in table 1.

Table 1: Four experimental groups

Group 1 Group 2 Group 3 Group 4

Investment no yes yes no

High first no no yes yes

Random allocation of subjects will be done through order of entering in the experiment, since there is little to no reason to believe a structural bias as to which moment students enter the university. Some time-of-day variance might however occur, for example variance in energy levels. I will not control for this.

Upon entering, subjects will be placed behind computers, on which I run the experiment using PyschoPy, by courtesy of Peirce (2007, 2008). They will receive the general instructions on their screen, explaining that they will be asked to imagine the three coming rounds as separate days. It will be made clear to them that wages will vary between days, though it will not be made specific how much. The task will be explained to them: alternately pushing the letters A and L. The

instructions will emphasize that they can choose to stop working at any point during a day, by pushing a “stop working” button. They can then do whatever they wish: read a book, browse freely and unmonitored, or use their phone. However, at each round, the working page will pop up again, and they will have to make a new decision to work or not each time. The treatment group will be explained that they will have to pay a fixed fee of €2 before they can work each round. To increase saliency, the necessary (provided) investment for all rounds will be next to their computer, and I will take the investment away from them. By not giving them an option I can be sure that all participants in the investment group do indeed "invest". Ideally, one would have enough

participants to allow them a choice without jeopardizing the minimum amount of data needed. It will be made clear that they can easily earn more than this fee to ensure at least some choose to work. This also requires a rather low fee compared to the wage.

After these instructions, round 1 will start. Subjects can start “working” by alternately typing “A” and “L”, which is recorded by the computer. The current wage and cumulative income of that round will show on the screen. By pressing the space bar a subject can choose to stop working and start leisure, to accurately record work versus leisure time. This option is continuously displayed. After 7 minutes, the day will stop and subjects will shortly be shown their total earnings for the

(18)

18 past day. To ensure some consistency in opportunity costs, students will have to sit out the full 7 minutes in any way, also if they reach the maximum before that time. The maximum is set at €2,50 per round in earnings, which from the pilot should be high enough not to be reached consistently. Also, assuming a 400-500 strokes per minute expected average speed, this will take subjects less than 4 minutes on average, and at least 5 minutes for the low wage condition, which means some participants might run out of time, and there is not too much margin.

When the 7 minutes are completed, either by waiting or working, the next day will start. To make sure subjects start reconsidering working again, the page will pop-up over other opened (leisure) windows.

This will be repeated in total three times representing the three days. At the end subjects will see their total earnings and be paid accordingly. I will also offer them a chocolate bar as a more "personal" token of gratitude. I will also register sex and age of subjects for some background statistics and do a short debriefing with subjects in the form of an unstructured chat.

I ran a pilot with three friends successfully before going to the university to find actual subjects. I ran the pilot to make sure the instructions were clear and the wage appropriate. By also recruiting a friend with a history of RSI/CANS I got some reassurance that the task would not cause an irresponsible strain on subjects, despite its highly repetitive nature. After the first pilot run I adjusted the wage downwards to decrease the likelihood of subjects reaching the maximum as well as to increase the necessary effort to earn a decent wage.

(19)

19

Result Analysis

General descriptive statistics

Twenty participants were recruited, but as mentioned in the experimental set-up, this number is quite low to achieve a fully normal distribution. From the descriptive statistics in table 2 we can see that the male/female ratio is normal, at 55% (11) women. The average age is somewhat high for students at 24.1, and has a somewhat wide, though not exceptional distribution. What is however highly exceptional is the mean of total earnings at €6.70, and the median at €7.50. The latter is the maximum possible total earnings, as participants could earn maximum €2.50 per round. We can immediately see in figure 1 that the total earnings are all centred around this maximum. That is still including one extreme outlier, who started the experiment, but after a minute or so said she did not want to work any more, as she thought the task was too simple and the wage too low. As she is a valid data point (the wage was just too low for her to work at all), I did not remove her, but she is very influential on the data. Both this one outlier as well as the strong concentration around the maximum already indicates that the data on earnings, and thus worked time (as the experiment stopped in each round when a subject reached the limit of the round), are heavily censored.

Table 2: Descriptive Statistics

female age total_earnings

Worktime adjustment

round Worktime high wage Worktime low wage

Mean .55 24.10 6.7210 234.15 164.05 292.35

Median 1.00 23.50 7.5000 238.50 168.50 314.50

Std. Deviation .510 2.732 1.74112 58.525 48.160 110.882

Range 1 10 7.40 225 238 415

Normally, assuming amongst others normally distributed data, an experimenter could use a two-way ANOVA to test simultaneously for the effects of the two between-subjects variables:

investment and order. This would also include a potential interaction effect. One could also employ t-test to test for effects between the different groups, especially when splitting on different

variables, but one generally loses power when doing so. Employing a factorial ANOVA when also including for example gender would probably be optimal. One would primarily run this on labour time and effort. Also running it on earnings is in the ideal set-up of this experiment superfluous, as effort is calculated as keystrokes per minute, which means that effort, time and earnings are all three only calculated on the actual variables keystrokes and labour time. However, for my data, using all three variables seemed useful, as effort suffered less from censoring than the other two variables. For the analysis of this specific experiment I will almost exclusively rely on

non-parametric tests as the assumptions of non-parametric tests are generally violated. More specifically, earnings and labour time supply are right-censored.

(20)

20

The solid line being the normal distribution, it is immediately clear that this is not normally distributed. The utmost right bar captures all maximum earnings of €7.50

We can see in figures 2a-c that supplied time is relatively normally distributed. Note that the utmost left bar is caused by the aforementioned outlier. Most participants reached the maximum, and only one subject kept working for the maximum time (7 minutes) in one round. This makes the distribution of worked time mostly a distribution of effort or aptitude, rather than of different choices in the amount of time supplied. Since there are only two decision variables for participants, being effort and time, this is a problem for the validity of this experiment. However, using a Kaplan-Meier survival time model, we can still attempt to estimate the effect of different experimental variables on the time supplied.

The Kaplan-Meier survival model is specifically meant for right-censored non-normal data, and finds which group is likely to survive longer. More specifically, when on average a certain event happens, the event is most applications being either death or recovery. Though mostly used in medicine and biology, using it to find the group which is most likely to keep on working (instead of living), seems to cause no conflicts with the assumptions or statistic fundamentals.

As an example of more economic applications, Meyer (1990) also uses it to estimate differences in unemployment spells. Farber(2005) also uses a sort of survival model on the cab-driver data, where the event is similar as in mine: stopping to work. To be exact, Farber employs a hazard (of stopping) model, which is a form of probit model, estimating probabilities of stopping at some

(21)

21 point in time, and which variables influence those probabilities. Though Farber does not need the Kaplan-Meier version due to his better data1, the underlying fundamentals are quite similar.

Order effects

First we will use the Kaplan-Meier survival model to check for order effects in time supplied. To be able to check for order effects, I randomized participants between high wage first and low wage first. Unfortunately, with a survival model it is hardly possible to fully test for interaction effects of order and investment. I will therefore only test for interaction effects in the effort section. I did split the groups on investment and subsequently tested for the differences in high-first and low-high-first, which help provide a reasonably clean estimate of the order effect.

The first test for order effects is on labour time supplied in the round with the low wage. The labour time supplied in the round with high wage is not useful, as all but one subject got the maximum earnings. The adjustment round is equal for all, so not relevant for order effects. I included the processing summary in table 3 as well as the survival graph in figure 3 to show the

1 More specifically, a probit model generally assumes a normal distribution of the error term, which is violated amongst others by the censoring due to the maximum amount. Also, most maximum likelihood estimators are only powerful when using a lot more data points then available in my data.

(22)

22 huge amount of censoring. For the investment and high-first group any estimate is even impossible due to a 100% censoring, i.e. everyone reached the maximum, so we can not make a quantitative comparison between high-first and low-first for the groups with investment.

Table 3: Processing table for estimating the survival model

Investment High wage first Total number in group Number of Events Percentage censored 0 0 5 3 40.0% 1 5 1 80.0% Overall 10 4 60.0% 1 0 5 2 60.0% 1 5 0 100.0% Overall 10 2 80.0%

Showing the four different experimental groups, the ”events” column shows people deliberately stopping. The lack of people stopping causes the high percentage of censoring, shown in the utmost right column.

The survival graph for the group without investment, comparing the low wage round, also shows the few occasions in which subjects opt out of working.

For the groups without investment, the mean supplied times are 343.8 seconds for the high-first and 273.4 for the low-first. The Log rank Mantel-Cox test, which is most adequate in estimating differences for data where the variance is mostly near the censoring point, is insignificant at

(23)

23 p=0.258, so we cannot reject the null hypothesis that the two groups are the same. This indicates that there is no order effect.

For earnings, again no comparison can be made between the high wage treatments due to all but one subject maxing out the earnings, i.e. hardly no variation even exists.

Though effort has a reasonably normal distribution compared to the rest of the variables, I use a Mann-Whitney U test to test for differences between effort levels. The MWU does have a somewhat lower power, but given the distortions in distribution on the length of supplied time, this might have also affected the effort levels, even if it does not directly show in the histograms of effort. First we test again for order effects, but now in effort. Testing for differences in the

adjustment round(p=0.315), the high wage round (p=0.853) and the low wage round (p=0.123) all indicate there is no difference in effort levels due to whether a subject first receives the high wage or the low wage.

From this we can conclude that order effects on both effort and time supplied are unlikely. This also seems to make theoretical sense. In an ideal experiment, which would mimic a real-world situation, many iterations of high and low wages would be done, which would completely rule out order effects, as there is just (random) variation for participants, just like it is for the cab drivers in Camerer et al. (1997).

Effect of investment

A further research goal was to find the effect of investment on labour supply decisions. Within the low wage condition, we find no difference between labour time supplied caused by the presence of investment. The log rank Mantel-Cox cannot reject the null hypothesis of no difference, at p=0.187. The estimated mean of the investment condition is somewhat higher at 374.5 seconds over 308.1 seconds without investment. We also see in figure 4 that the surviving percentage is generally lower for the group without investment. This could be interpreted as an indication that investment could be related to higher labour supply times, just like the simple fact that in the investment condition most comparisons were impossible due to the high number of maximums reached, which was not the case in the group without investment. However these findings and estimates are absolutely not significant and thus require more testing.

(24)

24

Survival graph for the two investment groups, compared in the low wage condition. The blue line shows people in the no-investment condition opting out of labour.

Within the high wage condition, I cannot estimate a proper statistic, since only one subject did not reach the maximum. For the adjustment round, I find again an inconclusive p=0.515 on the log rank Mantel-Cox test, though the survival line of the no-investment group in figure 5 is generally under the investment condition.

(25)

25

Survival graph for the first adjustment round, showing the difference between the groups with and without investment. We can see that again investment is quite consistently above no-investment.

We conclude that there is no significant evidence for differences in supplied labour time due to investment, though there is some indication that investment might have an effect, which would need further research. Turning to the other decision variable, effort, using Mann Whitney U tests, we also find no significant differences between the investment and no investment group in either the adjustment round (p=0.579), the round with high wages (p=0.684) or the round with low wages (p=0.631).

Testing for differences between earnings due to investment, using once more the Mann-Whitney U test to counter the non-normal distribution, shows a significant p=0.043 for the low wage

condition. The MWU test is principally two-sided. Further investigation does show a mean rank for investment of 13.20 and for the control of 7.80, so the investment seems to cause higher earnings, at least in the low wage condition. Unfortunately, the differences between investment and no investment for the adjustment round (p=0.105) and high wage round (p=0.393) are not significant. The significant result in the low wage condition should thus be read with quite some caution.

(26)

26

Main effect of different wage levels

For the within-subjects question whether subjects worked longer and/or harder when they had a low wage than when they had a high wage, a related-samples Wilcoxon signed rank test is

significant (p=0.000). The related-samples Wilcoxon signed rank test can be used when one would normally use a paired t-test, which I can't use due to the violation of normality. The test being significant proves that the median of supplied labour time is different in the low wage condition from the high wage condition. Further inspection, as well as basic assumptions show that all but one difference is worked time in the low wage setting being longer. However, this is quite obvious as in all rounds many participants reached the maximum, which was a set amount of money earned rather than a set amount of keystrokes. A higher wage also means one reaches the

maximum sooner, which is likely to be the greatest driver of the found higher labour time supplied for low wages.

Splitting the groups for investment gives similar results: for investment p=0.008 and for the control group p=0.017, all showing that for low wages, labour supply time is longer. It is unfortunately not useful in this experiment to adjust for the maximum amount of keystrokes. This would ultimately result in the variable effort times or divided by a scalar. For example, the simplest way to adjust for the difference in maximum amount of keystrokes would be to divide the worked time in the low wage round over the number of keystrokes, and then multiply that number by the maximum amount in the high wage setting(1500). However, this is just effort times 1500, which should not make a difference on significance nor theoretical relevance.

Turning to effort, we find absolutely no difference in the medians for effort (p>0.999). Splitting the groups per investment also gives no significant results: for the group with investment I find

p=0.767 for the difference in effort between the high wage and low wage, using the the Wilcoxon signed rank test again. For the control group without investment the difference is also not

significant, at p=0.878. So also when splitting the groups on investment, all results remain insignificant. This means I can not prove that having to invest makes people work harder when encountering a low wage, nor that people change their effort at all in different contexts. Should these results hold in a more representative experiment, then this would indicate that people do work longer when encountering a lower wage, but probably not harder. This is however a big if, since especially the results on supplied labour time are highly unreliable due to the

censoring.

Conclusions on hypotheses

Before drawing conclusions based on my data, I stress once more that the data is highly censored. This is next to the simple fact that my sample is very small, and the experiment ran only in a relatively short time span. Any conclusions based on this experiment are thus indicative at best, and quite unreliable in general.

For the first three hypotheses regarding supplied labour time and effort, I can hardly conclude anything. In the literature results are already quite mixed, though most experiments come up with a slight increase in the overall labour supply. I could not reach conclusive results regarding this hypotheses (neither affirmative nor negatory) based on my data. If any, participants did sustain greater labour supply times during low wages, though these can not be related to the (censored) labour time supply during the high wage condition. Also, no differences in effort due to wages can

(27)

27 be established.

The investment hypothesis: The treatment group with the investment requirement will show a

stronger tendency to set an income target and stop when that it is reached than the control group.

Did show some indication that at least the investment caused participants to work longer, though this effect is not to be trusted without further testing. The main result was that in the investment treatment, there were very few "early drop-outs", which ought to make no sense in the classical theoretical sense, as the investment was a budget-neutral and irrelevant addition to the labour decision. Therefore, this result is the most promising: if there are effects to be found in the relation between transitory wage variation and labour supply, including an investment perspective might amplify this effect.

Qualitative research notes

After the experiments, I did a short unstructured debrief to clarify any questions and to make sure students did not get any CANS complaints because of the repetitive nature of the experiment. During these little chats, several participants told me that the task was indeed highly repetitive, but due to the short time period, it was quite fine. In fact, a few participants indicated they even found it somewhat "relaxing" or "meditative", especially after some had exams. This might indicate there was actually a reduced incentive to stop for some participants as they enjoyed the task.

Furthermore, some participants indicated that they did not understand the relevance of me taking the money away for the investment. Though of course the impact might very well be subconscious, it can also mean that participants did not experience it as actually making an investment. This might be highly influential on the salience of the investment effect.

During the experiments, I generally kept half an eye on the participants, as to not make them feel too monitored, but still be able to notice behaviour. I noticed that indeed many participants used the magazines, papers and laptop I brought, or their own smart-phones, indicating most felt free to use their leisure time as actual leisure. I also noticed some participants taking mini-breaks to stretch. To me, this proved they were definitely making a effort, but also that the task was indeed physically taxing. This is not necessarily the intention of the experiment, and might be detrimental to the health of participants. I would therefore strongly advise future experimenters to use a different (less physical) task from mine, especially in longer experiments, as it might induce health risks.

Finally, some of the participants made it clear that they hardly cared about the money, but "know how it is to have to do an experiment". This can be considered an indication of a test effect, probably due to me personally asking people to participate. Participants might consider working more as a personal favour to me, despite the fact that they were and remained strangers to me.

Referenties

GERELATEERDE DOCUMENTEN

Due to a higher price volatility and a higher growth rate, the threshold line of scenario 2 is situated left of the base case.. In this scenario with a higher uncertain elec-

Therefore Table 8 is constructed, which provides the financial literacy index based on the amount of correct answers for 2008 and 2011 for individuals, responsible for the

This table reports the share of liquid financial wealth invested in stocks and bonds for different levels of total debt, mortgage debt and consumer credit in two distinct

Given an query manuscript without date or location, one possible way to estimate its year or location of origin is to search for similar writing styles in a large reference

Moreover, our schemes also out- perform the plain network coding based transmission scheme in terms of power saving as long as the receive energy of the devices is not negligible..

In this section, a comparative study will be conducted through the lens of public law theory in two stages: it first compares the recent proposals of reform formulated in

Because the Provocateur does not build any weapons, he tries to seduce Inspector to attack with a sufficiently low probability, such that if Agent becomes a Deterrer and builds

In effek word daar geargumenteer dat die geagte donasie in elk geval bepaal moet word as die verskil tussen die billike markwaarde van die langslewende se boedel ten tye