The influence of working overtime on the performance evaluation of a junior employee

(1)

The influence of working overtime on the

performance evaluation of a junior employee

Name: Mike Bakker

Student number: 11402296 Thesis supervisor: Victor Maas

Date: 20 June 2018

Word count: 11982

MSc Accountancy & Control, specialization Accountancy Faculty of Economics and Business, University of Amsterdam

(2)

2

Statement of Originality

This document is written by student Mike Bakker who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

3

Abstract

Junior employees who are starting their careers do not want to give up their social life, but on the other hand want to build a successful career. When they get pressured by their manager to work more overtime hours, they often choose the latter. But does working more overtime hours contribute to a higher performance evaluation, which is essential in climbing the corporate ladder? And, to what extend does a manager evaluate on effort and performance? Based on these questions, the research question for this study is formulated as follows:

RQ: Does working overtime have a positive effect on the performance evaluation of a junior employee?

This study finds statistical evidence that supports the first hypothesis of this study that working overtime hours has a positive effect on the performance evaluation of a junior employee. A significant difference in the performance evaluations was found between the group of employees who worked more overtime hours than their peers, and the group of employees who worked less overtime hours than their peers. This implies that working more overtime hours results in higher performance evaluations. But, working more overtime hours is not only positive. Because, from multiple other studies it has become clear that working too much overtime hours can cause serious health problems like: fatigue, overweight, and depression.

The second part of the study was about the potential existence of an interaction effect between high performing employees and poor performing employees. The reasoning behind the potential interaction effect was that poor performing employees would benefit more from working more overtime hours, than the high performing employees would. However, the data did not show a significant difference between both groups, so, no interaction effect was established, and therefore the second hypothesis of this study was rejected.

The data is collected by means of a case-based experiment. For the experiment a various

selection of 106 participants was used. The findings of the study contribute to our understanding of what managers evaluate junior employees on, and therefore, if working overtime leads to higher performance evaluations.

(4)

4

1. Introduction

The purpose of this thesis is to examine the relationship between employee performance evaluation and working overtime. More specifically, the thesis will investigate whether working overtime has an influence on the performance evaluation of a junior employee, and whether the quality of work delivered by the employee has a moderating effect. In doing so, the following research question will be answered:

RQ: Does working overtime have a positive effect on the performance evaluation of a junior employee?

The thesis is focused on how working overtime hours by junior employees is regarded by superiors of a company, in other words, do seniors evaluate the input of the employee, i.e. the overtime hours, or do seniors evaluate on the output, i.e. the quality of the work produced by a junior employee. Given my personal experience and education, and the fact that working

overtime is a known phenomenon within the auditing profession, the audit profession is used as the scope of this study.

On average accountants at the big firms work 8.3 hours overtime a week. For the small- and medium size accounting firms this is respectively 7.4 and 6.7 hours (Accountant and Alterim, 2017). The same annual research from Accountant and Alterim (2017) highlights that almost 40% of Dutch accountants are unsatisfied with their work-life balance. Using the term ‘work-life balance’ in a metaphorical way, involuntary working overtime hours could be a serious factor to tip the scale to the wrong side (Guest, 2002). Work-life balance is described as: a perceived balance between work and the rest of life. This perceived balance could be different for everyone, meaning that it is a very subjective concept.

Also, considering the wide range of negative effects of working overtime on health, makes it important to answer this research question (Kodz et al., 2003). This topic is widely researched which produced a lot of evidence on, for example: fatigue, increases in weight, and depression all due to working overtime hours (Kodz et al., 2003; Beckers, 2008; Nakamura et al., 1998). The big accounting firms usually have a strong up-or-out culture, meaning an employee has to grow to the next level or has to leave the firm when he does not grow enough within a limited amount of time. Advancing to the next level requires positive performance evaluations by seniors within the firm. By studying the relation between the subjective performance evaluations and working overtime the study tries to provide more insight into this matter.

(6)

6

This relation between subjective performance evaluations and working overtime is probably influenced by the quality of the work a junior employee provides. Because, intuitively, when a junior employee delivers the work without mistakes, i.e. high quality, the quantity of overtime hours is less important than when the quality of the work is not as good as expected.

Auditing and big audit firms are used as the scope for the study. Given the strong up-or-out culture, the use of performance evaluations and the amount of working overtime hours within the profession which makes auditing particularly suitable for this study. Although, the external validity might be limited due to this specific setting and the participants used for the study. More about the external validity and the participants is included in section 4.

To the best of my knowledge there hasn’t been a similar study that examined this research

question. This study aims to contribute to the recent stream of research on junior-senior relations within audit firms.

This thesis is structured as follows, section two reviews different streams of literature. In total four different streams are described. Section three outlines the hypothesis development, and section four describes the design of the study. Next, the results are reported on in section five and section six shows the conclusions from the research. Also the discussion and directions for future research are included in section six.

(7)

7

2. Literature review

The intuition for my research question follows from multiple different streams of theory. The first stream is based on the negative consequences of working overtime, like illness, fatigue, and overweight. The second stream is related to the work-life balance and how working overtime influences this. The third stream is based on the influence of organizational culture on working overtime. What seniors evaluate junior employees on is the last stream, i.e. senior performance evaluations.

2.1 Negative health consequences

Many, more or less know what working overtime means, however multiple definitions of the term are used. The definition of working overtime used in this study is:all work hours that an employee works on top of his/her contractual work hours (Beckers, 2008). On the topic of working overtime the vast majority of the studies research the negative consequences on health in general, fatigue, overweight, anxiety, depression, and mortality (Beckers, 2008; Beckers et al., 2008; Kleppa, Sanne, & Tell, 2008; Kodz et al., 2003; Kompier, 2006; Nakamura et al., 1998; Nylén, Voss, & Floderus, 2001).

For example, Nylén, Voss, and Floderus (2001) find some evidence for an increase in mortality. This increase is associated with more than five hours of overtime working a week. Even though the data used in the study is from 1973, these findings are still relevant today as working overtime is still a very common practice.

Also, the relationship between working overtime and fatigue is studied intensively. Although not all studies found an association between the two concepts, there is a lot of evidence suggesting an association does exist (Hayashi, Kobayashi, Yamaoka, & Yano, 1996; Iwasaki, Sasaki, Oka, & Hisanaga, 1998; Proctor, White, Robins, Echeverria, & Rocskay, 1996).

Thirdly, the study of Kleppa et al. (2008) shows anxiety and depression to be associated with working overtime. In their study a distinction between working moderate overtime, one to eight hours, and working most overtime, more than eight, was made. The group of working moderate overtime had a higher mean score for anxiety but not significantly higher. However, for the group of working most overtime, significantly higher levels of anxiety and depression were found. Watanabe, Torii, Shinkai, and Watanabe (1993) found similar results for symptoms of depression, whereas the same applies to the study of Nishikitani, Nakao, Karita, Nomura, and Yano (2005).

(8)

8

Finally, an association between working overtime and increase in overweight was found by Nakamura et al. (1998). The study obtained data about height, weight and waist circumference from white-collar workers over a period of three years. The results showed a significant

correlation between working overtime and change in body mass. As acknowledged in the study, irregular eating habits and less physical activity could be mediating factors in this equation. Even though there seems to be compelling evidence of the negative effects of working overtime, it still is a very common phenomenon, not only in the audit profession, but to a wide variety of white- as well as blue-collar professions and everything in between. But what could be the reason for people to keep doing it?

2.2 Work-life balance

According to Guest (2008), work-life balance could be described as “a perceived balance between work and the rest of life”. This is a subjective definition because this ‘perceived balance’ could be different for every person on this earth. Due to differences in social life one could perceive life outside of work as very important, as others may perceive their work as much more important due to career prospects or the fact that one does not have many friends apart from his colleagues. Also factors like age and financial situation have significant influence on ones acceptance of pressure at home and/or at work (Guest, 2008).

The most objective definition concerning work-life balance is the established maximum of 48 working hours defined by European legislation (Guest, 2008). But, considering the earlier mentioned subjectivity around this topic, one could ask whether an objective definition is desirable or appropriate.

There are multiple theories concerning this cohesion between the life outside of work and work. For example, Staines (1980) compares the ‘spill over theory’ and the ‘compensation theory’ and concludes that both theories acknowledge that one is interrelated with the other, in other words, work is influenced by one’s life outside of work and vice versa. Spill over theory assumes

emotions and feelings ‘spill over’ from work to home, which means, having a good day at work results in a good mood when coming home after work (Staines, 1980). Compensation theory assumes a contrary relation between the both spheres. The theory reasons from the perspective that disappointment in one spheres leads to extra, compensating, efforts in the other sphere. For example, being unhappy about ones love life could result in compensation of this disappointment by working more overtime hours to get achieve better results at work (Staines, 1980). In his study, Staines (1980) found mixed evidence under different conditions, but concludes that the evidence for the spill over theory is slightly stronger.

(9)

9

To come back to the question asked at the end of paragraph 2.1, anxiety of losing your job or the pressure from your superior to perform and/or the diminishing boundaries between work and home seem to cause this working overtime phenomenon (Beckers, 2008). These diminishing boundaries addressed by Beckers (2008) closely align with ‘the border theory’ described by Clark (2008). According to Clark (2008) people daily cross the borders between the work domain and the family domain. In this setting the term: family domain, is referred to as everything outside of work. In her theory Clark (2008) describes balance as: satisfaction and good functioning at work and at home, with a minimum role of conflict. An unbalance arises when, for example, work has to be taken home due to pressure from a superior to finish it timely. When this happens regularly, borders, or boundaries, between work and family diminish over time.

2.3 Overwork culture

Nowadays, graduates who want to start their careers off by one of the big white-collar firms like the audit- or law firms have to have a master’s degree to even get an invitation for a job

interview. Graduates also preferably have high grades and some sort of management experience coming with it. According to Sturges, Guest, and Mackenzie Davey (2000) graduates have to take more responsibility for their own careers, because a job for life doesn’t exist anymore or has become uncommon to say at the least according to Bridges (1995; as cited in Sturges et al., 2000). This own responsibility also gives room for making different choices. For example, also

something the study of Sturges et al. (2000) reveals, when starting their careers graduates hold their work-life balance, or work-social life balance, in high regard, not wanting to give up their social life for working long hours for the firm. But, in order to keep their job they are pressured to do so, whatever these pressures may be. This emerged unbalance is tolerated by the belief that they can control their work life and thinking it is only temporary (Sturges et al., 2000).

But, on the other hand, the research of Mazzetti, Schaufeli, and Guglielmi (2014) shows that, when exposed to an overwork climate, employees will be working more hours overtime. In their study, Mazzetti et al. (2014) explain overwork climate as: management encouraging working overtime and expecting employees to comply with it. Workaholics, who prosper well in these kind of climates, self-select themselves for these firms (Mazzetti et al., 2014). Others may quit after a few years of high pressure for the sole purpose of building a strong resume, which justifies their belief of thinking the overwork is only temporary. This mix of workaholics and

non-workaholics also fits the organisation structure which is pyramid shaped, meaning not every graduate can grow to the top level of the firm.

(10)

10

2.4 Senior performance evaluations

2.4.1 Objective performance evaluation

There are different ways management could use for measuring and evaluating of their employees. A widely used way is objective performance evaluation (Widener, Shackell, & Demers, 2008). Examples of objective performance measures are: sales targets or revenue targets, amongst many others. These objective measures are solely focused on the performance, or the output, of an employee, the underlying input of the employee is often ignored. The objective performance measures are often linked to incentive contracts. In their study Widener et al. (2008) found evidence for this complementary link of performance measures and incentives.

In his study about CEO compensation, Murphy (2013) found an increase in the amount of stock options granted to CEO’s of S&P 500 firms for the period of 1992 to 2001. From 2001 to 2011 the amount of stock options decreased, but the amount of restricted stock increased with the same pace. Restricted stock implies that the stock is granted, but the CEO has to hold on to it for a certain amount of time before he or she can trade the stock. This way the CEO is incentivised to perform as high as possible (Murphy, 2013). This development shows a strong focus on performance.

2.4.2 Subjective performance evaluation

This study is mainly focused on the subjective performance evaluation. The advantage of subjective performance evaluation, compared to objective performance measures, is that everything which cannot be measured objectively could still be included in the performance evaluation (Prendergast & Topel, 1993). Examples of these advantages are the extent to which an employee helps other employees or takes on side activities which benefit the company indirectly. But, things that cannot be measured objectively are subject to subjective interpretation of the superior, which also has its downsides. For example, the discretion used in the evaluation leaves room for favouritism and biases (Prendergast & Topel, 1993).

Classifying overtime hours as an objective- or subjective performance measure may not be that easy. Because, factual the hours of overtime working could be measured objectively and be evaluated on, but in reality this is unlikely, since firms are not actively aiming on overworking their staff. It is much more likely that this extra effort an employee puts in by working extra hours would be rewarded in a subjective performance evaluation.

According to Baker, Gibbons, and Murphy (1994) using both objective- and subjective

(11)

11 2.4.3 Effort versus performance

This study investigates whether working overtime hours has a positive effect on the performance evaluation of a junior employee, but the underlying, more abstract, level of the study is about what managers evaluate on, so, to which extend they value effort (i.e. working overtime hours) and to which extend they value the performance of the employee (i.e. quality of work).

Effort and performance of an employee are two related constructs. In their study, Brown and Leigh (1996) found a relation between job involvement and job performance mediated by the amount of effort put in by the employee. The underlying theory for this relation according to Brown and Leigh (1996) is that when employees feel psychologically save and find their job meaningful, they are more involved with their job, put in more effort, and perform better. This is different from the overwork climate described in paragraph 2.3 because the overwork climate discussed earlier is more based on pressure and intimidation, the relation found by Brown and Leigh (1996) is based on trust and inspiration. The relation found in the study from Brown and Leigh (1996) is easy to understand, because putting in more effort normally leads to a higher performance. But what happens with the performance evaluation when an employee is putting in much effort but still has lagging results?

Although a direct answer to that question has not been found in the existing literature, Klimoski and London (1974) researched the difference between self-, peer-, and supervisory ratings of performance. In their study they found differences between the information that was used to compose the ratings. But, more importantly, they found that the self-ratings and peer ratings made a distinction between effort and performance, whereas supervisory ratings did not distinguish between the two constructs.

The evidence from the study of Klimoski and London (1974) shows that management mainly focuses on the performance of the employee. An important explanatory factor for this focus on results is the link between performance measures and incentive contracts, described in paragraph 2.4.1. Also, the increased meddling to meet the earnings guidance dictated by Wall Street the focus on financial performance is very strong for a lot of companies, because if the companies miss the earnings target, Wall Street punishes the companies by dropping the share price, which is the basis of most incentive contracts (Fuller & Jensen, 2010).

Concluding, the paragraphs 2.1, 2.2, and 2.3 gave an impression of the issues surrounding this topic and why this topic is important to examine. The development of the hypotheses of the study rely more on the theory described in paragraph 2.4. The hypotheses that were derived from the theory are included in the next section.

(12)

12

3. Hypothesis development

The experiment is focused on the effect of working overtime on performance evaluations of junior employees. The experiment is designed to measure whether seniors evaluate based on input, i.e. hours of overtime, or output, i.e. the quality of work. The explicit choice to focus on performance evaluations in general is based on the available participants for this experiment, more about the participants is explained in paragraph 4.2. A big audit firm is chosen as the setting for the case because working overtime is a known phenomenon within audit firms and annual performance evaluations play an important role in promoting junior employees. Being an auditor is normally considered a well-paid job, and overtime hours are considered to already be

compensated in the salary. Due to different levels of knowledge about audit firms under the participants, it is decided to extensively describe the case to provide a complete and realistic picture for all participants. The aim was to find a balance between the limited amount of time of the participants and the extensiveness of the case.

As described in the previous section, the study of Klimoski and London (1974) suggests that managers only pay attention to the performance of an employee, the study of Prendergast and Topel (1993) shows that subjective performance measurement leaves room for including effort in the evaluation. Based on these studies it is expected that good performers (i.e. employees who deliver a high quality of work) get higher evaluations and putting in the extra effort will have a slightly positive effect, whereas poor performers (i.e. employees who subsequently deliver their work with too many mistakes) will have a greater effect from the overtime hours they have put in, but still have lower evaluations given the strong focus on performance (Klimoski & London, 1974). In other words, for both groups, the good- and poor performers, a positive effect is expected. But, this effect is expected to be stronger for the poor performers, so an interaction effect is expected. This effect is visualised in chart 1.

Input Low High High Output Low Chart 1

(13)

13

The study of Brown and Leigh (1996), described in the previous section, also showed the relation of an increased effort of the employee leading to higher performances. Also, spending more time on your job normally increases the amount of work you can get done, which could increase your profitability or improve customer satisfaction for the company, for example. Based on these theories two hypotheses have been derived. The first hypothesis is as follows:

H1: Working overtime has a positive effect on the performance evaluation of a junior employee.

But, one could argue that working overtime would always have a positive effect on the performance evaluation of an junior employee. Suggesting that putting in more hours

automatically leads to higher performance evaluations from your superiors, regardless of your competence for the job. Because of the strong focus on performance, described by Klimoski and London (1974), a distinction is made between employees who perform well and employees who perform worse. So, the effect of working overtime doesn’t have to be the same for both groups. Prior research also established a negative relation between working overtime the health of an employee. For instance, the study of Kleppa et al. (2008) shows the connection between anxiety and depression as a result of working an excessive amount overtime hours. Nakamura et al. (1998) found a relation between overweight and working overtime. And, Hayashi et al. (1996) found, amongst others, that working overtime has a negative effect on fatigue. So, even though working overtime seems to have a positive effect on performance evaluations, it also has some negative effects on your health, for instance.

Based on the distinction made by management between good- and poor performers at work, performance clearly has to be included into the study. So, the second hypothesis is formulated as follows:

H2: The quality of work (i.e. performance) moderates the effect of working overtime on the performance evaluation of a junior employee.

However, in reality there will always be some kind of ‘noise’. Because, when a senior evaluates a junior employee other factors like favouritism or bias to give an example, could play some sort of role (Prendergast & Topel, 1993). But, characterizing for the chosen research method this ‘noise’ is mitigated effectively, so the independent variables, i.e. working overtime and the quality of work, could be measured relatively well, i.e. a good internal validity.

(14)

14

4. Research design

4.1 Case based experiment

In the appendix of the document a predictive validity framework (“Libby boxes”) shows the conceptual relation between the variables in this study, i.e. working overtime, performance evaluation and the moderating variable: quality of work.

The relation between the independent, moderating, and dependent variable will be investigated by conducting a judgment and decision making experiment. In the experiment the variable of working overtime and the variable of the quality of the work will be manipulated in order to measure its effect on the performance evaluation.

The experiment is case-based, meaning that all participants will be presented the same case except for the variables which are manipulated. Intuitively the effect of the independent variable (i.e. working overtime) on the dependent variable (i.e. performance evaluation) must be

moderated by the results of an individual. In other words, when someone does work overtime and has very good quality of work, the performance evaluation will most likely be positive. But, when the same person still does work overtime but has very poor quality of work, one could expect the performance evaluation to be less positive. Including this moderating variable in the study meant that a 2 by 2 case experiment was conducted.

The goal of conducting this case-based experiment establishing a cause-effect relationship. Because the independent variable (i.e. working overtime) and the moderating variable (i.e. quality of work) are manipulated, other explanations for affecting the performance evaluation can be ruled out, in other words a strong internal validity. Also characterizing for this kind of

experiments is the potential concerns about external validity. I anticipated on external validity concerns by picturing a well formulated, complete, and realistic case. Because of the participants available for the study the case was slightly extended relative to the first draft to form a complete picture for all the participants.

Furthermore, the construct validity of the independent variables (i.e. working overtime and quality of work) used in this study was not an issue because the variables are very straight forward. Working overtime is operationalized with either the employee does or does not work overtime. The moderating variable is operationalized by the amount of errors made compared to colleagues with the same level of experience (e.g. less than peers/more than peers).

(15)

15

4.2 Participants

No ethical dilemmas or other issues that require specialized knowledge are part of this case experiment. Also emotions are omitted as much as possible in the case, to avoid influencing the thinking process.

For the experiment the assumption is made that the cognitive abilities of auditors are not

significantly different than that of a nationwide population of auditors and non-auditors. In other words, it is assumed that the average auditor has no different cognitive abilities than any other average person has when it comes to non-specialized topics like a performance evaluation. A minimum sample size of 20 participants per condition should guarantee equal proportions of each participant characteristic in each condition. Therefore, the used sample size of 106 people justifies the choice of using a mix of auditors and non-auditors as the participants for this experiment. In total 127 participants participated in the experiment. From these responses only 107 completed the whole experiment, the 20 responses that were incomplete have been excluded from the analysis. One participant did not provide his or her consent. This response was also removed and excluded from further analysis. At table 1 it is shown that the sample of 106 participants consisted of 61% male respondents, meaning the other 39% were female respondents. The average age of the participants was 28,8 years.

Due to the manipulation checks, which were included in the questionnaire, another 25 responses had to be excluded from further analysis. The remaining 81 responses are used for testing the hypotheses of this study. The variations 1 to 4 respectively got 20, 20, 20, and 21 useable responses. More on the manipulation checks is discussed in paragraph 4.3.

4.3 Procedures

The process of the experiment itself roughly existed out of six steps. Respectively, after the participants provided informed consent to participate in the experiment, the participants got randomly assigned to the different variations of the case, these different variations of the case are shown in paragraph 4.4. The random allocation was automatically taken care of by the online program used for this study, called Qualtrics. Next, the participants read the instructions and case

N Minimum Maximum Mean

Std. Deviation Age 106 18,00 66,00 28,774 10,951 Gender 106 0,00 1,00 0,613 0,489 N 106 Table 1 Respondent demographics

(16)

16

scenario. With the fifth step the participants answered the questions and end with the exit-questionnaire. The questions that were asked are shown at paragraph 4.5. Manipulation checks were included within the questionnaire to check whether the manipulation was successful. The manipulation checks consisted of question 2 and 3. These questions checked whether the provided information on the tested variables was read and understood. The amount of

participants who failed to give the right answers for question 2 and 3 was 25. These participants have also been removed from the sample and excluded from all further analyses. On average the participants needed 4 minutes and 33 seconds to complete the experiment.

4.4 The case

The case will be separated into two parts, the first part will be the same for all participants. After this first part, four different variations arise. The participant only gets to read one variation. The first part, which will be identical for all participants, reads as follows:

You are one of the managers in a big audit company located on ‘De Zuidas’ in Amsterdam. In your firm, like any big audit firm, working overtime is no exception, it is like an unwritten rule. During the year you are responsible for multiple audit engagements which are carried out by your team. During the audits, you and your team visit the client to audit their books. One of the members of your team with which you do multiple engagements is Alex. Alex is a junior associate who is with the firm for almost two years now.

Annually, every employee of the firm is evaluated by its superiors. These subjective

performance evaluations are the basis of the promotion decisions. Because you have worked a lot of time with Alex you are asked on your views on Alex. Your evaluation will weigh heavily because you are in the best position to evaluate Alex.

Next, the four variations are described: Variation 1: working overtime and good results

Alex is a second year associate who joined the firm immediately after graduation. Alex is a very happy and socially person, has a partner and goes to the gym occasionally. Alex works full time and works mostly with clients in and around Amsterdam. Compared to the other associates, Alex works more overtime hours on a weekly basis. You know that Alex rarely makes mistakes and belongs to the top 20 associates of the firm. At this moment the firm counts 72 associates in total.

(17)

17

Variation 2: working overtime and bad results

Alex is a second year associate who joined the firm immediately after graduation. Alex is a very happy and socially person, has a partner and goes to the gym occasionally. Alex works full time and works mostly with clients in and around Amsterdam. Compared to the other associates, Alex works more overtime hours on a weekly basis. You know that Alex makes mistakes now and then and belongs to the bottom 20 associates of the firm. At this moment the firm counts 72 associates in total.

Variation 3: not working overtime and good results

Alex is a second year associate who joined the firm immediately after graduation. Alex is a very happy and socially person, has a partner and goes to the gym occasionally. Alex works full time and works mostly with clients in and around Amsterdam. Compared to the other associates, Alex works less overtime hours on a weekly basis. You know that Alex rarely makes mistakes and belongs to the top 20 associates of the firm. At this moment the firm counts 72 associates in total.

Variation 4: not working overtime and bad results

Alex is a second year associate who joined the firm immediately after graduation. Alex is a very happy and socially person, has a partner and goes to the gym occasionally. Alex works full time and works mostly with clients in and around Amsterdam. Compared to the other associates, Alex works less overtime hours on a weekly basis. You know that Alex makes mistakes now and then and belongs to the bottom 20 associates of the firm. At this moment the firm counts 72 associates in total.

For the case the name ‘Alex’ is used. This name is chosen because it is suitable for both sexes. This way discrimination on gender is tried to avoid, because every participant could imagine either a boy or a girl. In the translated version of the case the name of ‘Alex’ is replaced with the name of ‘Robin’, because in Dutch ‘Alex’ is typically a boys name. Multiple studies find evidence for rating adjustments based on age, race, or other demographic characteristics, so no

information concerning these topics is provided to avoid these rating adjustments that are based on anything other than the manipulated variables(Pulakos, White, Oppler, & Borman, 1989;Tsui & O'Reilly, 1989). The translated version of the case is included in appendix 3.

(18)

18

4.5 Case questions

After reading the case, all participants got the same set of questions. All the questions that were asked during the experiment are the following:

1. How would you evaluate the performance of Alex? 2. Did Alex put in more overtime hours than his/her peers? 3. Did Alex make more mistakes compared to his/her peers?

4. Do you agree that employees who put in more effort, perform better? 5. How do you consider a workweek of 48 hours or more?

6. Generally, do you work a lot of overtime hours every week at your current job? 7. Do you work in a(n) company/industry where working overtime is seen as the normal

thing to do?

8. Are you your own boss or do you work as an employee? When both apply, choose the one most dominant in your situation.

9. What is your gender? 10. What is your age?

4.6 Analysis

A Two-Way ANOVA (analysis of variance) is used for analysing the data. Because the study is based on two categorical independent variables, i.e. working overtime and quality of work, and one continuous dependent variable, i.e. performance evaluation, the two-way factorial analysis of variances is the most suitable for this case based experiment. Apart from the two main effects which are measured, the interaction effect between both independent variables can also be measured. The hypothesized interaction effect in this experiment could be that employees who deliver a good quality of their work benefit less from working overtime hours than those who deliver a their work with a lot of mistakes, when it comes to their performance evaluations.

4.6.1 Assumption testing

In order to know whether the conclusions based on the dataset are valid, a few assumptions have to be met. See the assumptions below.

Dependent variable is at least interval

The dependent variable (i.e. performance evaluation) is measured by a 7-point Likert scale with an inherent order, equal intervals, and without a natural zero point. So, the variable qualifies as interval data, meaning the first assumption was satisfied.

(19)

19

Independent observations

The second assumption of independent observations is satisfied by the design of the study. All participants answered the questions individually without consulting other participants

beforehand. By making use of the online software of Qualtrics, participants were not aware of there being different variations to test the differences in response. Also, participants were not aware of the answers other participants gave. This way the independence of all responses was secured.

Dependent variable should have a normal distribution in all groups

The four variations were all tested for normality with the Kolmogorov-Smirnov test. The significant score for all variations indicates a deviation from normality. See table 2a to 2d with respectively the significance scores of 0,000 , 0,001 , 0,000 , and 0.003 of variation 1 to 4.

Statistic df Sig. Performance evaluation score Variation 1 0,311 20 0,000 Table 2a Tests of Normality Kolmogorov-Smirnova

a. Lilliefors Significance Correction

Statistic df Sig. Performance evaluation score Variation 2 0,259 20 0,001 Table 2b Tests of Normality Kolmogorov-Smirnova

Statistic df Sig. Performance evaluation score Variation 3 0,346 20 0,000 Table 2c Tests of Normality Kolmogorov-Smirnova

Statistic df Sig. Performance evaluation score Variation 4 0,240 21 0,003 Table 2d

Tests of Normality

(20)

20

In order to understand the cause of this deviation from normality, the z-scores of the skewness and kurtosis values were calculated. For variation 1 the z-scores of skewness, and kurtosis are respectively -1,39 (-,712/,512), and -0,449 (-,446/,992). Both numbers are not exceeding -1,96, meaning they are not significant.

The same has been done for variation 2 to 4. The z-scores for variation 2 for skewness, and kurtosis are respectively -.533 (-,273/,512), and -1,215 (-1,205/,992). Again, both numbers are not exceeding -1,96, meaning they are not significant. The z-scores for variation 3 for skewness, and kurtosis are respectively -1,43 (-,736/,512), and 0,579 (,574/,992). Again, both numbers are not significant. Lastly, the z-scores for variation 4 for skewness, and kurtosis are respectively -,485 (-,243/,501), and -1,361 (-1,323/,972), also not significant. So, there was no significant skewness or kurtosis for any of the variations.

Another potential threat for the usability of the data could be the existence of outliers. Based on the created boxplots, no outliers were detected within any of the variations. So, based on the findings with regard to the skewness, kurtosis, and potential outliers it is concluded that the deviation of normality is not to severe and therefore not problematic.

The variance in all groups is equivalent

The homogeneity of variances is tested for with Levene’s test. The significant score of 0,022 for Levene’s test based on the mean indicates a significant difference between the variances of the four variations. Based on the median, the non-significant score of 0,116 indicates there is not a significant difference between the variances of the four variations. Normally, the use of the median is slightly preferable, because it is less biased by outliers. But, as discussed earlier, there are no outliers in the data.

So, according to the significant score for the Levene’s test, the assumption of equal variances between all groups is violated. But, according to Zimmerman (2004), unequal group sizes is much more of a problem than potential heterogeneity of variance. And, the fact that sample variances do not necessarily correspond with the sample parameters, like a Levene’s test, is shown within

Levene

Statistic Sig.

Based on Mean 3,389 0,022

Based on Median 2,033 0,116

Based on Median and with

adjusted df 2,033 0,119

Based on trimmed mean 3,292 0,025

Table 3 Test of Homogeneity of Variance

Performance evaluation score

(21)

21

multiple studies. That is why tests for homogeneity of variances are no longer widely

recommended by statisticians, according to Zimmerman (2004). Because this study practically has equal group sizes, the significant score based on the mean of the Levene’s test is ignored. In this section the design of the study is described, the analysis is explained, and the assumptions are tested. The results of the hypothesis tests are described in the next section.

(22)

22

5. Results

5.1 Main results

After testing the assumptions, as described in the previous section, a Two-Way ANOVA was carried out to compare the mean scores of the different variations, in order to test the two hypothesis of the study. The results are shown in table 4a, below.

The table shows the mean scores of the performance evaluations per variation as well as the total mean score for high- and low input. A clear difference shows between the mean scores of

employees who put in more overtime hours compared to their peers, and employees who put less overtime hours. The scores for these groups are respectively 5,48 and 4,76, indicating that the employees who put in more overtime hours score 15% higher in their performance evaluation. This first indication supports the first hypotheses of this study (i.e. working overtime has a positive effect on the performance evaluation of a junior employee), but is not statistically valid proof, the Two-Way ANOVA does provide statistical evidence, and will be discussed in

paragraph 5.1.1.

5.1.1 Hypotheses 1 test

Table 4b, above, shows a part of the ANOVA output and provides statistical proof for a significant difference in mean scores between the group of employees who do work more overtime hours than their peers, and the group of employees who work less overtime hours than

Working overtime Quality of work Variation Mean

Std. Deviation N High 1 6,40 0,681 20 Low 2 4,55 1,099 20 Total 5,48 1,301 40 High 3 5,70 0,801 20 Low 4 3,86 1,062 21 Total 4,76 1,319 41 Total 81 Table 4a Yes No Main results Type III Sum of Squares Mean Square F Sig. Working overtime 9,817 9,817 11,364 0,001 Quality of work _69,007 _69,007 _79,878 _0,000

Working overtime * Quality of work 0,000 0,000 0,000 0,986

Table 4b

Variable

(23)

23

their peers. Based on the significant score of ,001 for the ‘working overtime’-variable can be concluded that the variable has significant influence on the performance evaluation of an employee. So, the means of the two groups differ significantly, and therefore hypotheses one is supported.

Something which also becomes very clear from table 4a is the fact that employees who make less mistakes than their peers get higher performance evaluations. The mean scores of the employees who perform well and work overtime hours, and the employees who perform well but work less overtime hours, score respectively 40,7% ((6,40-4,55)/4,55) and 47,7% ((5,70-3,86)/3,86) higher than their peers who perform lower, thus are making more mistakes. Of course this finding is obvious because it is the whole reason performance evaluations exist in the first place, to separate high- and poor performers. This also was hypothesized in section 3, and could be seen clearly from chart 1, where the line of high performers was higher regardless of the quantity of overtime hours. Chart 2, below, which is part of the Two-way ANOVA output, shows a similar picture.

5.1.2 Hypotheses 2 test

Hypotheses two (i.e. the quality of work (i.e. performance) moderates the effect of working overtime on the performance evaluation of a junior employee) assumed the existence of an interaction effect. However, chart 2 indicates that the hypothesized interaction effect did not occur, meaning the effect of working overtime did not differ between the high- and poor

performing employees. This is expressed in chart 2 by the two parallel lines. Compared to chart 1, where the interaction effect was hypothesized, the two lines were convergent instead of parallel.

(24)

24

Table 4c, below, shows another part of the same output of the Two-way ANOVA. The table shows that the variables of working overtime, and quality of work are individually significant, based on their scores of respectively ,001, and ,000. However, their interaction, showed at the bottom row of the table, has a non-significant (,986) effect on the performance evaluation of an employee.

Therefore, based on table 4c, above, can be concluded that hypotheses two is not supported and therefore must be rejected. Whether this is due to limitations of the study, or such an interaction effect does not exist in reality, will be discussed in section 6.

5.2 Other correlations

So far the results were explicitly focused on the mean scores of the performance evaluations in combination with the different variations. This was the main focus of the study and therefore discussed in paragraph 5.1. In this paragraph the other interesting correlations between question 4 to 10 are discussed. There was no significant correlation between question 1 and one of the other questions, therefore it is excluded from the correlation table. In table 5, below, the

correlation table is showed. Two-tailed significant correlations were market with an asterisk. The table consists of two parts, the Pearson correlations appear above the diagonal, and the non-parametric Spearman correlations appear below the diagonal.

Type III Sum of Squares Mean Square F Sig. Working overtime 9,817 9,817 11,364 0,001 Quality of work 69,007 69,007 79,878 0,000

Working overtime * Quality of work 0,000 0,000 0,000 0,986

Table 4c

Variable

(25)

25

Looking at table 5, above, question 6 and 7 significantly correlate at ,01 level for both the

Pearson correlation and the non-parametric Spearman correlation. Their scores are, respectively -,429 and -,420. These scores mean that when a person does not work for a company, or in an industry, were working overtime is seen as the normal thing to do, they also tend to work less overtime hours. This also works in the opposite direction, meaning that when a person does work for a company were working overtime is deemed to be normal, he or she tends to work more overtime hours. Due to the quality of the data, it is not possible to analyse the quantity of overtime hours when working overtime is deemed to be normal.

This finding matches the results from the study of Mazzetti, Schaufeli, and Guglielmi (2014). As discussed in paragraph 2.3, their research showed that when an employee gets exposed to an overwork climate, he or she will be working more overtime hours.

Another significant correlation exists between question 6 and 8. With the Pearson correlation they correlate at the ,01 level, with a score of -,336. For the non-parametric Spearman correlation, the two questions correlate at the ,05 level, with a score of -,286. These scores means that when someone is not an employee but is his own boss (i.e. an entrepreneur), he or she tends to work

People who work more overtime hours perform better? Q4 What is your opinion on a 48+ hour workweek? Q5 Do you work a lot of overtime hours? Q6 Do you work at a company or industry where overwork is considered normal? Q7 Are you your own boss or an employee? Q8 Gender Q9 Age Q10 Pearson Correlation 1 -0,172 -0,056 0,062 -0,234 -0,024 ,243* Sig. (2-tailed) 0,126 0,616 0,613 0,053 0,831 0,029 Pear/Spear Correlation -0,174 1 -0,071 ,244* ,285* ,278* -0,003 Sig. (2-tailed) 0,120 0,530 0,044 0,018 0,012 0,976 Pear/Spear Correlation -0,044 -0,122 1 -,429** -,336** 0,103 0,035 Sig. (2-tailed) 0,699 0,279 0,000 0,005 0,359 0,754 Pear/Spear Correlation 0,086 ,254* -,420** 1 ,313** 0,231 -0,084 Sig. (2-tailed) 0,484 0,035 0,000 0,009 0,056 0,493 Pear/Spear Correlation -0,179 ,295* -,286* ,274* 1 0,181 -0,219 Sig. (2-tailed) 0,141 0,014 0,017 0,023 0,137 0,070 Pear/Spear Correlation -0,034 ,269* _0,093 _0,219 _0,176 ₁ _0,082 Sig. (2-tailed) 0,763 0,015 0,410 0,070 0,147 0,469 Spearman Correlation 0,052 0,052 0,024 -0,087 -0,213 0,214 1 Sig. (2-tailed) 0,643 0,645 0,831 0,475 0,078 0,055 Table 5

a. Non-parametric Spearman correlations appear below the diagonal, Pearson correlations appear above the diagonal Q8

Q9 Q10

*. Correlation is significant at the 0.05 level (2-tailed). **. Correlation is significant at the 0.01 level (2-tailed). Q7

Correlation tablea

Q4 Q5 Q6

(26)

26

more hours. In line with this finding is the significant correlation of question 5 and 8, at the ,05 level for the Pearson correlation as well as the non-parametric Spearman correlation, of

respectively ,295 and ,285. Because these numbers show that entrepreneurs tend to see a 48-hour workweek more often as normal than employees do.

In line with the previous findings, is the correlation between question 7 and 8. For the Pearson correlation they correlate at the ,01 level, with a score of ,313. For the non-parametric Spearman correlation, the score of ,274 correlates at the ,05 level. The numbers show that entrepreneurs intrinsically feel that working more than 40 hours a week is normal.

The last interesting correlation from the correlation table, is the correlation between question 4 and question 10. These questions only correlate for the Pearson correlation test, not for the non-parametric Spearman test. The scores are, respectively ,243 and ,052. Although this correlation is not as strong as the correlations which are discussed earlier, because according to the non-parametric Spearman test the two question are not significantly correlating, but it becomes clear from the data that the older the respondents get, the more they agree with the statement of question 4 (i.e. Do you agree that employees who put in more effort, perform better?). This means that older people value input higher than younger people do. The quality of the data is not high enough to conclude anything from this single finding, but it could indicate a changing perspective on input and even on work in general, between different generations. This could be very interesting to study in future research. More about future research will be discussed in section six.

5.3 Additional tests

5.3.1 Robustness test

After the analysis of the results, an additional test was carried out to test the robustness of the results. The control variable gender was added to the analysis to test whether the found positive relationship of working overtime on the performance evaluation holds when this variable is added. Because, it could be that there is a difference in perspective on working overtime between men and women. Men, for instance, are more masculine and therefore could have a stronger focus on performance, where women could have a stronger focus on input (i.e. working overtime) because of their feminine nature.

(27)

27

The last column of table 6 (i.e. sig.), above, shows that the input- and output variable are significant. The scores of input and output are respectively: 0,002 and 0,000. The score for gender and the other combinations are not significant, because all other scores are higher than 0,05.

5.3.2 Analysis of manipulation failures

As discussed in paragraph 4.2, the questionnaire contained a manipulation check, to test whether the participant read and understood the case. The manipulation check consisted of two

questions; question 2 (i.e. Did Alex put in more overtime hours than his/her peers?) and question 3 (i.e. Did Alex make more mistakes compared to his/her peers?). If one or both questions of the manipulation check were answered incorrectly, then the participant did not read or understand the important part of the case, and therefore the manipulation did not work for the particular participant.

In total 25 participants answered one or both questions incorrectly. Divided over the four

variations, variation 1 to 4 respectively got 7, 4, 3, and 11 manipulation failures. When expressing these numbers in percentages, variation 1 to 4 respectively got 28%, 16%, 12%, and 44%. A clear separation is visible between variation 1, and 4 on one side, and 2, and 3 on the other side. A possible explanation for the large difference in manipulation failures between the variations could be that the answers for variation 2, and 3 had to be respectively double yes, and double no, whereas the participant in variation 1, and 4 had to answer respectively yes-no, and no-yes, which requires more thinking from the participant and therefore could be more sensitive to mistakes. To test the effect of eliminating the manipulation failures, an additional test was carried out with the sample size of 106 (81+25) participants. Similar to main analysis of the study, a Two-way

F Sig. Working overtime _9,855 _0,002 Quality of work 75,017 0,000 Gender _2,700 _0,105 Working overtime * Quality of work 0,000 0,988 Working overtime * Gender 1,204 0,276 Quality of work * Gender 0,743 0,391 Working overtime * Quality of work * Gender 1,572 0,214 Table 6 Robustness test: Gender

(28)

28

ANOVA was carried out. First, the original mean scores of the performance evaluation per variation were compered to the additional tests which include the manipulation failure responses. The results of this comparison are showed in table 7a, below.

As shown in table 7a, the differences in mean scores are relatively small, which indicates that the elimination of the manipulation failures does not affect the results of the study. Table 7b, below, supports this conclusion. Based on the significant score of ,000 for the ‘working overtime’-variable, can be concluded that the performance evaluation of employees who do work more overtime hours than their peers still differs significantly from the performance evaluations of the employees who do not work more overtime hours. This finding aligns with the conclusion made in paragraph 5.1, so the conclusion on hypothesis one does not change when including the manipulation failures in the sample.

The main analysis of the study did not find a significant interaction effect between both variables (i.e. working overtime, and quality of work). Based on the non-significant score of ,326 in table 7b, above, the same can be concluded when the manipulation failures are included in the analysis. So, the conclusion on hypothesis two does also not change when the manipulation failures are included in the sample.

N Mean Original Mean Difference 1 Performance evaluation score 27 6,22 6,40 0,18 2 Performance evaluation score 24 4,50 4,55 0,05 3 Performance evaluation score 23 5,70 5,70 0,00 4 Performance evaluation score 32 4,09 3,86 -0,24 Table 7a Comparison of means Variation Variable Type III Sum of Squares Mean Square F Sig. Working overtime 17,702 17,702 17,155 0,000 Quality of work 54,771 54,771 53,076 0,000 Working overtime * Quality of work 1,003 1,003 0,972 0,326 Table 7b Tests of Between-Subjects Effects

(29)

29

6. Conclusion

The study examines the effect of working overtime on the performance evaluation of a junior employee, and if this effect differs between high performing employees and poor performing employees. The underlying, more abstract, level of this study is about input (i.e. working overtime hours) and output (i.e. performance of the employee), and which one gets valued higher by managers.

The study was motivated by an annual research from Accountant and Alterim (2017). In the research it was found that accountants work 8.3 hours overtime, on average, every week. As discussed in paragraph 2.3, employees tend to work more overtime when they are exposed to an overwork climate (Mazzetti, Schaufeli, and Guglielmi, 2014), this would be a textbook example of an overwork climate. Also, almost 40% percent of the accountants are unsatisfied with their work-life balance, according to the same annual research of Accountant and Alterim (2017). So, it is important to gain more insight in the unbalance of making a career and the life outside of work.

The method used for this study is an case-based experiment. Characterizing for this kind of research is the controllability of the manipulated variables, so the results of the experiment are very clean. But, also due to this specific design of the study, external validity might be limited. To the best of my knowledge, there are no other studies examining this topic with this particular method.

The collected data clearly shows support for the first hypothesis of this study (i.e. working overtime has a positive effect on the performance evaluation of a junior employee). Chart 2, at paragraph 5.1, shows that the high performing employees as well as the poor performing employees benefit from working more overtime hours, as their performance evaluations get higher. But, working overtime does not solely have positive effects, for instance health could suffer from working excessive overtime hours.

The interaction effect, which was hypothesized in hypothesis two, was not supported by the data. Meaning, the effect of working more overtime hours did not differ between the group of high performing employees, and the group of poor performing employees. The reason for the absence of the interaction effect, could be the small sample and the relatively young average age of the participants. More about this caveats are discussed in the next paragraph.

It is important for junior employees to take note of these findings. Because, when starting their careers they have to make hard choices between their social life and their careers. Working

(30)

30

overtime hours at the cost of their social life is a hard decision to make, but now it can be made based upon statistical evidence of the pros and cons.

6.1 Discussion

When interpreting the results of the study, it is important to bare the caveats of the study in mind. The most important caveats are described below.

As mentioned above, due to the specific setting of the case the external validity might be limited. This limitation was anticipated for by making the case as realistic as possible. Also, due the big differences amongst the respondents in knowledge about audit firms, the case was slightly extended to provide every respondent with enough information to form a complete, and realistic picture for themselves.

Another caveat of the study is the low average age of the participants of only 28,8 years. The last finding of paragraph 5.2 indicates a possible difference in perspective between the young- and the older people. Because the average age of managers in business would probably be higher, it could probably affect the results.

In addition to the previous point, it is acknowledged that the small sample size, in combination with the low average age, may affect the results. Future research could answer the real effect of this caveat. More about future research is described in the next paragraph.

6.2 Future research

Given the fact that almost 40% percent of accountants is not satisfied with their work-life balance, makes it very important to further investigate the topic of working overtime and work-life balance to improve the insights in this matter. Because if it is such a problem in one

profession, it most certainly is a problem in other professions as well. So, in order to improve the lives of many people, future research must be focused on this area.

Building on this study, future studies could increase sample sizes, to raise the average age compared to this study. Because, as discussed earlier, this could affect the results. Also, the indication for a change in perspectives on input and maybe even a change in perspective on work in general, could be an interesting path for future research.

Lastly, future studies should try to create more realism for the respondents to get higher quality results. For instance, by expanding the case and providing more information.

(31)

31

References

Accountant, & Alterim. (2017). Accountancy beloningsonderzoek 2017. Retrieved from

https://www.accountant.nl/globalassets/accountant.nl/beloningsonderzoeken/beloning sonderzoek_2017.pdf

Baker, G., Gibbons, R., & Murphy, K. J. (1994). Subjective performance measures in optimal incentive contracts. The Quarterly Journal of Economics, 109(4), 1125-1156.

doi:10.2307/2118358

Beckers, D. G. J., Van der Linden, D., Smulders, P. G. W., Kompier, M. A. J., Taris, T. W., & Geurts, S. A. E. (2008). Voluntary or involuntary? Control over overtime and rewards for overtime in relation to fatigue and work satisfaction. Work & Stress, 22(1), 33-50.

doi:10.1080/02678370801984927

Beckers, D. G. J. (2008). Overtime work and well-being: Opening up the black box (thesis). Retrieved from

https://www.researchgate.net/profile/Debby_Beckers/publication/254856660_Overtim

e_work_and_well-being_opening_up_the_black_box/links/577ba2d108aec3b743366048.pdf#page=23 Brown, S. P., & Leigh, T. W. (1996). A new look at psychological climate and its relationship to

job involvement, effort, and performance. Journal of Applied Psychology, 81(4), 358-368. doi:10.1037/0021-9010.81.4.358

Clark, S. C. (2000). Work/family border theory: A new theory of work/family balance. Human Relations, 53(6), 747-770. doi:10.1177/0018726700536001

Fuller, J., & Jensen, M. C. (2010). Just say no to Wall Street: Putting a stop to the earnings game. Journal of Applied Corporate Finance, 22(1), 59-63. doi:10.1111/j.1745-6622.2010.00261.x Guest, D. E. (2002). Perspectives on the Study of Work-life Balance. Social Science Information,

41(2), 255-279. doi:10.1177/0539018402041002005

Hayashi, T., Kobayashi, Y., Yamaoka, K., & Yano, E. (1996). Effect of overtime work on 24-Hour ambulatory blood pressure. Journal of Occupational and Environmental Medicine, 38(10), 1007-1011. doi:10.1097/00043764-199610000-00010

Iwasaki, K., Sasaki, T., Oka, T., & Hisanaga, N. (1998). Effect of working hours on biological functions related to cardiovascular system among salesmen machinery manufacturing company. Industrial Health, 36(4), 361-367. doi:10.2486/indhealth.36.361

(32)

32

Kleppa, E., Sanne, B., & Tell, G. S. (2008). Working overtime is associated with anxiety and depression: The hordaland health study. Journal of Occupational and Environmental Medicine / American College of Occupational and Environmental Medicine, 50, 658-666.

doi:10.1097/JOM.0b013e3181734330

Klimoski, R. J., & London, M. (1974). Role of the rater in performance appraisal. Journal of Applied Psychology, 59(4), 445-451. doi:10.1037/h0037332

Kodz, J., Davis, S., Lain, D., Strebler, M., Rick, J., Bates, P., . . . Pamer, S. (2003). Working long hours: A review of the evidence. Volume 1 — main report (Employment relations research series no.16). Retrieved from

https://www.researchgate.net/profile/Nigel_Meager/publication/269101499_Working_l

ong_hours_a_review_of_the_evidence_Volume_1_-_Main_report/links/548062350cf250f1edc131a4.pdf

Kompier, M. A. J. (2006). New systems of work organization and workers’ health. Scandinavian Journal of Work, Environment & Health, 32(6), 421-430. doi:10.5271/sjweh.1048

Mazzetti, G., Schaufeli, W. B., & Guglielmi, D. (2014). Are workaholics born or made? Relations of workaholism with person characteristics and overwork climate. International Journal of Stress Management, 21(3), 227-254. doi:10.1037/a0035700

Murphy, K. J. (2013). Executive compensation: Where we are, and how we got there. In G. M. Constantinides, M. Harris, & R. M. Stulz (Eds.), Handbook of the economics of finance (pp. 211-356). doi:10.1016/B978-0-44-453594-8.00004-5

Nakamura, K., Shimai, S., Kikuchi, S., Takahashi, H., Tanaka, M., Nakano, S., . . . Yamamoto, M. (1998). Increases in body mass index and waist circumference as outcomes of working overtime. Occupational Medicine, 48(3), 169-173. doi:10.1093/occmed/48.3.169

Nishikitani, M., Nakao, M., Karita, K., Nomura, K., & Yano, E. (2005). Influence of overtime work, sleep duration, and perceived job characteristics on the physical and mental status of software engineers. Industrial Health, 43(4), 623-629. doi:10.2486/indhealth.43.623 Nylén, L., Voss, M., & Floderus, B. (2001). Mortality among women and men relative to

unemployment, part time work, overtime work, and extra work: a study based on data from the Swedish twin registry. Occupational and Environmental Medicine, 58(1), 52-57. doi:10.1136/oem.58.1.52

(33)

33

Prendergast, C., & Topel, R. (1993). Discretion and bias in performance evaluation. European Economic Review, 37(2-3), 355-365. doi:10.1016/0014-2921(93)90024-5

Proctor, S. P., White, R. F., Robbins, T. G., Echeverria, D., & Rocskay, A. Z. (1996). Effect of overtime work on cognitive function in automotive workers. Scandinavian Journal of Work, Environment & Health, 22(2), 124-132. doi:10.5271/sjweh.120

Pulakos, E. D., White, L. A., Oppler, S. H., & Borman, W. C. (1989). Examination of race and sex effects on performance ratings. Journal of Applied Psychology, 74(5), 770-780.

doi:10.1037/0021-9010.74.5.770

Staines, G. L. (1980). Spillover versus compensation: A review of the literature on the relationship between work and nonwork. Human Relations, 33(2), 111-129. doi:10.1177/001872678003300203

Sturges, J., Guest, D., & Mackenzie Davey, K. (2000). Who's in charge? Graduates' attitudes to and experiences of career management and their relationship with organizational commitment. European Journal of Work and Organizational Psychology, 9(3), 351-370. doi:10.1080/135943200417966

Tsui, A. S., & O'Reilly, C. A. (1989). Beyond simple demographic effects: The importance of relational demography in superior-subordinate dyads. Academy of Management Journal, 32(2), 402-432. doi:10.2307/256368

Watanabe, S., Torii, J., Shinkai, S., & Watanabe, T. (1993). Relationships between health status and working conditions and personalities among VDT workers. Environmental Research, 61(2), 258-265. doi:10.1006/enrs.1993.1070

Widener, S. K., Shackell, M. B., & Demers, E. A. (2008). The juxtaposition of social surveillance controls with traditional organizational design components. Contemporary Accounting Research, 25(2), 605-638. doi:10.1506/car.25.2.11

Zimmerman, D. W. (2004). A note on preliminary tests of equality of variances. British Journal of Mathematical and Statistical Psychology, 57(1), 173-181. doi:10.1348/000711004849222

(34)

34

Appendix 1 – Libby boxes

+

*The manipulated variables

Working overtime Performance evaluation

*No overtime (0) VS overtime (1) Performance evaluation score Control variable: Gender

*Working overtime Performance

evaluation

(35)

35

Appendix 2 – Case as in Qualtrics

Preface

Dear participant,

Thank you for your willingness to participate in my study. For this study a case will be presented to you. After you have read the case, a few question will be asked on your opinion of the content of the case. When you answer the questions, please only base your answers on what you have read in the case. In total, participating in this study will take you approximately five minutes. Please click on the arrow to go to the next page.

Informed consent

You are about to participate in a study about employees working overtime. By providing your consent at the bottom of the page you indicate that you understand that;

• Your participation is completely voluntary and that you can quit at any time; • Your answers are completely anonymous;

• Your answers are used for the study; and

• For correcting and grading reasons the thesis will be shared with others, but your answers stay anonymous at all time.

When you have read, understand and agree to the bullet points above, please provide your consent at the bottom and click on the arrow to continue. If you do not agree, the experiment stops here for you and you can leave the page.

.Yes I consent

.No, I do not consent

Instruction and case scenario

Please read the case attentively. When you are done reading, click on the arrow to continue to the next page where you can answer the questions.

We ask you to imagine that you are one of the managers in a big audit company located on ‘De Zuidas’ in Amsterdam. In your firm, like any big audit firm, working overtime is no exception, it is like an unwritten rule. During the year you are responsible for multiple audit engagements which are carried out by your team. During the audits, you and your team visit the client to audit their books. One of the members of your team with which you do multiple engagements is Alex. Alex is a junior associate who is with the firm for almost two years now.

The influence of working overtime on the performance evaluation of a junior employee