• No results found

The Relationship between Research & Development and Employment in Germany: An Empirical Analysis of Automation

N/A
N/A
Protected

Academic year: 2021

Share "The Relationship between Research & Development and Employment in Germany: An Empirical Analysis of Automation"

Copied!
58
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Universiteit Leiden

Faculty of Governance and Global Affairs

Master of Science in Public Administration

The Relationship between Research & Development and Employment

in Germany: An Empirical Analysis of Automation

Master Thesis

By

Name: Christian Maximilian Hunglinger Student Number: 2391589

Date: 11.06.2019 Thesis Coordinator: Dr. Max van Lent

(2)

Table of Contents

1. Introduction ...3

2. Literature Review ...4

2.1 Contribution to the Field ...11

2.2 Research Question and Hypotheses...12

3. Research Design and Methodology ...14

3.1 Variables ...14 3.1.1 Independent Variables...15 3.1.2 Dependent Variable...22 3.2 Data ...23 3.3 Econometric Model...28 4. Analysis of Germany ...32 4.1 Linear Model...32 4.2. Lagged Model...40 4.3. Hypotheses Testing...42 4.3 Limitations...45 5. Conclusion...46 References...47 Annex...50

(3)

1. Introduction

In recent years the dangers of artificial intelligence and rapid technological advance have been a popular topic in dystopian pop-culture. Be it in movies, books or TV-shows, technology has frequently been depicted as hostile to humanity. Even though, at least according to Hollywood, losing our jobs to artificial intelligence is the last thing we have to worry about, some recent reports draw a rather grim picture. Automation is a popular topic in the public as well as in the scientific debate and maybe its popularity stems largely from its unpredictability. Automation is already happening and will most likely become more important in the future. Utopists are already dreaming of fully automated economies, where nobody has to work any more and people can spend their time pursuing personal goals while living off their universal basic income. Skepticists believe its just a hype and argue that the transformation will not be as radical as assumed. What both sides have in common, is that neither of them knows what exactly will happen and if it has already started. Most research relies on predictions about the future and struggles to operationalise automation. This is exactly where this paper comes into play. I argue that a reliable measure for automation is needed in order to pursue research that can help economies prepare for the changes to come. In order to be able to react to future issues, policy makers need to know which processes are already observable. My thesis uses research and development expenditure to operationalise automation and performs a detailed country analysis on the industry-level for Germany. The thesis is structured as follows: In the next chapter after this brief introduction I give a brief overview over relevant research on the matter and explain which research gap my thesis aims to fill. The research question and the hypotheses underlying my research can also be found in chapter 2. The third chapter is dedicated to research design and methodology, here I will explain the general design of my research, the variables and the data used. The last part of that chapter is dedicated to the introduction of the econometric model. In chapter 4 I report the results of my three regression models and discuss what the empirical results mean for the hypotheses. Lastly, chapter 5 serves to summarize the findings.

(4)

2. Literature Review

The scientific debate about automation is as old as automation itself, as The Fragment on

Machines by Karl Marx suggests. Hidden in one of Marx' most elusive works one can find a

shockingly accurate description of modern-day automation and its potential effects on the capitalist society:

But, once adopted into the production process of capital, the means of labour passes through different metamorphoses, whose culmination is the machine, or rather, an automatic system of machinery (system of machinery: the automatic one is merely its most complete, most adequate form, and alone transforms machinery into a system), set in motion by an automaton, a moving power that moves itself; this automaton consisting of numerous mechanical and intellectual organs, so that the workers themselves are cast merely as its conscious linkages.

(Marx, 1973, p.692)

Written already between 1857 and 1858, but published in 1939, and not translated into English until much later, the fragmentary manuscript much likely did not have much readers back then. His automaton, an automatic system of machinery made up from mechanical and intellectual organs sounds like dark science-fiction, but is in fact nothing else than the description of a modern factory, which is exactly what makes it so remarkable. Some modern sources even consider the excerpt from Grundrisse der Kritik der politischen

Ökonomie - to use the original title - a prediction of artificial intelligence and robotics,

written many years before the first computer was even invented (McBride, 2017). For Marx the automated economy is the endpoint of a transformative process, he deliberately describes as a metamorphosis. It is not a modification as it fundamentally changes the understanding of labour, capital and the means of production (McBride, 2017). Marx already points to an issue, that may become very pressing in the near future. In his vision an automated future is a future without jobs, and then ultimately the question arises whether capitalism can subsist without the very foundations it was build upon: labour, wages, capital. According to Marx “Capital itself is the moving contradiction” (Marx, 1973, p. 706), and thus, capitalism essentially works towards its own dissolution. Under capitalism labour time is reduced to a minimum to maximise profits, until eventually human labour disappears and with it the main measure and source of capital (Marx, 1973, p. 706). Without wages no capital, and without capital no consumption and without consumption no demand, which subsequently - according to Econ 101 - leads to no supply. Only supplying people with the goods they essentially need may be in line with Marxist theory, but it certainly is not in line

(5)

with the ideas of capitalism. The disappearance of human labour might sound a little apocalyptic, but it already hints at the modern debate of displacement versus replacement. And there are other scholars, who realised, that automation might in the long run make human labour redundant. It was no other than John Maynard Keynes himself, who sparked the modern scholarly debate back in 1930 with his seminal essay Economic Possibilities for

Our Grandchildren (Keynes, 1963). Keynes observes: “The increase of technical efficiency has

been taking place faster than we can deal with the problem of labour absorption” (1963, p. 358). Keynes claims, that the majority of mankind's most influential inventions like electricity, automatic machinery or the methods of mass production have been made in recent history and the period of fast technological advance is small considering how long humans have existed on this planet. According to Keynes, innovation and capital accumulation are the two main drivers of economic growth and have left us with a world that changes too fast for us to adjust to. He recognizes technological unemployment as the big threat of the future, a term still widely used in the debate today: “This means unemployment due to our discovery of means of economising the use of labour outrunning the pace at which we can find new uses for labour.” (Keynes, 1963, p. 3). Keynes idea of the future is neither utopian nor dystopian, he acknowledges the great opportunities mankind will have once the economic problem is solved, but at the same time, work has been embedded in our culture for so long, the transition will not be an easy one.

[A] point may soon be reached, much sooner perhaps than we are all of us aware of, when these needs are satisfied in the sense that we prefer to devote our further energies to non-economic purposes

(Keynes, 1963, p. 4)

Marx and Keynes are just two early examples of what is widely discussed among pessimists today. The debate about automation is strongly interlinked to a fundamental debate about the future of work. Keynes (1963) believed, that in 100 years from then, should there be no bigger wars the economic problem will be solved or at least will be closed to being solved. Keynes said this in 1933 and was undoubtedly far ahead of his time, now almost 90 years later it seems like he actually could have been right, as large scale automation could mean the end of work as we know it in the near future. Given the fact, that only a few years after his essay the second world war started, we should add a few years to the one hundred he predicted. Still, it seems likely based on what we know today, that around 2050 economies

(6)

and labour markets could have significantly changed and that maybe the 15 hour work week Keynes also predicted will be reality.

When there will be no more work, there will also be no more wages. As outlined before, our current system could not subsist without capital and therefore scholars and experts are looking for alternatives to labour compensation. That is why the modern debate is strongly linked to the debate about universal basic income and other compensatory policies.

The debate came to live again in the wake of technological advance and the increasing impact of automation. The modern debate is one of replacement versus displacement or pessimists versus optimists respectively. Will we be replaced by robots? Or will labour simply move to new sectors of the economy? Optimists argue, that the Fourth Industrial Revolution will not be much different, than its predecessors. Be it manufacturing or services, modern technology will help make the processes and machines involved faster and more efficient. As a result, less manpower will be needed for the same output, but the output will also become cheaper and thus, will lead to an increase in demand. The higher demand will then offset the negative effect of employment (Brynjolfsson & McAfee, 2016, p. 175). A popular argument among optimists is the bank teller example, studied by Bessen (2015). When automated teller machines (ATMs) were first introduced, sceptics assumed a major decrease in employment for bank tellers, however no such thing happened and in fact the number of bank tellers in America even increased slightly. This was one the one hand, due to banks opening more regional branches as a response to reduced costs, but also due to the fact, that tellers started to perform more complex tasks in the banks than they traditionally did. Apart from the the debate, about displacement or replacement two different approaches, dominate the research on the relationship of technology and employment, these are the

skill-biased technological change (SBTC) approach and routine-biased technological change

(RBTC) approach on the other hand. SBTC has repeatedly been the subject of research, see for instance Goldin & Katz (2008, 2009) or Katz & Autor (2011). Problematic about SBTC is, that it mostly benefits one group, while it disadvantages the other. Industries, which traditionally involve more routine labour make relatively larger investments in computer capital, followed by a decrease in the demand for manual labour and a simultaneous increase in the demand for higher skilled labour.

(7)

Acemoglu and Autor (2011) provide a canonical model differentiating between two distinct skill-groups, low and high, each of them providing non-subsitutable labour. Technology can only complement one of the groups, meaning that more technology inevitably leads to a decrease in employment for one of them. The overall effect of technological change, however, is then dependent on the increase of employment for the opposite group. Such simple models, however, have their limitations and have been criticised for lacking accuracy. David Autor and his colleagues (2003) were among the first to investigate the relationship between routine-tasks and automation. With their hypothesis they add another variable to the debate, which before was mostly limited to skill levels. Their new explanatory approach is known as routine-biased technological change (RBTC) and believed to be the main driver behind the hollowing out of the middling skill sector. Their research showed that, jobs involving routine tasks faced the highest risk of automation, an assumption which certainly was true in 2003. Later, more detailed research by one of the founders of the simple model (Autor & Dorn, 2013) found that employment changes in the US economy in the period 1980 to 2005 were in fact U-shaped. That means, the biggest increase took place in the lowest and the highest quartile, while employment around the median skill level remained constant or decreased. The same U-shaped distribution was found with wages, they increased more on the ends, especially on the upper end compared to the mid. Autor and Dorn find, that the increase in the lowest skill quartile largely stems from a 39 per cent increase in employment in service occupations – occupations which are not easily automatable. They find evidence for a shift from middle-skill occupations for example in manufacturing toward low-skill service occupations. This is known as job and wage polarisation. One of the first papers on this matter (Bluestone & Harrison, 1988) was already published in the late eighties. Showing the same U-shaped distribution, it caused a lot of controversy at the time, as most economists and labour market experts assumed a declining number of low-skill jobs. However, in the subsequent years more scholars discussed the topic and today there is abundant research on the matter and most recent papers find evidence for the uneven distribution of jobs, known as job polarisation. The deviation of growth and employment has been found to be a reoccurring pattern in industrialised economies, see for instance Goos et al. (2011). Empirical research on the subject (Charles, Hurst, & Notowidigdo; Jaimovich &

(8)

Siu) could identify the decline in manufacturing and other routine-intensive occupations to be responsible for the redistribution of jobs.

After, Autor, Levy & Murnane (2003) published their results, scholars used their approach and the RBTC hypothesis on other countries. A large-scale study (Goos, Manning, & Salomons, 2014) found evidence for the same effect in 16 European countries between 1993 and 2010. In addition to providing evidence for the hypothesis, they added the influence of offshoring to the debate. A study on the UK with the controversial title Lousy and Lovely Jobs

(Goos & Manning, 2007) found evidence for job polarisation similar to the US in the UK

employment structure, between 1975-1999. Goos and Manning show an uneven distribution of routine tasks among skill-levels. With the help of Autor, Levy and Murnane's (2003) data, they show that routine tasks are more common in jobs located in the middle of the wage distribution and similarly find an increase in employment in both, lousy (low-paying) and

lovely (high-paying) jobs, accompanied by an decrease of jobs with medium-level earnings. If

it is necessary and appropriate to label jobs lousy, simply based on the fact, that they are low-paying, however, is debate-able at least. But let us move on to applications of the RBTC hypothesis on Germany (Alexandra, 2006; Dustmann, Ludsteck, & Schönberg, 2009). Based on survey data, and once again the Autor-Levy-Murnane-model (2003), Spitz-Oener finds an increase in non-routine cognitive tasks in German jobs since 1979. Simultaneously, the share of routine cognitive and manual tasks diminished over the same period. Most scholars agreed, however, that the wage structure in Germany is stable, and that job polarisation is a phenomenon of the US and UK economy only. Interestingly enough, Dustmann et al. (2009) find evidence for an increase in wage inequality throughout the 1980s and 1990s. In contrast to the US, wage inequality at first only increased in the upper half of the distribution before it also increased in the bottom half. The characteristic U-shaped distribution, found in the US, therefore, was first observable in data from the 1990s. Research by Karabarbounis and Neiman (2014) supports the hypothesis of jobs disappearing faster, than new ones are created. They show that labour's share in national income has globally been decreasing since the 80s of the last century. Amongst other factors, changes in technology are a key driver behind this. Similar research on the US-economy (Elsby, Bart, & Ayşegül, 2013) resulted in similar results.

(9)

One of the most influential contributions of the last years represents the more pessimistic side of the debate and is from Frey and Osborne (2013). Based on the task-model of Autor et al. (2003), the researchers build a model that is not only able to assess the susceptibility of routine task, but also of non-routine tasks. Frey and Osborne (2013) assume almost any task to be susceptible to computerisation, but differentiate between three (low, medium and high) risk groups, and additionally identify a few tasks, which are not automatable. These are perception and manipulation tasks, creative intelligence tasks, and social intelligence tasks, and are labelled engineering bottlenecks. Robots and machines lack perception, which can be a problem not only when interacting with humans, but also when trying to identify a problem. Failure recovery e.g. when a robot dropped an object is still a big challenge even for the most modern robots. Creative intelligence tasks, on the other hand, are task, which require the robots to think outside the box and look at problems from different angles. Given the fact, that even today researchers still do not fully understand human creativity and artistry, it is questionable if robots, who are per se rational, can ever acquire it. Robots can evaluate thousands, maybe even millions of solutions to a problem at the same time. But what makes a good piece of art? Is it the one which is most rational and is the best poem automatically the one, which rhymes best? Rather not, instead good art works are those which touch us and evoke deeply human emotions. Good art is as beautiful as it is elusive and I argue that it is just that unpredictability, which makes creative intelligence an exclusively human skill. Social intelligence is a related problem. Artificial intelligence has come a long way and chat-bots like Siri, Alexa and the like have seen massive improvements. But still communication with them is distinctly different than with another human and they cannot fool anybody into thinking they are actually human. Interpreting human reactions and reacting to them accordingly, will probably remain a big challenge for the next decades, which makes occupations like teachers or guides not easy to automate. Frey and Osborne's (2013) detailed task-based approach is unsurpassed and their work is regarded as seminal by many scholars. Yet, it is also highly speculative. Their analysis is based on jobs and technology that existed in 2010, but of course there is no reliable way to predict how fast these technologies will actually evolve. Therefore, their analysis just captures one point in time and makes predictions based on those observations. They cannot go back in time either to analyse a longer time series, as the technology they consider the main driver of

(10)

computerisation, was simply not available back then. This could explain why their estimates draw a rather grim picture of the future. Their predictions see 33 per cent of US employment at low risk, 19 per cent at medium risk and as much as 47 per cent at high risk. Maybe more striking than the fact, that almost half of the US workforce is at risk of being replaced are Frey and Osborne's findings about the relationship between wages and education, and probability of computerisation. They find a strong negative correlation, which breaks with the dominant theory of a U-shaped increase in employment and the hollowing out of the middle class, found for example in Autor & Dorn (2013). Their findings however correspond to the period from 1980 to 2005. Similar works (Goos et al., 2014; Goos & Manning, 2007) that look at comparable periods imply, that U-shaped job polarization is a phenomenon of the 80s, 90s and 2000s. According to Frey and Osborne (2013) the U-shaped model does not hold, as their predictions see labourers with high wages and high educational attainment at an advantage in the long run. There are a few limitations to their research that have to be discussed. Firstly, it is likely that Frey and Osborne (2013) overestimate the future pace of technological advance and underestimate human resistance. Should technology really make jobs redundant is is likely, that at least in the short run policies will be introduced to mitigate the effects of mass unemployment. Such as forcing corporations to still employ a certain percentage of human labourers or comparable policies. Secondly, when interpreting their results it is important to acknowledge that their results are estimations of risks. Just because an occupation is at risk of automation, this does not automatically imply that the whole occupation will be automated.

What troubles scholars and experts alike is the fact, that automation does not only affect manual labour any more as it has been historically. The current wave of automation increasingly threatens white-collar jobs The keyword here is Robotic Process Automation (RPA) and in a recent study 53 per cent of respondents claimed to have started to implement RPA in their businesses (Deloitte, 2018). Robots can not only assemble products these days, they can perform cognitive tasks as well and in many cases they do so faster and more efficient than humans, which adds another dimensions to automation, namely cognitive automation.

The approach used in my paper is distinctly different and there are not many works on the matter. Chennells & Van Reenen (REF) provide a good overview of pre-millenial literature.

(11)

But especially in recent years research on the relationship of innovation and employment have become unpopular. Using R&D expenditure as a measure to estimate and predict the effect of automation on employment is an unique approach, especially on the industry level. Similar works like (Lachenmaier & Rottmann, 2011; Piva & Vivarelli, 2017b; Piva & Vivarelli, 2017a) consider R&D to be a measure of innovation and not of automation. To avoid confusion, it's necessary to add the fact, that Piva and Vivarelli published two papers in 2017, one with an industry-level approach and one with a firm-level approach. Both contributions are closely related to my research. But the industry-level approach (Piva & Vivarelli, 2017a) is the closest, in regard of the assumptions it is build on, and in the data usage. Piva also uses OECD data from the STAN and ANBERD datasets and pursues an industry-level approach. Furthermore, industries are also ranked and grouped according to their R&D intensity, but the classification is based on Eurostat instead of OECD data. And in contrast to my paper, which focuses on Germany, they look at panel data from 11 different OECD countries.

Piva and most comparable works find a positive effect of innovation on employment, see for example Van Reenen (1997) for the UK, or Piva and Vivarelli (2004, 2005) for Italy. Empirical results and theory suggest a positive effect for product innovation and a negative or neutral one for process innovation. Still a recent study on Germany (Lachenmaier & Rottmann, 2011) found overall positive effects of innovation on employment in the period 1982-2002. Piva and Vivarelli (2017b) solve that puzzle by showing that high- and medium-tech firms are responsible for the increase in employment and their results are in line with other works too (Bogliacino, Piva, & Vivarelli, 2012; Bogliacino & Vivarelli, 2012; van Roy, Vertesy, & Vivarelli, 2015). After, dedicating this chapter to the introduction of the most important works in the field, I explain in the next chapter what role my research fulfils and how it fits into the debate.

2.1 Contribution to the Field

Assuming a connection between Research & Development expenditure and employment is an innovative approach to operationalise automation. This has frequently been a challenge in related research, which is why some papers rely solely on predictions. Without a doubt, predictions can play a vital role in establishing policies to counteract the negative effects of

(12)

automation, but to fully assess the situation policy makers need to know what is already happening. This calls for an operationalised measure of automation and is exactly where my research comes into play. Investigating the relationship between R&D investments and employment on industry-level can provide policy makers with the detailed information they need to establish efficient solutions.

My thesis also stands out for using an unique one country approach to avoid biased results. Economies are substantially different in employment, output, productivity and R&D intensity. The case of Germany is particularly interesting, as it is considered a high-tech country, and at the forefront of automation. I will elaborate more on this in chapter 3, but the bottom-line is that only a detailed one case analysis, can reliably predict how the variables behave on industry-level. It is common sense, that the German economy is substantially different from the Chinese or Indian one, and even industrialised economies like Germany and the US cannot easily be compared.

My research is based on the assumption, that automation already has an observable effect on modern labour markets and industrialised economies and that this effect is likely to become stronger in the future. As shown, in the previous chapter there is debate about the strength and significance of the effect and the consequences of large-scale automation. As Germany is a high-tech country, theory would suggest a positive effect of R&D on employment. But, I assume the positive effect of product innovation to be diminishing in the case of German manufacturing. For the service industry, however I expect the results to be in line with previous results. So, the negative effect of process innovation is most likely weaker, than product innovation. More on the underlying assumptions follows in the next chapter.

2.2 Research Question and Hypotheses

In this short paragraph, I outline my research question and briefly explain the hypotheses formulated based on earlier research and my data. The research question essentially consists of two parts. The first part of my analysis is intended to show the immediate effect of R&D expenditure on employment in German industries, and the second part analyses the lagged effect.

(13)

Based on this the underlying research question of my research is as follows:

The research question serves as guideline for this thesis. Additionally, I use the following hypotheses to further guide the analysis. Hypotheses are largely based on empirical findings of other papers, described in chapter 2.

H1: R&D expenditure is expected to have a positive effect on employment

The first hypothesis is based on what other papers with a similar approach have found (Lachenmaier & Rottmann, 2011; 2004, 2005; 1997). My results are therefore expected to be largely in line, with what they have found. However, as they looked at data from the 1980s and 1990s mostly, Lachenmaier's (2011) time series ends in 2002. The effect could have potentially changed throughout the 2000s or early 2010s, as a result of innovation and growth in robotics and AI.

H2: Manufacturing represents an exceptional case. In German manufacturing industries R&D expenditure and employment have a negative correlation due to a stronger effect of process innovation.

Based on the fact, that my time series covers the whole 2000s and early to mid-2010s, and also based on the distinctive characteristics of the German industry – I elaborate more on that in chapter 3 – I assume, that the decline in employment in the manufacturing industry can be strongly linked to more R&D expenditure and subsequently automation. Should this hypothesis hold, it would confirm Germany's role as a trailblazer of automation.

H3: The higher the R&D intensity of an industry the higher the increase in employment

The third hypothesis is also largely based on other research, that found a positive effect of R&D on employment in R&D intensive industries (Bogliacino et al., 2012; Bogliacino & Vivarelli, 2012; Piva & Vivarelli, 2017b; van Roy et al., 2015). The hypothesis does not contradict H2, as especially manufacturing industries vary highly in R&D intensity and, therefore, it is likely that the strength of the effect also varies across manufacturing

RQ: What is the relationship between R&D investments and employment on

(14)

industries. Non the less, I still assume to see a negative effect on the aggregate level, due to the impact of low R&D intensity industries.

H4: The effect of R&D expenditure on employment is expected to be significantly stronger in the lagged model

It is common sense to assume that R&D expenditure and consequently research requires a certain time period to show its full effect. This period can undoubtedly vary greatly as research and time required for innovation is not uniform. Some innovations show their effect shortly after being introduced, while others may only impact employment several years later.

3. Research Design and Methodology

This chapter is structured in three parts to explain the building blocks of my approach step by step. Firstly, subchapter 3.1 is dedicated to the variables used, a substantial part of any quantitative study. The respective subchapters also include tables showing descriptive statistics for explanatory variables and the dependent variable. In sub-chapter 3.2 follows a discussion of the datasets used and a brief explanation of why those have been picked over other alternatives. Lastly, sub-chapter 3.3 introduces the econometric model and regression equations used for the analysis in chapter 4.

3.1 Variables

In my approach three independent variables are used to predict a single dependent variable. Each of the variables is analysed at the industry-level, so essentially they are split up in sub-variables, which can be used individually in the regressions. The research aims to show the effect of automation on employment in Germany across a variety of industries. Following a simple X --> Y pattern, my main independent and dependent variables are automation and employment respectively. Automation, however, is a concept that needs to be operationalised, and there are certain difficulties involved in achieving this. Most importantly, there is no indicator or index available that is commonly accepted as a reliable measure. The International Federation of Robotics (IFR) publishes annual data on robots per worker for most industrialised countries1. But data for my approach needs to cover a

(15)

sufficient period of time, at least 20-30 years to capture long term effects. A larger time span is also required to make the model with lagged predictors work. Choosing R&D as a proxy for automation is an approach, that is unique for this paper. This should not be done without further explanation, which is why in the next part I outline why it is viable, especially in the case of Germany.

3.1.1 Independent Variables

As discussed in chapter 2 most research on automation relies on either a task-based or skill-based approach to identify those jobs, which are at risk of automation. The research on R&D and employment usually connects the two variables via technological change or innovation. But not all contributions attempt to operationalise and quantify the relationship, and there is certainly still debate about how to do it.

R&D Expenditure

I use private Research & Development (R&D) expenditure as a measure for automation. There are a number of reasons why this is especially viable for an analysis of Germany. First of all, because business enterprise R&D has become increasingly more important over the past years. Latest OECD (2019) data has shown, that after experiencing a drop, as a result of the 2008 economic crisis, private sector R&D has seen bigger growth than government funded research & development in OECD countries in the past years and is now the main driver behind global R&D growth. It surpassed government R&D in 2013 and in 2017, 70% of R&D performed in OECD countries was performed by business enterprises.

Further, Germany is among the leading countries when it comes to automation. The country ranks second on the Automation Readiness Index, recently developed by a research group lead by The Economist (The Economist Intelligence Unit, 2018). An important role in Germany's high rank play publicly funded R&D programs to support automation and AI, as well as the generally very high R&D expenditure and R&D intensity. The report ranks Germany on third place in terms of total R&D expenditure. As of 2016, 2.88 per cent of GDP are invested in R&D in Germany. Additionally, there is strong initiative for HR transformation projects, which are intended to provide the workforce with the skills needed for Industrie

(16)

highlights the strong linkage between automation and research & development investments, and why it can serve as an indicator for the same.

Another important factor is the high importance of robots in the German manufacturing industry. Figure 1 shows the estimated worldwide operational stock of industrial robots from 2009 onwards including predictions for the years up to 2021, and it shows a steady increase of the robots used in the industry. Knowing that, according to the IFR Report (2018), Germany already has one of the highest shares of robot per 10k workers and that this number has been steadily increasing, I assume that also a large share of the future growth will take place in Germany further contributing to automation. It already constitutes the fifth largest market for robots and is part of the group of five countries, which together make up 73 per cent of global robot sales (International Federation of Robotics, 2018, p. 14). This is largely due to the important role of the automotive industry. For several years, this industry was the main driver of growth in the robotics market and has only recently been surpassed by the electrical/electronics industry (International Federation of Robotics, 2018, pp. 16–17). Other important industries for robotics are the rubber and plastics industry, pharmaceutical and cosmetics industry, metal and machinery industry, and also the food and beverage industry. All of them play an important role in the German economy. According to the IFR, average growth rate of the robotics market was 18 per cent from 2012 to 2017. This once again shows that Germany is at the forefront of automation, and it can be assumed that a large part of investments for R&D is spent on continuous research on automation of production and service processes. My data does not allow to differentiate between product and process innovation, but for the reasons outlined here I assume them to have an almost equal effect on employment.

As I demonstrate in the paragraphs above, R&D can serve as a viable indicator for automation. Still, there are a few acknowledgements that have to be made. First of all, more precise indicators for automation are thinkable and they exist in the form of robot per worker ratios or sales of industrial robots. The Automation Readiness Index may also seem an appropriate indicator, but as the name suggests it tells more about an economies readiness for automation than the actual degree of automation in the economy. Even though R&D is represented, the index strongly emphasises labour market and education policies (The Economist Intelligence Unit, 2018).

(17)

Figure 1: Estimated Operational Stock of Industrial Robots 2009 - 2021

Source: Own depiction based on:

International Federation of Robotics. (2018). Executive Summary World Robotics 2018 Industrial Robots. Retrieved from https://ifr.org/downloads/press2018/Executive_Summary_WR_2018_Industrial_Robots.pdf

For the indicators mentioned above extensive data is not readily accessible and more importantly do not cover a sufficiently long period. Another factor is that robotics is not the exclusive driver of automation. It is also AI, digitalisation, and machine learning that drive automation. As a result, only looking at the increase of robots in the economy, does not fully capture all aspects of automation. This is also why information and communications technology (ICT) investment does not qualify as an independent variable. Without a doubt, computers and software significantly contribute to automation, but as with robotics, they do not capture the whole process, but only one aspect of it. These technologies, however, require extensive research if an economy wants to profit from them. And this is where R&D investment comes into play. Transforming the German economy to Industrie 4.0 involves large investments in automation, digitalisation and AI across all sectors and businesses. Subsequently, R&D expenditure is an indicator that can capture all aspects of automation. The lack of precision is made up for by the extensive data and the long time period it allows to capture. There certainly are limitations to this approach, but considering the lack of viable alternatives, this is as close as research can get with the available data.

In the following, a few words on how R&D impacts economies. According to European Commission analysis (2016) R&D has been found to have a strong effect on total factor productivity (TFP), and this effect is even stronger in highly developed countries like

2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 0 500 1000 1500 2000 2500 3000 3500 4000

(18)

Germany. The same report also found a lagged effect of R&D on productivity, and that the productivity increase diminishes over time.

Value Added and Labour Costs

In a modern global economy, employment is influenced by various factors, which makes inclusion of control variables into the model vital. I use two additional control variables. Firstly, value added to control for employment changes as a result of fluctuations of economic performance. The same approach is used in comparable research on Germany (Lachenmaier & Rottmann, 2011). The underlying assumption is, that a decrease in production or sales is followed by a reduction in employment. In a competitive economy firms aim to maximise revenue, thus, a common reaction to a decrease in revenue is to lower production costs by for instance producing less or cutting jobs to save money. In terms of structure and unit, value added is in line with the main dependent variable and is expressed in thousand € and deflated by using current prices.

Secondly, I include Labour Costs as control variable to control for changes in employment due to changing labour costs. Labour costs is the sum of salaries paid per sector and is expressed in thousand €. To control for inflation the values are expressed in current prices and again the data is structured on the industry-level.

R&D Intensity

Before moving on to the dependent variable, a few words on an additional variable. At industry-level R&D intensity is included as an additional explanatory variable. The variable is strictly non-numeric and based on OECD average, it ranges from high to low intensity and differentiates between five different categories, for details please see Annex 3. Due to the use of aggregate industry categories, mixed-categories are used to deal with the issue of overlapping main categories. This would for example be the case if one wants to assign a value to a category such as machinery and equipment, CEO products. As an aggregate category it represents data from a variety of smaller industries, in that case machinery and

equipment, and computer, electronic and optical products (CEO products). CEO products are

considered to be high R&D intensive, while machinery and equipment has medium-high R&D intensity. Subsequently, the label high - medium-high is assigned to the combined category. This approach only works reliably when only two categories are combined, however, some

(19)

aggregate categories represent more than two sub-categories, in that case a specific scoring system is used to assign a numeric value to the category, that can then be translated into a non-numeric category again. The system assigns a distinct value to each category of R&D intensity. 1 for high intensity, 2 for for medium-high, 3 for medium, 4 for medium-low, and finally, low intensity represents the lowest category with a score of 5. The final score is then calculated based on the following formula:

Where RDIagg represents the aggregate score of R&D intensity. N is the number of combined industries and k represents R&D intensity of a single category. It has to be noted, that R&D intensity is not used for regression analysis, but only to assess if the coefficients differentiate between the categories of R&D intensity and if there is a pattern. The underlying assumptions about the role of R&D intensity in my model, can be found in hypothesis H3. Based on the literature I expect a positive impact of R&D on employment in high and medium-high R&D intensive industries. Using R&D intensity as an additional variable in the regression is not possible as it is assumed to be constant throughout the entire period of analysis. This is due to lack of data and explained more detailed in chapter 3.2.1 on ANBERD and STAN datasets. Table 3 shows descriptive statistics for all explanatory variables: R&D expenditure in thousands of Euros, value added in thousands of Euros and labour costs in thousands of Euros for the selected industries. Some industry labels are shortened to improve clarity and for layout reasons. For further information on the respective industries please see the Annex. Annex 3 provides a detailed list of primary sector and manufacturing industries used in the analysis, and Annex 4 provides the same information for service industries. The list provided in the Annex should avoid confusion caused by the fact, that the independent and dependent variables for my research essentially have the same labels. What differs is in fact only the unit of analysis i.e. the indicator. Independent and Dependent variables are both structured as per industry, and each industry indicator is considered a separate variable.

RDIagg=1

n×

k=1 n

(20)

Table 3: Descriptive Statistics for Explanatory Variables per Sector in T€ Ind. Variables on industry level R&D Value Added Labour Costs Ind. Variables on industry level R&D Value Added Labour Costs N Mean

(Std. Dev.) Mean (Std. Dev.) Mean (Std. Dev.) N Mean (Std. Dev.) Mean (Std. Dev.) Mean (Std. Dev.)

Total Business Enterprise 26 26 26 3.99E+07

(1.15E+07) (3.79e+08)2.08e+09 ( 1.92e+08 )1.18e+09 Total Services

26 26 26

4040300

(2560507) (2.85e+08)1.42e+09 (1.56e+08)7.62e+08

Agriculture, Forestry and Fishing (A) 26 26 26 90776 (36718) (2152365)1.86e+07 (616696.2)6576000 Services Business Economy 9 26 26 7001867

(1229036) (1.95e+08)9.72e+08 (9.20e+07)4.27e+08

Mining and Quarrying (B) 26 26 26 53406

(48492) (2261348)5965154 (1780796)5568192 Wholesale andretail trade...

22 26 26

151948

(84906) (6.18e+07)3.34e+08 (3.12e+07)2.14e+08

Manufacturing Aggregate (C) 26 26 26 3.55E+07

( 8891577) (8.15e+07 )4.75e+08 (3.82e+07)3.08e+08 Transportationand storage

9 0 25 93855 (27808) (2.06e+07)9.25e+07 4.97e+07 (1.07e+07) Food Products, Beverages and Tobacco 26 25 25 258134 (56933) (2791104) 3.69e+07 (2906432)2.60e+07 Accommodatio n and food... 1026 26 250 (150) (5931510)3.12e+07 2.02e+07(4704581) Textiles... 2625 25 121884 (19495) (1817839)8771920 (1507373)6543040 Information and Communication 10 26 26 2805890

(547315) (2.34e+07 )9.34e+07 (1.24e+07)4.69e+07

Wood, Paper, Printing... 26 25 25 154481

(46095) (1456647 ) 2.63e+07 (1644604)1.76e+07 Telecommunications

9 25 25 411400 (185318) 3.01e+07 (4255351) (1503275)1.05e+07 Chemical, Rubber, Plastics, Fuel Products... 26 25 25 7359498

(1293748) (1.24e+07)9.36e+07 (4690908)5.14e+07

IT and other information services 21 25 25 1528824

(840398) (1.78e+07)3.68e+07 (1.17e+07)2.25e+07

Basic Metals and Fabricated Metal Products... 26 25 25 1017261 (204719) (9955572)5.93e+07 (4296015)4.29e+07 Financial and Insurance activities 22 26 26 159045

(118338) (1.28e+07)9.65e+07 (7247375)5.90e+07

Machinery and Equipment, CEO Products... 26 25 25 2829313

(6949632) (2.28e+07)1.33e+08 (1.27e+07)8.72e+07

Real estate Activities , professional, scientific, technical and Support Services Activities 22 26 26 2235210

(1241269) (9.94e+07)4.48e+08 (4.18e+07)1.06e+08

Furniture, Other Manufacturing... 26 25 25 845231 (432940) (2797087)1.85e+07 ( 2615066)2.37e+07 Community , Social and Personal Service 9 26 26 22422

(9834) (9.05e+07)4.48e+08 (6.41e+07)3.35e+08

Electricity, Gas and Water Supply... 26 26 26 125052

(48547) (1.38e+07)6.04e+07 (2544244 )2.40e+07

Construction (F) 2626 26

65083

(21)

As a result, the full dataset consists of more than 140 variables, including year and ID variables, which help structure the data. The high number of variables may seem counter-intuitive, but is necessary to allow detailed per industry analysis. Lastly, one more remark on the terms employment and total persons engaged, these are used interchangeably in my thesis. Even though, total persons engaged would be the full indicator name used by the OECD, especially in tables and graphs, where it is beneficial to use the shorter form, I resort to employment as alternative and shorter name.

Looking at descriptive statistics, there are a few important observations we can make at this point of the analysis. Table 3 shows us, for instance, that R&D expenditure is generally a little lower in the service industry. Especially the food service industry, transportation and storage, social services, but also wholesale and retail trade are characterised by generally lower levels of R&D expenditure. Among those the food service industry and social services are outliers, with a mean that is significantly lower than manufacturing and other services. In one year accommodation and food service even report R&D expenditure of zero. On the other hand R&D expenditure is generally higher for information and communication, IT, and telecommunications. The mean for real estate activities , professional, scientific, technical

and support services activities is also very high, but this is most likely due to the fact that

this category includes a large number of smaller industries. It includes ISIC divisions 77 to 82 and as such constitutes one of the broader categories of ISIC classification.

Furthermore, table 3 reveals missing values for some service industries. The OECD provides no explanation for the missing data in the ANBERD dataset and offers no alternative estimations. Most likely regulations on reporting and accounting have changed throughout the last years which would explain why detailed per industry data for some industries is only available since the early 2000s. It is important to note, however, that the missing values are concentrated, so the resulting time series are not fragmentary but simply shorter. As a result, the shortest time series has only 9 observations, while the full series is supposed to cover 26 years. There is no way around this problem with the dataset I am using, other than focusing on the categories that are available. A look at the statistics shows that a full time series is available for total service industry and other important categories like IT, wholesale and retail trade, or financial and insurance activities only miss a handful of values. So with limitations an analysis is still possible.

(22)

The left side of table 3 displays R&D expenditure for the primary and secondary sector of the German economy. The data on these sectors is very detailed and almost every time series consists of 26 observations. A few variables only have 25 observations, as for some industries data on 2016 was not yet available. This results in some time series being slightly shorter, but the overall validity is not a risk. R&D expenditure in the manufacturing industry is generally higher than in the service industry. At the same time, R&D is of less importance in the primary sector (agriculture, forestry and fishing) and in mining and quarrying. And the data also shows high variance, most likely due to drastic reduction of R&D expenditures as a result of the decreasing importance of those sectors in Germany. Across all manufacturing sectors the standard deviations indicate a relatively high dispersion and regression analysis will show if these changes correlate with an overall reduction of employment in manufacturing. Labour costs does not seem to follow the pattern found in R&D expenditure, high and low cost industries can be found in the secondary as well as the tertiary sector. But in general, the picture does not hold any surprises: the mean for aggregate services is significantly higher than for aggregate manufacturing. And values for the primary sector are overall very low. On the other hand, value added for agriculture, forestry and fishing is relatively high compared to its low values for labour costs and R&D. Apart from that, value added displays the typical distribution of a service society. Manufacturing is far from being irrelevant, but the mean is generally significantly higher for service industries.

3.1.2 Dependent Variable

My study specifically looks at the effects of automation on employment. Thus, employment is the dependent variable in my research. Employment, however, is a broad concept and can be measured in various ways. For the first part of the analysis I use data on total employment, the OECD refers to this variable in the STAN database as Number of persons

engaged (total employment) and uses the variable code EMPN. The value is expressed in

thousand persons and includes both employees and the self-employed. I follow an industry-level approach this allows to capture within industry changes, even when no effect or only a marginal effect is visible on the aggregate level. OECD data can easily be merged and compared with other datasets on the industry-level, as newer STAN and ANBERD datasets are structured according to the NACE Rev. 2 classification. In contrast to automation my main dependent variable does not need to be operationalised and can be measured in

(23)

various ways. It is straight-forward and it should be common sense why numbers on persons engaged can be used to capture changes in total employment.

Before moving on to a description of the data, a brief look at descriptive statistics for the dependent variable. These can be found in table 4. The structure is identical to table 3, the left side shows employment data for primary and secondary sector, and the right side for the tertiary sector. Employment data is expressed in total persons engaged in thousands. Overall there seems to be less variance in employment data and some sectors have very low standard deviation, indicating low fluctuations in employment. This is especially true for

food products, beverages and tobacco; electricity, gas and water supply but also financial and insurance activities. This does not necessarily surprise as these are industries, which

provide the goods to satisfy basic human needs such as food, water and energy. Tobacco of course is an exemption and I do not claim it constitutes a basic human need. However, it can be argued that financial and insurance products are basic needs of a modern human being as they help to protect and increase wealth but also provide security in the form of insurance. As such these goods generally have a low price elasticity of demand and are not easily substituted. A relatively constant value for employment, could be an indicator for simultaneous productivity and demand growth. Productivity increases due to more automation, while at the same time demand rises too as a result of a growing population and more wealth.

3.2 Data

This chapter explains the datasets I used for my variables in more detail. As the empirical analysis can be divided into three models, I used data from various sources and combined them. The first two models are intended to capture the immediate effect of automation on the economy i.e. total employment, whereas the third model aims to capture the lagged effect on employment per industry. The models are explained in detail in chapter 3.3. This sub-chapter serves the purposes of explaining the STAN and ANBERD datasets in more detail.

(24)

Table 4: Descriptive Statistics for Persons Engaged per Sector in Thousands

Dep. Variable (Persons Engaged)

N Mean

(Std. Dev.)

Min Max Dep. Variable (Persons Engaged) N Mean (Std. Dev.) Min Max Total Business Enterprise 26 39971 (1749) 37786 43638 Total Services 26 28151 (2651) 23760 32461 Agriculture, Forestry

and Fishing (A) 26

751 (137) 619 1174 Services Business Economy 26 16155 (1693) 13540 18766 Mining and Quarrying

(B) 26 117(64) 56 301 Wholesale and retail trade... 26 (118)5815 5576 6037 Manufacturing Aggregate (C) 26 7762 (672) 7138 10064 Transportation and storage 26 1988 (107) 1834 2195 Food Products, Beverages and Tobacco 25 906 (27) 849 936 Accommodation and foo... 26 (253)1470 1043 1868 Textiles... 25 258 (114) 155 590 Information and Communication 26 1100 (110) 944 1240 Wood, Paper, Printing... 25 573(96) 446 750 Telecommunications 25 213(63) 944 1240 Chemical, Rubber, Plastics, Fuel Products... 25 1225 (129) 1098 1595 IT and other information services 25 510 (172) 253 770 Basic Metals and

Fabricated Metal Products... 25 1181 (88) 1098 1478 Financial and Insurance activities 26 1239 (35) 1179 1296 Machinery and Equipment, CEO Products... 25 1988 (209) 1811 2753

Real estate Activities , professional, scientific, technical and Support Services Activities 26 (1186)4128 2308 5903 Furniture, Other Manufacturing... 25 671 (62) 603 860 Community , Social and Personal Services 26

11996

(965) 10220 13695 Electricity, Gas and

Water Supply... 26 527(40) 484 603

Construction (F) 26 2660

(25)

Both datasets cover a period of 26 years starting from 1991 to 2016. As noted earlier, it is crucial to capture a sufficiently long period of time, as it is likely that automation has a lagged effect on the economy and employment. And secondly, covering a longer period also reduces the risk of noisy data, due to economic shocks and other potentially influential events e.g. terror attacks or the financial crisis of 2008.

For the regressions I entirely rely on OECD data. For my main independent variable I use OECD data on R&D from their ANBERD2 (Analytical Business Enterprise R&D) database. It

shows R&D expenditure exclusively for business enterprises and not for the public sector. In the following, I briefly discuss why I choose the ANBERD dataset over its alternatives. To begin with, public sector research contributes to technological advance and general knowledge with its fundamental studies carried out in universities and research facilities. But it is likely that employment is much more affected by innovation originating in the private sector. It is businesses that research how to make their production process more efficient and how to increase their output. In a competitive economy, companies have an incentive to save money, and therefore an incentive to pursue process innovation. In business enterprise R&D is therefore used to find new and more cost-effective production methods and history

has shown that sometimes this means humans will be replaced by machines or robots.

Furthermore, firms must make profits in order to stay in business, which gives them an incentive to pursue product innovation. These drivers do not exist in the public sector. As a result, I argue that private R&D investment is the main driver behind innovation and subsequently the more appropriate proxy for automation. For the reasons outlined I, therefore, refrain from using only public R&D expenditures, as for instance provided by the German government. There are two other potential options, which need to be discussed briefly. First alternative option would be to use the OECD Gross Domestic Expenditure on

R&D3 (GERD) database. The GERD dataset covers R&D performed across all four sectors

(business enterprise, government, higher-education, and private non-profit) and thus, provides the most accurate data on total R&D expenditure. Being able to capture R&D of all sectors is certainly an advantage and could help the research, but GERD only provides

2 For details please see:

http://www.oecd.org/sti/inno/anberdanalyticalbusinessenterpriseresearchanddevelopmentdatabase.htm

(26)

aggregate data and does not differentiate between industries, which makes it not viable for my approach. Second alternative option, is the standard OECD Business Enterprise Research

& Development expenditure database (BERD). Being similar in structure and data, ANBERD

however is capable of covering a longer time period by estimating missing values based on national accounts. By using the ANBERD data I am able to extend my time series to the 1990s, even if this means some of the data for the 1990s is based on estimates.

The ANBERD dataset is linked to the OECD Structural Analysis (STAN) dataset, and both are structured according to the International Standard Industrial Classification of All Economic

Activities Revision 44 (ISIC Rev.4). This makes merging the two datasets significantly easier,

but they are also very complex and detailed. Technically, the data provides detailed data on around 100 industries – 138 for labour market data respectively. Comparing the whole dataset would therefore not be beneficial to my research, as each R&D data point needs to have an employment data point as an equivalent to test for correlation, so the data must be complete for all independent variables and the dependent variable. Therefore, the dataset is reduced to a handful of broader categories and also excludes some of the smaller sub-categories for reasons of clarity and comprehensibility. The reduced industry selection is still in line with the ISIC Rev. 4 classification. It is important to add that, ISIC Rev. 4 is also in concordance with NACE Rev. 2 the classification scheme used by Eurostat. The broad structure of NACE Rev. 2. is shown in table 1.

For the independent and control variables (Value Added and Labour Costs) data is imported from the OECD STAN5 (Structural Analysis) database. The STAN data uses the same

classification system as ANBERD data, and as with the ANBERD data I reduce the over 100 available distinct industries to an individual list to make them fit. For reasons of clarity abbreviations and acronyms are sometimes used for the industries in tables, graphs, and, if necessary, throughout the running text. In case of ambiguity please see Annex 3 and 4. In order to use Research & Development Intensity as an additional independent variable I make further use of OECD data (Galindo-Rueda & Verger, 2016; OECD, 2019).

4 For details please see:https://unstats.un.org/unsd/publication/seriesm/seriesm_4rev4e.pdf

(27)

Table 1: Broad structure of NACE Rev. 2

Source: Eurostat. (2008). NACE Rev. 2: Statistical Classification of Economic Activities in the European Community (Revision 2, English edition). Luxembourg: Office for Official Publications of the European Communities. p. 57

Annex 1 shows the list of industries ranked by their Research & Development intensity based on the ISIC Rev. 2 classification. The table is based on the two-digit classification, as I use a reduced industry list for my research and, thus, all relevant industries are covered by using the two-digit classification. The classification is based on OECD average, because a separate one for Germany could not be retrieved. The levels of R&D intensity for Germany and the OECD average, however, are very close, as can be seen in the latest report on main science and technology indicators (OECD, 2019). R&D intensity is generally defined by the OECD “as the ratio of R&D expenditure to an output measure, usually gross value added (GVA) or gross output (GO)” (Galindo-Rueda & Verger, 2016, p. 6). On company level R&D intensity is sometimes defined as R&D expenditure divided by sales. However, for the aggregate per industry indicator as used in this thesis the OECD uses gross value added (GVA) taken from the OECD STAN database. Data on R&D expenditure is taken from the ANBERD database and includes only business enterprise R&D, so the additional variable is based on the same data

(28)

as the main variables (Galindo-Rueda &Verger, 2016, p. 8). The variable further, represents only direct R&D expenditure, as the lack of detailed enough input/output data makes it impossible to report on indirect R&D expenditure on the aggregate level. Indirect R&D is supposed to measure the diffusion of R&D content of output into other industries (Galindo-Rueda &Verger, 2016, p. 7). My approach does not rely on need such detailed data, but it might be interesting for further micro-level research on the same topic. Another limitation is that data on R&D intensity is not time series data. The value is based on observations from 2011 and is not computed for every year. The OECD acknowledges this fact, but at the same time outlines why a time series is not possible with the limited data (Galindo-Rueda & Verger, 2016, p. 9). Also explained in the reference is the choice of 2011 as a reference year. According to OECD analysis 2011 is the year by which most industries have recovered from the financial crisis of 2008, ensuring the data is not biased. The thesis will base its analysis on the 2011 values and assume R&D intensity to be constant throughout the 26-year period. Combining the 2011 results with older data is difficult as the values are non-numeric. Therefore, combining them would require a reliable method and no such thing is provided by the OECD. Additionally, as outlined I already combine certain categories to be able to evaluate aggregate categories. Computing these aggregate values on the basis of per year and per industry data would likely lead to loss of precision and overlapping categories. This would not satisfy the premises of a good classification system to be mutually exclusive and collectively exhaustive.

3.3 Econometric Model

Prior to the regression analysis the goodness of the model has been tested using statistical methods. Firstly, a general test of the model has been carried out using scatter-plots. With, over 100 variables it is not beneficial or possible to display all of them, so only the scatter-plots for the two main aggregate categories are included. Figure 2 shows the relationship between aggregate employment and aggregate R&D expenditure and figure 3 displays the same for aggregate service sector employment and R&D investments. Both figures show a very good fit of the model, a pattern found across all variables apart from a few exceptions. But based on the results of the scatter-plots R&D expenditure qualifies as a predictor variable for employment.

(29)

Figure 2: Scatter-plot for Aggregate R&D and Employment

Figure 3: Scatter-plot for Aggregate R&D and Employment Services

38 00 0 40 00 0 4 20 00 44 00 0 30000000 40000000 50000000 60000000 70000000

Aggregate R&D expenditure

Aggregate Employment Fitted values

24 00 0 26 00 0 28 00 0 30 00 0 32 00 0 34 00 0 0 2000000 4000000 6000000 8000000 10000000

Aggregate R&D Expenditure Services

(30)

Before running the regressions, additional tests were carried out, which are briefly described here. Pearson's correlation coefficient was used to detect possible multicollinearity in the data. The full regression table, with all coefficients for primary and secondary sector industries can be found in Annex 5 and for tertiary sector industries in Annex 6. According to Cohen (1988), a coefficient of 0.1 to 0.3 is considered small, anything bigger than 0.3 and smaller than 0.5 is a medium correlation and anything bigger than 0.5 is considered a strong correlation. The correlation coefficients were found to be very high overall indicating a high degree of multicollinearity. The findings are backed by observation of a high change of the coefficients, when labour costs is added as a variable to the model. Pearson's correlation coefficients are especially high between labour costs and value added. This might be a sign that labour costs is in fact not an appropriate proxy for operational costs, but for economic performance. Increasing the workforce when a business runs well and by thus increasing labour costs appears logic.

As some of the results for Pearson's correlation appeared problematic, I additionally performed the Shapiro-Wilk test with my data to check if my variables are normally distributed. Should that not be the case, the results of Pearson's correlation are unreliable. The Shapiro-Wilk test is very suitable for the data, as it is relatively precise even for small datasets, with N<50, compared to other significance tests, which lose precision when applied to small N datasets (Seier, 2002). Results of the Shapiro-Wilk test indeed showed, that a number of independent variables are not normally distributed, applying an alpha of .05 to it. Therefore results of Pearson correlation might be unreliable. To react to those findings, I also tested the variance inflation factor (VIF) of my explanatory variables. The results can be found in Annex 7. There is, however, debate about how to interpret the results of VIF analysis, some scholars propose a max VIF of 5, while others tolerate values up to 10. As Pearson's correlation already suggests a high level of multicollinearity, I follow a tolerant approach allowing variance inflation factors of up to 10.

Multicollinearity does not necessarily need to be a problem, but it should be kept as low as possible. To account for the problem I use three different regression equations. The results of Pearson's correlation and especially variance inflation factor analysis identified labour costs as problematic variable. Therefore, only the long model will include labour costs as

(31)

explanatory variable, while it is omitted in the short-model. The short regression equation can be found below and is labelled as equation 1:

(1) EMP=α+β RD+γ VA+ε

The second equation describes the long model including both control variables and is shown below:

(2) EMP=α+β RD+γ VA+δ LC+ε

Where EMP represents the dependent variable employment and α stands for the intercept. RD is the main explanatory variable R&D investments and its effect is β. VA stands for value added per industry and is the first control variable, its effect is described by γ1. The second control variable is LC, the labour costs per industry and is represented by γ2. Lastly, ε represents the error term.

The third model is based on the assumption, that my explanatory variables have a lagged effect on employment. The underlying mechanism is common sense; money spent on R&D is likely to have a lagged effect, as research and implementation of new products or processes require time. The length of this research & implementation period is debatable, based on the length of my times series I use two years. The lagged regression equation is represented by equation 3 and can be seen below:

(3) EMPt00RDt1RDt −12RDt−20VA01VAt−12VAt−2

For the lagged model labour cost has been dropped, due to issues with multicollinearity. Apart from that the variables are the same as in the short model (1). But instead of a single effect, the model accounts for a distinct effect for each lag. My dataset features yearly data, therefore one lag represents one year. In the lagged model t represents the year of analysis, while t – q indicates how much the explanatory variables are lagged. For t – 2 this would mean that employment of that industry in 1995 is regressed on R&D expenditure of the same industry in the year 1993. Accordingly, β1 represents the effect of RD lagged for one year, β2 for two years and so on. In addition to dropping labour costs as a variable, I also excluded some industries from the lagged regressions, which displayed problematic variance inflation factors of more than 10 even after dropping labour costs. Interestingly enough,

Referenties

GERELATEERDE DOCUMENTEN

One of the most significant developments in international human rights law for 2018 has been the adoption of the first General Recommendation (GR) ex- clusively dedicated to

In relation to the second phase of collective action, I argue that depending on the perceived selective incentives and on the role and involvement of interest groups

Next to the three categorical variables, seven continuous variables have been used in this research namely, the dependent variable environmental performance and the independent

We report transient absorption spectroscopic studies on the hybrid material composed of porphyrin molecules covalently attached to graphene for investigating the mechanism

Put differently, the impact of those two personality traits on consumers’ decision-making (attitudinal) and purchase (behavioral) behaviors. The objectives of this

‘The effect of market dynamism, cooperation and firm age on R&amp;D investments of family firms: an

We therefore interpret the elasticity as the percent change in the dependent variable, while the independent variable increases by one percent (Hill et al. If we compare

No evidence is found to conclude that trust in automation explains the relation between automation reliability and human performance; no correlation or mediating effect is