Executive Program in Management Studies
Strategy track Master Thesis
Big Data and Financial Performance – Does CEO Age, CEO Tenure and CFO Age influence the relation between firms that manage Big Data and their Financial Performance?
Stefan Boom – 10730591 Thesis Supervisor: dr. D.A. Waeger
August 2016 Amsterdam
2 Statement of Originality
This document is written by Student Stefan Boom who declares full responsibility for the contents of
this document.
I declare that the text and the work presented in this document is original and that no sources other
than those mentioned in the text and its references have been used in creating it.
The faculty of Economics and Business is responsible solely for the supervision of completion of the
3
Table of Contents
Abstract ... 4 1. Introduction ... 5 2. Hypotheses ... 9 3. Methodology ... 14 1. Sample. ... 15 2. Independent Variable ... 15 3. Dependent variable ... 16 4. Moderator variables ... 16 5. Control variables ... 17 4. Results ... 18 1. Descriptive analytics ... 18 2. Normalization ... 18 3. Pearson Correlation ... 19 4. Regression ... 22 5. Discussion ... 276. Limitations and future research ... 29
7. Conclusion ... 31
Bibliography ... 33
Appendix ... 36
Appendix 1 – Descriptive Statistics ... 36
Appendix 2 – Collinearity Statistics ... 36
Appendix 3 - Distribution – Independent , Dependent and Moderator Variables ... 37
4
Abstract
Research in 2014 stated that 99,8% of all existing data has been created in the two years prior to 2014.
This is a trend seen within a world where we are digitalizing further, which leads to the creation of
more data. These enormous amounts of data, which is called Big Data, could contain valuable
information for firms and could lead to better firm (Financial) Performance when used properly.
Though, to get the insights from this information the data needs to be managed. Therefore, firms are
exploring their opportunities regarding Big Data management. To do this, firms need the right
managers and employees that can help them correctly manage Big Data.
While there is a lot of focus on Big Data and the opportunities it provides for a firm, there is
still a lot of research to be done. Therefore, I will research whether the use of Big Data and the
experience in using Big Data has a positive influence on the Financial Performance of a firm. The
study is done through the use of a hierarchical regression analysis on a sample of 375 firms out of the
S&P 500. To explore the importance of the right employees and managers, the moderating effect of
CEO Age, CEO Tenure and CFO Age on the relation between Big Data Usage/Big Data Experience
and Financial Performance will be explored. The results of this study show no support for a relation
between Big Data Usage/Experience and Financial Performance. Also, no support was found for the
other hypotheses. Though, this study confirms, in contrary to the stated hypothesis, that there is a
moderating effect of CEO age on the relation between Big Data Experience and Financial
Performance. Thereby explaining that an older CEO has a positive effect on the relation between Big
5
1. Introduction
To outcompete each other, firms are looking for competitive advantage (Peteraf, 1993). Porter (1980)
explains that firms can outcompete each other through the use of three different strategies, namely,
differentiation, overall cost leadership and focus. Hereby, Porter (1980) created the foundation of
strategical research and showed that competitive advantage is the cornerstone for every firm. He stated
that for a firm to choose the right strategy, there is a high need for analytical skills to understand the
market. To do so, Porter (1980) approaches the firm as a black box meaning approaching it as a
unitary agent without looking at the internal opportunities of the firm. After Porter (1980), a lot of
research has been done to understand how to achieve competitive advantage by focusing on internal
(e.g. creating efficiency inside a factory, here the black box is opened up) and external factors (e.g.
entering the right market). To understand on which internal and external factors to focus, insights in
these factors are needed. With these insights available, the firm is able to take the necessary strategic
decisions to guide the firm to competitive advantage. Hereby, analytics could be used, but when there
are no insights available and therefore there is a lack of knowledge, the decision has to be made on the
basis of instinct (Davenport, 2006). While both approaches have their pros(e.g. analytical is fact based,
instinct is low cost and quick due to no additional information needs) and cons(e.g. analytical is costly,
instinct is potentially less accurate due to the use of less information), more and more firms are
exploring the benefits of the analytical approach. Along with fast technological developments, this is
increasingly becoming a way for businesses to achieve competitive advantage (Chen, et al., 2012).
The introduction of firstly computers and later the internet has had a big influence on the
economy (Choi & Yo, 2009). Technological innovations, such as the use of smartphones, have
changed the landscape and access to information. The use of all these technologies increases the
buildup of data and facilitates access to the internet. The amount of data that is generated is increasing
exponentially. More business processes are being handled by information technologies and more and
more computers are being used. To be precise, of all existing data in 2014, 99.8% was created in the 2
years prior to 2014 (Leung, 2014). This exponential growth of data creates new challenges for firms
6
With the internal and external factors in mind, the challenge is to get the most out of Big Data and to ensure Big Data is in the benefit of the firm.
Big Data is explained by Chen et al (2012) as data sets and analytical techniques in
applications that are so large and complex that they require advanced and unique data storage,
management, analysis, and visualization technologies. This data ranges from Terabytes to Exabyte`s
and varies from sources such as sensors or social media. Data on itself from one source does not
always give meaningful and expected insights. Therefore, the data should be combined with other
sources. Big Data management sources can be combined (e.g. one source logistic application
connected with a financial application) to add value and give meaningful, up to date insights which are
needed to draw conclusions and take action. To be able to view and analyze these combined sources,
firms need separate applications. This type of application is called Business Intelligence software.
While Business Intelligence nowadays is being used to explain these applications, the term
was introduced with another meaning. Before the term Big Data was introduced, the terms Business
Intelligence and Business analytics were being used in the 1990`s and 2000s (Chen, et al., 2012).
Business Intelligence described the relation between business and IT communications and Business
Analytics describes the key analytical component of Business Intelligence (Davenport, 2006).
Nowadays the term Business Intelligence is used for the applications that provide the insights and
visualization of the data as Business Analytics still describes the analytical components within the
Business Intelligence tool. Examples of these applications are Qlikview/Sense, Microsoft Power BI
and Tableau. With these applications users across the organization are able to quickly analyze the firm’s performance without the need of high technical or analytical competences (Qliktech
International AB, 2016).
Big Data has become a trending topic within business but has also been noticed within other
industries. For example within the public sector a Big Data initiative of $200 million started by Obama
in 2012 was carried out for research and development within the National Science Foundation, the
National Institutes of Health, Department of Defense, the Department of Energy and the United States
7 sectors can achieve great advantages in actively managing Big Data. When doing so, it converts them
into what Chen et al(2012, p.1168) calls “data driven organizations”. A data-driven organization is a
firm that bases its decision making on analysis measured by 1) the usage of data for the creation of a
new product or service, 2) the usage of data for business decision making in the entire company, and
3) the existence of data for decision making in the entire company (Brynjolfsson, et al., 2011).
Research shows that data-driven firms are able to manage risks better, and enhance competitiveness
which leads to creating value for the world economy (Manyika, et al., 2011). Though, the development
towards becoming a data-driven firm and the advantages that the use of Big Data offer also bring
along important challenges.
The challenge is not only to collect and manage vast volumes and different types of data, but
also to extract meaningful value from this data (Bakshi, 2012). To do this, firms need managers and
analysts with knowledge about how Big Data can be managed. A report by the McKinsey Global
Institute (2011) predicted that by 2018, the United States alone will face a shortage of 140,000 to
190,000 people with deep analytical skills, as well as a shortfall of 1.5 million data-savvy managers
with the know-how to analyze big data to make effective decisions (Manyika, et al., 2011). Firms need
to accelerate employment programs, while making significant investments in the education and
training of personnel on all levels to prepare themselves for Big Data management and to keep up with
competitors (Sagiroglu & Sinanc, 2013).
The current literature on Big Data and business analytics describes the importance of
competences and knowledge with regards to Big Data. The decision on whether and how to manage
Big Data impacts the entire organization. These kind of decision are part of the strategic decisions of
the Top Management Team (TMT) of a firm. Leadership of a complex organization is a shared
activity, and the collective cognitions, capabilities, and interactions of the entire Top Management
Team (TMT) influence the strategic behaviors (Hambrick, 2007). The statement by Hambrick (2007)
proves the importance of the TMT with regards to the strategic decision of a firm and specifically on
how to manage Big Data within the firm. The decision to actively manage Big Data starts with the
8 change in mind-set from leadership down to the front lines. (Goyal, et al., 2012). Hambrick (2007)
also explains that the experience of the executive’s career influences the behavior and approach
towards different situations. As Big Data management provides different pros and cons for the
different TMT members and the experience of the executives influences their behavior, I will focus on
the role of the CEO and the CFO with regards to Big Data management.
When Big Data is well managed within a firm, this provides valuable insights in the
developments within the firm and also outside the firm. Managing Big Data requires another focus in
contrast to what was seen several years ago when Big Data was not as important as it is nowadays.
The CEO of the firm is closely related to the strategy and the strategic initiatives within the firm
(Hambrick, 2007). The support of a CEO is needed to get the most out of the available data across the
firm. Goyal et al (2012) confirmed the importance of the role of the CEO in the top-down approach on
investing in Big Data. Davenport (2006) stated that older CEOs will have less experience using data
and were not educated during their studies in the use of Big Data as this was not part of steering a firm
yet. Sagiroglu & Sinanc (2013) explain that to work with Big Data there is a need for differently
trained personnel. Studies like Davenport (2006) and Sagiroglu & Sinanc (2013) show that the age of
the CEO impacts on how a firm manages its Big Data and whether this influences the Financial
Performance. Another aspects of CEO`s is their tenure. A long-tenured CEO tend to grow "stale in the
saddle," making it harder to make adaptive changes (Hambrick, 2007). Due to the inflexibility
explained by Hambrick (2007) a longer tenured CEO is avoiding risks. Upon the approach to get the
most out of Big Data management, a decision needs to be taken on the use of applications and
investing in capacity and the right employees. This shows that when a CEO is long-tenured he/she will
be more likely to avoid risk and will not fully focus on Big Data management which leads to low
utilization of the data and could negatively impact the Financial Performance of the firm.
Another important role within the TMT regarding Big Data Management is the CFO for
several reasons. Firstly, the CFO is involved in large investments and is therefore most likely involved
in how much will be invested in Big Data management which can influence the Financial Performance
9 (Aier, et al., 2005). To manage the firm well, all of the companies’ insights are needed regarding the
performance of these companies which can be created by correct Big Data management. When the
CFO has these insights it can influence the Financial Performance positively due to accurate decisions
being made.
This thesis investigates the relationship between the use of Big Data and Financial
Performance. This research will be performed in a quantitative way, since previously conducted
research has mainly been done in a qualitative manner (E.g. the papers of Davenport (2006) and Chen
et al (2012) undertook qualitative studies to understand the benefits of the use of Big Data).The
quantitative approach shows to be a new manner to research the benefits of the use of Big Data.
Therefore, it sets up a challenge to collect the necessary sample, but will also open up the quantitative
research area on the benefits of Big Data management and its effect on Financial Performance. While
the argument that the use of Big Data contributes to Financial Performance has been formulated by
other researchers (Murphy & Zimmerman, 1993) (Mian, 2001), less has focused so far on how the
characteristics of the members of the Top Management Team influencing the relationship between Big
Data and Financial Performance.
2. Hypotheses
Managing Big Data actively enhances the firm’s competitiveness and provides competitive insights
(Manyika, et al., 2011) (Davenport, 2006). Chen et al (2012) refer to different examples about the use
of Big Data within different fields like health care and market intelligence. Across industries, different
initiatives exist to focus on Big data management and analytics. Procter & Gamble for instance
composed a group of analysts consisting out of functions such as operations, supply chain, sales,
consumer research and marketing (Davenport, 2006). This example shows how there is a focus on Big
Data across industries and that firms are taking initiatives to develop their competences on the use of
Big Data (Sagiroglu & Sinanc, 2013). As explained before, Big Data helps to provide insights within
the firm and enhances its decision making. Big Data influences the decision making process positively
10 information (Davenport, 2006). This is possible due to the careful analysis Big Data provides and the
depths needed to really solve problems supported by Business Intelligence applications (Sagiroglu &
Sinanc, 2013). In the literature, the benefits of having these insights within the firm have been widely
discussed and provide evidence of the positive effect of the use of Big Data on Financial Performance.
Therefore, I hypothesize in this thesis that there is a positive relationship between Big Data Usage and
Financial Performance in firms:
Hypothesis # 1a: Firms that use big data techniques have a better financial performance than firms that do not use big data techniques.
The vast majority of research on organizational experience adopts a learning-curve perspective
that predicts positive returns to experience (Haleblian & Finkelstein, 1999). For Big Data this can be
explained by the argument that a firm managing its Big Data actively for a longer period of time
should become better at managing Big Data. This should lead the firm to achieve a more effective
utilization out this data (e.g. a firm would be able to get in depth performance of all company`s within
the firm due to experience in their Big Data management). Also, the learning curve creates entry
barriers and protection from competition. When the firm is getting better at something through
experience it enhances its position within the market. Thereby making it harder for other firms to enter
and compete within the market (Spence, 1981). The entry barriers for entering markets has been
widely discussed by Porter (1980) and show how the learning curve assists in maintaining competitive
advantage. In this case, a firm managing Big Data for a longer time, would become better at it.
Thereby achieving competitive advantage and ensuring its position in the market by protecting against
new entrants. Due to the positive experience of the organizational learning curve where a firm
becomes better at something when they are doing it for a longer time, I expect a firm that is managing
Big Data actively for a longer period of time becomes better at exploiting the opportunities of Big
Data. By utilizing Big Data management, a firm will have better insights on strategic decisions which
will contribute to a positive Financial Performance. This leads to the following hypothesis:
Hypothesis # 1b: The longer the firm is managing Big Data actively the better its financial performance.
11 Previous experiences during the career of the TMT influences the strategic decisions being
made (Hambrick, 2007). As a CEO gets older it is likely that the CEO will have more experience. An
older CEO has experienced more situations which makes it likely that in new situations the CEO will
approach this with solutions that were effective in an earlier stage of his/her career. Due to this
behavior, a CEO will be less likely to search for new initiatives and innovative ideas to approach
situations. A younger CEO with less experience will have to exploit new approaches to new situations
as there is less experience. This behavior makes it more likely for a younger CEO to look for new
approaches like managing Big Data. Besides the open approach of a younger CEO to new situations,
there are also other characteristics that influence the behavior of an older CEO with regards to the
utilization of Big Data. Firstly, academic programs of older CEO`s where not designed for the technological solutions which are used nowadays. Because older CEO`s don’t have the knowledge of
these technologies they don’t know how to implement them in the right way for firm (Chen, et al.,
2012). Secondly, research like Serfling (2014) has proven that the age of the CEO influences its style
of management and decision making behavior through personal characteristics. These are for example
personal life experiences or overconfidence. Due to these characteristics an older CEO will focus upon
the areas within the firm at which the CEO is familiar. This lack of focus on new initiatives and
innovations influences the likeliness that an older CEO will not fully utilize Big Data management in
firms. Thirdly, a younger CEO shows higher risk taking behavior in comparison to an older CEO
(Serfling, 2014). This is relevant to the present argument, because it is oftentimes difficult for
individual firms to find a direct link between the use of Big Data and enhanced performance. This
uncertainty means that there is a certain risk inherent in using and relying on Big Data for
decision-making. As older CEOs are more risk-averse, I expect them to let Big Data inform their decisions to a
lesser degree than more risk-taking younger CEOs. As a consequence, the potentially beneficial
impact of Big Data on Financial Performance should be weaker for firms headed by comparatively
older CEOs. These arguments lead to the following hypotheses:
Hypothesis # 2a: CEO Age moderates the relationship between use of big data techniques and financial performance, such that this relationship is weaker for firms with older CEOs.
12
Hypothesis # 2b: CEO Age moderates the relationship between the length of time big data techniques are used and financial performance, such that this relationship is weaker for firms with older CEOs.
Existing research shows that long-tenured CEOs become more conservative. They tend to “grow stale in the saddle” causing a longer-tenured CEO to behave differently compared to more
recent CEOs (Hambrick, 2007). For example, a CEO who is longer tenured is expected to act less on
organizational change (Musteen, et al., 2006). Implementing and exploiting Big Data management
influences the whole organization. When a long tenured CEO wants to act less on organizational
change the CEO will also be less likely to fully exploit the possibilities of Big Data and influence the
Big Data strategy. Miller (1991) explains how CEO Tenure influences the strategy of a firm and the
decisions being made. Both Miller (1991) and Musteen et al (2006) show the influences on behavior
and decision making of a longer tenured CEO. Not only does CEO Tenure influence the CEO’s
behavior, it also influences the way peers value the CEO. A longer tenured CEO is more likely to be
less valued by his or her peers than a CEO who has shorter tenure (Antia, et al., 2010). As peers value
the longer tenured CEO lower, this statement could also be true for the members of the TMT and other
managers within the firm. Even if a longer tenured CEO would be open for organizational change it
could be hard to get the support to implement Big Data effectively throughout the organization as the
CEO is likely to be valued lower. Due to these behavioral characteristics of a longer tenured CEO and
the way the CEO is valued, it is less likely that a longer tenured CEO will fully exploit Big Data
management and will therefore achieve lower Financial Performance compared to a lower tenured
CEO. This leads to the following hypothesis:
Hypothesis # 3a: CEO Tenure moderates the relationship between use of big data techniques and financial performance, such that this relationship is weaker for firms with longer tenured CEOs.
Hypothesis # 3b: CEO Tenure moderates the relationship between the length of time big data techniques are used and financial performance, such that this relationship is weaker for firms with longer tenured CEOs.
13 Many of the characteristics of higher CEO Age is also seen within other members of the TMT
(Weinzimmer, 1997). While the CEO will influence whether to manage Big Data actively or not from
a more strategic view, the CFO will be more on the user side of Big Data (e.g. using the BI
application). The CFO is responsible for the final financial reporting of the organization (Gillet &
Udin, 2005). To be able to report over the whole organization and to understand the trends, up to date
and detailed insights are needed within the firm. Big Data management helps to provide these insights.
Further strategic decisions can be made based on these insights, which should influence the Financial
Performance of the firm. The CFO is also responsible for authorizing budgets and investments (Couto
& Neilson, 2004). Implementing Big Data is associated with high investments which makes it likely
that the CFO is involved from the start of the implementation of Big Data, thereby making the
engagement of the CFO important (Davenport, 2006). As described earlier, an older CEO is less likely
to take risk and becomes more conservative. For the CFO these characteristics may also apply for
several reasons. Firstly, this could influence whether a CFO decides to invest in Big Data
management. Secondly, an older CFO will be more likely to report in a more old-fashioned way by
not using the newest technologies coming with the use of Big Data management (E.g. through the use
of a Business Intelligence application). Due to this behavioral difference between older and younger
CFO`s, a younger CFO will be more likely to achieve higher Financial Performance since he or she
can be expected to use Big Data more actively than older CFOs. The younger CFO will be able to
exploit Big Data management further and provides support for the outcomes of these insights to guide
the firm in the right direction of competitive advantage. These findings lead to the following
hypotheses:
Hypothesis # 4a: CFO Age moderates the relationship between use of big data techniques and financial performance, such that this relationship is weaker for firms with older CFOs.
Hypothesis # 4b: CFO Age moderates the relationship between the length of time big data techniques are used and financial performance, such that this relationship is weaker for firms with older CFOs.
The described hypotheses leads to the following model presented in Figure 1. This model
14 negative moderating effect is shown for the moderator variables CEO Age, CEO Tenure and CFO
Age. This model will be analyzed in this thesis.
Figure 1: Proposed model
3. Methodology
To be able to analyze the proposed model, a quantitative research will be done. As described earlier, a
quantitative approach regarding the use of Big Data in organizations is not regularly performed in
earlier work (e.g. the research of Wamba et al (2016) is a quantitative research on the effect of Big
data analytics on Firm performance). For this thesis the data will be collected with the help of
databases and will be enriched with hand-collected data. The use of databases makes a lot of data
accessible in a short amount of time. Databases also provide the advantage of having access to unique
and meaningful data that otherwise would be hard to gather. They also provide access to data of some
of the largest firms in the world which another type of data collection (e.g. a questionnaire) would not
provide in the available amount of time. As Big Data is an innovation which is likely to be adopted
first by bigger firms due to the high cost, this data could provide a better overall view of the effect of
the moderators on the relation between Big Data usage/Big Data Experience and Financial
Performance (Davenport, 2006). The data will be collected from databases provided within the
15 for over 40,000 corporate, academic, government and nonprofit firms which provides information for
over 400 institutions in 30+ countries over different disciplines (Pennsylvania, 2016).
1. Sample.
The sample is based on the Standard & Poor’s 500 out of the USA. The S&P 500 is the leading
indicator of the US equities and is meant to reflect the risk/return characteristics of the large cap
universe (GlobalXfunds, 2016). The S&P 500 is seen as the definition of the market meaning that the
developments within the sample are representative for the US market. Within different data sources
(e.g. Compustat) a lot of data is available about the firms within the S&P 500. The S&P 500 consists
out of some of the largest companies in the world such as 3M and Google. The size of these firms
makes it possible to provide a good understanding and representation of the research being done in this
thesis. Due to the size of the firms they will be financially strong enough to cope with the large
investments needed to manage and use Big Data actively.
The data is collected from the Database Compustat within WRDS. Compustat is a market database
published by S&P. This database contains data from more than 50 years ago and is used by over
30,000 firms (Investopedia, 2016). The sample will be collected from the year 2014 as this is the most
recent year with complete variables.
2. Independent Variable
The two independent variables used are Big Data Usage and Big Data Experience. As explained
earlier, at this point there is limited data available about the use of Big Data within firms and even if it
exists it is not easily accessible (e.g. Gartner, a commercial technology research organization with a
specialization on Data & Analytics). All firms have been manually inspected. The research method has
been the same for both measures, however, the interpretation of the data is different. To establish
whether a firm is using Big Data, I collected the data with regards to the publications of the firms on
the use of Big Data. These publications are press releases or articles about the use of big data within
the organization. As Big Data is a trending topic, firms actively publish press releases about their
developments in Big Data. For the variable Big Data Usage, I registered whether a firm has published
on using Big Data and for the variable Big Data Experience, I registered the first year of publication to
16 looked up ‘firm X Big Data’. When I found a publication I searched for the first publication and
registered that the firm is active on Big Data management and the first year of publication to my
dataset. In case multiple publications were found, I used the date of the first publication for my data
set. When there was no publication to be found I went to the website of the firm itself and used the search tool to search ‘Big Data’. If no publications where found I registered that the firm is not
managing Big Data actively. All the data has been obtained in April 2016.
3. Dependent variable
The dependent variable within this research is Return On Assets(ROA) Growth over a 5-year window
(2009 to 2014). ROA is a variable used in many studies due to its stable nature and comparability
across firms (Fairfield, et al., 2003) (Dess & Jr., 1984). ROA Growth is not directly downloadable
from Compustat and is therefore manually calculated. ROA is calculated as follows.
𝑅𝑒𝑡𝑢𝑟𝑛 𝑂𝑛 𝐴𝑠𝑠𝑒𝑡𝑠 = 𝑛𝑒𝑡 𝑖𝑛𝑐𝑜𝑚𝑒 𝑡𝑜𝑡𝑎𝑙 𝑎𝑠𝑠𝑒𝑡𝑠
Both variables needed to calculate ROA (net income and total assets) are downloaded from
Compustat. This data is collected and calculated for both 2009 and 2014. To calculate the difference, I
subtracted the ROA of 2014 from the ROA of 2009. This will represent ROA Growth for this specific
period of measure.
4. Moderator variables
Within this research three moderator variables are used. Two of these variables are regarding the CEO,
namely CEO Age and CEO Tenure. The other variable is with regards to the CFO, namely CFO Age.
The variables CEO Age and CEO Tenure have been used a lot within other studies to explain the
effects age and tenure have on behavior (Hambrick, 2007) (Serfling, 2014). Information on the CEO
of a firm is widely available throughout multiple databases in the Wharton WRDS database. The
database chosen for the information with regards to the CEO and CFO is Intuitional Shareholder
Services Inc. (ISS). ISS is a database with a focus upon several key datasets to uncover risk and
understand the issues regarding the areas of the top management team, the board of directors, audit,
compensations and shareholder rights (WRDS, 2016). As information regarding the top management
17 also the data about the CFO from this database. For all three variables the data is retrieved for 2014.
The Age of the CEO and CFO in 2014 is used. For the variable CEO Tenure, the starting date at the
firm of the CEO is collected. Finally, the tenure was calculated by calculating the difference between
2014 and the year the CEO started at the firm. The number of years is noted as the tenure of the CEO.
5. Control variables
There will be three control variables included within the research, namely Industry, Firm Size and
Firm Debt. All these variables are collected from Compustat and are based on the year 2014. The first
control variable is type of industry. Industry is added as a control variable due to the nature of some of
the industries. For instance, within the technological industry the knowledge might be a lot higher as
in any other industry which makes it more likely for a firm to be active and fully exploit Big Data
instead of other firms. The industries data will consist out of the Global industry classification (GICS).
The GICS classification has the advantage of being stable year after year and is most used among large
firms. The use of the GICS Industry classification is also widely accepted and is used in many other
studies (Lee & Oler, 2003). The second control variable is Firm Size. Firms Size will be measured by
the number of people employed. While you may expect that every firm within the S&P 500 is large,
there still may be large differences in terms of employees. The size of a firm partly determines the
availability of IT positions making it likely for a firm to create more positions to manage Big Data.
Hence, Firm Size in terms of number of employees needs to be controlled for. The third and final
control variable is Firms Debt. A firm with high debt possibly would have less assets to invest in
project like Big Data management and would therefore utilize the use of Big Data less. This lack of
investment opportunity will negatively influence the ROA Growth where with low debt a firm would have the assets to invest in Big Data. To understand whether company’s debt plays a role it will be
18
4. Results
1. Descriptive analytics
The sample consists out of 504 firms based on data from 2014. After excluding the firms with missing
variables, the sample size consists of 375 firms with all required information. Table 1 shows the
descriptive statistics for the sample used (N = 375). ROA Growth shows the growth or loss in Return
on Assets between ROA 2009 and ROA 2014. The ROA Growth ranges between -12.63 and 2.60. Big
Data Usage indicates whether a firm has published on Big Data up to the year 2013. A firm which
published up to 2013 is coded with 1, a firm which did not publish up to 2013 is coded with 0. The
data shows that a little over half of the sample published on Big Data up to 2013(mean Big Data Usage =
0.5147). Big Data Experience shows how long ago the first publication on big data has been calculated
from 2015 ranging between ‘0’ and ‘6’. ‘0’ represents no experience and ‘6’ represents six years or
more years of experience meaning that the firm actively managed Big Data in or before 2009. The data
is showing that, based on Big Data Usage, most firms had their first publication 1.2 year before
2015(mean Big Data Experience = 1.2347). The average CEO is 52.9 years old (mean CEO Age = 52.9493), the
youngest CEO is 27 years of age and the oldest CFO is 74 years of age. The average CEO has a tenure
of 7.3 years (Mean CEO Tenure = 7.3413) and the average CFO is 53.3 years old (mean CFO Age 53.3120).
The youngest CFO is 38 years of age and the oldest CFO is 70 years of age.
Table 1 Descriptive Statistics of the used variables
N Min Max Mean St. Dev
ROA Growth 375 -12.63 2.60 -0.1341 0.9915
Big Data Usage 375 0.00 1.00 0.5147 0.5005
Big Data Experience 375 0.00 6.00 1.2347 1.4999
CEO Age 375 27.00 74.00 52.9493 5.9602
CEO Tenure 375 -1.00 52.00 7.3413 6.2129
CFO Age 375 38.00 70.00 53.3120 5.6773
2. Normalization
Based on the descriptive analytics, we determined that normalization of the data was necessary. The
skewness and kurtosis are calculated from the independent variable, dependent variable and
19
Table 2 Skewness and Kurtosis
skewness kurtosis
Return On Assets Growth -7.858 81.546
Return On Assets Growth log -3.791 22.209
Big Data Usage -.059 -2.007
Big Data Experience 1.411 1.873
CEO Age -.046 1.303
CEO Tenure 2.127 8.585
CFO Age -.130 -0.076
The skewness shows a very negative skewness for Return On Assets Growth (SKroa growth = -7.820).
Big Data Usage, CEO Age and CFO Age are close to zero and therefore do not need to be normalized.
Big Data Experience and CEO Tenure are both above zero and relatively high. Though the skewness
and kurtosis are too high, normalization of the data won’t be applied as the values are more
explainable when they are not normalized for further analysis.
The only variable that will be normalized is ROA Growth. To correct the data a LOG function will be
applied as this is a correction for large negative and positive skewness normalization (Field, 2013).
Due to some 0 in the dataset the ROA Growth value has to be added with 1, showing the following
formula.
𝑅𝑂𝐴 𝐺𝑟𝑜𝑤𝑡ℎ 𝐿𝑂𝐺 = 𝐿𝑂𝐺(𝑅𝑒𝑡𝑢𝑟𝑛 𝑜𝑛 𝐴𝑠𝑠𝑒𝑡𝑠 𝐺𝑟𝑜𝑤𝑡ℎ + 1)
The outcome of the skewness and kurtosis are shown in table 2 coded as the variable Return on Assets
Growth log. The Skewness is now -3.791 and the kurtosis 22.209 which is showing an improvement.
While the skewness and kurtosis are still high, these values are accepted for further analysis.
Therefore, Return On Assets Growth log will be used further in the analysis and will be coded as ROA
Growth.
3. Pearson Correlation
Pearson Correlation has been used to check whether there is a correlation between the various
variables used in this study and to check whether there is a concern for multicollinearity. The
correlation is shown in Table 3. A significance level of 0.05, coded as *, and 0.01, coded as **, will
20 The dependent variable ROA Growth is correlated with Gics industry Energy ( r = -0.240, n =
375, sig = 0.000), Financials( r = 0.114 n = 375, sig = 0.027) and Industrials ( r = 0.119, n = 375, sig =
0.022). The independent variables Big Data Usage and Big Data Experience are correlated ( r = 0.781,
n = 375, sig = 0.000). This makes sense as the data is based on the same source and half of the sample
will have the same outcome. E.g. when a firm is not managing Big Data the results for both Big Data
Usage as Big Data Experience will be 0. Also, both variables are used in separated analysis as they are
the independent variables within this study and will therefore not provide any issues. Big Data Usage
is also correlated with the Gics industry’s Consumer Staples (r = 0.115, n = 375, sig = 0.026) and
Information Technology ( r = 0.163, n = 375, sig = 0.002). Big Data Experience is also correlated with
Firm Size( r = 0.192, n = 375, sig = 0.000) and Firm Debt( r = 0.109, n = 375, sig = 0.035). Moderator
variables CEO Age and CEO Tenure are correlated with each other (r = 0.368, n = 375, sig = 0.000).
This makes sense as it is likely that an older CEO is working at a firm for a longer time, thereby
having a longer tenure. CEO Age is also correlated with Information Technology ( r = -0.168, n = 375,
sig = 0.001). CEO Tenure is also Correlated with Gics industry’s Consumer Discretionary (r = 0.113,
n = 375, sig = 0.029), Energy (r = -0.106, n = 375, sig = 0.040) and Health Care (r = 0.142, n = 375,
sig = 0.006). CEO Tenure is also correlated with Firm Debt (r = -0.110, n = 375, sig = 0.034). CFO
age is correlated with the Gics industry Consumer Staples (r = 0.120, n = 375, sig = 0.020).
When the correlation is higher as 0.8 it is likely that there is multicollinearity (Field,
2013).Within this thesis no correlation higher as 0.8 has been found. Field (2013, p325) explains that
this method is a good way to identify multicollinearity, but it misses more subtle forms of
multicollinearity. Therefore, I also checked the variance inflation factor (VIF). The VIF should not be
above 10 to exclude the concern of multicollinearity. Within this thesis none of the VIF outcomes are
21
Table 3 Pearson correlation of all variables of the dataset
Correlations
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
1. ROA Growth 1
2. Big Data Usage -0.068 1
3. Big Data Experience -0.014 .781** 1
4. CEO Age 0.096 -0.060 -0.001 1
5. CEO Tenure 0.010 -0.029 -0.012 .368** 1
6. CFO Age -0.063 0.067 0.075 -0.024 -0.010 1
7. Gics - Consumer Discretionary 0.049 0.030 0.035 0.006 .113* -0.063 1
8. Gics - Consumer Staples 0.000 .115* 0.075 0.071 -0.082 .120* -.155** 1
9. Gics - Energy -.240** -0.052 -0.083 -0.007 -.106* -0.046 -.168** -.108* 1
10. Gics - Financials .114* 0.066 0.010 -0.060 0.056 0.050 -0.077 -0.050 -0.053 1
11. Gics - Health Care -0.001 -0.090 -0.086 0.046 .142** -0.016 -.184** -.118* -.127* -0.059 1
12. Gics - Industrials .119* -0.070 -0.016 0.004 -0.042 -0.010 -.217** -.139** -.150** -0.069 -.165** 1
13. Gics - Information Technology -0.040 .163** .105* -.168** -0.020 0.011 -.199** -.128* -.138** -0.064 -.152** -.179** 1
14. Gics - Materials 0.041 -0.093 -0.043 0.099 -0.027 -0.078 -.134** -0.086 -0.093 -0.043 -.102* -.120* -.111* 1
15. Gics - Telecommunication Services 0.010 0.011 0.059 0.001 0.042 0.059 -0.057 -0.037 -0.040 -0.018 -0.043 -0.051 -0.047 -0.032 1
16. Gics - Utilities -0.018 -0.069 -0.039 0.019 -0.074 0.056 -.142** -0.091 -0.099 -0.045 -.108* -.128* -.117* -0.079 -0.034 1
17. Firm Size 0.011 0.078 .192** -0.012 -0.044 0.077 .115* .188** -0.094 -0.049 -0.064 0.004 -0.024 -0.060 0.038 -0.093 1
18. Firms Debt -0.045 .112* .109* 0.024 -.110* 0.029 -0.085 0.037 0.071 -0.029 -0.039 -0.067 -0.063 -0.051 .364** .137** .368** 1
**. Correlation is significant at the 0.01 level (2-tailed).
22
4. Regression
To test the stated hypothesizes 1a and 1b a hierarchical linear regression analysis has been done. To
test the moderation effect from hypothesis 2 to 4 the Process Application written by Hayes (2012) has
been used.
Within the hierarchical linear regression regarding hypothesis 1a and 1b four models have
been set up. Within Model 1, only the control variables of industries were entered where in model 2
the other control variables, Firm Size and Firm’s Debt, were inserted as well. In Model 3 the
independent variable Big Data Experience has been added and in model 4 the independent variable
Big Data Usage has been inserted. The results of the linear hierarchical regression are shown in table 4
Model 1 is showing to be significant (F = 3.602, p = 0.000). When control variables Firm Size and
Firms Debt were added the model shows to be non-significant (F = 0.076, p =0.926). Also after adding
the independent variables Big Data Experience (F = 0.303, p =0.582) and Big Data Usage (F = 2.739,
p =0.099 ) the model is not significant . Due to the non-significant P-value there is no support for
Hypothesis 1a and 1b for a relation between Return On Assets growth and either Big Data Usage or
Big Data Experience.
Table 4 Hierarchical Lineair regression
ROA Growth
Model 1 Model 2 Model 3 Model 4
B Sig B Sig B Sig B Sig
Control Variables
Industry Consumer
Discretionary 0.00 1.00 0.00 0.92 0.01 0.79 0.01 0.68
Industry Consumer Staples
-0.02 0.62 -0.02 0.65 -0.01 0.66 -0.01 0.76
Industry Energy
-0.13 0.00 -0.13 0.00 -0.13 0.00 -0.14 0.00
Industry Financials
0.10 0.07 0.10 0.07 0.10 0.07 0.11 0.05
Industry Health Care
-0.02 0.58 -0.02 0.58 -0.02 0.56 -0.02 0.50 Industry Industrials 0.03 0.32 0.03 0.32 0.03 0.33 0.02 0.40 Industry Information Technology -0.03 0.25 -0.03 0.26 -0.03 0.28 -0.03 0.33 Industry Materials 0.01 0.83 0.01 0.83 0.01 0.85 0.00 0.99
Industry Telecommunication Services 0,00 0.98
0.01 0.91 0.01 0.90 0.00 0.96 Industry Utilities -0.03 0.45 -0.02 0.50 -0.03 0.49 -0.03 0.39 Firm Size 0.00 0.98 0.00 0.96 0.00 0.79 Firms Debt 0.00 0.73 0.00 0.75 0.00 0.94
23
Independent Variables
Big Data Experience -.003 .582 .009 .099
Big Data Usage -.048 .341
R² .082 .082 .083 .090
R² change .082 .000 .001 .007
F 3.602 .076 .303 2.739
Sig. F change .000 .926 .582 .099
For the other hypotheses the Process Application in SPSS has been used. Every hypothesis
consists out of a version A where Big Data Usage is the independent variable and version B where Big
Data Usage is the independent variable. In this manner the independent variables were added in the
analysis. The dependent variable of every model is ROA Growth which describes Financial
Performance. For hypothesis 2a and 2b the moderator CEO Age has been inserted, for Hypothesis 3a
and 3b the moderator CEO Tenure has been inserted and for the final hypotheses 4a and 4b the
moderator CFO Age has been inserted. Within all the process analysis the control variables Industry,
Firm Size and Firms Debt has been inserted.
Table 5 shows the results of hypothesis 2a where we state that CEO Age moderates the
relationship between use of big data techniques and Financial Performance, such that this relationship
is weaker for firms with older CEOs. The regression coefficient of Big Data Usage x CEO Age is
0.0022 which is close to 0 and statistically not significant. This interaction is also non-significant (TXM
= 0.7074, pXM =0.4797 is non-significant). These results show no support that CEO Age moderates the
relationship between use of Big Data techniques and Financial Performance.
Table 5 Hypothesis 2a
Coefficient SE t p
Intercept -0.1236 0.0944 -1.3094 0.1912
Big Data Usage (X) -0.1369 0.1670 -0.8196 0.4130
CEO Age (M) 0.0020 0.0017 1.1460 0.2525
Big Data Usage x CEO Age (XM) 0.0022 0.0032 0.7074 0.4797
Control Variables Firm Size 0.0001 0.0001 0.8346 0.4045 Firm Debt 0.0000 0.0000 -1.1667 0.2441 Gics Sector 0.0026 0.0032 0.8299 0.4071 R² 0.0188 F 1.1740
24 In table 6 the results for hypothesis 2b are shown. Within this hypothesis is explored whether
CEO Age moderates the relationship between Big Data Experience and Financial Performance, such
that this relationship is weaker for firms with older CEOs. The regression coefficient of the interaction
Big Data Experience and CEO Age is 0.0023. This explains that there is statistically no significant
difference as it is close to 0. The interaction is showing to be significant for the positive influence of
the relationship of Big Data Experience and CEO Age (TXM = 2.4062, pXM =0.0166). The details in
table 7 show that ages under 45.8 shows a negative effect on the relation Big Data Experience and
ROA Growth while age over 64.6 show a positive relation. This explains that an older CEO has a
positive relation on the relation between Big Data Experience and ROA Growth. This is contrary to
what was stated within the hypothesis and it is thereby indicating that the hypothesis 2b needs to be
rejected.
Table 6 Hypothesis 2b
Coefficient SE t p
Intercept -0.0297 0.0980 -0.3031 0.7620
Big Data Usage (X) -0.1242 0.0513 -2.4231 0.0159
CEO Age (M) 0.0001 0.0018 0.0417 0.9668
Big Data Experience x CEO Age (XM) 0.0023 0.0010 2.4062 0.0166
Control Variables Firm Size 0.0001 0.0001 1.1062 0.2694 Firms Debt 0.0000 0.0000 -1.3702 0.1715 Gics Sector 0.0033 0.0032 1.0426 0.2978 R² 0.0299 F 1.8884
Table 7 Johnson-Neyman test CEO Age Effect p CEO Age Effect p CEO Age Effect p 27.0 -0.0619 0.0165 45.8 -0.0184 0.0429 62.3 0.0195 0.0648 29.4 -0.0564 0.0168 46.5 -0.0169 0.0500 64.0 0.0236 0.0500 31.7 -0.0510 0.0172 48.2 -0.0130 0.0829 64.6 0.0250 0.0465 34.1 -0.0456 0.0178 50.5 -0.0076 0.2300 67.0 0.0304 0.0372 36.4 -0.0401 0.0188 52.9 -0.0022 0.7105 69.3 0.0358 0.0318 38.8 -0.0347 0.0205 55.2 0.0033 0.5960 71.7 0.0413 0.0285 41.1 -0.0293 0.0234 57.6 0.0087 0.2298 74.0 0.0467 0.0262 43.5 -0.0239 0.0292 59.9 0.0141 0.1079
25 Table 8 present the results for Hypothesis 3a where CEO Tenure moderates the relationship
between use of Big Data and Financial Performance, such that this relationship is weaker for firms
with longer tenured CEO is being tested. The regression coefficient Big Data Usage x CEO Tenure is
-0.0059 which is close to 0 and is non-significant (TXM = -1.828, pXM =0.0699). Due to this
non-significance we can state that hypothesis 3a is not supported by this analysis.
Table 8 Hypothesis 3a
Coefficient SE t p
Intercept -0.0280 0.0235 -1.1911 0.2344
Big Data Usage (X) 0.0208 0.0295 0.7045 0.4816
CEO Tenure (M) 0.0016 0.0016 0.9894 0.3231
Big Data Usage x CEO Tenure (XM) -0.0059 0.0032 -1.8180 0.0699
Control Variables Firm Size 0.0001 0.0001 0.7141 0.4756 Firm Debt 0.0000 0.0000 -1.0228 0.3071 Gics Sector 0.0023 0.0032 0.7210 0.4714 R² 0.1311 F 0.0276
The results for hypothesis 3b are shown within table 9 where is tested whether CEO Tenure
moderates the relationship between the length of time Big Data techniques are used and Financial
Performance, such that this relationship is weaker for firms with longer tenured CEOs is being tested.
The regression coefficient XM is -0.0007 which is statistically close to 0 and is non-significant (TXM =
-0.6186, pXM =0.5366). These results show that hypothesis 3b is not being supported by the analysis.
Table 9 Hypothesis 3b
Coefficient SE t p
Intercept -0.0283 0.0237 -1.1918 0.2341
Big Data Usage (X) 0.0037 0.0104 0.3523 0.7248
CEO Tenure (M) 0.0008 0.0017 0.4787 0.6324
Big Data Experience x CEO Tenure (XM) -0.0007 0.0011 -0.6186 0.5366
Control Variables Firm Size 0.0000 0.0001 0.6680 0.5045 Firm Debt 0.0000 0.0000 -1.0865 0.2780 Gics Sector 0.0024 0.0032 0.7623 0.4464 R² 0.0058 F 0.3592
26 Table 10 is showing the results for hypothesis 4a where CFO age moderates the relationship
between use of big data techniques and Financial Performance, such that this relationship is weaker for
firms with older CFO is being tested. The regression coefficient XM is 0.0008 and is non-significant
(TXM = 0.2473, pXM =0.8048). From these results we can conclude that there is no support for
hypothesis 4a within this analysis.
Table 70 Hypothesis 4a
Coefficient SE t p
Intercept 0.0953 0.1048 0.9091 0.3639
Big Data Usage (X) -0.0615 0.1680 -0.3661 0.7145
CFO Age (M) -0.0021 0.0020 -1.0827 0.2797
Big Data Usage x CFO Age (XM) 0.0008 0.0031 0.2473 0.8048
Control Variables Firm Size 0.0001 0.0001 0.8553 0.3929 Firm Debt 0.0000 0.0000 -1.0278 0.3047 Gics Sector 0.0023 0.0032 0.7378 0.4611 R² 0.0123 F 0.7624
Finally, in table 11 the final hypothesis 4b is showing where CFO Age moderates the
relationship between the length of time big data techniques are used and Financial Performance, such
that this relationship is weaker for firms with older CFOs is being tested. The regression coefficient
XM is -0.0016 and is non-significant (TXM = -1.6044, pXM=0.1095). This explains that within this
analysis there is no support for hypothesis 4b.
Table 11 Hypothesis 4b
Coefficient SE t p
Intercept -0.0330 0.1074 -0.3075 0.7587
Big Data Usage (X) 0.0877 0.0557 1.5738 0.1164
CFO Age (M) 0.0001 0.0020 0.0717 0.9428
Big Data Experience x CFO Age (XM) -0.0016 0.0010 -1.6044 0.1095
Control Variables Firm Size 0.0001 0.0001 1.0064 0.3149 Firm Debt 0.0000 0.0000 -1.2030 0.2297 Gics Sector 0.0029 0.0032 0.9064 0.3653 R² 0.0158 F 0.9849
27
5. Discussion
Former literature written by Davenport (2006) and Chen et al (2012) already explain the advantages
Big Data management can provide. While research on the topic of Big Data management is often seen
and organizations are exploring options regarding their data, there are still a lot of areas to cover. Up
to this point research on Big Data has mainly been done in a qualitative way (Manyika, et al., 2011)
(Sagiroglu & Sinanc, 2013). While a qualitative approach explains more on the capabilities of Big
Data (e.g. accurate decision making), a quantitative study is able to explain the effects based on a
larger sample better. To explore these effects, I have formulated multiple hypothesis. First the effect of
Big Data Usage and Big Data Experience on Financial Performance has been investigated. Then, I
formed other hypothesizes where I researched the moderating effects of CEO Age, CEO Tenure and
CFO Age on the relation between Financial Performance and the Usage and Experience on the Big
Data management. After testing the hypothesis, no support was found for a relation between the Usage
or Experience of Big Data and Financial Performance. Furthermore, none of the hypothesis where
supported. Although, a significant moderating effect was found within hypothesis 2b. Though the
results showed support in contrary to the stated hypothesis. The outcome explains that an older CEO
has a positive moderating effect on the relationship between Big Data Experience and Financial
Performance. This is an interesting finding as the hypothesis, based on earlier work, stated that a
younger CEO would have been more likely to have positive effect whereas in the sample it was
different. Thus, the older CEO is showing to be better at gaining a positive growth in ROA when a
firm is more experienced with the use of Big Data. While this is different as explained by the
literature, this could be explained by the fact that older CEO`s gained more experience and therefore
knows better what he or she would like to see within the insights gathered through the use of Big Data
management. With this experience the CEO is likely to involve analytical resources to get the insights
needed. Also the learning curve could impact the decision of an older CEO to manage Big Data
actively. Namely, due to the experience of the older CEO, he or she will be able to handle the position
better through experience. Thereby being more efficient with time and resources which allows the
older CEO to focus on new innovations like Big Data, becoming more efficient on the experience of
28 The other hypotheses were not supported by the analysis. However, while no support was
found within the used sample, there is enough literature support to show the importance of the used
variables for future research.
Hypothesis 1a and 1b tested the relation between Big Data Usage/Experience and Financial
Performance. Earlier work states that there are many advantages on the use of Big Data which makes
it likely that there should be a positive effect on the Financials Performance of the firm regarding the
Usage and Experience of Big Data. As ROA is influenced by many other variables, within and outside
the firm, it is likely that no support was found due to a to small of an effect to measure. Also the
sample could have been limited to the finding of support as the sample consist out of the biggest firms
out of the USA. Big Data could influence the Financial Performance differently in smaller firms.
Also the moderators used within the thesis show enough theoretical evidence from earlier
work to show its importance. Firstly, while a significant moderating effect was found of CEO Age on
Big Data Experience no support was found for the relation of Big Data Usage and Financial
Performance. Thus, the age of an CEO does not influence if a firm is managing Big Data actively.
However, when a firm is using Big Data actively the age of the CEO makes a difference. The literature
describes how the behavior of a CEO changes as he/she gets older which makes it likely that its
behavior in Big Data Usage is an important factor as well.
Secondly, CEO Tenure was proven to influence the behavior of the CEO as well, making the
CEO more conservative over time (Hambrick, 2007). A conservative CEO is less likely to take risks
which still could have a large impact on the behavior towards the relation between Big Data and
Financial Performance. To measure the effects of tenure a CEO has to be at a firm for a longer time.
Also, Big Data is upcoming in the past couple is years whereby it is hard to show the effect of CEO
Tenure on the relation between Big Data Usage/Experience as this is active for a short amount of time.
Therefor a longer period of measurement could influence the results.
Thirdly, there was no support that the age of the CFO moderates the relation between Big
Data Usage/Experience and Financial Performance. Many of the arguments made with regards to CEO
Age may also apply to the CFO. Also, because the CFO is responsible for the whole financial
29 Udin, 2005). By handling this data and the need of these extracted insights from the data, the CFO
should positively influence the usage of Big Data within a firm.
As described there is a theoretical foundation on why the stated hypothesis did not found
support within this thesis. This could also have been the cause of some of the limitations of the
research which will be discussed in the following chapter.
6. Limitations and future research
As earlier explained, the quantitative approach to the research of Big Data is not seen as much. Most
earlier research on Big Data is done in a qualitative way. I approached the thesis in a quantitative way.
Even though no support was found, I have been successful in setting up the quantitative approach.
However, the setup and results of the hypotheses show limitations. First, to be able to do this research
in a quantitative way, data had to be manually gathered. As the quantitative approach is new to the Big
Data field, not a lot of data has been collected within other researches or databases yet. When
hand-collecting the data, I used the press publications of firms about Big Data to describe their usage and
experience on Big Data management. While this approach will explain the usage and experience of a
firm regarding Big Data, it is not the most representative way of measurement (e.g. a more accurate
way would be to get the data from the firm itself). Within the time available this approach was the best
approach possible, however, it is a limitation to this thesis. Argued could be that the data used is not
sufficient to explain whether and for how long an organization has been using Big Data. Firms that are
active on Big Data management, but did not publish about their Big Data management, were not coded
as being active on Big Data management. Therefore, this first limitation could possibly have an impact
on the results of this thesis. For further research other approaches should be explored and assessed to
get a better quality of data. Secondly, the sample used is based on the S&P 500 out of the USA. The
advantages of using the S&P500 are earlier described within this thesis. Though, using the S&P 500
could also be a limitation. Namely due to geographical characteristics of the sample. Within this
sample all firms are based within the USA, making the firms within the sample comparable. Though,
this could also influence output. When firms are geographically spread, more markets are included
30 understanding of Big Data Management in different countries (e.g. In the USA is could be that Big
Data management is not improving Financial Performance while in another market it does). The used
sample should therefore be seen as a limitation. Thirdly, the timeline of the study has been set within a
maximum of five years (2009 – 2014). Four years were chosen due to the availability of the data. Big
Data is new within firms which explains why there were limited earlier publications available on the
use of Big Data. Within this thesis, for a hypothesis to be significant, there should be an effect of this
Big Data Usage on Financial performance within these four years. While this had not been studied yet,
it is likely that this effect could take for a longer period of time to be noticeable making this a
limitation for this thesis. Based on this limitations two points should be kept in mind for further
research. Namely, researching how long it takes for a firm to achieve positive effects by the use of Big
Data. And also, when a study is being done to research the relation between Big Data
Usage/Experience and the Financial Performance of a firm, a longer timeline should be used. Fourthly,
this thesis focused on the role of the CEO regarding the age and tenure and the CFO Age. Based on
the literature these positions within the firm are important regarding the strategy (Golden & Zajac,
2011). However, other roles could be of big importance as well. For example, the Chief Information
Officer(CIO) could be of big importance. The CIO is responsible for all IT related issues within firm
and could therefore have a big impact on whether a firm is managing its Big Data well (Maes & Vries,
2008). The focus should thus not only be on C-level managers, but a focus on middle and lower
management levels could also show interesting insights. With more data being developed over the past
years, firms and job position are changing to becoming more data-driven (Davenport, 2006). Due to
this data-driven focus, lower level managers could have an impact on the utilization of Big Data in a
firm. For further research, this means that the study should look further than C-Level managers and
should incorporate other levels of managers and positions within the firm to research how this impacts
the relation between Big Data Usage/Experience and Financial Performance of the firm. Finally, the
dependent variable Financial Performance was measured by the ROA Growth. While ROA is a stable
and comparable measurement on the evaluation of a firm, it is influenced by many various factors
(Fairfield, et al., 2003). Therefore, it is interesting to research the effects of Big Data Usage and
31 by other aspect in the organizations and to show different behaviors (e.g. market share or quantity
sold).
7. Conclusion
This study investigates the relation between Financial Performance (ROA Growth) and Big Data
Usage and the Big Data Experience of firms within S&P 500. This study researched the moderating
effect of CEO Age, CEO Tenure and CFO Age on the relations between Big Data Usage/Experience
and Financial Performance.
As the technological developments are moving quickly, more data is being stored than ever.
This data contains a lot of valuable information, but to be able to extract this information the right
knowledge, strategy and tools are needed. These large collections of data are called Big Data and
create challenges for firms to get the most advantages out of this data. More research is being done on
the topic and businesses are exploring their opportunities regarding their Big Data management. To
research whether Big Data management positively impact the Financial Performance of a firm, a
quantitative study is done on the S&P500. After eliminating cases with missing data, a sample of 375
firms remained.
The hypothesizes were tested making use of a hierarchical regression and the process application. No
support was found for the stated hypothesis. Though, a significant moderating effect was found for
CEO age on the relationship between Big Data Experience and Financial Performance. While the
hypothesis stated that a younger CEO would have a positive effect on Financial Performance the
contrary was supported by the sample. Older CEOs showed to have a positive effect on the relation
between Big Data Experience and Financial Performance where a younger CEO had a negative effect.
None of the other hypotheses where supported within the sample. However, this does not mean that
the hypothesizes are not important or could be supported. The data used about the use and experience
of Big Data has been gathered manually. While this was the best option available due to the time
constraint, it could have influenced the results. Also, a fairly small sample has been researched. When
32 Qualitative studies on the effects of Big Data are often seen (Davenport, 2006) (Chen, et al.,
2012). Within this thesis I have been able to explore the effects on Big Data in a quantitative way.
This thesis thereby added to the current literature, that an older CEO has a positive effect on the
relation between Big Data Experience and Financial Performance while a younger CEO shows a
negative effect. This results shows that, while the current literature stated that older CEOs create
characteristics that make it less likely to fully utilize the use of Big Data, the older CEO is able to get a