• No results found

The perceptions on the institutionalizing of big data analytics in financial audits within a Big 4 audit firm in the Netherlands

N/A
N/A
Protected

Academic year: 2021

Share "The perceptions on the institutionalizing of big data analytics in financial audits within a Big 4 audit firm in the Netherlands"

Copied!
78
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The perceptions on the institutionalizing of Big Data Analytics in Financial

audits within a Big 4 audit firm in the Netherlands

Name: Martin Schokker Student number: 11112190 Date: 26 June 2017

Word count: 26203

Supervisor: Prof. dr. B.G.D. O'Dwyer

MSc Accountancy & Control, specialization: Accountancy Track & Control Track Faculty of Economics and Business, University of Amsterdam

(2)

2 Statement of Originality

This document is written by student Martin Schokker who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

3 Abstract

The developments in working with data of most companies are changing very rapidly by new technologies and large amounts of data are available nowadays. Big Data, almost everybody has an opinion about this topic, but what is Big Data and what can be done with the analysis of all kind of structured and unstructured data, Big Data Analytics? What does it mean for the accounting profession and auditing and controlling in particular? How is it perceived in the upper segment of accounting firms? This paper will give the reader a closer look at the perceptions on the implementation of using Big Data Analytics for audits within a Big 4 firm. The theoretical framework of Lawrence and Suddaby (2006) is used to paint an image on the process of the implementation of Big Data Analytics in the audit.

Through a case study at a Big 4 audit firm, I studied (recent and) relevant literature and investigated the current situation and the perception of several professionals working in this area. After I determined the context of my investigation I interviewed these 12

professionals, being auditors, IT-auditors, and (Big) Data Analyst along the same lines. These participants were asked about the institutionalization of Big Data Analytics in the audit, how they think the different departments should work together for the institutionalization and what method of Big Data Analytics should be used.

It appeared that there are a lot of similarities in their perceptions but also quite a few discrepancies on several topics like the dynamics and implementation in the current practice and expectations for the (near) future, how to deal with compliancy issues, responsibility for results, ways of collaboration and who should take the lead. For the time being the

preliminary conclusion could be drawn that he institutionalization of Big Data Analytics within audits has a (long) way to go, by both external and internal reasons, but attention, acceptance and further implementation of applications are increasing.

(4)

4 Content

1. Introduction 7

1.1 Background 7

1.1.1 Big Data and audit evidence 7

1.2 Research question 8

1.3 Research outline 8

1.4 Research contribution 9

1.5 Paper Structure 10

2. Literature 11

2.1 Big Data and Big Data Analytics 11

2.2 Complementing/substitute traditional methods of auditing 12

2.3 Formalizing Big Data for audit 13

2.4 Information overload 14

2.5 The use of tools 14

2.5.1 SQL/NoSQL 14

2.6 Knowledge data discovery and data mining 15

2.6.1 Directed/Lean Approach (Bottom-up) 15

2.6.1.1 Classification 16

2.6.1.2 Estimation 16

2.6.1.3 Prediction 16

2.6.2 Undirected approach (Top-down) 16

2.6.2.1 Affinity grouping 17

2.6.2.2 Clustering 17

2.6.2.3 Description and visualization 17

2.7 Shift in thinking 17

2.8 Provenance 18

3. Theory 19

3.1 Directed and Undirected approach 20

3.1.1 Directed (Top-down) 20

3.1.2 Undirected (Bottom-up) 20

3.2 Practices of institutional work theory 21

3.2.1 Creating institutions 22

3.2.1.1 Advocacy 22

(5)

5

3.2.1.3 Constructing normative associations 22

3.2.1.4 Constructing normative networks 23

3.2.1.5 Mimicry 23 3.2.1.6 Theorizing 23 3.2.1.7 Education 23 3.2.2 Maintaining institutions 23 3.2.2.1 Policing 24 3.2.2.2 Deterring 24

3.2.2.3 Embedding and routinizing 24

4.1 Sample 24

4.2 Method 25

4.3 Data collection 25

4.3.1 Selection of interviewees 26

4.3.1.1 Coding of interviewees 26

4.3.2 Semi-structured open interviews 27

4.3.3. Data processing 28 5. Case analysis 28 5.1 Initial thoughts 28 5.1.1 Definitions 29 5.1.2 Experience 31 5.1.2.1 Revenue recalculation 31 5.1.2.2 Text scanning 31 5.1.2.3 Weather 32 5.2 Strategy of implementation 32

5.2.1 Formalizing Big Data 33

5.2.1.1 Formalizing rules 33 5.2.3.2 Data standard 35 5.2.2 Tooling 36 5.2.3 Best practices 37 5.2.4 Training 38 5.2.5 Provenance 40

5.2.6 Tone at the top 40

5.3 Method of using Big Data 42

5.3.1 Top-down 42

(6)

6

5.3.2 Bottom-up 44

5.3.2.1 Virtual assistant 45

5.3.2.2 Fraud Analytics 46

5.3.3 Hybrid model 47

5.4 Drivers and barriers of implementing Big Data 48

5.4.1 Audit quality 48 5.4.1.1 Materiality 49 5.4.2 Efficiency 50 5.4.3 Client 52 5.4.3.1 Size of clients 53 5.4.4 Innovation 54 5.5 Role of actors 54 5.5.1 Auditors 54 5.5.2 IT-auditors 55 5.5.2.1 A new profession 56

5.5.3 Big Data Analysts 57

5.5.4 Collaboration 58

5.5.4.1 Pilot program 59

5.5.4.2 Technical department 59

6. Discussion 60

6.1 Result descriptive analysis 61

6.1.1 Theorizing 62

6.1.2 Deterring, normative networks 62

6.1.2.1 Extractions 63

6.1.3 Policing, embedding and routinizing, proof-of-concept 64

6.1.4 Normative association 65

6.1.5 Educating, constructing identities 66

7. Conclusion 66

7.1. Conclusion 66

7.2 Limitations 68

7.3 Future research 68

References 70

Appendix A: Details of interviewees 76

(7)

7 1. Introduction

1.1 Background

With the emergence of new technologies, companies are able to provide a lot more data. Turner et al. (2014) claim the amount of data doubles every year. A part of this large amount of data is unstructured and can be used beside the traditional already available and used data. This large amount of data is considered as Big Data. Manyika et al. (2011) discuss how Big Data can affect different industries.

The availability of Big Data will certainly affect the accounting profession (Manyika et al., 2011). It is even sometimes discussed that future audits will be done by computers. Frey and Osborne (2013) even think there is a 94% chance that computers will take over the audit profession, because of Big Data Analytics. However, Richins et al. (2017) believe auditors are able to leverage their knowledge and take advance of Big Data.

1.1.1 Big Data and audit evidence

Big Data can also be used for audit evidence, it will transform the way of auditing. Analysis of this data, Big Data Analytics will provide a higher quality of audit evidence and more relevant business insights (Ramlukan, R., 2015). As these systems become larger and more complex, assurance cannot be accomplished manually (Vasarhelyi & Halper, 1991).

Techniques like sampling cannot be used anymore to provide sufficient assurance. Big Data Analytics makes it possible to handle these complex systems. Large volumes of data can be handled to provide assurance.

Yoon et al. (2015) describe how Big Data can be used as complementary audit evidence. Audit evidence needs to be sufficient and appropriate. Big Data can support

traditional methods when there is not sufficient evidence. The use of Big Data can create new evidence, this will make it easier to satisfy the sufficiency requirement. Yoon et al. (2015) also explain how Big Data can provide more relevant and reliable information (e.g. GPS can give a more precise location of ships than shipping documents).

This paper focusses on how Big Data can be used within the audit. The study will show how different parties think the process of selecting data should work. Tools are used to apply the algorithms to the selected data. The auditors must select the data which they use their tools to gather the audit evidence. This paper also looks at the way how auditors, Data Analysts with an IT-audit background and the (Big) Data Analysts of the tools think the process of selecting data should occur.

(8)

8 1.2 Research question

At the moment Data Analytics is only used for handling the traditional data that has always been used. In the last few years, a lot of research has been written about Data Analytics. Now Data Analytics is used more often, the next step is to use “Big Data”. Auditors, IT-auditors and Big Data Analysts were interviewed to give their opinion on the process of selecting data. They were asked what kind of techniques they think are most appropriate to determine what data should be used. Afterward, the participants were asked about how the departments within the Big 4 firm need to operate to implement the use of Big Data Analytics. The paper looks at how the departments need to cooperate to realize the implementation. The accounting

profession still needs to embrace Big Data (Alles, 2015); therefore, I will focus on the perception of the participants on collecting and selecting data.

Research question:

What is the perception on the implementation of using Big Data Analytics for audits within a Big 4 firm?

1.3 Research outline

1. What is Big Data in the context of the audit?

This part of the thesis focusses on the initial thoughts on Big Data. There is much confusion about the concept of Big Data and how it can be used. Putting those

different thoughts about Big Data together, the research provides an understanding of the notion of Big Data. When the meaning of Big Data becomes clear, a basic idea of the application of Big Data used for audit evidence will be discussed.

2. What kind of methods are going to be used to gather audit evidence using Big Data? This part focusses more on the technical part of Big Data Analytics. There are multiple techniques that can be used to handle Big Data. Nevertheless, every method has its pros and cons. Discussing different methods with different parties may provide information in the preferred method of handling Big Data.

3. How is Big Data Analytics going to be realized within the firm?

This part explains the institutionalization of Big Data. Here the institutional theory of Lawrence and Suddaby (2006) is used to give insight into the creation of Big Data Analytics within the audit. Here it is discussed what strategies are used to promote the use of Big Data Analytics, and what practices had occurred.

(9)

9 4. How do different departments within the firm need to collaborate to realize the

implementation of Big Data? Here the collaboration between the different departments is discussed. First, the current role of the departments will be discussed, here it is asked what their role is and was in the implementation and using Big Data Analytics within the audit. Afterward, one is asked if they think if and how the different departments should collaborate.

1.4 Research contribution

This research contributes by examining how Big Data can be used within the audit. By giving insight into the preferred methods, this paper can contribute to the implementation of Big Data Analysis in the audit. As different parties show their opinion about the process of using Big Data in the audit, this research could provide insights in how they perceive the use of Big Data Analytics in the audit. Whenever the implementation of Big Data Analytics draws near, this paper provides information on the occurrence of the methods of the selection process of Big Data Analytics. This information can be used to determine how it should be used in practice.

This research also contributes to the institutional work theory of Lawrence and

Suddaby (2006) as it explains how Big Data might be implemented within a Big 4 audit firm. This paper gives insight in how institutions and individuals within the organization try to implement the use of Big Data. Empson (2013) used the institutional work theory to identify how legitimacy was gained through a dyadic collaboration between two parties. In this thesis, this concept is taken one step further as it describes a triadic collaboration between Auditors, IT-auditors and Big Data Analysts. Furthermore, most papers using the institutional work theory to describe what happened to the institutions in the past. This paper focusses on the perception and thus the possible future implementation of Big Data Analytics in the audit. By using the institutional work theory in a broader context, this paper contributes to this theory by showing possible applications of the theory.

(10)

10 1.5 Paper Structure

2. Literature: the existing literature is used to give insight into the use of Big Data Analytics.

3 Theory: Here the institutional work theory is explained and the practices of the institutional work theory that is applicable to this research are described.

4. Methodology: In the methodology, it is described how the information is gathered. In this qualitative study, the information is mostly gathered through interviews. It explains how and why the interviewees were selected and how the analysis is done.

5 Descriptive analysis: In the descriptive analysis the findings of the interviews are explained.

6 Discussion: Here the links between literature, theory, and findings are combined.

7 Conclusion: Based upon the information from the theory and the interviews, the findings are discussed. Furthermore, possible future research will be discussed based on the findings of this paper.

(11)

11 2. Literature

In this section, it is shown what prior literature state about the use of Big Data Analytics in the audit. Furthermore, the literature discusses methods of using Big Data Analytics in the audit. 2.1 Big Data and Big Data Analytics

Big data consist of datasets that are too large and complex to handle with standard methods or tools. There are 4 V’s that define big data; huge Volume, high Velocity, huge Variety, and uncertain Veracity. (Zhang, Yang, and Appelbaum, 2015; IBM, 2012). Volume stands for a large amount of data, velocity for the speed in which the data can be processed (real time), variety for the different amount of information that can be used and veracity for the reduction of noise. This means that a large amount of different data can be processed fast. The use and analyzation of this large amount of data are called Big Data Analytics. As Turner et al. (2014) explains that the amount of data doubles every two years and Vasarhelyi and Halper (1991) claim that the increased volume in combination with the more complex data, assurance can only be given through an automated audit that is able to handle this data, rather than sampling.

Nowadays companies have more data available besides the traditional data in ERP systems, like the GPS-location of ships. Some of this data can be used for more audit

evidence, like giving the positions of transferred goods. Big Data Analytics makes it possible to complement this data to the data of the existing auditing methods to improve the assurance. For traditional audit evidence “clean and sorted data” needs to be used. Big Data makes it possible to also use unsorted and semi-unsorted data.

There is no clear definition of Big Data and it can be interpreted in multiple ways. Even though there is no clear definition of Big Data, the ACCA (2013, p.10) explain:

“It refers primarily to the vast amount of data continually collected through devices and technologies such as credit cards and customer loyalty cards, the internet and social media and, increasingly, WiFi sensors and electronic tags. Much of this data is unstructured – data that does not conform to a specific, pre-defined data model”.

(12)

12 Mayer-Schönberger and Cukier (2013, p.69) defines Big Data as “N=all” to which they refer to all information that is available can be used. Connolly (2012) explains that most definitions of Big Data describe new forms unstructured data, but describes Big Data as “Big Data = transactions + interactions + observations”. Big Data in auditing is able to use both the structured information used in ERP systems, like the sale transactions in the system and the unstructured information beside ERP systems, such as e-mails or contracts.

2.2 Complementing/substitute traditional methods of auditing

Vasarhelyi, Kogan, and Tuttle (2015) claim that by adding Big Data with the traditional data will create a “bridge” which can improve the audit quality. Big Data can be used to

complement the traditional way of auditing. They explain how adding Big Data to traditional ways of auditing will enrich the availability of information, by using the Big Data Analysis next to the traditional ways of auditing to acquire more audit evidence. As earlier discussed, Yoon, K., Hoogduin, L., & Zhang, L. (2015) investigated how Big Data can complement traditional ways of auditing. They found that Big Data can bring reliable and sufficient data for the audit, as new sources of data can be added to the traditional data and be used to increase the availability of audit evidence.

Swart (2014) claims that Big Data can only be implemented when it substitutes another form of auditing because otherwise, Big Data Analysis doesn’t fit within the budget of the audit. Relying on Big Data can lead to Big Data Hubris (Lazer et al., 2012). This is when Big Data is used as a substitute to traditional data collection, rather than a complement to traditional methods. A case where trust in the analysis has gone wrong is with Google Flu (Lazer et al., 2014). Google Flu uses search terms to forecast flu; however, search terms that were unrelated, but strongly correlated were taken into account. Google Flu relied too much on Big Data, as it substituted traditional method of data collection and analysis. In the end, the flu prevalence was overrated. Overfitting had occurred, this is when the logarithms seek for correlations that are unrelated (Fayyad, 1996).

Whereas Vasarhelyi et al. (2015) and Yoon et al. (2015) show that Big Data will improve the quality of audit evidence, by complementing the traditional method and thus increase the availability of audit evidence. Other researchers are more skeptical. Crawford (2013) and McKinney et al. (2017) explain that the results of the data sets of Big Data are not objective. They are numbers which can be interpreted differently; therefore, the conclusions based on Big Data Analytics may be biased. Cao et al. (2016) say it depends on the level of audit whether Big Data Analysis is useful. The messy data can be used in the early stages of

(13)

13 planning and risk assessment, contrary to substantive procedures, as the data can be used to find patterns and trends that provide the auditors in risks that might occur. For substantive testing, it is harder to use messy data as is more sensitive to noise.

2.3 Formalizing Big Data for audit

Yoon et al. (2015) claimed that there lacked a standardized way of using Big Data. Krahel and Titera (2015) discuss how the accounting standards should formalize Big Data and the

shortcomings of the accounting standards for applying Big Data Analysis. They believe that the accounting standards should become “dynamic” instead of “static” to provide auditors with the opportunity to use Big Data. In other words, the rules should give more freedom and be more flexible in order to provide the opportunity of using Big Data.

The AICPA created the Audit Data Standards (ADS) (AICPA, 2015). This is a standard format in which the data needs to be stored. There are two types of ways to store the data; one is storing the data in a flat file format and XBRL-GL. These formats can be used to

standardize the outputs of ERP systems and thus make use of a SQL database. Chan & Vasarhelyi (2011, p. 157) support the idea of standardizing the input of data. They claim: “For automated audit procedures to be effective, standardization of data collection and formalization of internal control policies is necessary”.

Contrary to formalizing, Alles & Gray (2016, p.31) proposed as a future research question: “Is unguided information collection appropriate and feasible in the auditing context”. There are some questions about the feasibility of formalizing Big Data. If

formalizing isn’t feasible it means that the collection and selection of Big Data will be based on the opinion of the auditor.

(14)

14 2.4 Information overload

Yoon et al. (2015) also find that auditors are not familiar with the many sources of big data. It is difficult to a priori predict how effective the use if Big Data will be. Auditors that use Big Data Analytics may be confronted with information overload (Brown-Liburd, H., Issa, H., & Lombardi, D, 2015). The auditors are overwhelmed with the amount of information. The auditors don’t have the ability to process a large amount of information, which will lead to a biased outcome. A large amount of data may lead to wrong auditing judgments, as it hard to to cope with the large amounts of data and determine what data is relevant. The large amount of data the auditor wants to use for audit evidence and unstructured nature of Big Data makes it hard to determine what information should be extracted. The irrelevant data has a negative effect on the audit judgment, as it will create noise in the analysis which will affect the results. Information overload shows how difficult it is to select data.

2.5 The use of tools

Even though analytics tools used to gather audit evidence able to cope with a large amount of data, Brown-Liburd et al.(2015) show that the results may still be problematic. The output of those tools still produces an overwhelming amount of data. Before using the data tools, the auditors must have a clear understanding of the data they want to use as audit evidence before they can draw conclusions. However; Alles (2015) explains that large investments in Big Data may lead to simple-to-use tools. This will make it easier for auditors to collect and select data he wants to use as audit evidence. Brown-Liburd, et al. (2015) suggest that current analytics tools used in marketing and insurance can mitigate difficulties of the auditors’ experience.

2.5.1 SQL/NoSQL

The tools can be used to search the databases. These databases used for Data Analytics are either in SQL (Structured Query Language) or NoSQL (non SQL) (Varian, 2014). Buckler (2015) explains the difference between SQL and NoSQL. The data in a SQL is stored in a relational database. The data in these databases are stored in tables and provide a strict

template of how the data should be stored. Within these tables for every item, the same data is collected. This makes sure this standard is restrained; nevertheless, it provides a high data integrity. The data in a NoSQL database is not stored in a relational database. The data is not stored in tables where every item provides the same information. As the volume increases the use of NoSQL tools become more important (Varian, 2014). NoSQL is a simpler in data manipulation; however, it can handle larger amounts of data. NoSQL is also more open and

(15)

15 excels at the storage of unstructured data (Feinleib, 2012), but does not validate the input. The data, therefore, has less data integrity, but it enables more information to be used. Dix (2014) discusses the advantages and disadvantages of using SQL and NoSQL in Big Data. Where SQL is more used and proven, NoSQL is faster and more flexible.

2.6 Knowledge data discovery and data mining

Data can be used to gather knowledge. Within a process, knowledge is extracted from data by identifying patterns within the data (Fayyas, 1996). This method is known as Knowledge Data Discovery (KDD). Within this process, logarithms are used to identify these patterns, this is also known as data mining.

Within the use of Big Data Analytics, there are multiple approaches to data mining. Gray and Debrecheny (2014) and Berry and Linoff (1997) explain the two main approaches and give examples of them. The two approaches are directed (top-down) and undirected (bottom-up). In an earlier research, Tukey (1977) used similar terms; exploratory data analysis confirmatory data analysis. He explains how exploratory data analysis and confirmatory data analysis are used within audit data analysis (ADA).Within directed approach, there are specifically targeted variables of interest and with the undirected

approach, there is no specific variable and the whole population is searched for a relationship between the variables. The examples Gray and Debrecheny (2014) and Berry and Linoff (1997) use for the directed approach are classifications, estimation, and prediction. The examples they give for the undirected approach are affinity grouping, clustering and description, and visualization.

Thuraisingham (2003) shows how both methods can complement each other using a hybrid approach, where both the top-down and bottom-up approach is used. E.g. correlations can be found using the undirected approach which can later be tested through hypotheses in the directed approach. Byrnes et al. (2014) find the same stating that there is no bright

distinction between exploratory and confirmatory ADA. Since exploratory results can be used for confirmatory practices.

2.6.1 Directed/Lean Approach (Bottom-up)

When using Big Data, it is important to know what relevant questions need to be asked beforehand. Keltanen (2013) and Deloitte (2012) show the importance of a lean approach in gathering audit information. The lean approach is about using only relevant data with the right tools instead of analyzing all the data available. Even though they use a different name, it can

(16)

16 be considered as the directed approach. The lean approach will cut costs of analyzing and improving the accuracy of data. Analyzing high volumes of data is costly and adds less accuracy. Shah (2012) discuss how Data Analytics is more part of the IT-department and more employees besides IT need to be trained in analytics since their knowledge is important for interpretation.

2.6.1.1 Classification

With classification, pre-defined subgroups are determined to categorize the data. The

knowledge of the field is used to create categories in which the data can be divided (Gray and Debrecheny; 2014). Wu et al. (2005) show how classification can be used for data mining with multiple databases.

2.6.1.2 Estimation

Estimation can be considered as the same as classification. Nevertheless, whereas with classification the subgroups are divided “discrete” categories (Gray and Debrecheny; 2014), estimation has more continuous variables. Results are more likely to be shown within in a range (e.g. 0 to 10) rather than discrete categories.

2.6.1.3 Prediction

Prediction is used to test the hypothesis. Whenever something is expected, Big Data Analytics tools are used to determine whether this expectation is true. Gray and Debrecheny (2014). Nevertheless, some researchers consider hypothesis testing as another approach to data

mining next to directed and undirected, instead of part of the directed approach (Flockhart and Radcliffe; 1996).

2.6.2 Undirected approach (Top-down)

The undirected approach makes an explorative way of using the data. Instead of using pre-existing knowledge to use the data, with the undirected approach the data has to “speaks for itself” (Acker, Blockus, & Pötscher, F 2013, p.8; Mayer-Schönberger & Cukier, 2013, p.6). This way not only the obvious correlations are found, but also the not obvious correlations. However, there is a risk in using the undirected approach as correlation doesn’t have to mean causation.

(17)

17 2.6.2.1 Affinity grouping

Affinity grouping is used by finding a relationship in variables without any independent and dependent variable (Gray and Debrecheny, 2014). The analyst looks for unknown

relationships within the data. The tools are used to find random correlations. It’s up to the analyst/auditor to interpret the results and the quality of the outcome.

2.6.2.2 Clustering

Clustering can be compared to classification; however, there is a major difference. Where classification uses the knowledge of the user to divide the data into sub-groups, with clustering the tools look for similar patterns in the data to put them into sub-groups. The auditor needs to interpret the result. Thiprungsri and Vasarhelyi (2011) examine how clustering can be used within auditing. Whenever clustering is used, all the transactions are categorized. The transactions not categorized can be considered as an outlier. These outliers can be considered can be flagged and be more examined, since they can be a result of fraud or error.

2.6.2.3 Description and visualization

Description and visualization are used to put data into a visual context (Gray and Debrecheny, 2014). Putting the data into a visual context will provide insight into the data. The user will have more understanding of the data and can use this to make decisions.

2.7 Shift in thinking

Alles and Gray (2016) explain how there will be a shift in thinking from causation to correlation. With the use of Big Data, correlations will be sought and based upon the correlation conclusions are made. Anderson (2008) claims that correlation is enough when using petabytes. Brynes et al. (2014) also explain that by using a whole sample a bit of pollution is permitted. However, for auditors, there may be a backlash in using the

“correlation to causation approach”. As often referred correlation does not mean causation (Alles and Gray, 2015, p.440) explain how the audit profession is more struggling in handling Big Data. They explain that the audit profession is more constrained by standards and specific tasks.

McKinnley et al. (2017) and Shah et al. (2012) discuss how important it to be more skeptic with picking the data. They explain how there is more trust in analysis then

(18)

18 skeptical (art. 7 ISA 200, 2009). For example, a case where there was too much trust, was earlier discussed with Google Flu. (Lazer et al., 2014). Here the trust in an analysis led to Big Data Hubris and eventually lead to the algorithms making conclusions based on unrelated correlations.

2.8 Provenance

Appelbaum (2016) shows the importance of provenance within Big Data. Provenance shows the lineage and origin of the data. Meta-data, audit trails and log files are used to give insight into the ways how the data were originally recorded and how it has transformed. Provenance can be used for process mining (van Aalst et al., 2010). Jans and Alles (2010), the meta-data is able to discover the processes in the systems by clustering the event logs. This can also be done within the audit. Alles et al. (2004) explain how a black box (BB) needs to be used to provide information about the decisions and actions of the auditor during the audit. Based upon this idea, Appelbaum (2016) proposes to use a Big Data provenance black box

(BDPBB). Here the events of the analysis will be recorded. Every action that is done within the analysis will be stored in the black box. This black box will focus on the provenance of the Big Data used. The Big Data that is analyzed will be traced back to its roots, the lineage and the origin of the input of the data need to be determined. The use of providence with Big Data will increase the liability of the data; especially, of the used external Big Data. The black box allows an audit trail to be held in the chosen data. Regulatory organs can use this

(19)

19 3. Theory

As earlier described literature explains the importance of selecting Big Data for audit evidence. This paper provides information about the institutionalizing of selecting Big Data for audit evidence. Audit firms try to infiltrate a new environment using Big Data Analytics for audit evidence. Big Data Analysis has mostly been part of the IT companies/departments; however, auditors will step out of the traditional way of auditing and use Big Data to

complement the traditional way. Institutional work theory explains how individuals and organizations gain legitimacy in a new field, in this case, the theory shows how either auditors or analysts as individuals may gain legitimacy within the field of using Big Data for auditing.

This paper uses the institutional work theory of Lawrence and Suddaby (2006) to determine the institutionalization on the use Big Data Analytics in audits. Contrary to other institutional theories, the model of Lawrence and Suddaby (2006) focusses on the thoughts, feelings, and behavior of both the individual and collective actors instead of only macro-institutions. Bjerregaard (2011) explains how in past, the focus from institutional theories slowly focused more on macro institutions instead of actors. In the last couple of years, there seems to be a focus on actors next to the macro institutions. Different individuals from different departments were asked about their role in the implementation of Big Data for audit evidence. Lawrence et al. (2011) introduce the term “institutional biography”, where is sought after the relationship between the institution and the individual. Here they look at how the individual creates, maintains or disrupt institutions.

Empson (2013) uses the institutional work theory in a different context. She explains the creation of dyads in law firms. Managing partners (CFO’s, COO’s with mostly an earlier accounting career) who made the strategic decisions slowly began to work more and more with managing professionals (senior managers) to professionalize the law firms. Instead of looking at dyads, this research focusses on “triads”. Where Empson looked at two parties that needed to cooperate, this research look collaboration of 3 parties; auditors, IT-auditors and Data Analysts. Using the institutional work theory, it will become clearer what kind of methods is used by these individuals and how they need to cooperate to implement the use of Big Data Analytics in the audit practice.

(20)

20 3.1 Directed and Undirected approach

As earlier discussed in theory by Gray and Debrecheny (2014), Fayyad (1996) and Acker et al. (2013) there is the top down/directed approach and the bottom-up/undirected approach of using Big Data Analytics. The preferred method of the institutionalized parties may influence the approach used for Big Data Analytics. Depending on the influence and strategies of the individuals within the firm, the preferred will be determined. As the institutional work theory will give insight into the strategies used to institutionalize the use Big Data Analytics in audits, these strategies linked with the preferred approach of the individuals might give insight into how the selection process of Big Data Analysis will develop.

3.1.1 Directed (Top-down)

It may be the case that the auditors determine what information is important and therefore influence the process of Big Data. As earlier discussed, the lean approach shows how important it is to have knowledge on the subject you perform Data Analytics on (Keltanen, 2013; Deloitte, 2012). In this case, the lean approach is used. The knowledge of the auditor is used to specify the search for correlations. The auditors determine what kind of data will be analyzed. For example, the auditor can use rules like looking at bookings made over the weekend, to look for fraud. Since multiple authors discussed the risk of the undirected approach when it comes to auditing, it is expected for auditors to have the directed approach as their preferred approach.

3.1.2 Undirected (Bottom-up)

The analysts may determine what tools should be used and how these tools should be used. Different analytic techniques can be used for Big Data Analytics for audit (Gray and

Debrecheny, 2014). The tools are used to determine correlations within the whole population. As the Data Analysts are less known with the information within the field and are more experienced with the use of the tools, it is expected for the bottom-up approach to be the preferred method for the Data Analysts. For example, machine learning can be used to look at patterns within transactions to identify fraud.

(21)

21 Figure 1: Methods of using Big Data Analytics

3.2 Practices of institutional work theory

The institutional work theory of Lawrence and Suddaby (2006) has three categories of institutional work; creating institutions, maintaining institutions and disrupting institutions. Each of these categories uses a set of practices. These practices explain what kind of actions is used by the actors to achieve their goals. Considering the fact that Big Data still has to emerge within the gathering of audit evidence, this study will focus mainly on the creating of

institutions. Nevertheless; Whereas Lawrence and Suddaby (2006) present the categories in a linear order, Empson (2013) finds that all forms of institutional work can occur

simultaneously. Empson (2013) and Canning and O’Dwyer (2016) show how the practices used in the institutional work theory can also be used in a different context than explained by Lawrence and Suddaby (2006); moreover, this could mean that practices might even belong in different categories. This paper only focusses on creating and maintaining institutions. With practices in creating, institutions may explain how the new institution, where Big Data

Analytics will be used, will be established. Whenever the practices of maintaining institutions are more prevalent than the practices of creating institutions, the institutionalizing will occur slowly. If this is the case, the current institution with using sampling will still play an

important role in the audit, instead of Big Data Analytics. Only the practices of creating and maintaining institutions relevant to this research will be explained.

(22)

22 3.2.1 Creating institutions

The first category explains the efforts of institutional entrepreneurs, as the actors, to create a new institution. According to Dacin (2002), institutional entrepreneurs seek to create

institutions by deploying their resources. The actors deploy their resources in an attempt to change the field.

The first set of practices reflect on politics between the actors used to influence the institutionalization of the new field. The practice of the first set that is relevant to this research is advocacy. The second set focusses on creating a new field by changing the belief system. The second set consists of: constructing identities, changing normative associations,

constructing normative networks. The third set explains how the boundaries of existing fields are changed to determine the new fields. The practices of the third set are: mimicry, educating and theorizing.

3.2.1.1 Advocacy

Advocacy can be described as “The mobilization of political and regulatory support through direct and deliberate techniques of social suasion” (Lawrence & Suddaby, 2006, p. 221). With advocacy, it can be shown how certain individuals within the organization promote the idea of using Big Data for audit evidence. It shows who is pushing the idea forward to change the way of auditing.

3.2.1.2 Constructing identities

Constructing identities look at the relation between the actor and the field (Lawrence and Suddaby, 2006). With the implementation of Big Data Analytics, relation between the auditors and the field will change. As clients become more and more digitalized, the use of Big Data Analytics might be used to handle the changing field, instead of manual sampling. 3.2.1.3 Constructing normative associations

Constructing normative associations describe the relation between the norm and the actor. “Re-making the connection between sets of practices and cultural foundations for these practices” (Lawrence and Suddaby, 2006, p.221). With the use of Big Data Analytics, the focus needs to shift from causation to correlation (Alles, M., & Gray, G. L. 2016; Gray & Debrecheny 2014). Whenever Big Data Analytics is used as a new tool, there need to be a change of morals for auditors that allow auditors to use more exploratory methods.

(23)

23 3.2.1.4 Constructing normative networks

Lawrence et al. (2002, p. 283) show how collaboration can lead to proto-institutions. They refer to proto-institutions as “new sets of practices, technology, and rules that have the potential to become widely institutionalized”. Lawrence and Suddaby (2006) look at interorganizational collaboration between peer groups can create a new institution. It describes the relationship between actors. This paper looks at how different departments within the organization are involved and collaborate in implementing Big Data Analytics. Constructing normative networks shows how auditors, IT-auditors and Big Data Analyst might collaborate to use Big Data Analytics for audit.

3.2.1.5 Mimicry

Mimicry looks at how practices from old institutions are transformed into practices of the new institutions (Lawrence and Suddaby, 2006). Actors are familiar with the practices of the old institutions, by editing the practices of the old institution the actors become more familiar with the new institution. This may result in more support.

3.2.1.6 Theorizing

Theorizing is described as: “The development and specification of abstract categories, and the elaboration of chains of cause and effect”. (Greenwood et al. 2002). It is vital for theorizing to name the concept. Naming creates a notion of the concept. It allows people to grasp the understanding of concept.

3.2.1.7 Education

This part of creating institution displays how actors gain the skill and knowledge to support the new institution. Auditors will need some training in handling Big Data before they are able to use Big Data Analytics. Educating will show how the auditors are trained within the Big 4 firm to cope with the difficulties of using Big Data Analytics.

3.2.2 Maintaining institutions

Here it is explained how the institutions are maintained by the actors. (Lawrence and

Suddaby, 2006) The practices in maintaining institution explain how the old institution keeps in place. Instead of using Big Data Analytics, a strong sign of maintaining institutions may indicate that the sampling will still have a big role in the audit.

(24)

24 3.2.2.1 Policing

Policing occurs because of the need of being compliant, through enforcement, auditing and monitoring (Lawrence and Suddaby, 2006). The auditors need to be compliant with the audit standards. The need to be compliant, makes sure the institution might be maintained. You need to comply with the rules, and it is the question whether the auditors will with the use of Big Data Analytics.

3.2.2.2 Deterring

Lawrence and Suddaby (2006, p.230) describe deterring as: “Establishing coercive barriers to institutional change”. There are barriers in place that make it hard for a new institution to take form. A set of rules might be set in place to make sure that it is harder too for the new institution to enter the field, thus it helps maintaining the institutions.

3.2.2.3 Embedding and routinizing

When actors are used to work in a certain method, it is hard to let them work in a new way. Whenever a way of working is part of the day-to-day routines, it will become hard to learn them another way of working. (Lawrence and Suddaby, 2006). This makes sure that embedding practices will have a maintaining effect.

3.3 Summary

This research focusses on the institutionalization of Big Data Analytics in the audit. In order to look at how the institutionalization occurs, the institutional work framework is used. Here the practices of creating institutions show how Big Data Analytics will be institutionalized, while maintaining institutions will show how the old method of auditing will still be used. Linking the findings to the theoretical framework shows how it will be institutionalized. 4.1 Sample

This is a case study on the selection of data for Big Data Analytics, multiple sources of data are used for this case study. First, the existing literature is used. The existing literature will give an understanding of what Big Data is, how Big Data Analytics works and what the risks and benefits of using Big Data are. Second, I want to conduct four interviews on auditors familiar with Data Analytics, three interviews on Data Analysts with an IT-audit background and four interviews with (Big) Data Analysts without an IT-audit background about the selection of Big Data. For this thesis, I did an internship at a Big 4 audit firm in the

(25)

25 Netherlands. The difference in audit background is made because the analysts with an IT-audit background have more experience and affinity within the IT-audit. All the participants labeled as IT-audit have the Dutch IT-audit title of RE (Register EDP-Auditor), but are active as Data Analysts. Within the Big 4 firm, I had access to these auditors, IT-auditors and the analysts to conduct my interviews.

Power and Gendron (2015) explain there are two dimensions in qualitative research. The first one is with “positivist spirit”, here it is believed the reality is external to the mind and handles a breadth approach where the conclusions are valid in a lot of different situations. The second dimension is “constructivist spirit”, here is believed the reality is socially

constructed. The reality is in the mind of the actors and is how they perceive the truth is. Here an in-depth approach is needed. The in-depth approach explains the human behavior in specific context.

The sample will show how the auditors, IT-auditors, and analysts differ in the way they think Big Data will impact the audit and how the data should be used. The auditors and analysts have different abilities and may have different ideas about how Big Data can be used within the audit. It would be interesting to see how the analysts with a lot of knowledge of Data Analytics, but less knowledge of auditing will determine what data is important for the audit. Moreover, it would be interesting to show how auditors try to cope with the complexity of Big Data Analytics and how they determine what information is relevant.

4.2 Method

For this research, I conducted semi-structured interviews. This will provide more depth within the interviews. The interviews were conducted, recorded and transcribed by myself. An overview with the interviewees is given in Appendix 1 and includes the duration of the interviews. Atlas.ti will be used to analyze the information. The information within the transcripts will be coded and the data considered important will be labeled. The transcripts of the interview were sent to the interviewees and asked for feedback. This will reduce the errors and interpretation bias of the interviews.

4.3 Data collection

This section explains how the information is gathered. To get the right information, it is important to interview the right people, thus the selection of the interviewees is explained.

(26)

26 4.3.1 Selection of interviewees

The interviewees were selected based on their knowledge of (Big) Data Analytics. Since Big Data is not yet much applied in practice, the auditors and IT-auditors were based upon their knowledge of Data Analytics, while the Big Data Analysts were chosen for their knowledge of Big Data. There were 12 interviews conducted; 4 auditors interviewed, 3 IT-auditors and 4 Big Data Analysts. Also, a forensic accountant was interviewed on the possible use of Big Data within the audit. The forensic department is more developed in using Big Data Analytics than the audit department. The forensic accountant provided information on the use of the bottom-up method explaining how machine learning can be used and how it can be used within the audit. Selecting different departments may show how these professions think differently on the implementation of Big Data for audit. Between these different departments, the interviewees are also selected based upon different levels of command. This will give more insight in the way the way it is institutionalized.

4.3.1.1 Coding of interviewees

All the respondents are coded. The auditors are coded as Audit1, Audit2, Audit3, and Audit4. The IT-auditors are coded as IT1, IT2, and IT3. The Big Data Analysts are coded as BDA1, BDA2, BDA3, and BDA4. The forensic accountant is coded as FA. In Appendix 1, a more detailed view of the interviews and interviewees is given.

The auditors were chosen because of their knowledge in the field of auditing, but the lack of knowledge of Data Analytics. It would be interesting to see how the auditors will try to cope with their shortcoming of technical knowledge to use Big Data Analytics in practice. Since the auditors have always been involved in the financial audit and the implementation of Big Data Analytics may have far-reaching consequences for the way how they work, it would be interesting to see how they think Big Data Analytics will change the way they work and how they perceive the use of Big Data Analytics. The literature revealed that auditors are more careful in their approach of using Big Data. As Keltanen (2013) and Deloitte (2012) suggest, auditors will prefer to choose the lean approach. Audit1 is a Data Analytics enthusiast with knowledge of IT-auditing since he finished most of his IT-audit education. Audit2 had done prior research on Data Analytics in the audit. Audit3 is the head of Data Analytics within the Big 4 firm in the Netherlands. Audit4 is the head of innovation in the Netherlands.

(27)

27 The IT-auditors were chosen because of both their IT-audit knowledge and their knowledge of Data Analytics. The IT-auditors in this research are often more skilled as Data Analysts and are not IT-auditors pur sang. However, all participants labeled as IT-auditors had experience in IT-auditing and possessed the Dutch IT-auditor title of Register EDP-auditor (RE). Moreover, the IT-EDP-auditors were involved in audit analytics and thus have experience in the use of Data Analytics for audit. As they had knowledge of IT-audit and audit analytics, they could be considered as a bridge between audit and Data Analysts.

The (Big) Data Analysts are chosen because of their technical knowledge of Data Analytics. As Katz (2014) suggests: “the point of view of those people, many of whom are skilled in the analysis of Big Data, differs sharply from that of traditional accountants”. Contrary to the IT-auditors, the (Big) Data Analyst has no or less experience in the audit and are not involved in audit analytics. Even though 2 of the 4 Big Data Analysts possessed the RE title, BDA2, and BDA3, they were chosen based on their technical knowledge on Big Data Analytics. As they have the technical knowledge, but not the knowledge of the field. Interviewing Big Data Analysts can provide different insight in the use of Big Data in the audit.

4.3.2 Semi-structured open interviews

Keegan and Ward (2003) consider three types of in-depth interviews. The postmodern approach, the feminist research approach, and the biological approach. The postmodern approach looks at the way reality is constructed. Here the interview is seen as a collaboration between the interviewer and interviewee. This approach focusses on the contradictions and action for change. A feminist approach is a non-hierarchal approach without objectifying the interviewee. For the unstructured nature of this way of interviewing, it is important for the interviewer to be very responsive. The biographical approach looks at the milieu and the experience of the interviewee throughout a specific period.

The interviews were conducted in a semi-structured manner and use the postmodern approach. Before the interviews were held, a short introduction was given the purpose of the research. This helped the participants to think about what kind of information was needed and thus the interviews often turned out to be a collaboration. Moreover, the interviewees were asked about a change in the audit. Within the interviews, a set of questions was used, thus every participant has been asked the same questions. Asking the same questions, made sure the same topics were discussed in every interview; however, these questions were not asked in

(28)

28 a hierarchical manner. The semi-structured manner allowed for follow up questions.

Therefore, in-depth questions to be asked and the standard question served as a guide to make sure all topics were discussed. This made it possible to bring more depth to the interviews. 4.3.3. Data processing

The interviews were conducted in person. Dependent on the nationality of the participants, the interviews were either in Dutch or in English. Not all interviewees were Dutch; therefore, interview of BDA4 had to be conducted in English. All transcripts except for BDA4 are in Dutch. The interviews were recorded and transcribed by myself and send to the interviewees for validation.

Afterward, the transcripts of the interviews were coded in Atlas.ti. For the coding, the three subprocesses were used (O’Dwyer, 2004). The three processes are; data reduction, data display and conclusion drawing. First, I looked at the transcript and started open coding. Then the open coding was organized and assembled. In the last process, I looked for patterns and the core codes were developed. Finally, the transcripts were coded into five main themes. These themes are:

1. Initial thoughts

2. Strategy of implementation

3. Method of using Big Data Analytics 4. Drivers and barriers

5. Roles

Each of these categories is divided into different subcategories. For the codingThe coding used in Atlas.ti can be found in Appendix 2. The interviews were summarized in English and the citations out of the transcripts were translated into English.

5. Case analysis

In this chapter, the finding of the conducted interviews is presented. During the interviews, the information on the implementation of Big Data in the audit is gathered.

5.1 Initial thoughts

There are many definitions of Big Data (Zhang, Yang, and Appbelbaum, 2015; IBM, 2012; ACCA, 2013; Mayer-Schonberger and Cukier, 2013; Connolly, 2012). This creates confusion about the notion of Big Data. Analyzing the interviews, it became clear there was a lot of

(29)

29 different understandings of what “Big Data” means. The participants gave different

definitions of Big Data and gave different examples of how they perceive Big Data. This chapter discusses the initial thoughts of the participants. Here it is discussed how they would define Big Data, how they perceive the use of Big Data in practice and how they believe it will transform the audit.

This section is divided into two parts. The first part discusses the definitions given by the participants. In the second part, the examples of Big Data given by the participants are discussed. Depending on the definition of Big Data, the participants gave different examples of both how it was used in practice and how it can be used.

5.1.1 Definitions

As earlier mentioned, there is no clear definition of Big Data. The interviewees were asked about their meaning for Big Data. Where some interviewees referred to the 3V’s (later expanded and in this paper discussed as 4V’s) most interviewees agreed that Big Data is whenever there is a large amount of data and everything can be used. Some interviewees considered it more of a buzzword. Other participants considered Big Data as “all data”.

“As the 3 V’s. Volume, Velocity, and Variety. More like Gartner.” (IT2, p.1)

“What is Big Data? I always refer to the 3V’s: Volume, Velocity, and Variety. When one of these three applies, it can be considered as Big Data. So when it is a large amount of data, or a lot of different data, I consider it as Big Data. [….] It is a bit like the question how I perceive Big Data. I think it is a lot of Data. It doesn’t have to be in something like a Hadoop cluster. When you use a SQL database I already consider it as Big Data, especially when you combine it with different sources of data.” (BDA2, p.1)

“Big Data is a nice buzzword. But why do people want something with Big Data? Big Data isn’t a goal by itself, Analytics overall isn’t a goal by itself.” (IT1, p.1).

“First of all, I think it is a buzzword. Especially the term. New technologies are used to analyze the data. In the past, you sometimes saw data analytics being used, but that was structured data, numbers, Excel sheets, etc. The techniques used to analyze text are improved and now applicable. Big Data has become a hype, at least the word “Big Data” has become a hype.” (BDA1, p.1)

(30)

30 “What does Big Data mean? As you mentioned earlier, unstructured data. To me, Big Data in the audit is all data. Not only structured but definitely not only unstructured. Both.” (IT3, p.1)

Having no clear definition there is also a gray area of the difference between Data Analytics and Big Data Analytics. Two participants even considered no difference between Data Analytics and Big Data Analytics.

“I think Data Analytics is a subset of Big Data Analytics, but I consider it as one.” (Audit4, p.1)

Other interviewees considered the use of external data as Big Data. They make a distinction between the internal information of the client and external of the client. E.g. linking the land- and real estate register information to the real estate on balance.

“How do I perceive Big Data? I have more experience in Data Analytics. But what do I consider the difference between Data Analytics and Big Data? With Data Analytics you have certain information of the client, you look in this data whether there are any mistakes. What I consider Big Data, for example, is when you want to check the interest income of a bank using external data. The external data can tell something about the development of the market interest rate, interest income or something in general of the market.” (Audit2, p.1)

“And Big Data is something we haven’t really started to use as auditors, at least I have not seen much in this firm. It is the use of external data sources. You got different kinds of external sources of data. There are invoices that come in and will be read through “scan and recognize”. This is also a source of internal data, but it is also possible. But it is mostly what data is available in the domain and what you can use within the audit.” (Audit3, p.2)

The different definitions used by the participants show a weak application of “theorizing” of the practice in the institutional work theory of Lawrence and Suddaby (2006). Theorizing consists of two vital parts, naming, and practices. Having no clear definition of Big Data, there is no clear concept of the notion “Big Data”. A clear notion of the concept allows the field to better understand the concept. As it is not clear what Big Data is, the field has a bad understanding of what it can and does.

(31)

31 5.1.2 Experience

In this section, the experience of the participants using Big Data Analytics is discussed. The examples they give of how Big Data is used in practice are displayed. Also, scenario’s where Big Data is not currently used but could be of great use are discussed.

5.1.2.1 Revenue recalculation

The most common example the participants could think of using Big Data is revenue

recalculation. While it depends on the definition that is used whether this can be considered as “Big Data”, it is the most common form of Data Analytics that is already used in practice. With revenue recognition, a second (shadow) system is constructed by the audit firm which analyzes the invoices to determine the revenue. When both systems show the same amount of revenue, the auditor has evidence that the revenue is stated at the right amount.

“We made a recalculation to determine the revenue. We downloaded the relevant data for the revenue from the client. Simply said... we connected the ”P’s”, the prices, to the “Q’s”, all transactions downloaded. Someone will build a nice formula in SQL. The formula will multiply the P’s by the Q’s and a number will appear. We look if the number is the same as the number that was booked.” (Audit1, p.1)

“For an auditor, it is interesting to look at whether the revenue is accurate and complete. At [Big 4 firm] we build a shadow system where a copy of the data that comes in the operating system will be transferred to our system. At the end of the month, we check if we have the same revenue.” (IT1, p.1)

5.1.2.2 Text scanning

Another example that often came to mind is text scanning. There are some text mining tools that are able to analyze contracts or e-mails. Text scanning makes it possible for Big Data Analytics to be used to test internal controls. Large amounts of contracts can be scanned to find deviations in the contracts; moreover, the contracts can be scanned on whether they have a signature and if this signature is from the right person.

“If you look at e-mails, you can do control testing. Control testing is only possible if you are not only able to scan structured databases, but are also able to analyze unstructured data and get information out of it. For example, if I test 10 samples for

(32)

32 approval in the e-mails, I scan the name and the word “approval” within the e-mail. If I look at that, I basically checked the control.” (IT3, p.4)

“It is from the US and it is a new system. It is called [name]. I have never seen it, but I know it exists. It is pure text mining where all contracts will be analyzed by some kind of artificial intelligence that will index the data.” (IT1, p.1)

5.1.2.3 Weather

Another example that was given twice, is the use of weather information to predict the revenue of some using weather forecast

“I can also imagine it being used for energy companies. The revenue depends on the temperature. A couple of years ago, I was involved in the audit of heating companies. There was a direct link between the revenue and a number of days of the year the temperature was above 18 degrees Celsius. Of course, there was a standard price of gigajoule used. We used the data of the KNMI in De Bilt on a monthly base.” (Audit3, p.2)

“What I really liked, with [recreational park] they created a regression model based on the weather to determine the revenue. This led to great results.” (IT2, p.4)

The many definitions of Big Data translate into the practices. The definition of theorizing given by Greenwood et al (2002, p.60) “the development and specification of abstract categories, and elaboration of chains of cause and effect”, does not fit for Big Data. The confusion about the notion of Big Data means there are no abstract categories. This translates into the practices, and thus the chains of cause and effect. Where some participants referred to revenue recalculation as a practice of Big Data, other participants would not. They would consider it as Data Analytics. As some participants claimed, analyzing large amounts of data would be enough to call it “Big Data” and others claiming that the use of external data is needed. It can be questioned if revenue recalculation is “Big Data”. There might even be some questions whether text scanning can be considered as “Big Data”, as with the scanning of contracts internal information of the client is used; however, it is external of ERP-systems. Nevertheless, the use of weather to predict the revenue does fit in all the definitions given. 5.2 Strategy of implementation

In this section, it is explained how the participants think the implementation will occur, and it focusses on the participants’ perception on the implementation of Big Data.

(33)

33 5.2.1 Formalizing Big Data

As discussed by Yoon, K., Hoogduin, L., & Zhang, L. (2015), there is no standardized way of selecting Big Data. And Krahel, J. P., & Titera, W. R. (2015) discuss how Big Data should be formalized. However, the sort and amount of data differ per company. This means that

standardizing will be hard and maybe the data should be selected based upon the auditor's opinion. As Alles, M., & Gray, G. L. (2016, p.31) proposed as a future research question: “Is unguided information collection appropriate and feasible in the auditing context”.

There are two ways discussed in which Big Data can be formalized. The first one is through regulation and best practices. Nevertheless, in the IT-audit department, it was discussed that the main challenge of using Data Analytics is to extract the data. The second part in which Data Analytics can be formalized is through the standardization of data. The Audit Data Standard (AICPA, 2015) created a standard in which data can be stored. 5.2.1.1 Formalizing rules

One of the major problems in the implementation of Big Data Analytics is the fear of auditors not be compliant whenever Big Data Analytics is used within the audit. The use of Big Data Analytics is new, and it is hard for the auditor to determine how the Autoriteit Financiële Markten (AFM), the Dutch financial regulatory authority, looks at Big Data Analytics. Some auditors are afraid that if they apply Big Data Analytics, they will not be compliant with the Nadere voorschriften controle- en overige Standaarden (NV COS), the Dutch accounting rules. Nevertheless, the old method is already used a lot and therefore “proven”.

“The auditors say: “The COS does not provide any room, I do not know if it is compliant with the current standards”. If you look at the work as a data analyst, how do you make sure the COS is adjusted to make it easier? I have spoken to auditors who said: “I do not know what the AFM thinks of it”. The AFM says: “please do not let us stand in the way, show us how you want to use it and how the current standards do not meet the requirements”.”(IT2, p.1)

“Isn’t that the reason why it takes so long before it is applied? I notice auditors have something like: “let’s do it as it prescribed on paper, at least this way we know we are compliant and let sleeping dogs lie”.” (BDA2, p.6)

Even though there aren’t many practices of how Big Data Analytics can be used within the audit, the auditors interviewed believed that the COS provided enough freedom to

(34)

34 use it. For some auditors it is not the question whether the rules enable auditors to use Data Analytics, but if the way it is applied is compliant.

I think the COS provides enough leverage; however, we need more courage to apply it. You are allowed to use Data Analytics as part of your detailed testing and control testing. We only never did, because we feel more comfortable in the way we are used to. I do not have any examples how a regulator looks at a dossier that is fully

supported by analytics. It would be nice to hear what their view is. This might explain the fear when you apply Data Analytics because you have no invoices and samples in your dossier. That used to be the evidence you used to support on, now you have to support the knowledge, analysis, and data. (Audit1, p.7)

I think the rules themselves have enough freedom to use Data Analytics. In fact, it is mentioned two times in the COS, although it is in the appendix. The COS gives the opportunity, but I do think the AFM of NBA should provide examples, guidance, or clearly express that they approve what is happening. (Audit2, p.2)

There were even conversations with the AFM about how Data Analytics can be used. As a regulator, the AFM cannot determine how it should be used, so it asks the auditors how it should be applied.

We went to the AFM and they were very clear. We told them what we are doing and they asked us to tell them what we need. As a regulator, the AFM cannot determine how we should do it. If we say this is how we are going to apply it, we have to go to the AFM and tell them “this is how we are going to it, but it is not possible due to the rules. Do something about it”. It can never be in the opposite direction. (IT3, p.4)

They have to tell it themselves. The AFM will never say that because they only test if the right audit evidence is in the dossier. As one of the Big 4, I think we should be more pro-active to the outside world, the AFM and the NBA like: “This is the way we are going to work”. (Audit4, p.7)

Every participant believes that it is not the regulators that prevent the use Data Analytics, but the auditors themselves. The fear of the regulators disagreeing with the way Data Analytics is applied creates an incentive for the auditors to stick with the old methods. While everyone believes Data Analytics will improve the audit quality, they do not want to use it because of the risk of not being compliant. To address this problem, a different mindset

(35)

35 is needed where auditors are less risk averse and prefer to use Data Analytics in the audit. The auditors should have more courage to use these new techniques.

“Different kind of people. The old school auditors are told to never make mistakes. I get it. If your career depends on it, you make sure the things you do are 100%

compliant. Otherwise, you will take the risk. It is something I understand, but we need to do something about the culture within [Big 4 firm]. We need to reward

entrepreneurs and push to create new things.” (IT3, p.4)

While it may be unintentional by the regulators, having no clear view of how it should be applied creates a barrier for the use of Data Analytics. Both a form of “embedding and routinizing” and “policing” occurs where the institution of the current method of auditing is maintained (Lawrence and Suddaby, 2006). By not knowing how the AFM will react, policing will lead to embedding and routinizing. Even though the rules do provide the

opportunity to use Big Data Analytics, the fear of not being compliant makes sure auditors are not eager to use Data Analytics and are stuck in their current way of working.

5.2.3.2 Data standard

A major complaint of using Data Analytics is the Data extraction. There are many different systems that are used in the Netherlands, and even when companies use the same system the extraction can be difficult.

“A month ago we had to pull the plug out of a project since we saw it wouldn’t work. There were many systems and it was unachievable. You would think, you ask this from auditing and they thought about it. With data analysis, you have a sample of more than 300 different systems. There are just too many systems.” (IT2, p.3)

“Every system, even when you have an SAP-system, is different and has different configurations. It is hard to get the data. It needs some preparation, especially in the first two years. With every client you want to use it, you have that start-up period. Once constructed, you can let it run.” (Audit3, p.6)

A data standard would make it easier to extract the data since it will be stored and transferred in the same way through different ERP systems. The AICPA (2015) proposed an audit standard in which the data needs to be stored. For this data standard, XBRL GL and Flat file format are used to standardize the data from transaction level.

Referenties

GERELATEERDE DOCUMENTEN

Our main argument for involving patients in translational research was that their input may help to make an innovation more relevant and useable for patients, and that it may

User profiling is the starting point for the user requirement analysis, limiting the research to particular users (Delikostidis, van Elzakker, & Kraak, 2016). Based

G Rohat, J Flacke, A Dosio, S Pedde, H Dao, M van Maarseveen (Under review) Influence of changes in socioeconomic and climatic conditions on future heat related health challenges

Experimental and simulation data of the compressive strain with the pressure chamber diameter and the applied pressures as variables; combinatorial screening of 3D cellular behaviors

Life Cycle Assessment of low temperature asphalt mixtures for road pavement surfaces: a

In addition to Bickel, I will argue in the following chapter that the informal doctrine within the Marine Corps was, besides a result of the personal convictions of Marine

Following TGF- b 3 loading of microspheres, incubation with hMSCs for 21 d in vitro pellet culture revealed enhanced accumulation of GAGs (Fig. 1A) and positive IHC of collagen type

Figure 4.1: Foot analysis: Foot type and static and dynamic foot motion 61 Figure 4.2: Analysis of the left foot: Heel contact, mid stance and propulsion 63 Figure 4.3: Analysis