Privacy practices and stakeholder trust in big data : a framework

(1)

PRIVACY PRACTICES AND STAKEHOLDER TRUST IN BIG DATA:

A FRAMEWORK

MSC THESIS

Amsterdam Business School

Executive Programme in Management Studies – Strategy Track Supervisor: D.Sc. Arno Kourula - Assistant Professor of Strategy, UvA

University of Amsterdam

Image courtesy of Stuart Miles at FreeDigitalPhotos.net (royalty free)

Final draft submitted on June 28th 2015 By

Lieke Jetten

Student Number: 10499342 liekejetten1@gmail.com

(2)

Statement of Originality

This document is written by Student Lieke Jetten who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

Signature ___________________________________________ Table of Contents Abstract ... 4 1. Introduction ... 5 2. Literature review ... 9 2.1 Organizational trust ... 10 2.2 Big Data ... 13

2.3 Big Data ethics ... 15

2.4 Addressing Stakeholder’s privacy concerns ethically ... 20

2.5 Addressing Stakeholder’s privacy concerns strategically ... 25

2.6 Privacy practices ... 30

2.7 Proposed effect of privacy practices on organizational trust ... 35

3. Method ... 37 3.1 Research design ... 37 3.2 Sampling ... 41 3.3 Data collection ... 44 3.4 Data analysis ... 48 4. Empirical findings ... 52

4.1.Within case analysis ... 52

4.2 Cross case analysis ... 53

5. Discussion ... 65

5.1 Revising propositions and explanation building ... 65

5.1 Implications for research question ... 81

6. Conclusions ... 90

6.1 Theoretical contributions ... 90

(3)

6.3 Limitations and suggestions for future research ... 93

Acknowledgement ... 95

References ... 96

Appendices ... 111

Appendix1 - Interview Protocol in English and Dutch ... 111

Appendix 2 - Within case analysis - Visualization per case ... 117

Appendix 3 - Additional topics discussed in interviews ... 124

Tables and figures Table 1.0 Process of Building Theory from Case Study Research (Eisenhardt, 1989: 533) ... 38-39 Table 1.1 From Marshall et al. (2013:17) ... 42

Table 1.2 Characteristics of participants and organizations investigated ... 48

Table 1.3 Main concepts used for coding ... 49-50 Table 1.4 Codes and sub codes ... 50-51 Table 1.5 Cross case analysis on capacity of privacy practices and trust levels ... 54-55 Table 1.6 Cross case analysis on main antecedents and privacy practices ... 59-60 Table 1.7 Process insights and communication methods ... 62-63 Figure 1.0 A three-stage model with propositions ... 9

Figure 1.1 Visualization of the theoretical framework ... 10

Figure 1.2 from Yin’s (2009) ... 37

Figure 1.3 Abductive reasoning approach. ... 40

Figure 1.4 From Marshall et al. (2013:17) Number of interviews per study ... 42

Figure 1.5 From Marshall et al. (2013:18) Interviews per Study by Research Design ... 43

Figure 1.6 Purposive sampling approach ... 44

Figure 1.7 Heterogeneous sample ... 46

Figure 1.8 Revised sample ... 47

Figure 1.9 per case analysis ... 53

Figure 2.0 Capacity and trust matrix ... 56

(4)

Abstract

This study aims to answer the research question: ‘How can organizations manage ethical privacy concerns regarding Big Data?’ which is divided into sub questions on: (1) Antecedents: What determines the organizational decision to address ethical privacy concerns regarding Big Data?, (2) Moderators: Which varying tools or practices are used to address these concerns? and (3) Outcomes: What are the effects of the scope and capacity of these tools or practices used on perceived

organizational trust? The topic is addressed from stakeholder theory, Business ethics, RBV, business venturing and other organization and information systems perspectives. In an abductive reasoning approach preliminary propositions are developed as a base for a qualitative multiple case study in which 15 Big Data experts are interviewed. A rigorous within case and cross case analysis is conducted after which the preliminary propositions are reviewed. This leads to a final theoretical framework with propositions on Antecedents: (A) Smaller organizations will likely have less privacy practices than larger corporations, (B) Privacy practices prevent or mitigate risk of reputational (trust) loss, (C) Enhanced stakeholder knowledge on Big Data usage will decrease trust if privacy safeguards are missing and will increase if the organization has proper safeguards in place; Moderators: (D) The greater the scope and capacity of these privacy practices in place: (1) complying with legal

requirements, (2) having data privacy officers, (3) having internal organizational privacy frameworks (4) inducing cultural values on ethical use of data, (5) having privacy as an agenda point throughout the organization, (6) having regular board meetings on privacy and (7) organizing data privacy accountability via procedures, systems and organization design, the greater the organizational trust, except for small/micro young existing companies, (E) when organizations struggle to effectively communicate about Big Data, using metaphors and analogies in the communication to stakeholders will benefit organizational trust, (F) having specialized job profiles in-house such as data scientists and data analysts will benefit the efficiency of Big Data processing, and Outcomes: (G) Organizations that have a large scope and capacity of privacy practices and use metaphors and analogies in their

(5)

1. Introduction

As information asymmetries exist between stakeholders that provide data to organizations and organizations that use this data, granting trust to the organization to manage their data is difficult to do for stakeholders. Stakeholders, especially those lacking power, have to rely on the trustworthiness of organizations to meet the fairness obligations that are due them. Doing this is precarious for stakeholders since, although stakeholders’ contribution creates binding ethical obligations on corporations, they have no guarantees that these obligations will be met (Greenwood and Van Buren III, 2010). In an organizational setting trust is defined as ‘the willingness of a party to be vulnerable to the actions of another party based on the expectation that the other will perform a particular action impacting to the trustor, irrespective of the ability to monitor or control that other party’ (Mayer et al. 1995: 712). As organizational decisions increasingly depend on complex information systems affecting growing numbers of stakeholders (Bose, 2012) and stakeholders are less willing to share crucial personal

information online with organizations when they perceive privacy risk (Myerscough, Lowe and Alpert, 2006; Wu et al., 2012) the questions arises on how organizations can manage the privacy concerns and which strategies or tools are effective for Big Data management in establishing a trusting relationship between stakeholder and organization.

Big Data is a term generally used to describe information which is collected in large volumes, through a variety of ways, characterized by high velocity (real-time collecting) (McAfee and Brynjolfsson, 2012) with great opportunities for added value. Dobbs et al. (2011) calculated the potential annual value of Big Data to Europe’s public sector administration to be €250 billion (which is more than Greece’s GDP). And the potential value for Big Data for the Netherlands alone was calculated to be €45 milliard (Nationale Denktank¹, 2014: 7). Smart

(6)

algorithms can identify and predict behavior of target groups and give insight in real-time events. So that, instead of having to rely on intuition, managers can relate on evidence by using Big Data, as Amazon and Google have been doing successfully (McAfee and

Brynjolfsson, 2012; Davenport, 2006). However, the Big Data phenomenon comes with great challenges and provokes challenging questions (Boyd et al., 2012), such as the need to secure private information (Chen et al., 2012). These wider implications of the business and

economic possibilities of Big Data are important issues that need to be tackled by business leaders and policy makers and proper policies and management practices need to be put in place for addressing the privacy concerns (Dobbs et al., 2011; McAfee and Brynjolfsson, 2012). Firm’s mismanagement of stakeholders privacy concerns regarding the use of Big Data has resulted in major negative publicity, hurting consumer’s trust, such as in the cases of Target (Duhigg, 2012), OfficeMax (Hoffman, 2014) and ING Netherlands (Munsterman, 2014 and Laan, 2014). And they have been a source of sharp debate, like in Facebook its case in which a mood manipulation test was conducted, testing whether 700,000 uninformed Facebook users' emotions could be swayed by exposing them to news feeds weighted with either positive or negative posts and images (Bertolucci, 2014; Claburn, 2014). The loss of stakeholder trust, due to violations of privacy, breach of security and negative publicity comes at a heavy price for corporations, as a Ponemon (2009) study calculated a cost of over $200 per record of such an issue, in detection, notification, and remediation efforts, as well as customers lost. ‘Consumers are now not only second-guessing the security of their personal information when they make routine shopping trips, but are also extending this lack of trust to how they perceive the stores and brands they once preferred’ (Hoffman, 2014: 1). Privacy concerns are severe, as TNS (2011) found that 70% of Europeans are concerned their personal data held by companies may be used for other purposes than what it was collected for. This is

(7)

alarming, as without consumer trust, most of the potential social and economic value

promised from Big Data, will not be realized (Rose et al., 2013). A challenge lies here, since a large proportion of organizations are not sufficiently prepared to address privacy and security issues (Kshetri, 2014). By reviewing the literature, I identified a gap in addressing this

problem. Despite Big Data being a hot topic and major concerns about ethical use of it are posed, the literature has paid little attention to the application of business ethics on the management of Big Data (Boyd & Crawford, 2012). Much literature is concerned with the possibilities and big questions (e.g. Boyd & Crawford, 2012 and Kshetri, 2014) of Big Data, but few suggestions are made on how organizations can manage these concerns. The purpose of this study is therefore to inform organizations on this issue. My research question is: ‘How can organizations manage ethical privacy concerns regarding Big Data?’

In order to answer this question I developed the sub questions: Antecedents: (1) What determines the organizational decision to address ethical privacy concerns regarding Big Data?, Moderators: (2) Which varying tools or practices are used to address these concerns?, Outcomes: (3) What are the effects of the scope and capacity of these tools or practices used on perceived organizational trust?

This study assesses the topic from varying perspectives such as stakeholder theory, Business ethics, RBV, business venturing and from other organization and information systems perspectives. More overall, arguments are in line with Greenwood and Van Buren III (2010) who use stakeholder theory to argue that in the organization-stakeholder relationship, trust is a fundamental aspect. From a strategic (instrumental stakeholder) and RBV perspective the study assesses whether information management policies or practices that address

(8)

that managing privacy concerns well can be a good business opportunity, resulting in a competitive advantage (Hoffman, 2014; Raymond, 2013; Rose et al., 2013) this is a relevant research topic not only in terms of organization’s ethical behavior, but also in terms of firm performance. Information management policies are also currently relevant as they can help businesses prepare for new strict European legislations in the making, which protect

individual’s privacy more strongly, impacting organizations in major ways (Raymond, 2013). The aim is to form theory on this topic and inform practitioners on privacy practices regarding Big Data and its outcomes on organizational trust, a key dimension for unlocking the value of Big Data.

Smith et al. (2011) discuss how a likely competitive advantage is obtained by organizations which are perceived as more safe or trustworthy concerning privacy dimensions (Bowie and Jamal 2006). This competitive advantage is due to stakeholders ‘greater willingness to provide their information’ to trusted organizations (Schoenbachler and Gordon, 2002). To establish trust, privacy concerns might have to be reduced, which can be done by corporate privacy policies, as they have a positive effect on reduction of online users’ privacy concerns (Lwin, et al., 2007). As trust is domain specific (Zand, 1972), the interest of this study is trust to the domain of data management or how much stakeholders trust their (personal) data to the organization. An exploratory qualitative model-building multiple case study is conducted, by interviewing 15 employees in 14 different organizations that either work with, or have

extensive knowledge of Big Data, which will be subsequently referred to in this study as ‘Big Data experts’. By interviewing Big Data experts part of the information stakeholder

asymmetries are reduced, as this group of internal stakeholders is more informed on how data is used in organizations compared to other groups of stakeholders, such as external customers.

(9)

2. Literature review

The literature review in this study leads to the formation of preliminary propositions, which are subsequently further investigated in the data collection phase. The propositions are part of a theoretical framework on the antecedents for the choice to develop privacy practices, moderators and outcomes on trust in the organization. An overview of the propositions is depicted below.

Figure 1.0 A three-stage model with propositions

Figure 1.0 above serves as an introduction on the preliminary propositions which are formed in the theoretical framework. The reasoning for these propositions will be subsequently given in subchapters: (2.1) Organizational trust (2.2) Big Data, (2.3) Big Data ethics, (2.4)

Addressing Stakeholder’s privacy concerns ethically (2.5) Addressing Stakeholder’s privacy concerns strategically (2.6) Privacy practices and (2.7) Proposed effect of privacy practices on organizational trust. Figure 1.1 below visualizes the theoretical framework.

(10)

Figure 1.1 Visualization of the theoretical framework 2.1 Organizational trust

Trust is a well-covered topic in academic literature, although much literature has focused on trust as an antecedent for various outcomes, not on which antecedents lead to trust. However, ‘Scholars from various time periods and a diversity of disciplines seem to agree that trust is highly beneﬁcial to the functioning of organizations’ (Dirks et al. 2001:450). Having a trusted reputation is beneficial to organizations as it will benefit cooperative behavior from

stakeholders and therefore diminish various costs such as transaction costs, agency costs and team production associated costs (Jones 1995). Trust has also been found to have positive effects on (1) openness in communication within groups (Zand, 1972), (2) accuracy of shared info with supervisor (Mellinger, 1959; O’ Reilly and Roberts, 1974; O’ Reilly, 1978) and (3) openness in communication in inter organizational relationship (Smith and Barclay, 1997). Gallivan (2001) has specifically mentioned trust as a condition which is needed to ensure the success of virtual organizations. Although this study is not necessarily about virtual

(11)

organizations, the virtual aspect does relate to the digital world of Big Data. Urban et al. (2000) mention how Website trust is becoming a key differentiator that determines the failure or success of various retail Web companies now that consumers are becoming more

sophisticated about the Internet. Referring trust to the research topic, it has been suggested that trust in organizations is needed so that stakeholders are willing to share their data

(Schoenbachler and Gordon 2002), enabling organizations to unlock the potential value of Big Data. This is in line with Davenport and Prusak (1998: 35) who state that ‘‘mutual trust is at the heart of knowledge exchange’’. Building trust by responding to ethical privacy concerns regarding Big Data by implementing tools or practices is probably better done more

proactively in ‘good times’ when trust levels are high, instead of times of crises when trust levels are low, in line with suggestions made by Dirks et al. (2001).

Organizational trust is defined in this study as trust in an organizational relationship, or the trust that a stakeholder has in the organization. Mayer et al. (1995: 712) define trust in an organizational relationship as ‘the willingness of a party to be vulnerable to the actions of another party based on the expectation that the other will perform a particular action

impacting to the trustor, irrespective of the ability to monitor or control that other party.’ As information asymmetries exist between stakeholders that provide data and the organization that uses data, granting trust to the organization is difficult to do for stakeholders. One would expect that privacy practices could reduce information asymmetries, enhance a feeling of stakeholder control and therefore increase organizational trust. In this study, I examine whether establishing a large scope and capacity of privacy practices in response to ethical privacy concerns lead to greater organizational trust. As Mayer et al. (1995) found that trust is created by ability, integrity and benevolence, the expectation is that organizations can

(12)

measures to protect stakeholder’s privacy and hereby create trust. As trust is domain specific (Zand, 1972), I tie trust to the domain of data management or how much stakeholders trust their (personal) data to the organization. My definition of organizational trust is based on the premise that stakeholders have some awareness or attention that he/she pays to the action of giving data to the organization. Hereby I view awareness of this action as an antecedent to stakeholder knowledge as it can be a starting point about learning about an organization and how it manages Big Data.

In this study I do not focus solely on one specific group of stakeholders in the discussion of concepts. However, by interviewing a specific group of stakeholders: Big Data experts as stakeholders (employees) in various organizations, part of the information asymmetries are reduced that other groups of stakeholders suffer from. I define Big Data experts as people in the organization that either work with or have extensive knowledge of Big Data. This sample of stakeholders has above average information on how organizations, or the organizations they work for, manage data than many other stakeholders. Their levels of trust in the organization and its management of (private) data therefore are extremely valuable as they have the potential to portray a (more) accurate view on which responses on the ethical privacy concerns regarding Big Data are more or less effective in building trust than other viewpoints. However, there could be some problems of bias in asking this group of stakeholders

(employees) to evaluate trust in the organization they work for. They could be in charge of the data management policies and therefore inclined to give positive answers on how these

policies affect perceived trust in the organization. To minimize this effect I ask questions about a hypothesized organization, with characteristics similar as the one they work for. In the case of Big Data experts with consultancy roles, serving multiple organizations on data

(13)

they work for when answering the questions. However, this does not completely take away possible bias and could potentially be a limitation in my study.

Mayer et al. (1995) mention how difficult it is to measure trust itself and suggest that to measure trust itself, a method is needed that taps into the person's willingness to be vulnerable to the trustee. ‘The question "Do you trust them?" must be qualified: "trust them to do what?" and the issue on which you trust them depends not only on the assessment of integrity and benevolence, but also on the ability to accomplish it’ (Mayer et al., 1995:729). These arguments were taken into account while forming the interview questions. The aim is to measure the willingness of stakeholders (Big Data experts) to be vulnerable or give away control on personal data by granting data to the (hypothesized) organization.

2.2 Big Data

Big Data is a complex research topic, which can be described and investigated elaborately by looking into the technologies used, its potential opportunities and threats and many other dimensions. However, due to the limit in time resources and since the aim is not to explain in great detail the complex technological aspects of Big Data, but to focus on the strategic-ethical approaches to it, explanations that are considered necessary to form a framework of thought are solely discussed below.

Chen et al. (2012) state that the recent terms Big Data and Big Data analytics derive originally from the term intelligence, used in the field of artificial intelligence since the 1950s; business intelligence in 1990 and business analytics in the late 2000s. Although some argue that Big Data is an ambiguous term used for many concepts (Schroek et al. 2012), Dobbs et al. (2012) simply state in the preface of their Mc Kinsey Global Outlook Report that Big Data are ‘large pools of data that can be captured, communicated, aggregated, stored, and analyzed’, which is

(14)

very similar to Chen et al.’s (2012) definition. In this definition the large volume of data is stressed. However, Big Data is elsewhere explained into three different dimensions. For example both McAfee and Brynjolfsson (2012) and Schroek et al. (2012) mention besides the dimension (1) Volume (in that the volumes of data are larger than ever before, like 2.5

exabytes per day, growing exponentially), the dimensions (2) Velocity, in that the (nearly) real-time character of the data collected allows firms to be agile, and (3) Variety, in that the forms of the data collected differ (messages and images on social networks, GPS signals from phones etc.). Schroek et al. (2012) adds a fourth dimension: ‘Veracity’, in that data is

uncertain, because the unpredictability of some data cannot be removed, even by the best cleansing methods, whereas Kshetri, (2014) adds to the 3v’s, two more dimensions:

‘Variability’ (data comes in peaks and lows) and ‘Complexity’ (data coming from different sources need to be linked, matched, cleansed and transformed across systems). The most commonly used 3V’s (volume, velocity and variety) were used as early as in 2001 by Doug Laney.The variety dimension of Big Data is illustrated by the various ways data is collected according to the wide scale survey (26.574 surveys in EU) TNS published in 2011,

commissioned by the European Union. The report states how internet users are increasingly being monitored in various ways, even without giving personal data, through digital ‘cookies’ or electronic identifiers left on their browsers by websites or through their Internet Protocol (IP) address of the computer. This monitoring does not only happen on the internet, but also through mobile phones, camera’s, payments, store loyalty cards, Biometrics, Web 2.0 technologies (which allow user sharing of pictures, video’s etc.) and social media networks. (TNS, 2011). Besides the data which was knowingly shared, like giving date of birth and name or account number, there is also data which is unknowingly shared, like the data of searches on Google, the storing of locations through mobile phones etc., which is meta-data.

(15)

Meta-data can be both ethically and legally more problematic, as it is often gathered without knowledge, nor informed consent of the person providing that data. Some of the (possible) implications are discussed in the next section.

2.3 Big Data ethics

Most ethical concerns finally evolve around privacy as it can be harmful that (personal) data is used for reasons it was not intended for or in ways that hurt the individual. For individuals it is often difficult to assess what will happen to the data they knowingly or unknowingly share. In this digital age the problem has become more complex, however the problem of determining what is private and what not has existed for centuries as Acquisti et al. (2015:513) state that ‘The dilemma of what to share and what to keep private is universal across societies and over human history. The task of navigating those boundaries, and the consequences of mismanaging them, have grown increasingly complex and fateful in the information age, to the point that our natural instincts seem not nearly adequate’.

Although laws do provide data protection acts, these acts do often not suffice to sufficiently protect consumers’ privacy, since firms find backdoor-ways to circumvent the law of protection of privacy (Dijkstra, 2014). There is an overlap between ethics and law, but business ethics is in general concerning a field where law cannot give sufficient guidance (Crane & Matten, 2010). 2014) and therefore relevant for the privacy issue. Dijkstra (2014) describes how there are long-term effects of the monitoring and storing of people’s data, which are hardly perceived by the population itself. One is an effect that based on algorithms people will be discriminated without their knowledge (for example a person is excluded from getting social housing, or is suddenly no longer eligible for additional health insurance, without knowing the reasons why). The other is explained by Marthews and Tucker’s (2014)

(16)

study, which is that the knowledge of being monitored affects behavior, in the sense that unconscious behavior will change when a person is aware of being ‘watched’. This is

illustrated by the observation that after Edward Snowden exposed the NSA leaks, people have used less sensitive search terms (in search relating to U.S. governments) and also in search regarding personal information, for example the avoidance of the term ‘therapy’ (Marthews and Tucker’s, 2014). These two effects might have a huge long-term effect on people, creating divides in social status, and making people more risk-averse in their behavior. Besides the mostly unobserved possible (long-) term effects of Big Data collection on

individuals, there are also observed effects, which stakeholders express with growing severity. As Smith et al. (2011:990) state: ‘Information privacy is of growing concern to multiple stakeholders including business leaders, privacy activists, scholars, government regulators, and individual consumers.’

Stakeholders are people that have a stake in an organization, suggesting a relation in which

they can be affected by an organization. Stakeholders can be consumers, the media, interest groups, local communities and stockholders (Aguinis and Glavas, 2012) among others. Employees and consumers are both stakeholders that share their data with organizations and stockholders might get affected by negative publicity on Big Data use as in the ING case caused by another stakeholder: the media. When ING announced they had plans to use

financial customer data for commercial ends, the press picked up on it and it caused a massive frenzy from the media and the general public (Munsterman, 2014 and Laan, 2014). Clients of organizations are other stakeholders that in some cases entrust data to- or share data with organizations. In my study I discuss stakeholders in the broader sense, encompassing all possible stakeholders of organizations.

(17)

The overall concerns are severe, as before mentioned, a large majority of Europeans (70%), are concerned that their personal data held by companies may be used for other purposes than which it was collected for (TNS, 2011). And likewise, BCG’s Global Consumer Sentiment Survey on 10,000 consumers found that for 75% privacy of personal data was a “top issue” and just 7% in 12 countries were willing to allow their information to be used for purposes other than it was originally collected (Rose, Barton, Souza & Platt, 2013) In an Ovum study, 68% of the 111.000 people across 11 countries would use a do-not-track feature on a search engine if it would be easily available and regarding the use of personal data by Internet companies, only 14% believed they were honest about it. (Coyne, 2013).

However, there are differences in the severity of the concerns across various data types. Personal data such as health and financial records are noteworthy being viewed as being the most sensitive categories. TNS (2011) underpins that, since information which is considered personal is above all financial information (75%), medical information (74%), and national identity numbers or cards and passports (73%). Some data or data-sources should receive extra protection according to Europeans, as almost everyone questioned (95%) agrees that minors should be protected from and warned against (96%) the disclosure of personal data and 88% are in favor of genetic data receiving special protection (TNS, 2011). This is creating an interesting conflict of interest, since especially in the health industry a large potential for added value lies in accessing and using bodily health data and genetic data for predicting and preventing deceases which could both benefit consumers, governments and health firms (Dobbs et al., 2011, Nationale Denktank¹, 2014). Especially these very sensitive types of data such as health records indicate that either firms, governments and individuals need careful consideration regarding the trade-offs between utility and privacy (Dobbs et al., 2011). Other, less severe concerns are the recording of behavior via payment cards (54% vs. 38%), mobile

(18)

phones (49% vs. 43%) or mobile Internet (40% vs. 35%) (TNS, 2011). Since the concern on data management is stronger for some categories then for others and therefore the risks involved, I propose:

P1: Organizations that work with more sensitive types of data than organizations that work with less sensitive types of data will perceive a greater need for data privacy practices. Only a small share of social network users (26%) and even fewer online shoppers (18%) feel in complete control (TNS, 2011). This loss of control can be explained by information asymmetries, where the consumer who shares their data/information has often limited knowledge and insight into the actions of the receiver of that information and the risks associated with it. In fact, Boyd and Crawford (2012) propose that the limited access to Big Data creates new digital divides in which: (1) only few firms/persons have access, creating inequality, (2) this information asymmetry will advance data scientists, since they own specific skills needed, (3) it is unclear how these skills are taught and (4) chilling questions are anticipated. As authorities and institutions are trusted more (55%) than commercial companies: only 32% trusts phone companies, mobile phone companies and Internet service providers and just 22% trusts Internet companies such as search engines, social networking sites and e-mail services (TNS, 2011) a difference is expected between how public institutions manage privacy issues regarding Big Data and how private institutions manage it. As

organizations often assess emerging issues in terms of opportunities and threats, the

expectation is that commercial organization will perceive greater threats in privacy breaches, since according to TNS (2011) the trust levels are in general lower than public organizations. Therefore I propose:

P2: Private organizations will perceive a greater need for a privacy practices then public sector organizations.

(19)

In terms of how these privacy issues should be handled, a large majority of EU citizens (88%) believe that if large companies were obliged to have a Data Protection Officer, their personal data would be better protected and have strong penalties in mind for companies that use their personal data without their knowledge/consent (TNS, 2011). Europeans support sanctions against unethical firm behavior with a majority of 51% thinking that fines are in place for such firms; that they should in the future be banned from using such data (40%), or they should be forced to compensate the victims (39%) (TNS, 2011). This could explain the significant growth in the business ethics ‘industry’ recently, consisting of ethics consultants, corporate ethics officers, ethical investment funds, ethical products and services and activities associated with ethics auditing, reporting and monitoring (Crane and Matten, 2010). However, ethics officers might not do a very good job, according to Boyd and Crawford (2012: 673) as they argue that ‘many ethics boards do not understand the processes of mining and

anonymizing Big Data, let alone the errors that can cause data to become personally

identifiable.’ However, especially security issues are a major concern for most organizations, which they are willing to invest large sums of money in (Chen et al. 2012). Besides a

willingness to invest, monetary resources should also be available to make such an

investment. Overall larger organizations with more slack resources are likely to have more monetary resources to invest then smaller and more entrepreneurial set-ups. Therefore I propose:

P3: Smaller organizations will likely have less privacy practices than larger corporations. It is not much consoling that data scientists themselves, underline the concerns. From a Revolution Analytics survey of 144 data scientists, at an annual gathering of statisticians, organized to gauge their thoughts on Big Data ethics at JSM (Joint Statistical Meetings) in 2014, it was found that the vast majority of statisticians and data scientists believe that

(20)

consumers should worry about privacy issues related to data being collected on them (Bertolucci, 2014). The importance of ethics addressing these concerns is explained both clearly and simply by David Smith, chief community officer at Revolutionary Analytics. He states that: ‘If people feel there isn’t an ethical standard in place for data collection and analysis, then naturally they should worry about privacy issues associated with that data (Bertolucci, 2014).

2.4 Addressing Stakeholder’s privacy concerns ethically

As I identified the wide scope ethical worries concerning Big Data, it is helpful to assess how these worries can ethically be addressed.

As the research sample of this study lies in the Netherlands I relate my assessment mostly to European ethics. The practice and theory of ethics is different in the EU than from example the U.S, where a more formal approach to business ethics management (like compliance programs) are in place. Large differences in approaches in normative ethics can be

categorized as: (1) in Europe: institutionally oriented, questioning and justifying versus (2) US: Individually oriented, accepting capitalism, and applying moral norms (Crane and Matten, 2010). Some of the organizations in the sample are however American organizations with an office in the Netherlands, which could have an influence on their ethical approaches and perspectives. Additionally, there is a recent increasing attention to formal approaches stemming from the U.S. such as auditing, accounting, reporting standards and ethical codes, which suggest to become widespread in the future (Crane and Matten, 2010) and globalization increasingly makes EU and the other parts of the world interconnected. This could indicate that the way that ethics in the Dutch offices are addressed might be heavily influenced by American standards. Therefore I briefly provided an overview of ethical standards of the two economies.

(21)

When ethics are discussed in relation to an organizational environment, often two term come up: (1) Business Ethics and (2) Corporate Social Responsibility (CSR). There often is a confusion between the two terms. They are often used interchangeably, although there is a difference in the sense that CSR is more focused on environmental and social responsibilities, whereas Business Ethics is more concerned with moral dilemmas of right and wrong (Crane and Matten, 2010). Corporate Social Responsibility (CSR) was defined by Aguinis (2011: 855) as “context-specific organizational actions and policies that take into account

stakeholders’ expectations and the triple bottom line of economic, social, and environmental performance.”

In this study I assess the ethical implications of Big Data mostly from a normative stakeholder theory perspective, which in the context of Big Data relates to both Business Ethics as to Corporate Social Responsibility as stakeholders are impacted by both the moral decisions actors in the organization make and by the levels of social responsibility an organization is willing to take. In normative stakeholder theory the interest of the stakeholder that

organizations should take into account are central. Clarkson (1995) has even argued for evaluating corporate social performance in terms of stakeholders’ satisfaction instead of in terms of demonstrating corporate social responsiveness or fulfilling corporate social responsibility’, from Wijnberg (2000: 330). The impact of the Big Data phenomenon

increasingly asks from organizations that they manage it in a socially responsible way and it involves moral decision making on what data to use and what not, or how to inform

stakeholders about the collection and usage.

An important reason for organizations to engage in Corporate Social Responsibility is due to institutional pressures, particularly from stakeholders as stakeholders pressurize organizations by impacting the reputation of the firm as well as the potential resources and revenues

(22)

(Aguinis and Glavas, 2012). Richards and King (2014: 395) argue that ‘Big Data, broadly defined, is producing increased powers of institutional awareness and power that require the development of Big Data Ethics’ and ‘if we fail to balance the human values that we care about, like privacy, confidentiality, transparency, identity, and free choice, with the

compelling uses of Big Data, our Big Data society risks abandoning these values for the sake of innovation and expediency.’

However, although it is suggested that Big Data has brought about a broad range and scale of ethical issues and questions (Boyd & Crawford, 2012; Lane et al., 2014, Neuhaus and

Webmoor, 2012) ‘Very little is understood about the ethical implications underpinning the Big Data phenomenon’ (Boyd & Crawford, 2012: 672). At the same time Mingers and Walsham (2010) argue that in today’s world, ethics is indeed important for the practice of Information Systems. In the core ‘Ethics rationalizes morality, to produce ethical theory, that can be applied to any situation’ (Crane and Matten, 2010: 7). As debates and dilemmas around business ethics receive lots of attention the last few years, it is currently a very prominent topic and increasingly firms themselves seem to realize that to be (or be seen as) ethical may be good for business. (Crane and Matten, 2010).

The most influential and most popular theory in all likelihood that overlaps Business Ethics and Corporate Strategy, is the stakeholder theory of the firm (Crane and Matten, 2010, Stark, 1994). The stakeholder theory of the firm rests on the foundations that: (1) corporations have relations with several groups of associates that both affect and are affected by its decisions, which are called stakeholders (Freeman, 1984), (2) it is concerned with processes and outcomes for the corporation and its stakeholders of these relationships (Jones, 1999), (3) it values the interest of all stakeholders with no one set of interests dominating the others

(23)

(Clarkson, 1995; Donaldson and Preston, 1995) and (4) there is a focus on managerial decision making (Donaldson & Preston, 1995)

Donaldson and Preston (1995) argue that the stakeholder theory can be approached from the perspective of instrumental, descriptive and normative justifications which are, although fairly different, mutually supportive with the normative base serving as a critical foundation for the theory in all its forms. However, Garriga and Melé, (2004) make a further distinction in the ethical value theories in stakeholder theory which focus on the ethical requirements in the relationship between society and business and broadly dividing them into (1) normative stakeholder theory, (2) universal rights, (3) the common good approach and (4) sustainable development. The approach I focus on in this, as mentioned earlier, is a normative stakeholder theory, which attempts to provide reasoning for why organizations should take stakeholder interests into account (Crane and Matten, 2010) and which mainly became an ethical theory when Freeman wrote ‘Strategic Management: a Stakeholder Approach’ in 1984 (Garriga and Melé, 2004). The stakeholder approach is a knowledge gathering mechanism to predict environmental opportunities and threats and it uses stakeholders (customers, owners, suppliers, society, public etc.) as a generic level of analysis, often using stockholder

interviews and public attitude surveys as analytical techniques (Freeman, 1984). In line with Freeman’s suggestion, secondary data in the form of public attitude surveys (TNS, 2011) were used in this study as well. Below, from a normative stakeholder theory approach, will be argued why stakeholders should be taken into account in organization’s decision making on the management of data.

As discussed in the previous section, most stakeholders feel they have very little control in what happens with their data and especially commercial firms are not much trusted. The

(24)

majority of Europeans want to see unjust firm-behavior concerning personal data to be

sanctioned. Morally it is questionable whether, under which conditions, and for what purposes stakeholder’s (private) information can be gathered, analyzed and used. ‘Data may be public (or semi-public) but this does not simplistically equate with full permission being given for all uses.’ (Boyd & Crawford, 2012: 763). The feeling of lack of control over what happens with the collected data is due to the power imbalance between the people on whom data is

collected and the firms or organizations that have and use this data. Stakeholders that lack power have to rely on the trustworthiness of organizations to meet the fairness obligations to them of which they have no guarantee that they will be met (Greenwood and Van Buren III, 2010). Boyd & Crawford (2012) agree about the unbalanced power-relationship between stakeholders (as in users) and organizations, as many users are not aware of the algorithms and multiplicity of agents that are gathering and storing their data for future use. Kshetri (2014) explains the unequal relationship between consumer stakeholders and organizations in terms of information asymmetries and by the possibility of discrimination, for example resulting to low-income and minority shoppers to be targeted by the sellers of inferior

products (Talbot, 2013). Information asymmetries and asymmetrical exchange are especially the base of social media, making social media ethically dubious (Claburn, 2014). The

information asymmetries which provoke untrusting feelings from stakeholders should properly be addressed by firms if they want to bind stakeholders to them. Dawkins (2013) argues that power asymmetries should be reduced, interactions should be enabled and stakeholders should be able to impact distributive outcomes in an organization-stakeholder relationship. And although it is yet unclear how ‘fair use” of data is defined, it is clear that privacy is fundamental to an organization’s trust relationships with its stakeholders (such as customers, employees, business partners and others) (Dobbs et al., 2011). Therefore

(25)

organizations need to thoughtfully consider what kind of trust expectations it wants to establish with its stakeholders, in developing a privacy policy which will need to be communicated clearly to its stakeholders (Dobbs et al., 2011). On creating trust Rose et al. (2013: 2) argue that ‘the companies that excel at creating trust should be able to increase the amount of consumer data they can access by at least 5 to ten times in most countries. The resulting torrent of newly available data will meaningfully shift market shares and accelerate innovation, this is the ‘trust advantage’ (Rose et al., 2013). Additionally Jones (1995) states that a trusted reputation can be created by avoiding opportunistic behavior and ethical behavior should resonate in the decisions and policies of the organization. This means that organizations should master their codes of conduct, internal principles, trust metrics and compliance mechanisms, holding themselves accountable and communicate transparently with stakeholders about their actions and performance as data stewards (Rose et al., 2013). Considering the previous arguments, I propose:

P4: Addressing data privacy concerns of stakeholders benefits trust in the organization.

2.5 Addressing Stakeholder’s privacy concerns strategically

Instrumental CSR theories (CSR as a means to an end, the end being profits), are based on the assumption that an organization/corporation its sole purpose and social responsibility is wealth creation, in which there only exists a consideration for the economic aspects of the relationship between society and business and social activity is only considered if it involves wealth creation (Garriga and Melé, 2004). Instrumental stakeholder theories attempt to answer the question of whether it is beneficial for the organization to take stakeholder interests into account (Crane and Matten, 2010). According to Carriga and Melé (2004) instrumental theories consist of three main groups, varying in approach: (1) Maximizing

(26)

shareholder value, (2) strategies for achieving competitive advantages and (3) cause-related marketing. The focus in this section is on the instrumental theory as strategies for competitive advantage and on assessing whether it is indeed beneficial for organizations (with the sole purpose of maximizing profit) to take stakeholders’ interests into account. In other words, it will be examined how adopting a Corporate Social Responsibility approach regarding the ethical privacy concerns on Big Data usage, can be a competitive advantage for organizations, as a means to create greater wealth/profit.

From an instrumental stakeholder perspective, it could be argued that there is ‘nothing wrong’ with collecting, analyzing and storing Big Data from stakeholders, as it is justified by the added value it brings to the firm, in the sense that through smart algorithms and predictive modeling business operations can become smarter, more efficient and evoke cost decreases. The instrumental theory does not recognize ethical objections as discussed in the previous section. And from this perspective, organizations can actually benefit from the information asymmetries, as it allows them to do all kinds of analysis, which people do not need to know about, giving them free way to experiment and innovate, if no concerns are raised. In the Big Data debate 2014 November 26th, organized by the Nationale Denktank it was argued by one of the speakers that ethical privacy restraints often block innovations and therefore are not beneficial for organizations. In line with these arguments, Bridoux and Stoelhorst (2014) argue from an instrumental stakeholder perspective that a fair treatment of stakeholders is not always a more profitable approach then treating stakeholders with an ‘arms-length’ approach. The suggested advantage of keeping information asymmetries intact here is premised on the idea that stakeholders in general have very less awareness and lack knowledge about what happens with their data, which causes them to be less concerned and therefore more trusting in providing data. The suggested advantage of seizing to inform could also partly be due to

(27)

cognitive factors as bounded rationality, meaning that people do not always make rational decisions. Stakeholders might for example know that organizations might not treat them fairly or ethically but still might choose to engage with them. The same might hold for providing data to organizations. Although stakeholders might know of potential dangers of

mismanagement of data that they provide to an organization, they might still choose to do so. Whether this is indeed the case, we will observe in the data collection phase.

Maintaining the information asymmetries might give organizations greater power if it is assumed that stakeholders in general have a lack of awareness of the potential implications of providing their data to the organization and subsequently no concerns. In a ‘what doesn’t know, doesn’t hurt’ kind of way organizations are then enabled to collect large amounts of data almost in an unnoticed way. This argument is stressed by (Labrecque et al., 2013) who argue that nowadays organizations gain power by the large amount of data-content created by consumers, whereas the earlier trend was that consumers were gaining power by access to user-generated content.

However, the question arises how sustainable it is to rely on these information asymmetries, as awareness of the (privacy) risks of data collection increases rapidly and concerns have become quite severe. As I identified before, a large majority of Europeans (70%), are concerned that their personal data held by companies may be used for other purposes than which it was collected for (TNS, 2011). Likewise, BCG’s Global Consumer Sentiment Survey on 10,000 consumers found that for 75% privacy of personal data was a “top issue” and just 7% in 12 countries were willing to allow their information to be used for purposes other than it was originally collected (Rose, Barton, Souza & Platt, 2013) In an Ovum study, 68% of the 111.000 people across 11 countries would use a do-not-track feature on a search engine if it

(28)

would be easily available. Increasing awareness, knowledge and concerns might then undo the trust created by the former lack of knowledge. As I argued before, stakeholders find it difficult to trust when they feel a lack of control, which will be especially be the case if they are concerned about providing their data. Relying on information asymmetries as an

advantage could then work only based on the assumption that stakeholders do not care for their data privacy or are so less aware of possible harm that they will not see a reason to consider whether or not to provide their data. Considering the trend of increased awareness about the potential dangers of Big Data, relying on the information asymmetries does not seem like a sustainable strategy and carries the danger of potential losses. When awareness, knowledge and concerns increase, stakeholders might become more untrusting and choose to engage less with organizations, especially if the organization seizes to inform them and

empower them with a sense of control. Therefore I would argue that increasing the knowledge of stakeholders by informing them, would make it easier to engage with organizations and make stakeholders more trusting. If stakeholders are more trusting it is likely that they choose to engage long-term with organizations which adds to the wealth creation of the organization. Therefore, based on the premise that stakeholders do care about data usage and have some awareness, I propose:

P5: The level of stakeholder knowledge of data usage benefits organizational trust if stakeholders are concerned about Big Data usage.

With increasing examples of organizations being badmouthed in the media due to mismanagement of Big Data ethics, as mentioned, ignoring the concerns seems risky.

Especially as consumers nowadays have great power in influencing and spreading perceptions on organizations and influencing the media (Labrecque et al., 2013). Maybe not by being ethical and fair, but at least by seeming so can add to organizations wealth, as increasingly

(29)

studies have shown that CSR strategies in a majority have a positive predictive power on firm financial outcomes and reputation (Aguinis and Glavas, 2012). Therefore I propose that having the right resources in place (in terms of policies, procedures and processes) that illustrate to stakeholders that the organization deals with collected data in an ethical way, can actually add to organizations wealth and can potentially become a competitive advantage (Hoffman, 2014; Raymond, 2013; Rose et al., 2013). Discussing this, in terms of the Resource Based view of the firm, a competitive advantage could hereby be created, since there are market imperfections (some firms know better how to manage the ethical privacy concerns or have better resources to do so), firms are heterogeneous which explains why some firms do better than others in general and firms not all can freely access and exit markets, meaning that the one that has entry to for example resources as hiring a star privacy policy advisor will have benefits over other organizations who do not have such a person in their network, or cannot afford it. A solid privacy data management policy could then become a VRIN resource, when the resources acquired are valuable, rare, inimitable and non-substitutable (Barney, 1991). In order to deploy and over time develop privacy data management policy, dynamic capabilities might be required, as these are higher order superior capabilities on top of the value creation process, which reflect the ability to change firms from within, and are needed to sense, seize and reconfigure the existing business (Teece et al, 1997). In line with both the RBV and Dynamic capabilities approach it could be argued that the organization’s social and ethical resources or capabilities can be a source of competitive advantage, in processes such as moral decision-making (Petrick and Quinn, 2001; Garriga and Melé, 2004), in the process of developing good stakeholder-organization relationships (Harrison and St. John, 1996; Garriga and Melé, 2004; Hillman and Keim, 2001) and in the organization’s responsiveness, deliberation, perception or capacity of adaptation (Garriga and Melé, 2004;

(30)

Litz, 1996). Due to the relatively young phenomenon of Big Data, not all organizations have such resources in place, which suggests that these resources are valuable, rare, perhaps inimitable (if they are developed in a path-dependent way) and possibly non-substitutable. Since organizations are new to the needed resources to respond to ethical concerns regarding Big Data, obtaining those resources can result in a first-mover advantage, strengthening the competitive advantage of the organization. As I identified earlier, having a lack of resources to address stakeholder’s ethical concerns can result in a competitive disadvantage, as it creates a lack of trust, which locks some of the potential value of Big Data. Mistrust creates

(unconscious) cautious behavior and when there is a violation of trust/privacy, this can imply that a consumer-stakeholder will choose another organization to relate to, it affects

communities, media (and reputation of the organization), which affects shareholder value and ultimately employees. Therefore I come to the same proposition as was formed earlier that

P4: Addressing data privacy concerns of stakeholders benefits trust in the organization.

2.6 Privacy practices

In the previous sections I identified that most of the ethical concerns regarding Big Data evolve around privacy issues. Privacy has been defined in various ways and in various fields of business. A striking definition of privacy (general term) is posed by Swire & Bermann (2007:3) defining privacy as ‘the protection and appropriation at use of the personal

information of customers, and the meeting of expectations of customers about its use’ (from Massart, 2014). More specifically applicable to this study, Smith et al. (1996) quote from Stone et al. (1983) a definition of information privacy as "the ability of the individual to personally control information about one's self". This relates back to the information

(31)

My personal expectation is that privacy practices in firms will be mostly evolving around the basic legal rules to which organizations in the EU have to comply. This assumption is based on the fact that these regulations are shaping the framework of minimal requirements, which can, if not followed, cause immediate harm to an organization though expensive fines. These regulations can be found in the European Directive on Data Protection.1_{And specifically for} the Netherland there is the WBP2, the Dutch Data protection Act. Bowie (2006: 330-331) explains the basic rules by stating that ‘personal data must be processed fairly and lawfully and collected only for a specified, explicit, and legitimate purpose. Use of data beyond that stated purpose is prohibited. Data cannot be kept any longer than needed to serve its purpose and the data can be collected only if the person has given his or her consent.’ However, gathering consent has proven to be an ineffective strategy for protection of privacy, due to bounded rationality creating information asymmetries (by which people do not actually know what they consent to) and cookie-consent pop-ups which are unthoughtfully being clicked away (Borgesius, 2014). However, website pop-ups asking for consent to place cookies are still a common practice. Other measures organizations use to comply with legal requirements is by having codes of conduct, which provide statement on how business is done. However, besides the basic legal requirements, Rayport (2011) suggests 4 principles that should stop customers from alienating from firms, such as (1) creating ethical positions such as chief data officer, chief safety officer and chief privacy officer, (2) proposing several kinds of legal measures for privacy, like Senator Jay Rockefeller who in 2011 proposed the Do-Not-Track Online Act, (3) creating an own firm privacy framework to be ahead of legislation and (4) and

1_{This directive can be found in the Official Journal of the European Communities and accessed at}

http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2001:008:0001:0022:EN:PDF

2_{This law and a summary can be found at}

(32)

realizing that when ads are properly targeted, they cease to be an ad, but they turn into an important piece of information. In line with point two and four before mentioned, Raymond (2013) also stress the need for accurate and up-to-date data/information management policies, as legislative initiatives in both the EU and U.S are bound to place sharp demands on how firms manage data. Firms need to look beyond the minimal requirements they currently need to ensure (consent and reporting when a data breach occurs) and need to plan for the reality of an online global environment that deals in data storing, sharing and transmission (Raymond, 2013). Rose et al. (2013) argue that good data stewardship requires that policies about data use rest within the C-suite, instead of being relegated to the public policy department or legal department under the guise of lobbying or privacy. Another way privacy practices can be organized is by organizational practices that create accountability. Boyd & Crawford (2012) like Rose et al. (2013) stress the importance of accountability in order to act ethically, accountability both to the field of research and to the research subjects. It is not enough to assume that ethics boards will take care of protection, but critical thinking about the

ramifications of Big Data is required. And important questions of control, truth, and power in Big Data studies need to be addressed, while social media users e.g. do not have the tools and access that researchers have (Boyd & Crawford, 2012). Accountability throughout an

organization is often created through procedures, incentive systems and organization design. Simons (2013) states that whereas in the past eras, accountability for results was reserved for top-level managers, a more decentralized accountability-structure is emerging in both

governmental organizations as in commercial organizations, relying on accountability of individual workers and groups. Combining the findings stated above, I come to the following proposition:

(33)

P6: The greater the number of these privacy practices in place: (1) complying with legal requirements, (2) having data privacy officers, (3) having internal organizational privacy

frameworks (4) inducing cultural values on ethical use of data, (5) having privacy as an agenda point throughout the organization, (6) having regular board meetings on privacy, (7)

organizing data privacy accountability via procedures, systems and organization design, the greater the organizational trust.

However, organizations that do not have data privacy practices in place, most probably find other ways to create trust, since stakeholders are free to join, stay with, or leave the

relationship with the organization (and choose for a competitor), hence have the ability to inﬂuence value creation (Hill and Jones, 1992; Bridoux and Stoelhorst, 2014). Milne and Boza’s study (1999) even found that managing consumer information in a marketing strategy for example is more effectively done by building trust then using concern-reducing strategies. Stakeholders provide important resources, as they contribute to value creation, enabling access to their data, which enables managers to make informed decisions. As I identified the importance of trust within the organization-stakeholder relationship, it is useful to review organization strategic literature on how trust is established in such relatively uncertain and unfamiliar situations. As, when trust is lacking, stakeholders are less willing to share their data, resulting in a diminished value appropriation. However, regarding Big Data usage it is difficult to create trust, due to information asymmetries between the stakeholder and

organization. A way to approach this, is to view the organization as an entrepreneur aiming to gain support and legitimacy for their new venture. The situation of organizations addressing the stakeholders concerns around Big Data is similar to the entrepreneurial situation in the sense that due to the perceived lack of control and little established stakeholder knowledge about Big Data, uncertainty is prevalent which needs to be overcome to build a successful relationship with stakeholders. It is suggested that entrepreneurs in the case of a new venture, for reasons of reducing uncertainty, to gain support for their ideas and for familiarizing themselves and others with the new opportunities, use analogical or metaphorical

(34)

comparisons with other experiences and cases (Cornelissen and Clarke, 2010; Lounsbury & Glynn, 2001; Sternberg, 2004; Ward, 2004). Since analogies and metaphors convey

relationships to concepts already understood, they are useful in this context because they provide a construction of meaning to the ones listening to the ideas presented and allow entrepreneurs to make sense of unfamiliar situations (Gioia, 1986, Cornelissen and Clarke, 2010). ‘Magic’, a venture based on “customer-centric online shopping” is a good example of these principles, as initially the (new) concept of online shopping was poorly understood and the founding entrepreneur overcame this hurdle by linking online shopping with an image of offline supermarket shopping, which made the concept relatable (Santos & Eisenhardt, 2009, Cornelissen and Clarke, 2010). It is even suggested that using metaphors as part of a (digital) communication strategy, can not only gain trust of stakeholders, but even gain acceptance on a broader level. This is illustrated in a venture case involving a digital security service called ‘Secret’ studied by both Santos and Eisenhardt (2009) and Cornelissen and Clarke (2010) in which case, Secret successfully adopted metaphors in terms of relating to an electronic wallet, to describe their venture for would-be customers, other stakeholders and for themselves after the initial unsuccessful focus on gaining trust in the new venture’s identity and its product. Therefore I propose:

P7: Using metaphors and analogies in the communication to stakeholders will increase organizational trust.

According to Easton-Calabria & Allen (2015) the two questions that scholars and others concerned with developing ethical approaches for data should ask are: how do we “do the best” with the tons of information at hand, and for whom and for what purposes does this “best” look like? However, doing best with all the data available, as in spotting valuable information and using it in a value-adding way is still problematic for many companies, as

(35)

they (1) limit their capacity to only a few data-experts, (2) whom are not always the best candidates to critically assess and apply the information, which is why (3) more informed sceptics are needed (Shah, Horne and Capella, 2012). Informed sceptics are described as employees that can read and handle data pretty well (with some extra training) and use a common sense approach to explain the meanings, combined with experience in the firm. The suggestion to look further than recruiting data scientists, is providing a fresh approach, as much literature has focused on the lack of capable data scientists, and the need to create educational programs to fill in the gap (Nationale Denktank¹, 2014; Chen et al., 2012; Dobbs et al., 2011; McAfee and Brynjolfsson, 2012) which will take time. As not all organizations will have the resources to hire (expensive) data scientists, who are small in number as well, the expectation is that data analysis will be performed by others in the organization. Note that informed sceptics are still people that understand data analysis. Having capable people in the organization to process the large amounts of data will most probably enable easier data processing. When processes run more smoothly, less mistakes are made usually. Therefore it seems logical that having such capable employees will benefit the trust in the organization. Therefore I propose that:

P8: When either data scientists or informed sceptics are hired to manage data, organizational trust will increase.

2.7 Proposed effect of privacy practices on organizational trust

The proposed research model in this study is in line with Smith et al. (2011:989) their suggestion that ‘positivist empirical studies will add the greatest value if they focus on antecedents to privacy concerns and on actual outcomes. In that light, we recommend that researchers be alert to an overarching macro model that we term APCO (Antecedents >

(36)

Privacy Concerns > Outcomes).’ Nonetheless, there are some differences between the proposed APCO model and this study, in the sense that this study does not start with the antecedents to concerns (although these are incorporated in the questions to participants), but start with the reaction to the concerns itself as antecedents, moderated by the scope and capacity of the privacy practices that are established in an organization, to end with outcomes of organizational trust, in terms of entrusting data to the organization. Lwin, et al. (2007) found that corporate privacy policies have a positive effect on reduction of online users’ privacy concerns. And Smith et al. (2011) discuss how a likely competitive advantage is obtained by organizations that are perceived as more safe or trustworthy concerning privacy dimensions (Bowie and Jamal, 2006). As earlier identified, organizations that are trusted by consumers receive the benefit of consumers’ greater willingness to provide their information (Schoenbachler and Gordon 2002). In sum, these findings would indicate that it is in the organization’s best interest to receive stakeholders’ trust as an outcome. It would be reasonable to further expect that reducing privacy concerns, by having privacy policies in place, would provide a fitting answer to reducing privacy concerns and therefore enhance trust. However Milne and Boza (1999) found it is more effective to build trust rather than attempting to reduce consumer concern. Therefore, I do not exclude that other strategies or practices can enhance consumer trust, and two other dimensions: (1) communication

strategies and (2) the level of sophistication of data processing, are also taken into account in the model. After identifying the antecedents of choosing for privacy practices or not, the aim of this study is mainly to examine how the scope and capacity of established privacy

practices, communication strategies and established processes regarding Big Data influence perceived organizational trust. In line with the previous propositions I therefore propose:

(37)

P9: The higher the scope and capacity of privacy practices established in an organization and the higher the level of sophistication of communication strategies and data processing

established in an organization, the higher is perceived organizational trust.

3. Method

3.1 Research Design

An overall qualitative approach is taken in conducting a multiple case study in which 14 organizations (cases) are examined on how they respond to ethical concerns regarding Big Data and the outcomes on perceived organizational trust. To get access to the necessary information, Big Data experts in the organization are identified and interviewed. Each organization is seen as a holistic single-unit of analysis as depicted below in the right upper corner of Yin’s (2009) framework. A multiple case study fits well the research characteristics, as it suits very well in cases where new research areas are being investigated such as this research topic and the theory developed from it has specific strengths as empirical validity, testability and novelty (Eisenhardt, 1989).

Figure 1.2 from Yin’s (2009)