• No results found

Do we really care about privacy?

N/A
N/A
Protected

Academic year: 2021

Share "Do we really care about privacy?"

Copied!
63
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Do we really care about privacy?

MSC Business Economics/Digital Business

University of Amsterdam

Thesis

Bastiaan Karelse // 11208082

First Mentor: J. Guyt

(2)

P

REFACE

Before you lies the thesis ‘Do we really care about privacy?’, based upon a choice-based conjoint analysis regarding which factors drive people’s willingness to provide their personal data. It has been written to meet the requirements of the Business Management Program at the University of Amsterdam. I was engaged in writing this thesis from December 2017 until June 2018.

The research methodology was developed together with my supervisor, Jonne Guyt. The choice-based conjoint analysis was difficult, but conducting the investigation with the help of Jonne Guyt enabled me to execute the analysis and answer the research question. I’m truly satisfied with the help of Jonne Guyt, who was always available, willing to answer my many questions and provide me with sufficient feedback throughout the process of writing this thesis.

Furthermore, I would like to thank Emma Bettany for her time in proofreading my thesis. Also, I would like to thank the survey respondents, without whose cooperation I would not have been able to complete this thesis. And, last but not least, I would like to thank Linda Schmale for keeping me motivated and, as always, for your kind words.

I hope you enjoy your reading, Bastiaan Karelse

(3)

A

BSTRACT

Personal data possess tremendous value for companies, and the emergence of internet of things has transformed everything from refrigerators to cars into repositories of personal data. As a result, the collection and combination of a large amount of data enables organizations to become data-driven and consequently more competitive. This trend reflects the current hyper-connected state of the world. As data have become increasingly accessible, so have possibilities for abusing them. Meanwhile, people are increasingly worried about how companies use and protect their online data, thus becoming more reluctant to provide personal data. This thesis researches how organisations can satisfactorily collect and process personal data.

The current literature describes an existing privacy paradox: a gap between the expressed concern about privacy and the actions users are willing to take to protect their privacy. One could argue that people are apathetic towards their privacy. However, one could also argue that people are concerned but are faced with the dilemma of balancing privacy concerns with a competing desire for convenience (Polgar, 2017). A potential approach for mitigating the reluctance to provide personal data is the provision of benefits that outweigh the disadvantages of disclosing personal data. Researchers delving into this topic have found that people are usually willing to compromise privacy for convenience and benefits (Hoffman, 2014).

This study examines the trade-offs made by people when providing their personal data. A review of existing literature finds that providing a monetary incentive and personalised offers can influence people’s willingness to provide their personal data. In addition, people consider the type of data that is requested when pondering whether or not to provide their personal data. This study is novel because it researches the effect of monetary incentives, personalised offers and type of data on the willingness to provide personal data by applying a choice-based conjoint analysis. By applying this method, the relative importance of each benefit is uncovered. A binary logistic regression is executed to determine whether subjects may be willing to provide personal data if they receive benefits in exchange for privacy disclosures. In addition, attitudes towards trust, anonymity and privacy concerns are considered.

Results demonstrate that offering a set of benefits does not significantly impact willingness to provide personal data; however, significant differences exist among respondents regarding their willingness to provide these data. The lack of significant relationships yields several implications for organisations. First, this research demonstrates that the dynamics of factors of providing data are more complex and subtle than previously thought. Second, according to the result of the control variables, generational differences should be considered, as the ‘Internet Generation’ may be more willing to provide personal data. Third, benefits do not need to be provided to obtain personal data; the offering of benefits may even discourage consumers from providing personal data. The analysis and literature review show that if companies wish to collect data from consumers in a satisfactory manner, they should 1) account for type of data requested, 2) provide consumers with control over their data and 3) anonymise data whenever possible.

(4)

S

TATEMENT OF

O

RIGINALITY

This document is written by student Bastiaan Karelse who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Organizations is responsible solely for the supervision of completion of the work, not for the contents.

(5)

CONTENTS

Preface... 2

Statement of Originality ... 4

List of Figures and Tables ... 6

Glossary ... 7 Introduction ... 8 Contribution ... 10 Research Question ... 11 Literature Review ... 12 Conceptual Model ... 13 Research Design ... 21 Research Methodology ... 21

Choice-Based Conjoint Experiment ... 21

Measurement: Moderators ... 24

Measurement: Control Variables ... 24

Measurement: Dependent Variable: Willingness to Provide Personal Data ... 24

Strengths and Limitations ... 25

Handling Missing Data ... 26

Cleaning the Dataset ... 26

Research Results ... 29 Correlation Analysis... 29 Direct Effects ... 31 Interaction Effects ... 34 Conclusion(s) ... 37 Discussion ... 38 Limitations of Research ... 39 Future Research ... 39 Recommendations ... 39 Bibliography ... 41 Appendix ... 45 Questionnaire ... 45

(6)

L

IST OF

F

IGURES AND

T

ABLES

Figure 1. Business Impact of privacy concerns ... 10

Figure 2. Conceptual Model ... 13

Table 1. Conjoint Factors and levels ... 22

Table 2. Choice sets ... 23

Table 3. Analysis DV: Willingness to provide personal data... 24

Table 4. Distribution Matrix (Skewness, kurtosis and cronbach's alpha)... 28

Table 5. Correlation Matrix (Means, Standard Deviations, Correlations) ... 30

Table 6. Direct Effects ... 33

(7)

G

LOSSARY

Data Privacy: “the relationship between the collection and dissemination of data, technology, the public expectation of Privacy, and the legal and political issues surrounding Privacy” (Hasty, Nagel, & Subjally, 2013). Financial Data: information about a person’s or organization’s financial transactions.

Privacy Concerns: “exist wherever personally identifiable information or other sensitive information is collected, stored, used, and finally destroyed or deleted – in digital form or otherwise” (Bergstein, 2004) (Bergstein, 2006). Personally Identifiable Data: data that can be used on its own or with other information to identify, contact, or locate/identify a single person (Stevens, 2012) (Greene, 2014).

Processing Data: any handling of personal data, automated or otherwise, such as the collecting, retaining, saving, organising, changing, use, making public, transferring or removing of personal data. 1

(8)

I

NTRODUCTION

The internet is one of the most influential phenomena of the present era. Gradually, it has unconsciously become an integral part of people’s lives. In 2017, 97% of the Dutch population (> 12 years old) were connected to the internet, and 80% were active on social media. The internet is primarily used to obtain information (78.6%), read newspapers (64%), play games and listen to music (58%) (Centraal Bureau voor de Statistiek (CBS), 2017). These activities require the transmission of personal data, and because of the presence of search engines, the upswing of internet of things and social media, combined with the use of data mining techniques, collecting these data for companies has become standard procedure. It is predicted that in 2020, 44 zettabytes of data will be present in the world (VPRO Tegenlicht, 2017). Organisations harvest and analyse these data to obtain (real-time) insights into the behaviour of prospective and current customers. Swartz (2006) and Bergstein (2010) argue that while the value for organisations is eminent, using these data can become a threat to people’s privacy (Swartz, 2006) (Bergstein, 2010).

The majority of organisations began recognising the potential of data around 1995 (Bergstein, 2004). Between 1995 and 2000, personal data were collected for making strategic decisions, developing new products and providing personalised offers. While in 1995, only 10% of organisations collected personal data, since 2010, 82% of organisations constantly monitor users through cookies, internet of things, wearables, smartphones, identifiers on browsers and offline through cameras, financials, loyalty cards, biometrics and smart lights to acquire different data types including demographic, contact, financial and socio-economic. These data are then used for marketing, creating data-driven strategies or selling to third parties (TNS Opinion & Social Avenue Herrman Debroux, 2011). By 2017, 81% of all Dutch organisations had become data-driven. Data-driven organisations make all decisions based on data, and research shows that, as a result, they are 23 times more likely to acquire customers, six times more likely to retain customers and 19 times more likely to be profitable (Gaskell, 2017). Despite the clear benefits for companies, 80% of Dutch inhabitants are still unaware of the value of their data for companies and the corresponding risk for their privacy (VPRO Tegenlicht, 2017). The value of personal data can be ascertained by inserting the following search string ‘what is personal data worth?’ The results comprise articles and sentiment expressing the view that data have value and should not be freely provided. Academics are not unanimously consistent in determining the value of data. For example, scientists and mathematicians Lanier and Wolfram (2017) posit that every person can acquire a basic income by selling their data. But at the moment, the average price on auction websites for social media data of one person is $0.000856, and it is not clear where and why the data are used. The New York Times (2017) mentions that an individual’s personal data rarely sell for more than $1.

While it is clear that the usage of personal data provides a more personalised service and more customised products, it also has drawbacks. Dijkstra (2014) argues that the long-term effects of data collection, storage and usage can be harmful to individuals because of increased illegal activity on the internet, such as rises in online racism, identity theft and fraud. One type of online racism occurs when organisations discriminate against people based on historic behaviour on social media or online search behaviour. One potential consequence could be a person being excluded from social housing or health insurance without knowing the reason (Dijkstra, 2014).

(9)

Another harmful effect is that people may become aware of being monitored, which in turn can affect their behaviour. A prominent example of this is the PRISM scandal, involving Edward Snowden. Since this incident, many people use fewer sensitive terms related to the U.S. government on search engines and avoid the term ‘therapy’ (Marthews & Tucker, 2014).Phelps, Nowak and Ferrel (2000) have found that 45% of the subjects in their experiment were very concerned about the ways in which companies address their privacy, and most people want more control over the collection and processing of their data.

While most organizations have noticed this shift in privacy concerns, many have failed to mitigate against these concerns, which has resulted in significant negative publicity and damaged trust, such as in the cases of Target (Duhigg, 2012), Office Max (Hoffman, 2014) and ING (Munsterman, 2014). A prominent example involves the well-known social media platform Facebook, which conducted a test with 700,000 uninformed users by exposing them to either positive or negative stories in their newsfeed (Bertolucci, 2014) (Claburn, 2014). Another recent example involving Facebook occurred when the social networking website received considerable negative publicity as users became increasingly aware that Facebook had sold their data to Cambridge Analytica, a procedure that users had previously been unaware of (Meredith, 2018). When breaches of data privacy transpire, serious damage may occur to the trust that customers, investors and employees have in organisations. Ponemon (2009) calculated a financial cost of over $200 per privacy violation per damaged customer. These costs consist of detection, notification and remediation efforts (Ponemon, 2009). In the case of Facebook, 8% of users deleted their accounts (Meredith, 2018).

Due to the media attention generated by privacy incidents, customers are questioning their own privacy (Hoffman, 2014). People have become more aware of their privacy and now demand that their data are dealt with in an ethical and transparent manner. Internet users want the power to opt in or out, choose whether data are processed anonymously and receive transparency regarding the types of data processed (Rose, Barton, Souza, & Platt, 2013). Thus, while organisations must cope with growing privacy concerns and the diminishing trust of customers, they are also inadequately aware of the laws and restrictions concerning data privacy. Each year, new regulations (e.g., general data protection regulation) concerning data handling arise, but few organisations view customer privacy concerns as a major issue; nor are they willing to invest large sums of money to reduce these concerns (Chen, Chiang, & Storey, 2012). Moreover, these organisations prefer to seek methods to bypass laws regarding privacy protection (Dijkstra, 2014).

While there have not been many lawsuits regarding the unauthorised use of data, the risk of large fines (with a maximum fine of €20 million or 10% of annual revenue), reputational damage or legal prosecution exists should data be misused by organisations. The first relatively large fine was received by TalkTalk, which was fined £400,000 after 157,000 personal account details had been stolen the previous year. If the same data breach occurred post-GDPR, the fine would be closer to £70 million. Therefore, from an economic standpoint, it is now crucial to protect personal data and comply with privacy regulations. Beyond the financial consequences, decreasing customer trust can be even more devastating. After the breach of TalkTalk, not only did its share price fall and it received a fine but it also began to lose customers. By May 2016, TalkTalk’s profits had been reduced by half, and 98,000 users had left the company (Gaskell, 2017).

(10)

C

ONTRIBUTION

According to a 2011 survey conducted by TNS, among European inhabitants, 95% agree that personal privacy should be protected, and 96% agree that users should be warned of the disclosure of personal data (TNS Opinion & Social Avenue Herrman Debroux, 2011) (Dobbs, Manyika, Roxburgh, & Lund, 2011) (Dijkstra, 2014). However, despite these privacy concerns, a decreasing trust in organisations and the knowledge of potential danger, a majority of people continue providing personal data. Despite this ‘privacy paradox’ (people continuing to provide personal data despite the risk), customers’ growing privacy concerns and increasing distrust are slightly affecting people’s purchasing behaviour.

According to the National Cyber Security Alliance (2017), people are increasingly aware of the power they possess to increase the protection of their privacy. As seen in Figure 1, consumers can and do begin to signal that companies are asking for too much information that is not required for requested services or products. According to Nam et al. (2006), users now share false personal data more often and sometimes abort a product purchase or online transaction because the costs of privacy do not outweigh the offered benefits. People with few or no privacy concerns are more willing to provide accurate and current data than are those forced or deceived into disclosing personal data. These data are less reliable and can be hazardous to base strategic and marketing decisions upon (Nam, Song, Lee, & Park, 2006).

FIGURE 1. BUSINESS IMPACT OF PRIVACY CONCERNS

As such, a global need exists for organisations to look beyond the regulations and minimal requirements set by law (consent and reporting of data breaches) and instead plan for the realisation of initiatives that deal with data storage, sharing and transmission in a compliant, transparent and, most importantly, satisfactory manner (Raymond, 2013). This would restore trust, decrease privacy concerns and make people less reluctant to provide data.

Consumer data are the figurative lifeblood of many organisations. Companies must earn profits for shareholders but are also required to do so ethically (Crane & Matten, 2010). This can be done by (1) using data only when receiving consent and (2) complying with regulations and constructing a model that is beneficial to all parties. From an instrumental stakeholder perspective, it could be argued that there is ‘nothing wrong’ with utilising data because it adds value to an organisation. Conversely, the opposing side may question whether it is ethical and fair to use data without consent. Therefore, a discussion has arisen whether organisations should adopt a ‘what is not known does not hurt’ attitude, while bearing in mind both that using data without consent could damage organisations by creating mistrust and that customers possess great power in influencing and spreading perceptions of organisations, as well influencing the media (Labrecque, J., Mathwick, Novak, & Hofacker, 2013). On the other hand, one might argue that organisations should adopt socially responsible strategies that can emit a positive image and thus positively influence reputations and financial outcomes (Aguinis & Glavas, 2012)?

FIGURE 1. Business Impact of Privacy Concerns

In 2017, According to NCSA, because companies ask too much info:

51% have not clicked on add 44% withheld personal info 32% not downloaded an app/product 28% stopped an online transaction 36% have stopped using a website 29% have stopped using an app

As a result, 74% of the people worldwide have limited their online activity due to privacy

(11)

This thesis investigates the potential benefits, in contrast with equivalent existing disadvantages, of providing personal data. If an organisation can provide the proper benefits, compliant with legislation, and through doing so gain customer trust and reduce privacy concerns, it can achieve satisfied consumers and remain data-driven. The key aim of this thesis is to investigate what drives people to want to provide different types of data. This research makes two primary contributions to the field of data management and privacy. First, whereas most previous studies have incorporated drivers of willingness to provide personal data separately, this study proposes and empirically tests a model that embraces antecedents related to this willingness. Furthermore, while previous research has focused on individual opinions regarding the provision of personal data, this research investigates the trade-offs users make when asked to provide personal data.

R

ESEARCH

Q

UESTION

Based on a review of existing literature, a research question is developed focusing on prominent factors which could predict the willingness to provide personal data:

In what way do monetary incentives, data types and personalised offers, moderated by privacy concerns, anonymity and trust, influence the willingness to provide personal data?

(12)

L

ITERATURE

R

EVIEW

It is stated above that a growing proportion of the population is aware that their personal data can be collected, processed, shared and even stolen for financial gain by unknown members of organisations or criminals. Thus, consumers have begun to protect their privacy by refusing to provide data or even by providing false information. Yet organisations still require customer data to make efficient strategic decisions. In assessing the value of data for companies, Dobbs et al. have calculated that in 2011, the total value of data collected had already reached €250 billion for Europe alone (Dobbs, Manyika, Roxburgh, & Lund, 2011). Previous research has also shown that due to some organisations’ collecting data without consent and a lack of transparency, people have become less willing to share their data when they are concerned about privacy or it is perceived to be at risk (Myerscough, Lowe, & Alpert, 2006) (Wu, Huang, Yen, & Popova, 2012). Regarding the value and profitability of and dependence on data, the increasing reluctance to provide personal data constitutes a huge risk for organisations, which have begun to ponder what (de)motivates people to provide data.

To establish a method through which people willingly provide their data, research has identified factors that may mitigate these privacy concerns. The literature mentions several factors that drive decisions whether to provide personal data; unfortunately, the literature is not unambiguous on these factors. Over the years, motivations to provide data have differed. The research has concluded that before 2000, focusing on the benefits of providing data positively influenced the willingness to provide data. For example, Hughes (1994) and Jackson and Wang (1994) have shown that organisations that create promotions and reward programmes, customise advertising and promotion strategies and implement highly targeted direct-mail campaigns receive less opposition when acquiring personal data than organisations who do not do so (Hughes, 1994) (Jackson & Wang, 1994).

As demonstrated by empirical research, before 2000, customers were willing to provide personal data in return for more personalised service and participation in loyalty and reward programmes (Acquisti & Grossklags, 2005). Research conducted between 2005–2015 demonstrated that, at that time, providing incentives and building trust significantly impacted willingness to provide personal data (Tsai, Egelman, Cranor, & Acquisti, 2011). During this period (2005-2015), incentives did not need to be monetary, but could instead be a discount or gift, for example. Research conducted by Tsai et al. (2011) has concluded that when sufficient incentives for providing data exist that correspond with the correct target group, people do want to share personal data. Since 2015, people have become more aware of the value of their data and are not as easily persuaded to share it as was previously the case. Benndorf and Normann (2017) have found that people now want (a part) of the value of their personal data, and that they have become more aware that personal data can be traced back to them. They also want the assurance that the data are processed anonymously (Benndorf & Normann, 2017).

Of the factors discussed in recent literature that impact the willingness to provide data, monetary incentives, the sensitivity of the data and personalised offers are the most relevant. This thesis investigates these three most prominent factors to identify their effects on the willingness to provide personal data. In the following chapters, the conceptual model is first explained, which demonstrates the relation between three independent variables, three moderators and one dependent variable. Subsequently, each variable is explained and substantiated with empirical and conceptual theory, followed by corresponding hypotheses.

(13)

C

ONCEPTUAL

M

ODEL

FIGURE 2. CONCEPTUAL MODEL

D

EPENDENT

V

ARIABLE

:

W

ILLINGNESS TO

P

ROVIDE

P

ERSONAL

D

ATA

The outcome variable represents whether participants are willing to provide their personal data. To receive a certain product or service, it is often required to provide data, so people are forced to choose between providing or not providing data. In this study, the dependent variable examines whether people are willing to provide their data and if so, which factors motivate them to make a certain choice.

I

NDEPENDENT

V

ARIABLE

:

M

ONETARY

I

NCENTIVES

Limited research has been conducted on the willingness to provide personal data when receiving monetary incentives. Nevertheless, it can be assumed that the willingness to provide something will be substantially higher in incentivised settings. In 1997, Nowak and Phelps found that the willingness to share personal data increased when a subject received an incentive (Nowak & Phelps, 1997). Three years later, Phelps et al. (2000) undertook further research under different circumstances and also concluded that incentives could neutralise privacy concerns and increase customer willingness to provide personal data. Research by Sayre and Horne (2000) has shown that when incentives are offered in a retail setting, privacy concerns are not an issue. In this study, participants were asked to divulge gender and age (demographic data) and shopping habits (socio-economic data), and in return, the participants received price discounts on products. The study found that participants were cognisant of sharing their data but made the trade-off due to the potential benefits and shared their personal data (Sayre & Horne, 2000). In another study, students were allowed to buy DVDs at two different stores. One store was slightly cheaper but asked for more personal data. Almost all participants chose the cheaper store and thus were willing to share their data for a discount. Furthermore, a more recent study, conducted by Benndorf and Normann (2017), has concluded that most subjects are willing to sell personal data for a monetary incentive. Accordingly, a direct relationship between monetary incentives and the willingness to provide personal data is hypothesised as follows:

H1: Monetary incentives are positively related to the willingness to provide personal data. In other words, it is expected that higher monetary incentives result in an increase in the willingness to provide personal data.

FIGURE 2. Conceptual Model

Monetary Incentives Dependent Variable

Willingness to provide personal data Anonymity Trust Moderators Independent Variables Personalized Offers H1 H2 H5 H3 H7 H4 H8 H11 H10 Data Types H13 H6 Privacy Concerns H9 H14 H12

(14)

When providing subjects with monetary incentives, it is important to question what the monetary equivalent of safely protected data is; or, more specifically, what the value is of providing one’s data without knowing their eventual use. One potential angle is to consider both the amount of money at which a third party can sell the data and how people estimate the value of their data. In a study conducted in Scandinavia, only about one out of six subjects overestimated the value of their data. However, Acquisti and Grossklags (2005) have concluded that when people are going to receive a monetary incentive for their personal data, they make demands that exceed the data’s value (Acquisti & Grossklags, 2005). In such a scenario, people who become aware of the value of their data will demand a relatively high price. So, while it can be concluded that customers are willing to share their data in return for incentives, prior research has not clearly defined the specific value of data. Entering a search string into Google displays a maximum value of $1 per data set, and at the time of writing (01-03-2018), according to data auction websites, the writer’s personal data from social media platforms can be sold for approximately $0.10.

Another experiment by Mcnerney (2016) has demonstrated that people are willing to pay more for privacy protection than what they would receive for providing their personal data. In this experiment, people were offered either a product with high privacy protection or a cheaper product with lower privacy protection. Almost all of the subjects chose the expensive product with high privacy protection. This illustrates that people ask more money for allowing their data to be public than they are willing to pay to keep their data private. This is called the endowment effect, a classical bias from behavioural economics (Mcnerney, 2016). This could imply that when participants trust the data collector with handling their privacy or the data are anonymous, they are more willing to provide personal data or demand a lower monetary incentive, compared to when less trust or distrust exists. Also, when relating the anonymity of data to a monetary incentive, prior research has concluded that subjects do not demand higher monetary incentives for providing personally identifiable data (Schudy & Utikal, 2015) but are more reluctant to share the data (Beresford, Kubler, & Preibusch, 2012). Accordingly, this could involve an indirect relationship between monetary incentives and the willingness to provide personal data as moderated by privacy concerns, anonymity and trust. As such, hypotheses regarding these indirect relationships are formulated as follows:

H2: The relationship between monetary incentives and the willingness to provide personal data is positively moderated by privacy concerns. More specifically, monetary incentives are expected to demonstrate stronger positive effects on the willingness to provide personal data when people have more privacy concerns. In other words, when people are more concerned about their privacy, they require higher monetary incentives for providing their personal data.

H3: The relationship between monetary incentives and the willingness to provide personal data is positively moderated by anonymity. More specifically, monetary incentives are expected to demonstrate weaker positive effects on the willingness to provide personal data in the case of anonymous data, compared to personally identifiable data. In other words, when their processed data are anonymous, people require lower monetary incentives for providing it.

(15)

H4: The relationship between monetary incentives and the willingness to provide personal data is positively moderated by trust. More specifically, monetary incentives are expected to demonstrate weaker positive effects on the willingness to provide personal data when customers have more trust in the data collector, compared to less trust or distrust. In other words, when there is trust in the data collector, people require lower monetary incentives for providing their personal data.

I

NDEPENDENT

V

ARIABLE

:

D

ATA

T

YPES

A primary method for becoming data-driven is through combining data types. The four most commonly acquired personal data categories, whether voluntarily or involuntarily, are contact data (e.g., name, address, e-mail and telephone number), financial data (e.g., credit card number, account number and financial details) demographic data (e.g., gender and age) and socio-economic data (marital status, employer, and income and education levels).

According to a research conducted by TNS, financial data are viewed as being the most sensitive (TNS Opinion & Social Avenue Herrman Debroux, 2011). TNS (2011) has also found that while financial data are considered the most sensitive (75%), they are followed by medical data (74%) and national identity numbers (73%); this finding is logical since if criminals gain access to a person’s card numbers, that person could become the victim of fraud. But, in addition to the sensitivity and risk of fraud, this information represents the most valuable data for organisations. It displays financial history, where people have been, what they have purchased and thus used, their activities and habits and the medications they have purchased. Moreover, Ernst and Young’s survey (1999) found that the primary reason that consumers do not purchase on the internet is their concern about sending credit card information. Graeff and Harmon (2002) found that only 25% of the respondents to their 2002 survey used their credit cards online without worrying about privacy. Based on a review of the literature on data types, the following statements are hypothesised regarding the direct and indirect relationships between data types and the willingness to provide personal data:

H5: Data types are negatively related to the willingness to provide personal data. In other words, it is expected that the type of the data results in a decrease in the willingness to provide personal data.

H6: The relationship between data types and the willingness to provide personal data is negatively moderated by privacy concerns. More specifically, the types of data are expected to demonstrate stronger negative effects on the willingness to provide personal data when people have more privacy concerns. In other words, when people have more privacy concerns, they are more reluctant to provide specific types of data, compared to people who have fewer privacy concerns.

Research conducted in Australia by the Office of the Federal Privacy Commissioner (Office of The Federal Privacy Commissioner, 2001) has found that 59% of respondents are unwilling to share their financial data. The primary reason for not wanting to share financial data is the high risk of this kind of data being misused. This illustrates the importance of anonymization of data. Moreover, research in the USA has supported the finding that customers are particularly sensitive in sharing financial data over other types of data; however, it also showed that if the data are not retraceable to individuals, type of data has less effect (Cranor, Reagle, & Ackerman, 1999) (Phelps, D'Souza, & Nowak, 2001). Therefore, a lack of anonymity when collecting data may cause a decrease in

(16)

willingness to provide financial data. Based on a review of the literature on data types, the following hypothesis is developed regarding the indirect relationship between data types and the willingness to provide personal data: H7: The relationship between data types and the willingness to provide personal data is positively moderated by anonymity. More specifically, the types of data are expected to demonstrate weaker positive effects on the willingness to provide personal data in case of anonymous data, compared to personally identifiable data. In other words, when their processed data are anonymous, the type of data has less of an effect on people’s willingness to provide personal data.

Furthermore, the literature demonstrates that when people trust the data collector, they are more willing to provide data, regardless of the type of the data (Schoenbachler & Gordon, 2002). This prompts the following hypothesis:

H8: The relationship between data types and the willingness to provide personal data is positively moderated by trust. More specifically, the types of data are expected to demonstrate weaker positive effects on the willingness to provide personal data when customers have more trust in the data collector, compared to less trust or

distrust. In other words, when there is trust in the data collector, the type of data has less of an effect on people’s willingness to provide personal data.

I

NDEPENDENT

V

ARIABLE

:

P

ERSONALISED

O

FFERS

As outlined in the introduction, people currently worry more about privacy than they did in the past. But, paradoxically, this increasing level of privacy concern has not prevented people from providing their data when receiving the right benefits. Also, people do not act in protest against organisations processing large sets of data. Studies have found, for example, that many people are disappointed when they do not receive customised service or product opportunities that organisations can only provide by processing personal data (Goodwin, 1991).

More specifically, Hoffman (2014) has found that people are less willing to provide personal data if there is a lack of a ‘mutually beneficial relationship’. To create such a relationship, people receive access to certain products or more personalised content when they provide their data (Hoffman, 2014). Phelps et al. (2000) have confirmed this theory by concluding that people are more willing to provide personal data if benefits such as personalised service, time savings and relevant products are presented (Phelps, Nowak, & Ferrel, 2000). This set of benefits, as formulated by Phelps et al. (2000), are henceforth referred to as personalised offers. Accordingly, a direct relationship between personalised offers and the willingness to provide personal data is hypothesised as follows:

H9: Personalised offers are positively related to the willingness to provide personal data. In other words, it is expected that more personalised offers result in an increase in the willingness to provide personal data.

However, besides simply receiving personalised offers, it is necessary that people also become aware that they have received such offers. Graeff and Harmon (2002) have found that if consumers are made aware of the benefits that they will receive, 25% are willing to provide personal data (Graeff & Harmon, 2002). Another study has concluded that if consumers know that personal data collected is used for building relationships in which they benefit from the transfer of goods and services and more customised service, their privacy concerns are reduced, and they become more willing to provide data. Therefore, a trade-off is present between received benefits and

(17)

providing data; logically, it follows that if more privacy concerns exist, and the data are more personally identifiable, people require a relatively higher number of personalised offers. Based on these assumptions, hypotheses are developed regarding the indirect relationships between personalised offers and the willingness to provide personal data:

H10: The relationship between personalised offers and the willingness to provide personal data is positively moderated by privacy concerns. More specifically, personalised offers are expected to demonstrate stronger positive effects on the willingness to provide personal data when people have more privacy concerns, compared to people who have fewer privacy concerns. In other words, when privacy concerns exist, people require more personalised offers when providing personal data.

H11: The relationship between personalised offers and the willingness to provide personal data is positively moderated by anonymity. More specifically, personalised offers are expected to demonstrate weaker positive effects on the willingness to provide personal data in case of anonymous data, compared to personally

identifiable data. In other words, when the data are anonymous, people require fewer personalised offers when providing personal data.

M

ODERATOR

:

P

RIVACY

C

ONCERNS

The idea of the ‘right to privacy’ dates back to 1890, when the camera prompted fears of the images being stolen. Then, privacy was only available for the wealthy; the poor could not afford it (Ismail, 2017). Today, privacy is widely understood and affects everyone. It can be described as the extent to which participants perceive that their data collectors, or agents thereof, have failed to ensure participants’ privacy (Cheskin Research and Studio Archetype/Sapient, 1999). Nearly everyone has become familiar with privacy; often in a negative context. Research conducted by TNS (2010) has found that 70% of Europeans worry that their personal data are used for different purposes than they were originally collected for (TNS Opinion & Social Avenue Herrman Debroux, 2011). To illustrate the actuality of privacy, five students in the Netherlands challenged a new law that extended the possibilities of tapping personal data for the Dutch Special Agencies (AIVD/MIVD). These students proposed a referendum, and a majority of the Dutch population were against the proposed adjustment of the law.

Since the moment that organisations began to collect personal data, security and privacy breaches have been the primary barriers to people providing their personal data (Bush, Bush, & Harris, 1998). A privacy breach is defined as the unauthorised collection, disclosure or other use of personal information (Wang, Lee, & Wang, 1998). In previous years, privacy breaches were uncommon; nonetheless, although they were rare, multiple surveys conducted in the past found that customers were already concerned about their privacy (Phelps, D'Souza, & Nowak, 2001). These customers worried about how organisations collected and used their data and the accuracy of the information used (Katz & Tassone, 1990) (Equifax-Harris, 1996). Today, privacy breaches occur increasingly often, and as a consequence, people report in surveys and interviews that they would not voluntarily disclose personal data for marketing purposes (Acquisti, Brandimarte, & Loewenstein, 2015). Based on a review of the literature, the direct relationship between privacy concerns and willingness to provide personal data is presented in the following hypothesis:

H12: Privacy concerns are negatively related to the willingness to provide personal data. In other words, it is expected that a severe level of privacy concerns results in a decrease in the willingness to provide personal data.

(18)

Today, there are even data traders present online (e.g., Focum and Experian) who collect and sell data to third parties. Although these companies are required to comply to the legal framework, they are known for not being transparent. Bits of Freedom has discovered that data collectors amass more data than permitted. It is now nearly impossible to browse the internet anonymously; in 2002, Rust et al. (2002) had already predicted that the internet would eventually cause individual privacy to disappear, and that a specialised market for privacy protection would emerge. In other words, they predicted that consumers would be forced to pay for a certain degree of privacy protection in the future (Rust, Kannan, & Peng, 2002).

M

ODERATOR

:

A

NONYMITY OF

D

ATA

Before the introduction of the internet and electronic finances, consumers understood the standard protection of their privacy. The internet has facilitated growth in the collection, processing and exchange of personal data in speed and scope. Now, some researchers argue that it is impossible to conduct business online without revealing personal information (Rust, Kannan, & Peng, 2002). This has led to a call for anonymity, which, in practical terms, means that data are not retraceable to individuals.

At the moment, data are collected, regarding nearly every action of individuals, which are not made anonymous (organisations register names, addresses and contact details). In this light, the privacy concerns of people will likely be quite high. The phrase ‘will be’ is used because most organisations attempt to hide their data collection methods and do not make this process transparent. This decision appears to be logical, since Acquisti and Grossklags (2005) have proven that when openness regarding data collection method is lacking, the willingness to provide data is reduced.

However, Benndorf and Normann (2017) have demonstrated that with anonymous data, privacy concerns are quite low, data are almost always provided (Benndorf & Normann, 2017) and many benefits of data may also be achieved without using personally identifiable data (Acquisti, Brandimarte, & Loewenstein, 2015). Notably, some scholars claim that it is impossible to provide complete anonymity because, even if information is pseudonymised (i.e., name, etc., is replaced with a randomly generated number), the individual can be traced when the data are pieced back together. Some limitations of providing complete anonymity are potential risks, such as an increase in criminal activities, tax evasion and money laundering, resulting from data being difficult to trace to individuals. Based on the literature, a relationship between anonymity and the willingness to provide personal data is hypothesised as follows:

H13: Anonymity is positively related to the willingness to provide personal data. In other words, when the data are anonymous, this results in an increase in willingness to provide personal data.

M

ODERATOR

:

T

RUST

Information asymmetry exists between organisations that use data and the people who provide them. Because of this information asymmetry, having trust in the data collector is difficult. This is striking because without trust, most of the potential social and economic benefits of data sharing cannot be realised (Greenwood & Van Buren, 2010).Trust is well covered in the literature, where it is often posited that trust is of vital importance when deciding whether to share personal data. Multiple studies have shown that trust has a positive influence on open communication (Zand, 1972) (Mellinger, 1959) (Smith & Barclay, 1997) and that trust is a driver for success (Gallivan, 2001). The definition of trust is well captured by Mayer et al. (1995): ‘the Willingness of a party to be

(19)

vulnerable to the actions of another party based on the expectation that the other will perform a particular action impaction to the Trustor, irrespective of the ability to monitor or control that other party’ (Mayer, Davis, & Schoorman, 1995).A study conducted in 2014 by SDL found that 57% of respondents did not want to do business with organisations that collect personal data. However, 79% did not mind giving personal data to organisations with which they had conducted transactions in the past and therefore trusted more. When realising that most interactions are with unfamiliar companies, trust becomes even more relevant.

Another study that researched personality traits relating to the shopping, surfing and information seeking behaviours of consumers found that consumers who have relatively less trust in data collectors are less willing to provide their personal data (Das, Echambadi, McCardle, & Luckett, 2003). Trust can also be beneficial in mitigating the perceptions of risk and insecurity (McKnight, Cloudhury, & Kacmar, 2002). A combined study executed by Cheskin Research and Studio Archetype/Sapient (1999) regarding consumer trust has concluded that trustworthiness results from consumer experience accumulated over time. To make this trustworthiness positive, Mayer et al. (1995) have found that positive trust can be established by having integrity and the ability to establish mutually beneficial relationships and demonstrating benevolence. Research has also shown that authorities and institutions are trusted more than commercial organisations (55%) (TNS Opinion & Social Avenue Herrman Debroux, 2011). This means it is even more essential for organisations that possess the most personal data (commercial organisations) to establish relationships based on trust if they wish to become or remain data-driven.

A feeling of distrust is primarily caused by a power imbalance between the people about whom data is collected and the organisations that have and use these data. People must rely on the trustworthiness of organisations to meet fairness obligations (i.e., organisations must handle their data with the utmost care) (Greenwood & Van Buren, 2010).The information asymmetries that provoke, by default, untrusting feelings from customers should be properly addressed by organisations if they want customers to willingly provide data. Dawkins (2013) argues that power asymmetries should be reduced so interactions can be enabled (Dawkins, 2014). By creating trust, Rose et al. (2013) argue that ‘the organizations that excel at creating Trust should be able to increase the amount of customer data they can access by at least five to ten times in most countries’. In addition, if a relationship of trust has been established, Smith et al. (2011) have concluded that this trust can be a competitive advantage, because these organisations are perceived as safer or more trustworthy concerning privacy dimensions (Bowie and Jamal 2006). As earlier identified, organisations that are trusted by customers receive the benefit of customers’ greater willingness to provide their information (Schoenbachler & Gordon, 2002). Therefore, the hypothesis regarding trust is as follows:

H14: Trust is positively related to the willingness to provide personal data. In other words, when people trust their data collector, this results in an increase in willingness to provide personal data.

(20)

C

ONTROL

V

ARIABLES

:

S

OCIO

-D

EMOGRAPHIC AND

S

OCIO

-E

CONOMIC

F

ACTORS

Finally, this thesis conducts research regarding how demographic and socio-economic factors influence the willingness to provide personal data. The demographic factors consist of gender and age and the socio-economic factors consist of education level and net income. These factors are utilised since it is common to distinguish target groups based on these criteria, and so it will be easier to compare the results to existing and future research. Likewise, it follows that these factors impact the willingness of people to provide their personal data. Furthermore, existing literature has demonstrated that in comparison to female consumers, male consumers show lower privacy concerns and tend to be more willing to provide their data (Graeff & Harmon, 2002). In addition, Gervey and Lin (2000) have concluded that younger people tend to have more positive views on sharing data for marketing purposes than do older people (Gervey & Lin, 2000). This research controls for differences in the willingness to provide personal data between genders, age groups, educational groups and income groups.

(21)

R

ESEARCH

D

ESIGN

This study involves sampling and analysing which conditions encourage people to provide their personal data. The literature review demonstrates that monetary incentives, personalised offers and type of data are potential predictors for the willingness to provide personal data. Also, this study includes controls to determine if privacy concerns, trust in the data collector and anonymity of the data moderate these predictors and have a direct effect on the willingness to provide personal data.

The data used to determine the importance of each variable is acquired through distributing a survey. The survey method is chosen due to its convenience in acquiring an acceptable data set (>150) that is appropriate for this longitudinal study. The survey consists of three parts. The first section is a choice-based conjoint experiment regarding how the three independent variables – monetary incentives, data types and personalised offers – influence willingness to provide personal data. The choice experiment consists of full-profile choice-based questions. The second section of the questionnaire consists of five-point Likert Scale questions anchored by ‘strongly agree’ and ‘strongly disagree’ to assess respondents’ thoughts on the three moderators: privacy concerns, trust and anonymity. This structure is utilised because the most relevant and intellectually challenging questions are at the start of the questionnaire, and all variables of the conceptual model are tested. The complete questionnaire can be found in Appendix 1.

R

ESEARCH

M

ETHODOLOGY

C

HOICE

-B

ASED

C

ONJOINT

E

XPERIMENT

In this explanatory research, a choice-based conjoint experiment (also referred to as a choice experiment or the choice method) is used to determine the trade-offs a person makes when providing their personal data, with the goal of assigning values to the range of alternatives related to providing such data. Armed with this knowledge, organisations can focus on the most important drivers for choosing whether to provide personal data and compile services that positively affect this choice. A choice-based conjoint analysis is utilised because it represents what people experience in daily life: choosing a preferred option from a group of alternatives. In addition, the task is relatively simple and natural and is easy for respondents to understand.

To conduct a choice-based conjoint analysis, the drivers of providing personal data are divided into attributes and levels. An ‘attribute’ is a characteristic of a factor, and a ‘level’ represents the value of an attribute. In this study, the attributes consist of the benefits (monetary incentives and personalised offers) and disadvantages (use of sensitive data) of providing personal data. These attributes are chosen in part because the literature shows them as among the most relevant (de)motivators and also because these attributes are completely independent of each other. By filling out the choice experiment, participants evaluate the alternatives and choose the most preferred alternative (Aizaki H. , 2012). Table 1 depicts an overview of the attributes and corresponding levels.

(22)

TABLE 1. CONJOINT FACTORS AND LEVELS

First, a choice experiment design that conforms to the L^MA method is created (see Table 2). Before creating this design, it is crucial to decide on the number of blocks, number of questions per block, number of alternatives per choice set and number of attributes per alternative. Overloading the survey with extraneous elements would make it more difficult for respondents to identify their preferred attributes and levels. Too many attributes in one alternative would make the survey harder to read and may lead to fatigue. To minimise the psychological burden for respondents, it is decided that each choice set consist of two alternatives (‘Alternative 1’ and ‘Alternative 2’) and an ‘opt-out alternative’. ‘Alternative 1’ and ‘Alternative 2’ each consist of three attributes.

The attribute monetary incentive consists of four levels: ‘0.00’, ‘5.00’, ‘10.00’ and ‘15.00’. To determine these four levels, A/B testing is utilised to discover the most appropriate monetary incentive. A comparison is conducted between the levels ‘0.00’, ‘5.00’, ‘10.00’ and ‘15.00’ and ‘0.00’, ‘2.00’, ‘4.00’ and ‘6.00’. This A/B testing is necessary because practical and scientific research is not ambiguous on the value of personal data. The results demonstrate that respondents are more willing to provide different types of data when the monetary incentive is higher. The higher levels for the attribute monetary incentive (‘0.00’, ‘5.00’, ’10.00’ and ’15.00’) are utilised to allow for distinctions between the specific values in the ranges of data types and personalised offers and to ensure that respondents do not feel offended if their data generate a low incentive.

The attribute data type consists of four levels: ‘contact data’, ‘financial data’, ‘demographic data’ and ‘socio-economic data’. Four different types are included to test whether people distinguish between different types of data and therefore perceive one data type as more sensitive than another. The option of combining different data types into bundles of types of data is purposefully not chosen because this makes comparison more difficult, and the goal of this research is to determine what makes people more willing to provide their data. The data types are chosen so that each type can be understood as more sensitive (financial data) or less sensitive (contact data and demographic data) and anonymous (financial data and demographic data) or personally identifiable (contact data). In return for sharing a specific data type, the respondent can receive personalised offers, which comprise a set of potential benefits: namely, experiencing more personalised service when contacting an organisation, which saves time during contact by not needing to be identified, and being presented with relevant products. This final attribute, personalized offers, consists of two levels: ‘receive no personalised offers’ and ‘receive personalised offers’. As previously explained, the third alternative is an ‘opt-out’ (i.e., ‘I wouldn’t choose any of these’). By choosing this alternative, respondents can express their lack of interest. This can identify groups of respondents who are not interested or willing to provide their personal data, regardless of potential benefits.

TABLE 1. Conjoint Factors and Levels

Attributes Levels

Monetary Incentive € 0,- € 5,-€ 10,-€

15,-Type of Data Financial Data (Spending Behaviour)

Contact Data (Home address, E-mail address, Telephone number) Socio-Economic Data (Income, Education Level, Occupation) Demographic Data (Gender, Age, Marital Status)

Personalized Offers Receive no personalized offers Receive personalized offers

(23)

To ensure an acceptable psychological load and retain respondents’ concentration, it is decided to ask ten choice questions. In addition, the same choice questions are provided to each participant, thus necessitating only one block and facilitating data comparison.

TABLE 2. CHOICE SETS

The choice experiment is introduced via a scenario, which explains that people have the option to sell different types of their personal data, via their bank, to a single third party for one year. It is emphasised that the information will not be sold to other parties and participants can choose to not disclose their personal data. The purpose of beginning the choice experiment with a scenario is to 1) explain how the choice experiment works and 2) make the survey resemble a real-life experience as closely as possible, allowing people to relate to the situation (see the complete questionnaire in Appendix 1).

Normally, a choice experiment design, according to the predefined characteristics, should consist of a total of 16 choice questions (Johnson, Kanninen, Bingham, & Ozdemir, 2007). Sixteen choice questions, combined with the latter parts of the questionnaire, can place a heavy psychological burden on the respondent. Therefore, the choice set is reduced into one block of ten questions, rendering the psychological burden acceptable; completing the survey takes no longer than approximately 15 minutes, so it is unlikely participants will lose interest (Louviere , Hensher, & Swait, 2000).

To reduce the number of questions from 16 to 10, several questions are removed. The criteria for removing questions is based on the following criteria: 1) one choice question is clearly (100%) favoured above the other alternatives, and so the choice set is ‘too easy’ and 2) two choice questions are > 80% similar. Because the choice design is assembled using the L^MA method, each choice set is carefully selected using a computer algorithm that ensures that the total choice sets are well-balanced and have near orthogonality. As seen in Tables 1 and 2, the orthogonal main-effect array is used to create each choice set so that they consist of two alternatives of three attributes, with one two-level and two four-level attributes. Each row of the array corresponds to an alternative of the choice question. The Appendix includes an example of a choice question.

(24)

M

EASUREMENT

:

M

ODERATORS

To determine participants’ opinions regarding privacy concerns, anonymity and trust, each participant is asked to indicate whether they agree with several statements on a five-point Likert Scale, anchored from ‘strongly disagree’ towards ‘strongly agree’. Counter-indicative terms are used to control for whether people are focused, thus allowing for the removal of uncommitted participants. To ensure that participants are not forced to choose, for each Likert Scale question, two columns are added: ‘don’t know’ and ‘do not want to answer’. All variables are measured by at least three statements and score > 0.700 on Cronbach’s alpha: privacy concerns (0.722), trust (0.760) and anonymity (0.730). This indicates a high level of internal consistency.

M

EASUREMENT

:

C

ONTROL

V

ARIABLES

In the final segment of the questionnaire, personal details are requested to check for 1) willingness to disclose personal data and 2) control variables. Four questions are asked concerning participants’ age, gender and education and income levels.

M

EASUREMENT

:

D

EPENDENT

V

ARIABLE

:

W

ILLINGNESS TO

P

ROVIDE

P

ERSONAL

D

ATA

To determine whether participants are willing to provide personal data, the responses in the choice-based conjoint analysis and the answers to the control variable questions are analysed. If a participant chose either ‘Alternative 1’ or ‘Alternative 2’ on all ten choice sets and answered all control-variable-related questions, then this participant is labelled as willing to provide personal data. Analysis of the questionnaire results shows that 69 respondents (37.7%) are classified as willing to provide their personal data. This group provided their personal data when requesting demographic (gender) and socio-economic data (education level/income) and always chose an alternative in the choice experiment in favour of the option ‘I wouldn’t choose any of these’. More specifically, 155 respondents (84.7%) provided their personal data upon request, but only 75 respondents (41%) chose an alternative of providing data on each choice set in favour of ‘no alternative’. However, a group of 19 respondents (10.4% of the participants) chose to not provide any personal data by choosing ‘I wouldn’t choose any of these’ on every choice set. A more detailed overview of the analysis of the responses in displayed in Table 3.

TABLE 3. ANALYSIS DV: WILLINGNESS TO PROVIDE PERSONAL DATA

TABLE 3. Analysis DV: Willingness to Provide Personal Data

Results Choice Experiment

# Respondents who Did not provide Personal Data

# Respondents who Provided Personal Data

Total Respondents who are willing to Provide Data

#Respondents that choosed multiple times but not unlimited the option: "I wouldn't choose any of these"

63 (34.4%) 11 (17.5%) 52 (82.5%)

# Respondents that only choosed the option "I

wouldn't choose any of these" once 26 (14.2%) 5 (19.2%) 21 (80.8%)

# Respondents that only choosed choice sets 75 (41.0%) 6 (8.0%) 69 (92.0%) 69 (37.7%) #Respondents that only choosed the option "I

wouldn't choose any of these" 19 (10.4%) 6 (31.6%) 13 (68.4%)

(25)

S

AMPLING

T

ECHNIQUE

The population consists of inhabitants of the Netherlands who meet the following criteria: each participant is required to give consent and thus is freely willing to participate, is between 18-65 years old, has sufficient understanding of the English language and is fully committed to giving their best effort. These criteria are utilised because participants who meet these requirements potentially possess valuable personal data, provide their personal data (willingly or unwillingly) on a regular basis and have the capacities to successfully execute the survey. Convenience sampling is used by sending e-mail invitations with an anonymous link to the web-based survey.

To achieve a valid and diverse sample, a questionnaire was distributed to a total of 300 participants via an anonymous link to my personal network and an online audience panel (provided by Qualtrics). This approach reduces the chance of bias by blending the audience panel and my own personal network. Two-hundred and forty-two participants replied, a response rate of 80%. Multiple steps were taken before the choice experiment was begun to ensure that a response rate of 80% was achieved and that participants did not misunderstand questions, including a cover letter ensuring the confidentiality of the answers of the respondents, an explanation of the topic and an introductory section.

S

TRENGTHS AND

L

IMITATIONS

A strength of a choice-based conjoint analysis is a possibility for ‘alternative-specific’ attributes. This allows the researcher to check the considerations of people when providing personal data.

A limitation of this study is that the questionnaire is shared online. This makes it available only to people who are likely to be familiar with the internet, and thus relatively accustomed to providing their personal data online. More specifically, the people who are present on social media are more knowledgeable about (online) privacy than those who use the internet less often. This may mean that they are more aware and thus have higher privacy concerns or that they are aware of the benefits and thus care less about their privacy. Another limitation is that participants only have to state whether they want to provide their personal data, and do not actually have to provide anything. As such, people can react differently because the consequences are fictional.

Another limitation of this research is that the precise amount for which people are willing to sell their financial data cannot be estimated. The study only presents the conditions under which people are willing to provide their data. Furthermore, the collected data is sparse. It does not indicate how much each driver is preferred over the others. It does show which drivers are most important to respondents, but it is not possible to specify how much more important. People are not asked to rank products; they instead consider and compare a few choices.

(26)

H

ANDLING

M

ISSING

D

ATA

To establish face validity, the questionnaire was sent to a test panel of 12 participants, of which six are experts on data management. These participants verified whether the questionnaire is easily accessible and screened it for, among other issues, double-barrelled, confusing and leading questions. This ensures that participants do not fail to complete the questionnaire due to difficulty or lack of clarity. Also, validation settings are included in the survey to ensure that the maximum possible data are received for analysis. A ‘force response’ option is included, which forces respondents to answer the questions. This ensures a response rate of 100% on all questions because it prevents ‘skipping through’ the questionnaire and leaving fields blank. Also, if respondents do not know or want to answer a specific question, the options ‘do not want to answer’ and ‘don’t know’ are available as an option.

After the feedback from the test panel was implemented, a soft launch was performed using 10% of the sample size for a pre-analysis; this consisted of 20 questionnaires and provided the opportunity to further identify potential discrepancies or issues. The soft launch demonstrated that the screen-out logic worked as planned. The text entry responses do not involve non-sensical text entries, there are no missing responses and the data are usable to assess all hypotheses. All respondents completed the questionnaire and answered every question.

Because all questions have a forced response validation, there is no missing data, and therefore no ‘missing data strategy’ is performed. This is tested via requesting frequencies, which show that each participant answered each question and no technical errors occurred. The scores of 6 (‘don’t know’) and 7 (‘do not want to answer’) are treated as missing values. Pairwise deletion is utilised because the sample is relatively large and contains several questions per variable. Furthermore, there are relatively few 6 and 7 scores, which indicates that the potential bias is minimal and there is a low risk of inconsistent results.

C

LEANING THE

D

ATASET

Before beginning result analysis, the data set was cleaned. Several responses were filtered out based on five criteria, ensuring the quality of the responses. The first criteria is that the level of English of the participant must be graded by the participant as >5. At the beginning of the questionnaire, the participant is asked to indicate their level of English on a scale of 1 to 10. The number can be provided based on how well the participant understands the introductory text. If the participant perceives their English =< 5 and thus has difficulties reading the introductory text, the participant is filtered out. It is important to exclude these respondents because they have a relatively high chance of misinterpreting or not understanding questions due to the language barrier. Of the 242 respondents, 33 participants (14%) judged their English =<5 and were therefore excluded.

The second criteria is that the age of the participant must be between 18 and 65. To screen each participant based on age, a categorical question is included asking participants to indicate their age level by choosing from seven categories. Of the 209 (242-33) remaining participants, six were below 18 (3%) and one above 65 (1%). The third criteria is that participants must provide consent. Each participant is asked to click an ‘I consent’ button, by which the participant agrees to voluntarily participate in the survey and understands that their data will be used for the specified purpose and that the participant will not receive compensation for this study. Of the remaining 202 (209-7) participants, all consented.

(27)

The fourth criteria is that participants must be serious and answer the questions to the best of their knowledge and effort. To ensure that participants meet this requirement, a question is included asking respondents if they are committed to provide thoughtful and honest answers. The answers ‘I will not provide my best answers’ and ‘I can’t promise either way’ are screened out. Only when respondents answer ‘I promise to give my best answers are they allowed to proceed with the questionnaire. By asking the participants to make this promise at the outset, they feel more obliged to put in sufficient effort and take the survey seriously. Of the 202 participants, 11 (6%) participants did not choose ‘I will provide my best answers’ and were excluded. In addition, ‘straight-liners’ (i.e., someone who repeatedly selects the same answer) and wrong answers entered on counter-indicative terms were removed. The fifth criteria is that the time in which the survey is completed may not deviate more than > 2/3 of the median (7.6 minutes = 456 seconds) from the soft launch. More specifically, respondents who complete the survey in less than 152 seconds (7.6 minutes – (7.6 * 2/3)) were filtered out. Eight (4.2%) participants of the 191 filled out the questionnaire in less than 152 seconds, which results in a final sample of 183.

Of these respondents, 121 (66%) are male and 59 (32%) female. Three participants (2%) did not want to report their gender and chose the option ‘do not want to answer’. The age range of the respondents is regularly distributed among the five groups (M = 2.76, SD = 1.28). The oldest group (55-65) is the least represented (10%). The education level of the respondents ranges from lower school (3%) to a doctor’s degree (5%). A majority have a high school (30%), bachelor’s (33%) or master’s degree (26%), and only three people (2%) did not want to report their education level. In total, the education level of the participants can be defined as above average (M = 3.13, SD = 1.13). Corresponding to education level, many participants earn an above average income. Nearly a quarter of participants earn more than €50.000 on a yearly basis (24%), while only a minority earn less than €15.000 (12%). The majority of the participants earn an income between €25.000 and €50.000 (36%). Also, quite a few respondents did not want to answer the question regarding income (14%) (M = 2.83, SD = 1.90), which provides a hint that people are less willing to provide financial data (income) than other types of data (i.e. demographic [gender] or socio-economic [education level]).

Referenties

GERELATEERDE DOCUMENTEN

Other reason why is so hard to measure and establish a relationship between CSR and luxury goods might be consumers’ different perception of luxury, which can influence

Note: The dotted lines indicate links that have been present for 9 years until 2007, suggesting the possibility of being active for 10 years consecutively, i.e.. The single

14 The effects of the socio-demographic characteristics (moderators) on the relationship between the level of privacy protection of the cloud storage service and the

In contrast to cyclic rejuvenation of riparian vegetation along natural flowing rivers, vegetation in floodplains along regulated rivers in the Netherlands matures to its

Let us follow his line of thought to explore if it can provide an answer to this thesis’ research question ‘what kind of needs does the television program Say Yes to the

H5: The more motivated a firm’s management is, the more likely a firm will analyse the internal and external business environment for business opportunities.. 5.3 Capability

In dit hoofdstuk is ook onderzocht hoe het Nederlandse vestigingsklimaat gebaat is bij het verlenen van een APA of ATR bij informeel kapitaal, de cv/bv-structuur en financierings-

Various processes have been identified as being viable options for the production of hydrogen, which include the Hybrid Sulphur cycle and Sulphur Iodine cycle, both