• No results found

Finders Keepers, Losers Weepers: Uncovering the Hidden Assumptions of the Big Data Movement

N/A
N/A
Protected

Academic year: 2021

Share "Finders Keepers, Losers Weepers: Uncovering the Hidden Assumptions of the Big Data Movement"

Copied!
56
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Finders Keepers, Losers Weepers

Uncovering the Hidden Assumptions of the Big Data

Movement

MA Thesis

Research Master‘s in Philosophy, University of Amsterdam

Student: Marijn Sax

Student no.: 5973287 Date: August 18, 2015 Supervisor: prof. dr. Beate Roessler Second reader: dr. Gijs van Donselaar

(2)

2

(3)

3

Table of Contents

Introduction

4

Chapter 1: What Is Big Data?

6

More Data Does Not Equal Big Data 6

Emergent Data 8

Conclusion 9

Chapter 2: Big Data Privacy Risks

11

Intimate Personal Information 12

Profiling 13

Social Discrimination 14

Filter Bubbles 14

Conclusion 16

Chapter 3: The ‗Finders, Keepers‘ Ethic

18

Big Data as Oil and as a Goldmine 18

Kirzner on Entrepreneurial Activity 20

Big Data Entrepreneurship 24

Questions for Big Data Entrepreneurship 26

a) Divisibility of Personal Data 26

b) The Acquisition Process of Personal Data by Big Data Companies 30

c) Problems of the ‗Finders, Keepers‘ Ethic 35

Conclusion 38

Chapter 4: The Private Property Approach and the Dignity Approach to Privacy and

Personal Data

39

Privacy as Property Right 39

Privacy as Dignity Right 42

Conclusion 43

Chapter 5: The Perfect Marriage between Two Theories and Why We Should Try to

Break Them Up

44

The Perfect Marriage? 44

Breaking Up the Happy Couple 46

Dignity Instead of Private Property as a Starting Point 47

(4)

4

Introduction

Somewhere in February 2015 I came across a video on YouTube that made it immediately clear to me that I should write my MA Thesis on big data1. I had been in doubt for some time, juggling different privacy topics and being unable to decide on my thesis topic, but that video took the doubt away. To see why, I invite the reader to read the text I quote directly below this paragraph. It is the literal transcript of an advertisement promoting a data management system made by Lotame, a company Joseph Turow mentions in his book The Daily You (2011) and which lead me to research this company. When I first saw the video I was not sure whether to laugh or cry. The video, which tries to advertise a typical big data product, shows, without any reservations, what a big data approach towards data and people looks like, what attitudes it incorporates, the type of language it uses, and how the data subjects are viewed by big data entrepreneurs. The transcript:

Your company generates mountains of data every day, but you need a way to harness this data‘s power and turn that mountain into a goldmine. Most companies either don‘t know what to do with their data, or don‘t know how to turn their insights into action. That‘s why there is Lotame‘s fine Data Management Platform (DMP). Lotame's DMP is the most comprehensive, powerful, and easy to use Data Management Platform that allows you to bring in data from everywhere (video shows ones and zeros being extracted from people‟s smartphones, pc‟s and laptops and

being transferred to a machine with „Lotame‟s DMP‟ written on it) and utilize these data points to

improve your overall consumer experience, engagement, and loyalty by personalizing content, advertising, and offers whenever you interact with consumers. Build audiences as broad or specific as you want based on demographics, locations, behaviors, and interests and fill in any gaps with third party data. Lotame‘s DMP offers unique and often surprising insights about your audience‘s behavior and interests which can inform your entire marketing strategy (video shows

Lotame employees working behind their desks while dollar signs appear above their heads). Publishers and

networks can respond to more RFPs [request for proposals, M.S.] and win more business, offering added value to their advertisers. They can also use Lotame‘s DMP to deliver a fully customized customer experience to their sites visitors. Marketers and agencies using Lotame‘s Unifying DMP learn invaluable new information about prospects, behaviors, and interests before even running a campaign or presenting an offer so they can tailor their marketing content to match each target audience (video shows ones and zeros flowing from three different individuals to the

computers of Lotame employees). And audiences can be adjusted at any time, so you don‘t waste

1 The word ‗big data‘ is often written with a capital ‗b‘ and a capital ‗d‘: Big Data. I find this ugly and unnecessary

(5)

5 money on ads or promotions that aren‘t performing. The bottom line is that Lotame‘s Unifying

DMP will improve your bottom line! (Dollar signs start raining from the sky). Go to Lotame.com to learn more today. (Lotame‟s logo appears with „Collect anywhere. Use Everywhere‟ written underneath

it.)2

―How have we done it?‖ was the question Quine asked himself at the start of Pursuit of Truth (1992), announcing his investigation into the astonishing process that leads from simple neural stimulations to highly complex scientific theory building. I would like to pose a similar question in this thesis, although one with a somewhat different connotation. I would like to know ‗how we have done it‘ that these types of data practices, as exemplified by the Lotame video, appear to be completely normal and acceptable. Why would a company feel comfortable with openly advertising products like these? Why would a company feel comfortable using the slogan ―collect anywhere, use everywhere‖ when referring to personal data? How can a company feel comfortable with portraying individuals – in a public advertisement! – as mere objects from which ones and zeros (i.e. personal data) can and should be extracted for nothing other than the financial gain of corporations dealing with personal data in a big data context?

In order to answer these questions, I will investigate the (often implicit) assumptions that seem to underlie the big data movement. I will focus on the sense of entrepreneurship that surrounds big data, the ‗finders, keepers‘ ethic that is utilized by big data entrepreneurs, and the idea that seems to facilitate the previous two observations: personal data and privacy understood as private property, to be traded freely by individuals in market-like contexts according to their own perceived, best interests. I will try to show that these assumptions are far from self-evident and can and should be scrutinized. This analysis is used to explicate the issues surrounding big data, in order to question its values and practices, and ultimately assess whether this approach to big data must be modified.

(6)

6

Chapter 1: What Is Big Data?

Many authors agree on the importance of studying big data, although being unable to efficiently, or unanimously, define it. Ward and Barker, for example, write that ―[t]he term big data has become ubiquitous. Owing to a shared origin between academia, industry and the media there is no single unified definition, and various stakeholders provide diverse and often contradictory definitions‖ (Ward and Barker 2013: 1). Boyd and Crawford are more blunt: ―Big Data is, in many ways, a poor term‖ (boyd and Crawford 2012: 663). As a result, many authors do not present a snappy, one-sentence definition of big data but try to give a short description of what they take to be the conceptual core of the concept of big data.

In this chapter I will try to do the same thing. This will hopefully lead to a description of the concept that does justice to the phenomenon itself, while at the same time allowing me to utilize the description in the later stages of this thesis where I will present my ethical reflections on the phenomenon.

More Data Does Not Equal Big Data

The idea of quantity is a usual suspect when writing about what makes big data big. Many articles and chapters on big data start with the observation that we collect more data than ever and the observation that the amount of data we generate and collect is still growing exponentially. Sagiroglu and Sinanc, for example, write that ―5 exabytes (1018 bytes) of data were created by humans until 2003. Today, this amount of information is created in two days. In 2012, the digital world of data was expanded to 2.27 zettabytes (1021 bytes)‖ (Sagiroglu and Sinanc 2013: 42). It is clear that our ability to both collect and store larger volumes of data than ever is a driving force behind the phenomenon of big data. This does not mean, however, that what makes big data big is simply a certain, large enough, volume of data. The transition from data to big data is not just a quantitative shift; it is a qualitative shift as well.

Big data is not so much about amounts of data, as it is about thinking about data, dealing with data, and approaching challenges and opportunities through the eyes of data. Mayer-Schönberger and Cukier (2013) identify three major shifts in moving from ‗normal‘ data approaches to big data approaches. The first shift constitutes a focus on creating datasets that approach N=all, instead of the careful creation of samples that should be representative of much larger populations. The second shift constitutes the belief that in order to achieve a (nearly) N=all dataset, we can allow data from many different sources, even if the data is of dubious quality, to be included. In big data contexts, sheer size and volume are supposed to make up for messiness and low quality data, it ―permit[s] us to loosen up

(7)

7

our desire for exactitude‖ (Mayer-Schönberger and Cukier 2013: 13). The third shift constitutes the abandonment of the ―age-old search for causality‖ since ‗mere‘ correlations suffice (Mayer-Schönberger and Cukier 2013: 13). Mayer-Schönberger and Cukier‘s description of big data clearly moves beyond a description that focuses merely on the size of datasets. What is of great importance to them is the idea that in a big data world more data from various sources, even sources of dubious quality, combined in a larger dataset will almost always lead to more powerful and valuable analyses. This insight leads to a certain ‗data hungriness‘; when doing big data analyses, more data on the input side is (almost) always better. Big data incentivizes data collection, but also data recombination: ―so-called big data brings together not only large amounts of data, but also various types that previously never would have been considered together‖ (Michael and Miller 2013: 22). A concept that is often mentioned in this context is that of ‗datafication‘. Increasingly many of our daily dealings – think of traveling by public transportation and using your public transportation chip card, or doing groceries and paying with your debit card – are turned into quantitative data, leading to a datafication of those practices (Van Dijck 2014). It is clear that big data and the datafication of our lives are two mutually enforcing tendencies.

Mayer-Schönberger and Cukier‘s description also points to an ‗if it works, it works‘ mentality that can be associated with big data. Big data analysis is problem-oriented: if the data generate insights that can be used to solve certain problems or achieve certain ends, then the analysis is successful. If the instrument works in order to achieve a desired end, we shouldn't torture ourselves with difficult questions concerning causality and truth since those questions do not add anything to the process of acquiring valuable insights (or so the big data movement argues).

Crawford and Schultz (2014) name three defining features of big data that partly overlap with features mentioned above, but also differ on some accounts. ―First, it refers to the technology that maximizes computational power and algorithmic accuracy. Second, it describes types of analyses that draw on a range of tools to clean and compare data. Third, it promotes the belief that large datasets generate results with greater truth, objectivity, and accuracy‖ (Crawford and Schultz 2014: 96). Again, we can see that a notion of new technological capabilities (computational power and algorithmic accuracy) is combined with a description of what can be called a ‗big data mindset‘, namely the believe that approaching problems and opportunities through the eyes of more data will yield better, more useful results.

When using a big data approach to a problem, the goal is not to amass as much data as possible in order to simply paint an accurate picture. The goal is to come up with interesting and unanticipated insights that do not follow directly from the aggregated data themselves, but that need to extracted from them. As Rubinstein notes, big data ―is best understood as a more powerful version of knowledge discovery in databases or data mining, which has been defined as ‗the nontrivial extraction of implicit,

(8)

8

previously unknown, and potentially useful information from data‘3‖ (Rubinstein 2013: 76). As an effect, storing information – even if one is not sure how useful the data are right now – becomes more and more interesting. ―As big data analysis tools became more powerful, it became profitable to save more information‖ (Schneier 2015: 22). Tene and Polonetsky go one step further when they state that ―the big data business model is antithetical to data minimization. It incentivizes collection of more data for longer periods of time. It is aimed precisely at those unanticipated secondary uses, the ―crown jewels‖ of big data‖ (Tene and Polonetsky 2013: 259).

Data mining is the technique that is most commonly seen as the big data analysis technique par excellence. As already mentioned, it is a technique that is aimed at discovering something new in existing datasets, something that cannot simply be observed in datasets or follows automatically from datasets, but something that has to be extracted since it does not ‗lie at the surface‘. In order to achieve this, a combination of complex algorithms and brute computing force are used to work on the data.

The fact that we can discover new knowledge in existing data by using data mining techniques goes a long way in explaining why big data is a phenomenon that attracts so much attention. It surrounds big data with an aura of entrepreneurship. Since ―by its very nature, big data analysis seeks surprising correlations and produces results that resist prediction‖, it always remains an open question what new information will be found and who will find it at what time (Tene and Polonetsky 2013: 261). So entrepreneurs who work with big data hope that they will be the first to awaken the dormant value that lies hidden in big datasets.

Emergent Data

A key characteristic of big data that warrants special attention is the phenomenon of emergent data. When new, surprising, useful and unpredictable information is extracted from existing data, new data are created. Predictive data mining is a useful example to show the importance of understanding emergent data. Millar describes predictive data mining as the activity ―where the goal is to predict based on inference from the original data‖ (Millar 2009: 106). The practice of profiling is a prime example of an activity made possible by predictive data mining. By gathering data that record the behavior of individuals, predictive data mining techniques can be used to infer, for instance, what kind of interests, sensitivities, desires, and habits someone has based on regularities and patterns found in the dataset. And since these are the types of features of an individual that are important for shaping future behaviors, predictive data mining can help predict what type of behavior individuals will display in the future and how it can be influenced.

(9)

9 This account demonstrates how psychological profiling differs from descriptive data mining in

an important respect: the resultant dataset emerging from this kind of predictive data mining contains new data items that are not mere aggregations or simple characterizations of the original dataset. Those new data items are better characterized as non-trivial representations of an individual‘s beliefs, intentions, or desires. Hence, the resultant dataset includes emergent data that must be considered in its own right if we are to properly characterize predictive data mining and the privacy implications of it (Millar 2009: 112-3).

As Millar‘s apt description makes clear, emergent data present us with qualitatively new data that are generated out of existing data. Since these emergent data are in fact new data that only came into existence after analysis, they warrant special attention. It is not a trivial extrapolation of the existing dataset, but a non-trivial discovery. This is exactly what separates big data from non-big data: the analysis techniques applied to the dataset do not only provide a better ordered or more detailed description of features that were already known beforehand, they can also highlight entirely new features that were not known before.

The use of ―complex algorithms, artificial intelligence, neural networks and even genetic-based modeling‖ (Zarsky 2003: 4) allows for these new ways of knowledge discovery in databases, and just like Millar, Zarsky is also keen to point out that one of the key components of this new way of dealing with personal data is the fact that ―they can discover previously unknown facts and phenomena about a database, answering questions users did not know to ask‖ (Zarsky 2003: 4).

I am keen on stressing the importance of the phenomenon of emergent data because they introduce qualitatively new (privacy) challenges. Since new data can be generated out of existing data, it is no longer sufficient to only think about the original data prior to the data mining. Big data‘s capacity to mine data and discover unpredictable, surprising new insights points our attention to the pressing need to consider emergent data as well as original data.

Conclusion

As was already mentioned, we cannot expect a one sentence definition of big data. What is clear, however, is that the concept of big data incorporates different elements that point to new technological capabilities, new types of analyses, and new ways of thinking about and using data. The big data approach certainly has been fueled by the ever decreasing costs of storing and collecting data, and the increasing processing power of computers. Big data is not just about volumes though, it is also about the belief that, ultimately, using more data from various sources in our analyses leads to more useful

(10)

10

insights. Data mining, and more specifically predictive data mining, is the prime type of data analyses utilized by big data approaches to provide us with these useful insights. Although big data approaches promise to provide many new insights that can be used to solve problems or explore new possibilities, the outcomes of these analyses are always partly unpredictable. Big data analyses are mainly about discovering those insights that do not follow directly from the data themselves, but need to be ‗awakened‘ by these analyses. The phenomenon of emergent data – the extraction of non-trivial new information out of existing information – is a result of this ‗dormant‘ potential that is inherent in big data datasets. At the same time it explains the entrepreneurial interest for big data; entrepreneurs hope to discover new insights that they can market.

(11)

11

Chapter 2: Big Data Privacy Risks

In this chapter I will present what I believe to be the most important privacy risks that big data poses. My aim here is not to develop a completely new philosophical approach to privacy and big data but, instead, to catalogue the privacy threats others have already addressed in order to show that the stakes are high; exactly because big data can threaten privacy, it is important to scrutinize the concept of big data itself and its many assumptions. This, in turn, may help privacy advocates to formulate their objections and possible alternatives more clearly and assertively when dealing with big data.

When writing about harms to privacy it is important to note that there are different ways to approach and address these harms. A distinction that is often made, for instance by Moore (2010: 14), is that between normative and non-normative (or: descriptive) theories of privacy. Nissenbaum (2010: 67) makes a similar distinction in Part II of her book. Descriptive accounts describe conditions that need to be satisfied in order to be able to label something as private. One could say that they describe the ‗is‘; something is private/the condition of privacy obtains. Normative accounts utilize moral norms to establish an ought; the moral privacy norms should not be breached and if they are, blame and/or sanctions are appropriate. Moore gives the following example to illustrate the difference between both conceptions:

―When I was getting dressed at the doctor‘s office the other day, I was in a room with nice thick walls and a heavy door – I had some measure of privacy.‖ Here it seems that the meaning is non-normative – the person is reporting that a condition obtained. Had someone breached this zone, the person might have said, ―You should not be here. Please respect my privacy!‖ In this case, normative aspects would be stressed (Moore 2010: 14).

Notice that descriptive accounts do not necessarily imply such an ought: one can simply notice that certain conditions are (no longer) met and that therefore something is no longer to be considered private, without jumping to the conclusion that this is (morally) bad and that the thing or practice should have stayed private.

In the following, I will not use this distinction in such a strict manner. In most cases, elements of both the conceptions are to be found. Most of the harms discussed are to be seen as privacy harms because big data practices utilize personal data that are no longer private when utilized in these big data practices (i.e. the conditions of privacy are no longer met), which in turn generate harms that can have to do with things other than privacy (e.g. social discrimination). So in those cases, the non-privateness of personal data is what makes it possible for other non-privacy harms to occur at all, and as such those

(12)

12

harms are privacy problems because the non-privateness of data is an essential element in the causal chain leading to harm. In other cases, the breach of privacy itself can be seen as the breach of a moral privacy norm, meaning the breach of privacy is itself the harm. I will speak of both types of situation as privacy harms in order not to complicate matters unnecessarily, but the reader should be aware of the different structures of the privacy harms discussed.

Intimate Personal Information

The most obvious privacy risk that big data poses is intimate personal information that has not been explicitly released by the subject but that can nonetheless be inferred from datasets by using typical big data analysis techniques such as (predictive) data mining. An example that has been used by multiple scholars and even received wide attention in (mostly) American news media is the ‗pregnancy example‘. A female high school student began receiving coupons discounting various baby items from the retail company Target. Her father, considering his daughter‘s age, saw this as rather inappropriate. After complaining to Target, he discussed the incident with his daughter, who, as it happened, was pregnant, although never having disclosed this information to Target, at least not explicitly. Implicitly, though, she did, as Target, by keeping track of customers‘ purchases, can create personalized advertisements. Target statistician Andrew Pole explains this process, specifically how Target can predict a pregnancy: ―[a]s [his] computer‘s crawled through the data, he was able to identify about 25 products that, when analyzed together, allowed him to assign each shopper a ―pregnancy prediction‖ score. More importantly, he could also estimate her due date to within a small window, so Target could send coupons timed to very specific stages of her pregnancy‖ (Duhigg 2012).

The example clearly shows a big data approach being put to work. Large volumes of data, ideally as much data on Target‘s customer as possible, are collected in order to run analyses on the data that can generate useful (or: profitable) information for Target. The statisticians cannot always predict what they will find, quite to the contrary; their job is exactly to come up with interesting new information concerning their customers‘ behavior that does not follow directly from the data but has to be retrieved via data mining techniques. The data themselves did not simply tell the statisticians that such and such women were pregnant; they needed to construct algorithms to find new correlations and, when such correlations were found, test these correlations.

The privacy threat is clear: many separate pieces of not so intimate data4 (e.g. the date and time on which you have bought this or that product, the frequency of buying certain products, deviations from your ‗normal‘ purchasing patterns, etc.) can, when combined and even supplemented with data

(13)

13

from different sources, reveal new, very intimate information. The attentive reader might have already understood that the pregnancy prediction can be seen as an instance of emergent data; the non-trivial information regarding the pregnancy is generated out of existing data.

Profiling

The pregnancy example and the comments on intimate personal information can be extended to a related privacy harm: profiling. What the pregnant girl effectively became a victim of was the practice of profiling. Because she manifested a certain kind of behavior, she was profiled in a certain manner. Target had built a profile which allowed them to place her into certain categories (in this case the category ‗pregnant‘), categories on the basis of which they can adjust and personalize their advertisement. Since profiling becomes more powerful the more accurate it gets, and since big data approaches have introduced techniques that can strengthen profiling practices dramatically (Mayer-Schönberger and Cukier 2013: 160-1), the possible privacy threats that emerge due to profiling practices are big data related privacy threats.

Now, there are different ways to link profiling to privacy and to harm. The very fact that you are profiled on the basis of your personal information can be harmful. To make this claim more convincing we can focus on the respect for persons. The fact that people or companies do certain things with your information for certain ends can fail to respect you as a moral agent, for example when your personal data are used in ways that express values that you as an individual do not share of even condemn. Another way in which profiling can fail to respect someone is by understanding profiling as a practice that assesses and evaluates people in a specific way, for instance by placing them in reputation silos (Turow 2011: 126-128) attributing properties such as ‗predicted profitability for company X‘ to them. The fact that others present, assess, and evaluate individuals in a certain manner can have a direct impact on the way they perceive themselves. Knowledge of the profiling practices can cause individuals to – partly – see and understand themselves through the eyes of this profiling process, and this can harm their integrity as a moral agent. ―A man‘s view of what he does may be radically altered by having to see it, as it were, through another man‘s eyes‖ (Benn 1984: 242). Failing to understand this mechanism and failing to respect the fact that it can have significant effects on people can be seen as a failure to show respect for persons.

(14)

14

Social Discrimination

The previous two sections focused primarily on privacy harms that can occur at a purely individual level. Big data can also inflict harms that are more social than private, with profiling, reputation silos, the personalization of content, and personal targeting contributing to social discrimination.

Turow (2011) gives a clear example of the possibilities of social discrimination on the basis of big data practices. He describes a fictional middle class family, Rhonda and Larry and their children (Turow 2011: 2-3). Based on their everyday dealings, they receive personalized ads, offers, and information. Because they often eat out at fast-food restaurants, they start receiving coupons giving discounts at these fast-food restaurants, but the family is also targeted by the fitness industry, being inundated with weight loss and dieting advertisements. When they search for cars on the internet they are greeted with articles on cheap, used cars. To see the discriminatory elements of this situation, Turow introduces another person, Larry‘s boss, who has exactly the opposite experience when visiting websites on cars. She sees articles on the newest luxury cars and is offered free test drives. This situation may of course lower the self-worth or self-esteem of Rhonda and Larry: apparently they are deemed ‗less worthy‘ than higher income people. But there is more to it: ―In fact, the ads may signal your opportunities actually are narrowed if marketers and publishers decide that the data points – profiles – about you across the internet position you in a segment of the population that is relatively less desirable to marketers because of income, age, past-purchase behavior, geographical location, or other reasons‖ (Turow 2011:6). Rhonda and Larry seem to have ended up in a ‗waste‘ category of many companies, meaning marketers are less interested in them since they do not expect a return of their investment spent reaching Rhonda and Larry (Turow 2011: 89). At the same time, the fact that other companies do see them as promising ‗targets‘ can be seen as both stigmatizing and aggravating their current situation. Think of being offered short-term loans with high interest rates and coupons for unhealthy food. This way big data can promote social discrimination. We are dealing with a privacy issue because it is exactly the non-privateness of all kinds of personal data that is a necessary and major component of the big data practices that lead to social discrimination.

Filter Bubbles

Apart from the possibility of discriminatory practices outlined above, something that may be best described as a tendency towards social segregation can be observed as well. In 2011 Eli Pariser published

The Filter Bubble, a book that describes how people can come to live in their own bubbles due to many of

the big data practices mentioned above. The idea is quite simple: techniques like profiling, building reputation silos, and personal targeting will lead to an increasing amount of businesses adopting

(15)

15

algorithms that are sensitive to the (online) profiles they have created for individuals. As an effect, content will be personalized for you specifically. This may sound very convenient, and sometimes it is, but it can also have adverse side-effects.

If most of the things that you see online are catered to you based on your profile and if the same goes for others – who are bound to have different profiles since they are, although not entirely but always at least partially, different – you will live in a different online world than theirs. The online space that is shared diminishes. Newspapers serve as a good example. The paper versions of newspapers are read less and less and their websites and/or apps become increasingly important. If one has subscribed to Dutch newspapers like de Volkskrant and NRC Handelsblad, one usually also gets access to all the content on their interactive website and/or app. While the content in the paper version of the newspaper is the same for everyone, the interactive websites and apps can be personalized. On the subscription page of de

Volkskrant the following text can be found:

De Volkskrant Online Plus

[…]

Never miss anything

Customize your own Volkskrant

From now on you decide which news articles, themes or columns you do not want to miss. Indicate which subjects or authors you want to follow. This allows for your own personalized page which will include all the things you find important. You customize your own Volkskrant, it‘s that simple.5

This means that two persons can be subscribed to the same newspaper online without actually reading the

same newspaper. They both live in a different bubble. Reading different articles can cause these people to

think about different topics, be preoccupied with different topics, discuss different issues, etc. People can come to live in noticeably different worlds.

At the same time, something like a looping effect can occur. Because people are presented with things that already fit their profile, the chances of them becoming interested in new things that deviate from their ‗average‘ interests declines as well. So the chances that you stumble upon something new, something that can potentially introduce you to new publics, or new strains of thought, also decline in the filter bubble. This is facilitated by the way the personalizing algorithms function: ―The statistical models that make up the filter bubble write off the outliers. But in human life it‘s the outliers who make things interesting and give us inspiration. And it‘s the outliers who are the first signs of change‖ (Pariser

(16)

16

2011: 134). Another way to address this looping mechanism is to look at it from the perspective of ―information determinism‖ (Pariser 2011: 135). Things you have done in the past determine what your profile will look like, and your profile determines what you will be presented with online. The algorithms induce from your past behavior what your future interests will be like. ―But algorithmic induction can lead to a kind of information determinism, in which our past clickstreams entirely decide our future‖ (Pariser 2011: 135).

The Volkskrant example is informative, but does not cover filter bubbles exhaustively. In the

Volkskrant case, the readers customize the content themselves. They do their own filtering, in a

quasi-conscious manner6. On the internet, however, much of the filtering is done automatically, without the subjects of the filtering having an explicit role in the process. To return to Rhoda and Larry, the filtering was done for them, without them explicitly asking for it or without them explicitly providing enormous amounts of data for that specific purpose. And the same holds for Larry‘s boss. Both of these families ‗reside‘ in different filter bubbles, simply by virtue of living their lives, both offline and online, and leaving behind enormous data trails that can be used to create these personalized, and therefore substantially different, filter bubbles. The fact that these filter bubbles function in the background of our daily life, without us being conscious about them most of the time, makes it all the more pressing to address them.

In the case of filter bubbles, it is, again, the lack of privateness of much of our data – which allows for all these filter practices – that leads to a harm that is not a privacy harm in and of itself, namely segregation and the erosion of a common public space7. The situation is somewhat ironic. Exactly because some of our data are taken out of the private sphere by smart companies, the common

public space is eroded by increasing amounts of personalized (one could even say private) content.

Conclusion

Big data can have a lot of influence on many issues that can be regarded as privacy issues. Big data practices can lead to direct privacy intrusions, for example by revealing intimate information or by failing to respect persons by intruding in their private space with practices that can make them perceive

6 I say ‗quasi-conscious‘ and not ‗conscious‘ because, after the initial conscious input of personal preferences, it is

easy to forget, months later, that one is indeed reading highly personalized content that can differ from person to person.

7 I have simply assumedhere that the erosion of the public space is a harm. There are, however, good reasons to

believe that this is the case. Hannah Arendt explains the enormous value of the public space in The Human

Condition (1958), where she argues that a proper public space is a necessary precondition for action (in Arendt‘s

specific sense of being a fundamental human capacity through which people can ―insert‖ (Arendt 1958: 176) themselves into the world and begin something that can never be wholly determined by what precedes the action).

(17)

17

themselves differently or by introducing values in their private lives that they do not feel comfortable with. At the same time, other harms that are not direct privacy harms themselves can be enabled by a lack of ‗privacy of our personal data‘, making them privacy threats in an indirect manner. Here one can think of harms such as social discrimination and the erosion of a common public space as a result of filter bubbles.

All of this goes to show that big data is a topic that has real and significant effects on society. Because it is such an influential movement (with many privacy implications), it is very important to get a firm grasp of the way these big data practices are (often implicitly) justified and whether these justifications are well-reasoned for and acceptable. This is what I will be preoccupied with in the remainder of this thesis.

(18)

18

Chapter 3: The ‘Finders, Keepers’ Ethic

In the previous two chapters I have established what big data is and why it poses a threat to privacy. This has hopefully made clear that big data is a phenomenon that needs to be scrutinized, for it has the potential to transform our society and our lives. In this chapter I will try to critically assess the philosophical assumptions that underlie the big data movement. In order to do so, I will use multiple examples to explicate the ways in which people talk about big data and, consequently, think about big data. On the basis of this analysis I will then present the hidden assumptions that come with these views. The fact that these assumptions have remained largely implicit means that few explicit justifications for the current understanding of big data have been provided. Once made explicit, it becomes clear that many of these assumptions are problematic.

Big Data as Oil and as a Goldmine

The subtitle of Mayer-Schönberger and Cukier's (2013) book on big data is ‗A Revolution That Will Change How We Live, Work, and Think‘. They are not alone in the belief that big data will have a lasting and substantial influence on our everyday lives. As chapter 1 already explained, big data does not just ‗do data bigger‘; the approach allows us to do qualitatively new and different things with data as well. We can create new data and new insights out of existing data, new data that were simply inaccessible before8. It is because of these reasons that they say that

[b]ig data marks the beginning of a major transformation. Like so many new technologies, big data will surely become a victim of Silicon Valley‘s notorious hype cycle: after being feted on the cover of magazines and at industry conferences, the trend will be dismissed and many of the data-smitten startups will flounder. But both the infatuation and the damnation profoundly misunderstand the importance of what is taking place. Just as the telescope enabled us to comprehend the universe and the microscope allowed us to understand germs, the new techniques for collecting and analyzing huge bodies of data will help us make sense of our world in ways we are just starting to appreciate (Mayer-Schönberger & Cukier 2013: 7).

8 This statement can be read as being ontologically ambiguous. Two different interpretations are possible. One

can say that these new data did not exist at all before the data mining and were truly brought into existence by the act of data mining. If this is the case, then the new data were inaccessible before the data mining because only the potential for it to be created existed, but the data themselves did not. Alternatively, one can say that the new data have been latently present all the time, but were simply ‗hidden from sight‘, making them inaccessible until discovery. Creation would then mean gaining access to these latently present data. I can remain neutral towards either interpretation because I believe both interpretations of the ‗ontology of newly created data‘ work equally well with the theory I will develop later on in this chapter in order to describe the conduct of big data companies.

(19)

19

Neelie Kroes, former European Commissioner for the Digital Agenda of the European Union, delivered a speech at March 26, 2013 with a similar tone, praising the enormous potential of big data and urging the European Union (EU) not to ―miss out on that kind of growth opportunity‖ (Kroes 2013). Kroes continues by calling data ―the new oil‖ because she thinks that ―it‘s a fuel for innovation, powering and energising our economy. Unlike oil, of course, this well won‘t run dry: we‘ve only just started tapping it‖ (Kroes 2013). Now, in order to actually benefit from big data, we must get our hands on enormous amounts of data; a machine does not run without the oil to fuel it. That is why Kroes urges the EU to accept legislation that will ―unlock this goldmine‖ (Kroes 2013).

With the arrival of big data, data are continuously presented as ‗the new oil‘ and as a new ‗goldmine‘ with both academic and non-academic sources making use of this metaphor (Angwin 2010; Cukier 2010; International Data Corporation 2011: 3; World Economic Forum 2011; Chen, Chiang, and Storey 2012: 1167; Peters 2012; McAfee and Brynjolfsson 2012: 59; Mayer-Schönberger and Cukier 2013: 16; SAS 2013; Steinberg 2013; Peterson 2014; Zicari 2014: 104; CLEAN 2015; Kumar 2015). This way of talking about data and their potential in relation to big data seems to presuppose that (personal) data are a commodity or a good; pieces of property that can be owned, traded, and appropriated (more on this in chapter 4 and chapter 5). Moreover, personal data are made the object of entrepreneurial activity. Exactly because big data analysis is aimed at extracting surprising new insights from existing data, meaning one cannot always predict what one will find, an entrepreneurial aura has come to surround big data. Entrepreneurship goes hand in hand with uncertainty, for an entrepreneur can never be sure beforehand what she will find or whether what she has found will be useful or profitable. Especially in the American context this entrepreneurial aura of big data is found prominent.

A given level of uncertainty may weigh differently depending on the risk profile of a given culture or society. The United States, for example, established by explorers who pushed the frontier in a lawless atmosphere, continues to highly reward entrepreneurship, innovation, research, and discovery. The quintessential American hero is the lone entrepreneur who against all odds weaves straw into gold (Polonetsky and Tene 2013: 31).

One preliminary way to sketch entrepreneurial activity in this field of big data is the following one: entrepreneurs know that large volumes of data may contain hidden information that, once discovered, can be profitable. Therefore, entrepreneurs want to possess as much data as possible, so they can start to

(20)

20

mine9 the data in order to find exciting new non-trivial data: emergent data. If the entrepreneur does indeed stumble upon emergent data, then the entrepreneur can appropriate these emergent data and use them according to her own insights and interests. The profitability of big data entrepreneurship rests on the ability to appropriate the emergent data that are the result of data mining techniques.

Now, it is exactly this practice that I want to scrutinize. In this chapter I will investigate the – what I believe to be – ‗finders, keepers‘ ethic that is presupposed by big data entrepreneurs. Another big assumption that I believe should be scrutinized is the idea that personal data are a piece of private property that can be appropriated, owned, and traded like a commodity. I will deal with this topic extensively in later chapters, but for now I will focus on the ‗finders, keepers‘ ethic. So the reader should be informed that the fact that I do not yet problematize the personal data as private property view here, does not mean that I believe the challenges it poses to be insignificant. It is highly significant and intertwined with the ‗finders, keepers‘ ethic, but for the sake of the clarity of argument I will deal with them separately.

Kirzner on Entrepreneurial Activity

In the next sections I will draw upon the work of Israel Kirzner, an author who develops a model of ―the morality of the entrepreneurial role‖ (Kirzner 1978: 9), in order to give an account of the rationales that underlie the big data movement, including the ‗finders, keepers‘ ethic.

Kirzner departs from Nozick‘s theory of entitlements (Nozick 1974: 150-3). He agrees with the spirit of Nozick‘s theory but believes it to be defective when it comes to the entrepreneurial role. In short, Nozick‘s theory of entitlements depends on just acquisition and just transfer of goods10. In the context of a free market, transfers can only be just if they are voluntary. Kirzner now asks how we are

9 It is interesting to see that term ‗data mining‘ fits the idea of data as a hidden goldmine or oil reservoir perfectly.

Both suggest that data are like a natural resource, at times hard to find, hard to get out of the ground, but at the same time immensely profitable.

10 To expand a little on Nozick‘s account of entitlements: Nozick‘s account is a historical one. This means the

following: in order to assess whether a given distribution of goods is just, one should investigate the way this distribution developed itself historically. Nozick‘s proposal is quite simple:

―If the world were wholly just, the following inductive definition would exhaustively cover the subject of justice in holdings.

1. A person who acquires a holding in accordance with the principle of justice in acquisition is entitled to that holding.

2. A person who acquires a holding in accordance with the principle of justice in transfer, from someone else entitled to the holding, is entitled to the holding.

3. No one is entitled to a holding except by (repeated) applications of 1 and 2‖ (Nozick 1974: 151).

This account of entitlements allows Nozick to say that uneven distribution can never be unjust simply because of the fact that they are uneven: a radically uneven distribution can come into existence without breaching the principle of justice in acquisition and without breaching the principle of justice in transfer. This idea forms the basis of Nozick‘s critique of theories of justice that opt for end-result principles (such as the theory Rawls develops in his A Theory of Justice).

(21)

21

to understand error; to which extent does the involvement of error erode the voluntariness of a transaction? This is an important question, from the perspective of the entrepreneur, as Kirzner is keen to stress: ―For the economist‘s model of market equilibrium it is not only possible, but indeed necessary, to imagine a world without error‖ (Kirzner 1978: 11). This picture, however, is too ideal. In reality, disequilibria prevail and it is exactly in states of disequilibrium that entrepreneurial activity flourishes. Moreover, entrepreneurial activity can only exist because errors are being made. And since entrepreneurial discoveries are necessary to eventually bring the market into equilibrium again11, Kirzner says that ―[t]he equilibrative aspects of the market process depend, in an essential way, upon the lure of the profits made possible by the errors of those with whom the entrepreneur deals‖ (Kirzner 1978: 11). The entrepreneur depends on other people‘s errors, using her entrepreneurial insight to purchase goods at a low price which she then resells at a profit. If all others were aware of this insight of the entrepreneur, they would never sell at such a low price, prohibiting the entrepreneur‘s success. The fact that they do sell at a low price, signals our attention towards error. In hindsight, knowing about the insight of the entrepreneur, the original sellers would realize that they had sold their goods too cheaply: they made an error. Once all sellers know about the practices of the entrepreneur, they will only settle for the highest price possible, knowing what price the entrepreneur is willing to accept given the entrepreneurial project the entrepreneur envisages, bringing the market back into equilibrium.

Given error‘s effects on market dynamics, it is necessary, Kirzner affirms, to assess its role within a theory of entitlements. If we accept that errors are equal to involuntary transactions, the role of the entrepreneur becomes precarious, to say the least. Therefore, Kirzner needs a different conception of the entrepreneurial role, one that can be sustained in the face of error‘s necessity. This new conception must be able to explain why entrepreneurial activity occurring during disequilibrium does not breach any principles of justice. It must explain why the entrepreneur is entitled to the profits and goods she acquired through entrepreneurial activity.

Now, Kirzner‘s strategy to tackle this difficult challenge is not to argue that somehow errors should not be seen as such. Quite the contrary, Kirzner embraces the existence of error: ―genuine error is alive and well‖ (Kirzner 1978: 14). What Kirzner does do, however, is to provide a different story as to how the actual process of acquisition of profits by the entrepreneur should be understood. He proposes to resolve the challenges by accepting both ―a particular ethical judgment‖ and ―a particular

economic insight‖ (Kirzner 1978: 17). The ethical judgment consists in the acceptance of the ‗finders,

keepers‘ ethic and the economic insight

(22)

22 is that which permits us to perceive the discovery of a hitherto unknown market use for an

already owned resource or commodity as the discovery of (and consequently the spontaneous establishment of ownership in) a hitherto un-owned element associated with that resource or commodity (Kirzner 1978: 17).

The ‗finders, keepers‘ ethic means precisely what it appears to mean: those who find something that is not held by anybody, are, as they found it, the legitimate owners of that which they have discovered. Kirzner admits that few scholars have accepted this ethic in the domain of acquisition from nature. The idea that seems to prevent one from accepting the ‗finders, keepers‘ ethic in this domain is that the simple fact that someone stumbling upon an unheld resource cannot justify the fact that suddenly all of mankind is prohibited accessing it.

Kirzner, however, proposes to reconceptualize what discovering something that was previously unheld means. ―In order to introduce plausibility to the notion of finders-keepers, it appears necessary to adopt the view that, until a resource has been discovered, it has not, in the sense relevant to the rights of access and common use, existed at all‖ (Kirzner 1978: 17). This in effect means that, under Kirzner‘s reconceptualization, the discoverer of an unheld resource, brings the resource into existence and must therefore be seen as the creator12 of the resource. The idea here is that creation is a substantially different

act than acquisition from nature. The latter occurs ―against the background of a given unheld resources (even if no one is aware of their very existence)‖ (Kirzner 1978: 18), meaning that acquisition constitutes a transfer, namely from nature to the discoverer who becomes the first holder. In the case of creation, no notion of transfer is involved in the establishment of ownership over the created good: ―the finder-creator has spontaneously generated hitherto non-existent resources, and is seen, therefore, as their natural owner‖ (Kirzner 1978: 18). If the finder has generated the goods by finding it, it cannot transfer from nature to the finder for the simple fact that the goods did not exist, in the relevant sense, in nature before they were found. The reader should note, however, that finders-keepers does not fit every single case; there are legitimate cases of acquisition from nature, namely cases in which everyone is fully aware of the existence of the resource or goods13.

Moreover, entrepreneurs do not – or not exclusively – appropriate unheld resources, but acquire held resources via just transfers, apply an entrepreneurial insight to create more commercial

12 ―We see the entrepreneur as ―creator‖ not in the sense of the physical producer, but strictly in the sense of his

being the discoverer of an available opportunity‖ (Kirzner 1978: 19)

13 Kirzner‘s example to illustrate such cases: ―The first man to land on Mars can hardly claim title to it as its

―creator.‖ In order to establish just ownership in an unheld resource the existence of which everyone is fully aware, it is certainly necessary to follow the criteria considered appropriate to just acquisition from nature‖ (Kirzner 1978: 18).

(23)

23

value, and then profit from these improvements. Those are two different situations, although the way they have to be understood according to Kirzner will turn out to be remarkably similar.

Entrepreneurs‘ main activity consists in finding and exploiting market opportunities, and this usually happens when they discover a new marketable property or application of a known resource or commodity. For Kirzner, his ‗economic insight‘ applies in this case: ―the discovery of a hitherto unknown

market use for an already-owned resource or commodity constitutes the discovery of a hitherto un-owned

element associated with that resource or commodity‖ (Kirzner 1978: 18). Put differently, the owner of a resource can only be the owner of those properties and uses of a resource that the owner is explicitly aware of. This is in stark contrast to the view (one that Kirzner calls the conventional view and attributes to Nozick) that ownership means ownership of all a resource‘s or commodity‘s properties and applications, even ones that have yet to be discovered.

It may be instructive to introduce an example at this point (one that I borrowed from Kirzner) to make things more tangible: the example of oranges and orange juice. Image an entrepreneur who can buy oranges on the market for €5, who knows she can convert those oranges into orange juice for €4 (costs of the conversion process of oranges to orange juice), and who also knows that consumers on the market are willing to pay €12 for the orange juice. The entrepreneur who discovers this market opportunity can make a nice profit of (€12 - (€5 + €4)) = €3. The idea here is that the entrepreneur has

created — ex nihilo — the new use for oranges and has therefore created the additional value of €3. In

other words, the additional value of €3 was not, in any relevant sense, present in the oranges before the entrepreneur‘s intervention. This also means that the newly created value was not transferred from the original holder of the oranges to the entrepreneur, since this created value came into existence after the entrepreneur acquired the oranges and applied her insights to the product. ―He may, then, be held to have ―created‖ this additional value in these oranges. It is as if the entrepreneur found orange juice and marmalade in nature, where no one had perceived their existence; he has ―created‖ the orange-resource that can provide juice and marmalade‖14 (Kirzner 1978: 19).

Now, returning to the issue of error: the idea that ex nihilo creation is a justified description of the entrepreneur‘s activity gets rid of the challenges the idea of error poses to the justice of entrepreneurial activity. The question of the possible involuntariness of disequilibrium transactions involving entrepreneurs is conceptually not even impossible under Kirzner‘s scheme. Remember: the supposed involuntariness is presumed to be based on the fact that the initial sellers unwittingly sold not only the oranges for a certain market price, but also all their potential properties and applications, some of which the entrepreneur later exploits to create additional value. The idea that error is involved in this

14 Here one can see very clearly how similar acquisition from nature and entrepreneurial activity with previously

(24)

24

case stems from this (according to Kirzner wrong) observation: the initial sellers sold the additional value (or: the ‗orange juice making capabilities‘ of the oranges) that happened to reside in the oranges too cheaply (or even for free), which constituted an error. Now, Kirzner‘s answer to this is simple. Exactly because we are dealing with ex nihilo creation by the entrepreneur after the initial transaction, ―this additional $3 value may well be held never to have been possessed by the seller at all‖ (Kirzner 1978: 20). That part of the transaction which allows the entrepreneur to be an entrepreneur was, properly speaking, never part of the transaction, for the word transaction implies that the element of the good that is exploited by the entrepreneur to allow for her profitable insight to work was first was held by the seller and later, after the transaction, by the entrepreneur. But this is not the case, because the entrepreneur created the additional value ex nihilo, meaning that the initial seller never possessed it to begin with! And since the seller never possessed it, no discussion is necessary regarding her claim on the additional value: no such claim is possible if she weren‘t the owner to begin with.

To summarize: when an entrepreneur creates something new by utilizing something already existing, the entrepreneur can be said to have created something ex nihilo, and under a ‗finders, keepers‘ ethic this means that the fruits resulting from this new insight can be appropriated by the entrepreneur15 without inflicting any sort of injustice on the initial owners of the goods subject to entrepreneurial activity.

Big Data Entrepreneurship

This rather technical description of entrepreneurial activity can help us understand the practices of big data entrepreneurs. Those working with big data use, just like the orange juice entrepreneur, specific given goods to create something new out of the given goods. In the case of big data, personal data are used to extract non-trivial new information out of the given data via the technique of (predictive) data mining. The big data entrepreneurs then appropriate the emergent data. It is the ‗gold‘ that is so emphasized by commentator. And just like in the case of the orange juice, we can ask whether the big data entrepreneur can legitimately appropriate these new insights. Kirzner‘s answer can still apply here. As long as the entrepreneur gets a hold of the original personal data in a just way, the entrepreneur is free to apply entrepreneurial insights and appropriate the additional value. Indeed, justice even requires that the entrepreneur is the legitimate owner of these new emergent data that are extracted from the original data by the entrepreneur. Just like the seller of the oranges was never the owner of the property

15 Kirzner says that an even stronger formulation is justified: ―justice requires that the ―creator‖ be recognized as the

owner of what he has ―created‖: to deny the ―creator‖ title would be to inflict injustice on him‖ (Kirzner 1978: 24).

(25)

25

of the oranges that allowed the entrepreneur to make orange juice out of the oranges, the data subjects whose data are the original input of big data analyses were never the owners of the emergent data the big data entrepreneur finds and appropriates, for the data subjects were never explicitly aware of the specific emergent data the big data entrepreneur happens to find through data mining. And because they were never the owners of the data that are mined, they have no legitimate claim to the emergent data.

The example of Google can be of help here to makes things a little more tangible and to illustrate the way Kirzner‘s libertarian ‗finders, keepers‘ ethic seems to apply to big data. Google is a typical big data company whose business model is based on the clever usage of the personal data of its users. The business is premised on a simple ‗deal‘: free services in exchange for personal data. The personal data Google gathers in this way can then be put to work to generate revenue. It is a well-known fact that most of Google‘s revenue comes from advertisement16, more specifically from personalized advertisements that can be presented to individuals based on the extensive profiles Google has built – and continues to build – of its users. Google can thus be seen as one of the first, if not the first, to discover the enormous potential of the creative uses of personal data.

Put in Kirzner‘s terms, we may say that Google acquired significant amounts of personal data and then, like a true entrepreneur, discovered (and therefore created) new marketable properties – such as the potential of extensive profile building – of the personal data that the users were not aware of. The latent property of personal data that allows them to be used in the process of building extensive personal profiles was created by Google and under a ‗finders, keepers‘ ethic, justice requires that Google is the legitimate owner of these profiles (and the revenue they manage to generate thanks to these profiles). Seen from the perspective of the data subjects, they, as those from whom the personal data originate, have never been the owner of this marketable property of their personal data. This implies that the data subjects have no claim to this profile building potential of their personal data that Google has managed to extract. The fact that, in hindsight, they might feel like they made an error in transferring their personal data, including all the latent properties, to Google is irrelevant. We have already seen that the entrepreneur thrives on exactly these kinds of errors, and, as we have already seen as well, Kirzner can accommodate for that.

Data mining in general, performed by big data companies, can be understood in a similar fashion. As long as the big data companies acquire the initial personal data justly, any non-trivial emergent data they manage to squeeze out of the original data are theirs to keep. (If one reads the transcript in the introduction of Lotame‘s advertisement again, one will notice that this is exactly the

16 In 2014 Google‘s total advertising revenues added up to $59,624,000,000,-, making it Google‘s biggest source

of revenue. In all previous years, advertising was the biggest source of revenue as well. (See: https://investor.google.com/financial/tables.html)

(26)

26

service they are providing.) The data subjects providing the data were, before providing the data, not explicitly aware of the specific emergent data that resided in their data. To see why, remember that these emergent data are in fact new non-trivial data, created out of the original data. The very nature of emergent data is such that they do not follow directly from the original data, meaning that the original data subjects cannot, by definition, be aware of what emergent data can be extracted from their personal data prior to the actual extraction via data mining.

The attentive reader will notice immediately that data mining, understood in a Kirznerean manner, is a practice that can legitimize the conduct of big data companies vis-à-vis data subjects in many instances. Because data subjects can only derive rights from those properties of their data that they are explicitly aware of, and because big data companies specialize precisely in creating new emergent data that could not be simply known before the data mining process, data subjects will have a hard time claiming anything with regards to these companies. These data subjects were never the owners of the ‗gold‘ the big data companies manage to extract out of the ‗mountains of data‘.

In the above section we have seen that Kirzner‘s views on entrepreneurial activity are able to explain why the sphere of big data entrepreneurs functions the way it does. Big data companies‘ main activity is the creation and extraction of non-trivial new data. This means the big data companies are archetypical Kirznerean entrepreneurs, creating something new out of something already existing, thereby exploiting market opportunities to monetize their newly created insights.

Questions for Big Data Entrepreneurship

The fact that Kirzner‘s theory allows for an apt description of the current practices does not, however, mean that these current practices are unproblematic. Kirzner‘s theory itself, and its application to these modern cases of big data entrepreneurship17, depends on multiple assumptions that warrant special attention. The legitimacy of the conduct of big data companies thus depends on the soundness of these assumptions.

a. Divisibility of Personal Data and Commodification

Kirzner‘s entrepreneur dealt in oranges and orange juice, inanimate objects that do not seem to have an inherently personal relation to the persons possessing them. Here, however, we are dealing with personal data and it is unclear whether putting personal data on par with inanimate objects such as oranges and natural resources holds any ground. There are substantial differences that need to be

(27)

27

addressed, for the similarity between both cases is not self-evident. Personal data are exactly that,

personal: they contain – by definition – information that reveals something about the person from whom

they originate. Oranges do not possess this quality.

As we have seen, Kirzner‘s theory depends on the idea that within the same goods, some of the properties can be owned by the original holder, while other properties, namely those allowing for applications the original holder had never thought of before, are unheld at the very same time and can thus, after discovery, be appropriated by someone else. This introduces a certain kind of divisibility to goods which is necessary for Kirzner‘s theory to function adequately. In the case of inanimate objects his theory may be plausible – although even in those cases the divisibility of objects might feel highly artificial. But even if we assume, for the sake of argument, that this divisibility is plausible and accepted by everyone in the case of inanimate objects, it still does not follow that it is, by extension, equally plausible to think of personal data in a similar fashion. Granted, we often do speak of personal data as something – a resource, a thing – that can be owned, but does that automatically mean that personal data are to be understood as nothing more than inanimate objects?

I believe that the relationship between a person and her data is not exactly the same as the relationship between a person and a quotidian object (a table, a phone, a coffee cup, an orange, etc.) she owns. Floridi expresses this suspicion very accurately:

[O]ne may still argue that an agent ―owns‖ his or her information, […] in the precise sense in which an agent is her or his information. ―My‖ in ―my information‖ is not the same ―my‖ as in ―my car‖ but rather the same ―my‖ as in ―my body‖ or ―my feelings‖: it expresses a sense of constitutive belonging, not of external ownership, a sense in which my body, my feelings and my information are part of me but are not my (legal) possession (Floridi 2005: 195).

If we understand the relation between an individual and her personal data the way Floridi does, it becomes immediately clear that it is far from unproblematic to conceive of personal data as if they were like oranges and orange juice. Based on Floridi‘s characterization of personal data, we could say that thinking about personal data exactly like one thinks of oranges is to make a category mistake.

I believe Floridi‘s understanding of personal data is plausible and that it can explain why the idea of divisibility – something Kirzner‘s theory needs – is much less plausible in relation to personal data than it is in relation to inanimate objects. Floridi notes that the ‗my‘ in ‗my information‘ ―expresses a sense of constitutive belonging‖. This remark expresses the idea that your identity as a person is always necessarily constituted – at least partly – by your information (either information about you, or information that you happen to ‗possess‘), seeing the person ―as an informational entity‖ (Floridi 2005:

Referenties

GERELATEERDE DOCUMENTEN

To investigate the impacts of these developments on catchment-wetland water resources, the Soil and Water Assessment Tool (SWAT) was applied to the Kilombero Catchment in

Moreover, this approach makes it possible, analogous to previous fMRI drug studies ( Cole et al., 2010 ), to quantify changes in functional connectivity between all com- ponents of

The Magsphere PS colloids formed larger aggregates than the TPM, cross-linked PS and DPPS colloids at the same ionic strength and waiting time, even though the Zeta potential of the

Opgemerkt moet worden dat de experts niet alleen AMF's hebben bepaald voor de verklarende variabelen in de APM's, maar voor alle wegkenmerken waarvan de experts vonden dat

Table 6.2 shows time constants for SH response in transmission for different incident intensities as extracted from numerical data fit of Figure 5.6. The intensities shown

For the purpose of this study patient data were in- cluded based on the following criteria: (1.1) consec- utive adults who underwent a full presurgical evalua- tion for refractory

Drawing on the RBV of IT is important to our understanding as it explains how BDA allows firms to systematically prioritize, categorize and manage data that provide firms with

The questions of the interview are, in general, in line with the ordering of the literature review of this paper. Therefore, three main categories can be