In Alexa We Trust: How Increasinlgy Humanoid Computers Are Changing Human Behavior

(1)

In Alexa We Trust:

How Increasingly Humanoid Computers Are Changing Human Behavior

Master’s Thesis

29 June 2018

New Media and Digital Culture Universiteit van Amsterdam

(2)

Abstract

This research revolves around the anthropomorphism of (computer) devices and assesses how this practice affects human behavior in general and human-computer interaction in particular. It is nested in the domain of the voice-activated ‘conversational interface’, bringing forward the

Amazon Echo as a case study. As the first and foremost ‘smart speaker’ in the US market, the

Echo is approached on both a theoretical and empirical level by carefully examining three of the main ‘pillars’ of the current Echo ecosystem: The Echo Facebook page, the Alexa Skills store, and the Amazon webstore. Combining quantitative and qualitative analyses of several datasets from these domains with theory from the field of media studies, drawing mostly from platform and app studies, this research demonstrates how device anthropomorphism affects human-computer interaction in various ways. In conclusion, it is argued that the anthropomorphism of the Echo device is a deliberate, ‘trust-inducing design strategy’ above all, ultimately employed by Amazon to increase profits. Looking beyond the widely popularized conceptualization of the voice-activated conversational interface as merely being a ‘natural’ or ‘intuitive’ medium for human-computer interaction, this research illuminates the economic factors underlying this phenomenon.

(3)

Introduction 4 1. Amazon on Facebook 13 1.1 Method 18 1.2 Results 23 1.3 Discussion 29 2. Alexa Skills 33 2.1 Method 37 2.2 Results 43 2.3 Discussion 49 3. Echo Reviews 54 3.1 Method 58 3.2 Results 61 3.3 Discussion 67 Conclusion 73 Acknowledgements 79 References 80 Appendices 88

(4)

“Alexa. Good morning.” “Good Morning.”

“Alexa. How are you doing today?” “I’m AI okay.”

“Alexa. Are you being serious?”

“I like to be useful. But I can have fun too.” “Alexa. Does that make you human?” “Hmm. I’m not sure.”

“Alexa. What are you then?”

“I’m Alexa and I’m designed around your voice. I can provide information, music, news, weather, and more.”

For the largest part of my life, I have been interacting with technology, but interacting with Alexa somehow feels very different from any previous encounter. While being fully aware of the fact that I am conversing with a machine, Alexa’s human voice and quirky character trigger me into engaging with it in general chat, addressing it with human courtesy, and even developing what feels like a personal relationship. Alexa however, is nothing more than the proverbial face – or: voice – behind which a variety of complex technologies hide. It is the so-called ‘virtual personal assistant’ (VPA) of Amazon; the personified, voice-activated

‘conversational interface’ through which users can connect with Amazon’s Echo-devices in an “intuitive and natural way” (McTear 11-22).

“Echo”, as Amazon first introduced its voice-activated smart speaker to the press in June 2015, “is a new category of device designed around your voice—it’s always on, hands-free, and fast—just ask for information, music, news, weather, and more from across the room and get answers instantly”. “Alexa”, the company continues, is “the brain behind Echo (…) built in the cloud, so it is always getting smarter” (Amazon, Amazon Echo Now Available to All

Customers). The Echo is thus a device with a brain (i.e. Alexa) that communicates by voice – a computer with human characteristics. In other words, it is an ‘anthropomorphized’ device. Anthropomorphism, put simply, is the tendency to attribute human characteristics to non-human objects as a way to help rationalize their actions and behavior (Duffy 180). It is this human-computer duality that lies at the heart of this research, which revolves around the

question: How does the anthropomorphizing – or: personification – of the Amazon Echo device affect human behavior in general and human-computer interaction in particular?

(5)

Before elaborating on this question and delving into a human-computer duality narrative, it is important to first outline the cultural, technological, and economic context in which Alexa and the Echo came into existence, as well as the specific technologies they consist of. To be clear, ‘Alexa’ and ‘Echo’ refer to separate, yet inseparable things: The Echo is the hardware that houses the Alexa software; it is the device through which the underlying VPA technology can be

accessed. As these come in a package, this research treats them as such, referring to both when mentioning the device (i.e. ‘Echo’). At the same time however, the study respects the significance of the underlying technology – which can be accessed through other devices as well – by

referring to ‘Alexa’ when only the software is discussed. In some deviating cases there will be a clear indication of the subject(s) under scrutiny.

The idea of employing VPAs in everyday life is definitely not new: The voice-activated conversational interface has been a long-standing vision of researchers in artificial intelligence (AI) and speech technology (Cassell et al. 520). Until recently however, the realization of this vision was confined to the imagination of popular culture, with science fiction books and movies depicting ‘sentient computers’, like HAL 9000 in 2001: A Space Odyssey (1968), or personified, ‘intelligent operating systems’, like Samantha in Her (2013) (Pieraccini 263; Wan 166). Such systems have also been in the picture of major technology companies for quite some time, as a 1987 concept video of Apple depicting its Knowledge Navigator indicates (McTear 15). It was also this company that, after acquiring the necessary technology from the US-American startup Siri

Incorporated for an undisclosed amount in 2010, introduced Siri in 2011, now generally recognized

as the first voice-activated VPA (Both 108; McTear 16).

To understand the recent rise of the conversational interface, a term that refers both to voice-activated assistants like Siri and Alexa and ‘intelligent’, automated text-based chatbots with which one interacts by typing, such as Facebook’s M, it is important to highlight some of the technological advances that have contributed to this development (Newman; Brownlee). Besides the more obvious factors that appear on the surface, such as the ever-increasing computing power of devices, faster wireless networks, and the fact that major technology companies share great interest in the technology, there are also some more profound reasons behind the rise of the (voice-activated) conversational interface (McTear 16-18).

First, research in the field of artificial intelligence has shifted its focus from so-called

‘knowledge-based’ approaches, which pursue intelligence in computers by training them to solve problems that are difficult for humans bßut easy for computers (e.g. in the domains of decision-making, chess, etc.), towards ‘subsymbolic’ approaches, which instead revolve around easy ‘problems’ for humans that have proven to be difficult for computers (e.g. in the domains of

(6)

speech, emotion, etc.). In an official Google video on Youtube, psychologist Allison Gopnik notes how AI has shifted its focus towards tackling new challenges: “The things that we thought were going to be easy for a computer system, like understanding language, those things have turned out to be incredibly hard” (Google, Behind the Mic). Secondly, language technologies have benefitted greatly from recent technological developments within domains such as neural networks, big data, and deep learning, drastically increasing the accuracy of speech recognition technology and spoken language understanding (McTear 16-18; Pieraccini 136). Lastly, as the founding father of the internet Tim Berners-Lee already prophesied in 2001, the Web has been evolving into a ‘Semantic Web’, where search and other functionalities are built around the meaning of input, rather than on literal keywords (Berners-Lee 34; McTear 17).

Having sketched out the cultural and technological context in which the conversational interface could come into existence, I now briefly turn to the question of how the Echo and similar devices actually work, before elaborating on the economic context in which they came to thrive. Shedding light on the technicity of the device, I argue, in fact contributes to the better understanding of this economic context. To be sure, this is not an in-depth, all-encompassing specialist explanation, but rather a concise overview of the device’s main technical components. Typically, there are five sequential layers to any voice-activated conversational interface: speech recognition, spoken or natural language understanding, dialogue management, response

generation, and text-to-speech synthesis (McTear 20-21). After activating the Echo by using a ‘wake word’ (‘Alexa’ by default), the integral speech technology first attempts to recognize any spoken language by converting the audio to words (Bohn; Vermeulen et al. 1). Then, it interprets these words and discovers the intended meaning of the speaker (Hwang). If an intended

meaning is not recognized, the dialogue management system seeks clarification by engaging in a dialogue with the user (McTear 20). If the meaning is understood, the system proceeds by constructing a response in natural language, converting meaning back to words (Pieraccini 170). Finally, these words are converted to audio, as the device responds to the user in spoken language (Taylor 146; McTear 21) (see Figure 1a).

(7)

Figure 1a. The five sequential layers of the voice-activated conversational interface: speech recognition, spoken or

natural language understanding, dialogue management, response generation, and text-to-speech synthesis (McTear 21).

While text-based chatbots, the more ‘primitive’ conversational interfaces that do not consist of all aforementioned technologies, were initially believed to be “the next big platform” by many in the technology industry, have failed dramatically in living up to the expectations, Alexa (in this case synonymous for the Echo) and most other voice-activated VPAs have experienced a rapid and continuous increase in popularity ever since their introduction (Griffith and Simonite). Recent market research concludes that one-in-six US adults now owns a voice-activated smart speaker (Ong). Another 2017 study even predicts that by 2020, 75% of US households will own such a device (Gartner). Many of the world’s largest technology companies are employing vast amounts of knowledge and resources to at least one of the underlying technologies; some have even introduced a voice-activated smart speaker – with all of the inherent technologies – of their own (Coyne et al. 1). As mentioned before, Apple’s Siri is generally recognized as the first VPA. With regard to voice-activated smart speakers that house VPAs however, Amazon is widely accepted as the market’s ‘first mover’ with its Echo (Weinberger). After the introduction of the Echo, Microsoft and Google were quick to formulate an answer, with their Cortana and Google

(8)

(Apple). However, enjoying first mover advantage, Amazon has firmly established itself as the market leader, commanding around 72% market share (Kinsella and Mutchler 10).

Figure 1b. The first generation of smart speakers of four of the largest technology companies worldwide. From left

to right: Amazon’s Echo; Apple’s HomePod; Microsoft’s Cortana; Google’s Home.

The varying physical designs of all of these speakers do not hide the fact that they are remarkably similar in function (see Figure 1b). They are all equipped to play music, tell jokes, read the news, translate between languages, provide information, set timers and alarms, and much more. The companies introducing them however, have very different backgrounds. While Amazon, Apple, Microsoft and Google (or: Alphabet), the four largest technology companies in the world in terms of market value, originate from the separate market spheres of e-commerce, consumer goods, computer soft and hardware, and web search respectively, they are now jumping to the exact same occasion (Statista). With the power and influence that these companies wield in today’s global economy, it is of great importance – for academics and society in general – to understand why they all share the interest in building and selling voice-activated smart speakers and the accompanying VPAs.

Part of the reason for this development can be found in the simple fact that these companies are actors in a capitalist system. Shedding light on technology companies from a predominantly economic perspective, Nick Srnicek argues how their decision-making is best apprehended by scrutinizing their quest for profit and their effort to fend off competition. This approach in fact makes the ‘next move’ of such companies more predictable for outside

observers. “Capitalism”, Srnicek continues, “demands that firms constantly seek out new avenues for profit, new markets, new commodities, and new means of exploitation” (10). In accordance with this ‘logic of accumulation’, major technology companies – having already established themselves as uncontested market leaders in their respective spheres – unsurprisingly turn to new markets, such as smart speaker hardware and VPA technology, as new possible avenues for profit (Zuboff 76). However, this is still not a sufficient explanation for why these companies have all turned to the exact same markets, introducing strikingly similar products.

(9)

To truly understand this development, it is essential to delve deeper into economic theory – without resorting to too much jargon – and approach these technology companies as a new kind of firm, constructed around a new kind of business model – the ‘platform’ – and situated in a new kind of capitalism, one that has turned to data as a way to maintain economic growth (Srnicek 13). In this ‘platform capitalism’ – the economic system of the ‘information society’ – data is the raw material to be refined and exploited after being extracted from user activity, the natural source (Srnicek 54; Yeung 119). Platforms like Amazon, Apple, and Google revolve around obtaining control over data, with the aim to predict and even modify the

behavior of their users as a means to produce revenue and increase market control (Zuboff 75). On their indispensable quest to gain access to more data and enabled by aforementioned technological advances, these companies are now expanding their data collection into the relatively undiscovered realm of the home, expecting to uncover and control rich new data sources. As one report puts it: “From a data-production perspective, activities are like lands waiting to be discovered. Whoever gets there first and holds them gets their resources – in this case, their data riches” (Srnicek 127).

In platform capitalism, controlling more data means more control over a market. When in control of a market, platforms can ‘set the rules of the game’, eventually becoming non-regulable, hegemonic models that may even “take on a powerful institutional role, solidifying economies and cultures in their image over time” (Srnicek 13; Bratton 41). Only platforms can compete with and thereby possibly regulate other platforms: no other business model thrives so well in the information society (Srnicek 62). Other platforms thus form the only real threat for a platform’s conquering of a market: Only they are able to extract and control the same large amounts of data needed to expand. As the expansion of platforms is driven by the need for more data, we can see the development of a certain rat race between platforms, who compete vigorously for control over key market positions that are rich in data. This ‘data rush’ ultimately leads to a situation in which platforms become increasingly similar, entering the same markets and launching similar products (Idem 67-68; 136). In this light, the introduction of the

HomePod (paired with Siri) for example, both challenges and resembles Amazon’s longer

established effort of the Echo (and Alexa) and Google’s more recent Home (and Google Assistant) in their quests to control and exploit the supposedly data-rich markets of smart speaker hardware and VPA technology.

Having explained how the Echo works technically and having sketched out the cultural, technological, and economic context in which such devices could come into existence, I have paved the way to return to the main narrative of this research, which revolves around

(10)

anthropomorphism.In this regard, it is important to note that the striking similarities between the smart speakers and VPAs that Amazon, Apple, and Google have introduced are not limited to the confines of function, with which I refer to the aforementioned abilities of these products, such as playing music or reading the news. Rather, these similarities spill over into the realm of

form – the less tangible domain of terminology and underlying ideology with which these

products are introduced to the public by the companies in question. One of the most dominant narratives within this domain, propagated by Amazon as well as Apple and Google, is that the voice-activated conversational interface encompasses an ‘intuitive’ or ‘natural’ way of interacting with computers. At the introduction of a new Home device in 2017 for example, Google notes: “The way you interact with our products has to be so intuitive you never even have to think about it and so simple that the entire household can use it” (Google Event October 4 2017 New Google Home Mini). Propagating similar narratives, Amazon and Apple further attempt to establish natural interaction between humans and computers by personifying their devices – by default addressed as persons, ‘Alexa’ and ‘Siri’ respectively (McTear 15).

The main question of this research revolves around the consequences that this

anthropomorphizing of computers has on human behavior, specifically addressing the effects it has on human-computer interaction. As a case study, the Amazon Echo device is introduced. This research elaborates on that case study, bringing multiple sets of empirical data into the equation. It is divided into three main chapters, as it approaches this subject from three different angles. Each chapter is preceded by and constructed around specific sub-questions. Together, these sub-questions form a framework from which the main research question can be

approached more comprehensively. Discussing different datasets, these chapters are structured according to a similar setup, a) introducing and contextualizing the dataset, b) explaining the research methodology, c) presenting the research results, and d) discussing the findings.

The Echo, compared to other such devices, makes for an interesting case for a

multiplicity of reasons. As mentioned before, Amazon is both first mover and leader in the smart speaker market. Further, Apple’s HomePod is not yet available to the public and Google’s Home is not personified to the same extent as the Echo, as it is addressed and activated as a device rather than a person (‘Ok. Google’). These cases have thus respectively not sparked much research at all or not enough within domains that are of interest to this research, such as anthropomorphism of devices. The most important reason to study the Echo from the

perspective of this data-driven study however, is the simple fact that it was the first of its kind to hit the market and has thus sparked a relatively dense body of data (Weinberger). What also makes the Echo an interesting case study is that Amazon, more so than other contestants, has

(11)

built a platform around its devices, introducing the ‘Alexa Skills store’ that prompts developers to create custom applications for Alexa users, thereby creating a multisided market that caters to several different groups of stakeholders (Rieder and Sire 199). The availability of extensive data from different groups of stakeholders makes for a more holistic approach of the subject, one that better captures its versatility and complexity.

The first chapter of this research approaches the Echo by examining the official

Facebook page on which the device has been promoted from its very introduction. This chapter analyzes the specific terminology that Amazon conveys when parading its product to the public. With access to all publicly available historical data of this substantial marketing channel – which gives a unique insight into how Amazon has thus far framed its device to the public – I propose a quantitative approach to answering the question: How does Amazon employ the notion of anthropomorphism in presenting the Echo to its prospective customers on Facebook? Also having access to the ‘engagement metrics’ (likes, shares, reactions, etc.) of this Facebook page, this research will then, through quantitative and qualitative analyses, proceed to answer the question: How does Amazon’s framing of the Echo subsequently affect the relationship between Echo users and their devices? Elaborating on the last question, this chapter argues that this relationship is not merely a product of top-down imposed marketing, but rather an ever-evolving, in flux phenomenon that develops in dialogue between Amazon and its (prospective) customers. With respect to this argument, this chapter is constructed around theories of ‘prosumers’, ‘customer coproduction’, and ‘consumer publics’, among others (Lloyd; Arnould and Thomspon; Arvidsson, The Potential of Consumer Publics). Taking a step back, this chapter also discusses the semantics of engagement metrics on social media, building on a dense body of literature concerning the ‘real’ and the ‘virtual’ (Rogers, The End of the Virtual; Rogers, Digital Methods; Gerlitz, What Counts?; Rieder, Studying Facebook via Data Extraction), as well as the role of Facebook with regard to (the limits of) user expression, experience, and sentiment (Thaler and Sunstein; Gillespie; Gerlitz and Helmond).

The second chapter is set against the backdrop of the ‘Alexa Skills store’, a subdomain on the Amazon website that lists over 30.000 ‘skills’: instantly accessible functionalities that can be activated on the Echo by using custom voice commands (e.g. “Alexa, what’s my flash

briefing?”). With this market, Amazon invites developers and (commercial) third parties into their ecosystem, further establishing its position as a platform that caters to – and between – multiple stakeholders (Rieder and Sire 199). Approaching the Echo from the perspective of such parties, this chapter evaluates how each understands the device, asking: How do developers and third parties associated with the Echo ecosystem envision people using the device? I approach

(12)

this question by resorting to a quantitative empirical analysis of the aforementioned Amazon subdomain. Subsequently, this chapter employs Skills store data to unveil how Echo users are actually using the device, asking: How is the Echo actually being put to use and how does this differ from the usage envisioned by developers and third parties in the ecosystem? How does this differ from the usage envisioned by the platform itself? To answer these questions, I rely on two extensive datasets that contain (the metadata of) over 11.000 Alexa skills in total, while also borrowing from publicly available market research on the subject (Kinsella and Mutchler; NPR and Edison Research). As the Alexa Skills store concerns a relatively new phenomenon, this chapter will borrow from platform studies, as well as build on a body of literature surrounding a similar market place – that of mobile applications (Helmond et al.; Islam; Guzman and Maalej).

The third chapter zooms in on more specific user behavior and sentiment with regard to the Echo. This last chapter describes both a quantitative and qualitative analysis of the interaction between Echo users and their devices. It forms, to an extent, a renewal of previous research, carried out by Purington et al. and described in the 2017 article ‘“Alexa is my new BFF”: Social Roles, User Satisfaction, and Personification of the Amazon Echo’. Analyzing customer reviews of Echo-buyers on Amazon.com, the researchers aim to illuminate the ways in which “people perceive, interact with, and integrate this device into social life” (Purington et al. 2854). This chapter takes a similar approach, albeit with a different, more up-to-date dataset, and revolves around the following questions: How do Echo users address and interact with their devices? How does this behavior subsequently affect user sentiment with regard to the Echo? The main dataset that is used in this chapter consists of over eighteen thousand customer reviews of the Echo on Amazon.com. Borrowing from methodologies presented in prior research, this chapter applies a structured approach to distill from this dense dataset the information needed to

formulate comprehensive answers to the questions above (Purington et al.; Coyne et al.; Mudambi and Schuff).

Current study brings together theory and empirical data to establish whether and how anthropomorphism of computers is changing the ways in which we perceive and use these computers, ultimately touching upon the implications this has for the future of human-computer interaction. By approaching the Amazon Echo from three different perspectives, collecting and analyzing vast amounts of data from three of the core ‘pillars’ of the Echo ecosystem, this research proposes an extensive, yet tangible way of tackling this subject. Importantly, this study rises to the occasion of exploring the rather unexplored domain of the novel and immensely popular voice-activated conversational interface. As ‘natural’ and ‘intuitive’ this interface may seem, its rapid rise can only truly be understood by first scrutinizing the companies behind it.

(13)

1. Amazon on Facebook

In 2015, research found that manipulating slot machines to expose users to an

anthropomorphized description of these machines increased gambling behavior. ‘Priming’ such users with these anthropomorphic machines, the research concludes, makes them gamble – and lose – more, ultimately benefiting the casino and negatively impacting the gambler (Riva et al. 313). Anthropomorphism of devices, other research underscores, increases the user’s trust for and engagement with these devices, thereby effectively affecting user behavior (Schuetzler et al. 12). The choice of Amazon to personify the Echo then, can be conceptualized as a “trust-inducing design strategy”, aimed at establishing a more positive and thus more durable relationship between users and their devices (Seeger and Heinzl 130).

With regard to commerce, such a relationship may also be a more fruitful one: As research shows, trust is of landmark importance in the decision-making process of customers within the domain of e-commerce (Gefen 734). As all of these studies indicate, there are clear advantages for Amazon to introduce anthropomorphized hardware – none of which are obviously mentioned in the company’s official press release of the Echo (Amazon, Amazon Echo Now Available to All Customers). Current research however, does emphasize how the anthropomorphizing of the Echo device benefits Amazon’s commercial operations. In this first chapter, I approach such commercial benefits by illuminating the ways in which Amazon actively shapes public perception of the Echo by deliberately integrating and iterating

anthropomorphism narratives in their public communication around this product, ultimately exploring if and how this affects the behavior of (prospective) users with regard to their Echo.

To do so in a constructive and tangible manner, this chapter takes a two-fold approach to the concept of ‘anthropomorphism’. On the one hand, it establishes and compares the ‘degree of personification’ that Amazon and (prospective) users ascribe to the Echo device. On the other hand, these parties are scrutinized and compared for the ‘degree of sociability’ they ascribe to the device in their descriptions of varying use cases. Building forth on methodology

introduced in prior research, this chapter approximates these degrees by analyzing the specific language used by Amazon and its (prospective) customers – or: users – to address the device with and to specify how it is (to be) used (Purington et al. 2855). This chapter applies this approach against the backdrop of the ‘Computer as Social Actors’ (CASA) paradigm, which describes how “people respond to technologies as though they were human, despite knowing that they are interacting with a machine” (Nass et al. 228; Purington et al. 2854).

(14)

By incorporating the CASA paradigm, I add a certain layer of nuance to the analysis, arguing that in most of the cases where humans anthropomorphize their computers this does not imply that they see or treat them as equals. Following the logic of this paradigm, degrees of personification and sociability are thus to be considered within the confines of human-computer interaction and are not to be mistaken for measurement tools that transcend this domain and can simply be applied to approximate the types and ‘depths’ of interaction between equals. As the CASA paradigm indicates, computers – and other devices – have become social actors that take on various social roles in our lives, albeit still to a limited, non-human extent.

Anthropomorphism can be considered a logical consequence of the social roles that these devices have appropriated (Nass et al. 229). In the case of the Echo, this is no different. However, it is also important to view the anthropomorphizing of the Echo as a ‘response’ to Amazon’s initial introduction and framing of the device: It is a humanoid – thus social – device from the very outset. This chapter thus explores the notion of anthropomorphism and its consequences for human-computer interaction by first examining what precedes it – in this case: the marketing effort of the Echo.

One way or another, before goods are sold to customers, these customers have to be convinced of buying them. In other words: these products have to be marketed to prospective customers (Kotler 46-48). During this marketing process, companies communicate the ways in which (prospective) customers can use their products. Perhaps unavoidable, this in fact ‘nudges’ those customers towards using the product in specific ways (Thaler and Sunstein 6). In this respect, the case of the Echo is not any different: When announcing the Echo in 2014 – and in many marketing efforts since then – Amazon attached clear directions for its usage (Echo announcement Amazon). To get a deeper understanding of how Echo users interact with and make use of their devices, it is therefore of key importance to first understand the ways in which they are being ‘instructed’ to do so – whether that is before or after their purchase. In order to illuminate the ways in which such top-down instructing occurs, this research analyzes one of the Echo’s most substantial marketing channels: its Facebook fan page1_{. This particular page ‘went} public’ (i.e. with the first public page post) in July 2016 and counted over 514.000 followers at the time of writing.

Arguably Amazon’s largest external marketing channel for the Echo, this Facebook page forms an important object of study – a rich source that gives insight into the company’s sales strategy and broader underlying motivations. It is perhaps one of the most accurate lenses through which the rapid expansion of the smart speaker market can be analyzed, which has

(15)

taken on an almost unparalleled magnitude: Only three years after the first public introduction, one-sixth of the total US adult population now owns a smart speaker (Ong). Ironically, whereas it took Facebook, a free software service, two years to reach fifty million ‘customers’, this same number was reached in three years by smart speaker hardware with selling prices between $30 and $180 (Kinsella and Mutchler 7). With Amazon commanding 72% of this potent market, this company ought to be the first under scrutiny for the better comprehending of a market that has grown at such an explosive rate. In doing so, this chapter rises to the important occasion of illuminating the rapid, yet in many ways early stage rise of the voice-activated conversational interface (Dale 815-817).

As research object in the domain of media studies in general and ‘platform studies’ in particular, Facebook has been approached from a vast range of perspectives. In its capacity as intermediary, Facebook is often brought forward as a multi-sided market that aims to cater to all of its stakeholders (Rieder and Sire 199). Other observers emphasize the technical specificities with which the platform determines and streamlines user behavior and data flows (Gerlitz and Helmond; Mittelstadt et al.), or the role of the platform’s ‘political affordances’ in this context (Gillespie). Whereas these studies emphasize Facebook as an entity of which the technical and political affordances guide and restrict the maneuvering space of its different stakeholders, Facebook is also often conceptualized for the infrastructure it in fact offers to third parties to benefit from (Bogost and Montfort). Furthermore, at the intersection of media studies, psychology, social and political sciences, Facebook has notoriously been illuminated for its capacity to classify, predict, and modify user behavior (Bachrach et al.).

Borrowing from all of these approaches, yet not remaining confined to their exclusivity, this chapter first seeks to answer the question: How does Amazon employ the notion of

anthropomorphism in presenting the Echo to its prospective customers – or: users – on Facebook? As this question indicates, the terms ‘customer’ and ‘user’ can be considered interchangeable throughout this chapter, unless otherwise stated. To approach this question, I first take a step back and briefly disconnect from the underlying theoretical framework of platform studies to emphasize the importance of the specific language that Amazon conveys to nudge its users in specific directions. With this narrowed down approach, I aim to identify patterns in the interaction between Amazon and (prospective) Echo users. Subsequently, I reconnect to the broader framework of platform studies and consult these patterns to answer the question: How does Amazon’s framing of the Echo subsequently affect the relationship between Echo users and their devices? To contribute to the formulation of a more constructive answer to these questions, this chapter makes use of empirical research. By elaborating on these questions

(16)

and introducing two extensive datasets, I argue, Amazon’s underlying motivations for the anthropomorphism of the Echo can be mapped and better apprehended. This apprehension, ultimately, is necessary for a more conclusive approximation of the main research question: How does the anthropomorphizing of the Amazon Echo device affect human behavior in general and human-computer interaction in particular?

On the one hand, research on the Facebook page of the Echo gives meaningful insight into Amazon’s underlying marketing strategy. On the other hand, as Arvidsson points out, these are also sites of collaborative consumer practices: places where consumers are not only told about products top-down, but also contribute to the value creation of those products (Arvidsson, The Potential of Consumer Publics 368). In such places, consumers are in fact becoming producers (Arnould and Thompson 868-870). It is often on the basis of these so-called ‘consumer publics’ that suggestions for the innovative use of products arise, which in the long run may form a “common horizon of values that (…) determine the direction of [the consumers’] passions and engagements” (Arvidsson, The Potential of Consumer Publics 370; 384). With regard to the Echo, or any such device for that matter, there has been little research on consumer publics. This chapter however, approaches Amazon’s marketing effort on

Facebook not only as top-down communication, but also as a two-way interaction between producer and consumer – the latter becoming increasingly difficult to distinguish from the former (Lloyd 42). The Echo, this chapter argues, is a fluid, ‘in flux’ product, the narrative around which is changing continuously and is at least in part determined by the product’s users in a process called ‘customer coproduction’ (Arnould and Thompson 869).

Building on the aforementioned theoretical framework that describes the conjoining of consumers and producers, this chapter supplements theory with empirical data, introducing the ‘engagement metrics’ of the Echo Facebook page. An analysis of these metrics, of ‘natively digital objects’ such as likes, shares, and reactions, gives insight into how producer-consumer interaction shapes the narrative around the Echo device. Importantly, this chapter first takes the necessary step back and approach the semantics of such natively digital objects. In line with Richard Rogers’ studies that introduced the field of ‘digital methods’ and argued the ‘end of the virtual’ (Rogers, The End of the Virtual; Digital Methods), I argue how societal and cultural claims can be made on the basis of research of digital sources alone. Agreeing on this ‘online groundedness’, this chapter at the same time acknowledges and respects the limits of digital methods when it comes to the approaching of the ‘real’ through the lens of the ‘virtual’ (Rogers, Digital Methods 29). To illustrate this nuanced approach: In this chapter, a ‘like’ on Facebook is in itself an object of study that may indicate the enjoyment of a user with regard to what she

(17)

liked, while at the same time presenting a form of user expression that can only be witnessed in a digital environment and thus cannot said to be representative of any form of user expression witnessed outside of the digital domain – or: outside of Facebook for that matter.

By liking, whatever such user expression may in fact represent, users produce and engage with Facebook’s data, in this case participating in the shaping of a narrative around the Echo on the platform (Gerlitz). Thus, in the process of approaching the subject of data semantics

(Rogers; Rogers), it is also important to consider the role Facebook plays in the formulation of these semantics. To do so, this chapter returns to platform studies and examine how Facebook both enables and restricts user expression (Bogost and Montfort; Gerlitz and Helmond;

Gillespie). It is argued how the very design of Facebook – its political and technical affordances – nudges and restricts users in voicing their true feelings with regard to the Echo device (Gerlitz and Helmond; Gillespie). As this is a commercial platform that is subject to – and benefits from – the ‘law of the network effects’, which holds that an increased usage further increases usage, it strongly encourages any form of user participation (Rieder and Sire 200; Bucher 484). In this sense, Facebook is not neutral, but rather driven by technology that is designed and motivations that are commercial (Bucher 480). This research holds that a like of a user, to continue this illustrative example, cannot be regarded as mere user expression, but should also be considered to be co-produced by Facebook, which in fact benefits from increased user participation and thus stimulates such interaction (Gerlitz and Helmond 1361-1362).

However, the main aim of this first chapter is not merely to analyze and explain Facebook data, but rather to examine the behavior of Amazon on Facebook, focusing on how the company uses anthropomorphism in the narrative around the Echo to shape and modify customer behavior with regard to this device. Even though these customers indeed co-produce this narrative, as this chapter also brings forward, I emphasize how Amazon’s ‘instructions’ on Echo interaction and usage precede any such co-production process. This chapter hereby presents both a top-down and bottom-up dialogue between the Echo producer and consumer. Indeed, whatever shape or form this dialogue holds, it takes place within the confines of Facebook. As one of the main pillars of the current Echo ecosystem, this platform is thus scrutinized for the role it plays as intermediary between producer and consumer – or: company and customer – as this chapter borrows from a multiplicity of researches mostly originating from the field of platform studies.

(18)

1.1 Method

The first data sample used for the research in this chapter consists of 288 unique page posts that were collected from the official Amazon Echo Facebook page. Spanning the page’s entire public lifetime – from its first post in July 2016 to the data extraction for this research in March 2018 – this sample does not necessarily represent all contact moments between Amazon and its

customers via this medium. Indeed, posts may have been deleted in the meantime. Due to Facebook privacy regulations, there is no way of recovering deleted data and revealing the complete history of the page. 288 posts however, do make for a substantial data sample – one that suffices for the purposes of this research.

This dataset contains the textual content of all of these posts as well as the specification of their ‘type’ (e.g. photo, video, link, etc.). The temporal and “post-demographical” properties of the posts are also included: its publication date and time and a wide range of ‘engagement metrics’ (e.g. likes, comments), as well as other information that is not important for this particular research (Rieder, Studying Facebook via Data Extraction 346). With the notion of ‘consumer publics’ in mind, this dataset is further supplemented by a second dataset that zooms in on user comments to posts. For the sake of a meaningful, qualitative analysis, this dataset is limited to 721 user comments – covering all twenty posts of the month December 2017. As is discussed later, this particular period was selected in an effort to pursue academic consistency, echoing prior research that covers the same month in 2016 (Purington et al.).

Both datasets were formed using Bernhard Rieder’s Netvizz application: “A data collection and extraction application that allows researchers to export data in standard file formats from different sections of the Facebook social networking service” (Rieder, Studying Facebook via Data Extraction 346). Netvizz, which has a monthly active user base of over 3000 at the time of writing, is an application that only functions within the confines of Facebook and thus requires a user to have a Facebook-account2_{. To retrieve data, it makes use of the sanctioned Facebook} ‘Application Programming Interface’ (API) (Idem 348). Netvizz is written in the PHP-language and runs on a server that is provided by the Amsterdam-based Digital Methods Initiative (Idem 349). The tool allows for the quantitative and qualitative analysis of friendship networks, groups, and pages on Facebook. This research only makes use of the Netvizz tool with regard to

Facebook pages – in this case: The Amazon Echo page. While the analysis of both friendship networks and groups faces reliability issues on the basis of privacy settings of individual users,

(19)

page engagement data – which forms the core of this analysis – can be considered more robust (Idem 349).

Reliable in a technological sense, e.g. page data retrieved with Netvizz does not contain any miscalculations or leave any dubious blank spaces, there are however some reliability as well as validity issues when it comes to the semantics of the data. For example, taken into consideration for this research are only the textual capacities of posts and comments; any accompanying images, videos, or links are left out of the equation. This strong focus on text leaves obvious questions about the impact of such added media unanswered, potentially harming the reliability and validity of the data. Further, considering page engagement data, some validity questions arise: What does a like actually represent? And what about a wow reaction? What does it mean when someone shares a post? Importantly, carrying out such research into user behavior, expression, and interaction on Facebook, encloses the researcher within the confines of the technical and visual affordances of such a platform – what Agre famously deemed its “grammars of action” (Agre 745). Facebook’s very architecture and policy in fact determine and thereby limit the freedom of movement and expression of its users, forming “real and substantive interventions into the contours of public discourse” (Gillespie 359).

Regardless of these semantic complexities, “for researchers from the humanities and social sciences”, as Rieder points out, “the possibility to analyze the expressions and behavioral traces from sometimes very large numbers of individuals or groups using these platforms can provide valuable insights into the arrays of meaning and practice that emerge and manifest themselves online” (Rieder, Studying Facebook via Data Extraction 347). As Rogers argues, Facebook is not merely a “virtual space” that exists in isolation of “real life”. Rather, it can be regarded as “a source of data about society and culture” (Rogers, Digital Methods 29). Compared to traditional empirical methods such as experiments or interviews, using data capturing software such as Netvizz has the added value of producing ‘observational data’ (i.e. data documenting what people do, instead of what they say they do) – besides having more obvious advantages in the domains of cost, speed, and exhaustiveness (Rieder, Studying Facebook via Data Extraction 346-347).

Having outlined what the data sample represents and how it was formed, briefly discussing reliability and validity issues, I now go into more detail and discuss the specific procedures that were carried out during this research. First, to retrieve the main dataset with Netvizz, the numerical page id of the Amazon Echo page was recovered using Lookup-id3_{. The other}

parameters were then specified to capture the page’s post history in its entirety (see Figure 2). To

(20)

be sure, the page itself was also analyzed manually and the first post was found to date from 13 July 2016. This manual analysis unveiled that the page did not allow for any posts by users to be displayed. Thus, after setting the parameters, I retrieved the posts by page only. To be sure,

choosing the post by page and users in fact returned identical results. Netvizz then returned a zip-file with two tabular files: one containing the 288 page posts (ranging from 13 July 2016 until 7 March 2018) with metadata and engagement metrics and the other merely describing the

engagement statistics per day (see Appendix I). A third tabular file, describing the page’s fans per country, was not included, due to recent changes in Facebook’s API policy (Kmieckowiak).

Figure 2. The exact parameters with which this research used the Netvizz application to request data from the

official Amazon Echo page on Facebook. In this case: the last 999 page posts and accompanying metadata.

Secondly, following a similar procedure, the second dataset was retrieved. Zooming in on December 2017, this particular dataset contains all user comments to the posts of this particular month. It was retrieved using Netvizz with the parameters as specified in the Figure below (see Figure 3). Again, to activate Netvizz and start data retrieval, I selected the post by page only option. In this case, three tabular files were returned: the same two as with the aforementioned request and a third containing all user comments. In total, before filtering, this last file contained 950 comments to 20 posts (see Appendix II).

(21)

Figure 3. The exact parameters with which this research used the Netvizz application to request data from the

official Amazon Echo page on Facebook. In this case: user comments on the December 2017 page posts.

Thirdly, both datasets were filtered. For the first dataset, which is referred to as ‘1A’ from here on, only the tabular file containing the actual posts was used for this research and thus subjected to filtering. For the second dataset, which is be referred to as ‘1B’ from here on, this was the case for the tabular file containing the user comments only. Both ‘raw’ datasets – although some would argue that data is always already ‘cooked’ (Gitelman) – were filtered and analyzed using the built-in filter function of Google Sheets. For the filtering of 1A, the irrelevant columns of data were omitted from the file, leaving the type of post (link, status, photo, or video), its text, publish date, and engagement metrics (total engagement; likes, comments, shares, and types of reactions) (see Figure 4). Similar filtering was done for 1B, preserving the following data: the post to which the comment forms a reply, temporal data (of post and comment), whether the comment directly replies to a post or to another comment, the text of the comment, and its number of likes (see Figure 5). To not overcomplicate the analysis of 1B, second-tier and even third-tier replies (replies to replies, etc.) were omitted from the data sample, leaving 721 of 950 comments (76%).

Figure 4. An excerpt of the tabular file 1A after step one of filtering: omitting irrelevant columns of raw data. Row

one consists of data categories. Row two contains (meta)data of a post.

Figure 5. An excerpt of the tabular file 1B after step one of filtering: omitting irrelevant columns of raw data. Row

(22)

Lastly, both the 288 posts (post_message column in Figure 4) and 721 comments (comment_message column in Figure 5) were grouped – albeit in separate files – on the basis of a singular textual characteristic: whether they contained the word ‘Alexa’ and/(n)or ‘Echo’. This particular parameter was established as an indicator of the degree of personification with which both Amazon and its followers addressed the Echo device, a method derived from previous research by Purington et al. (2856). In line with the methodology of this research, posts and comments describing the technology as a person (using the name ‘Alexa’) were categorized separately from those describing the technology as an object (using ‘Echo’) and those referring to both or none (Purington et al. 2855). Then, all posts and comments were reviewed qualitatively to establish the degree of sociability they ascribed to the device. This was done on the basis of functionalities and roles of the Echo that were described in these posts and comments. In line with the

methodology of aforementioned study, which in turn relies on the CASA paradigm to approach anthropomorphism from, five separate categories were identified and coded – and recoded by a second coder – to represent varying degrees of sociability, from least sociable (0) to most sociable (4). Deviating from this methodology, this research merged the ‘Companion’ and ‘Friend’ categories, as it found the distinction between the two hard to establish and irrelevant (Purington et al. 2854) (see Figure 6).

Degree of sociability (what kind of interaction with the Echo is described?)

Code Functionality of device Example 1A Example 1B

0 None / not specified Say hello to the all-new Echo Dot. Add Alexa to any room for only $49.99. #JustAsk amzn.to/EchoDot

When are we getting this in the uk?

1 Information source (providing news, weather, facts)

#JustAsk for weather information and more. The all-new Echo Dot for only $49.99.

Alexa no longer recognizes "WBUR" as a streaming radio station. (after months and months of working correctly) It keeps asking me if I want to add an entry to Pandora.

2 Entertainment provider (playing music, audio books, games, telling jokes)

It’s summer so why not enjoy a soundtrack of seasonal hits? Ask “Alexa play the Summer Vibes station from Prime.”

I want to play the music I own on my Alexa. I used to be able to upload it to Amazon Music and it'd play. Now, you stopped accepting uploads. Now what do I do?

(23)

3 Personal assistant (managing

shopping, timers/alarms, schedules)

Order stuff anytime night or day. Add Alexa to any room with Echo Dot for only $49.99. #JustAsk

How hard can it be to turn lights on at dusk? Echo is the only smart home device that does not have that

functionality. Please add this to routines.

4 Companion / Friend (conversation partner, friend, family member, roommate, etc.)

Busting out memories from the past? It’s easy with the all-new Echo Dot available for only $49.99. #JustAsk

Chad Peery this is your girlfriend!

Figure 6. Categories with which degree of sociability of posts (1A) and comments (1B) was established, with

examples for both datasets. This methodology was largely derived from a previous study by Purington et al.

1.2 Results

Before delving into the results, it is important to briefly mention the complications that surfaced during data retrieval, filtering, and grouping. First, the Netvizz tool, however robust for

retrieving page data, has at least one weak spot in this respect. As displayed in Figure 2 and 3 above, the maximum amount of posts to be retrieved is 999 a time. If a page has less than 999 posts – which is difficult to establish beforehand – while the researcher requests this maximum retrieval of 999, Netvizz returns a tabular file of 999 posts that iterates some of the earliest post(s). In the case of the Amazon Echo page, the tabular file of 1A contained around 700 duplicates of the page’s first three posts. These duplicates had to be removed manually. This glitch did not however, as was established by comparing data manually, skew any (meta)data of these posts. Secondly, there was at least one major outlier in 1A, distorting the average results and complicating any meaningful observations. In the following section I discuss how this complication was resolved. Lastly, the coding of the types of interaction to establish degrees of sociability remains a manual task and is thus exposed to subjectivity and bias. To tackle such issues a second coder was employed and the inter-coder reliability was established at Cohen’s k=0,85 (Cohen 37-40).

The first research results of the 1A dataset revolve around the use of the words ‘Alexa’ and ‘Echo’ in posts, where the former represents a higher level of device personification than the latter (Purington et al. 2855). As described in the methodology, posts mentioning ‘Alexa’ were separated from those mentioning ‘Echo’, those mentioning both and those mentioning neither. Importantly, posts mentioning ‘Alexa’ in a non-personifying manner (e.g. ‘Alexa app’) were

(24)

excluded from this first category. As shown in Figure 7, more than half of the posts (N=160) addressed or described the technology as ‘Alexa’. The percentages in the right column are rounded to integers, as will all percentages in the remainder.

Figure 7. The mentioning of ‘Alexa’ and/or ‘Echo’ on the Amazon Echo Facebook page (Dataset 1A, retrieved on

7 March 2018, Appendix I).

Indicating the company’s strong preference to personify the device in its external

communication, it is interesting to delve deeper and assess whether and how this may affect engagement metrics. In Figure 8 below, the engagement metrics of ‘Alexa’-posts are displayed, with the second column representing the total engagement per category and the third column the average amount of engagement per post. Highlighted in green are the categories and relative metrics with higher engagement scores compared to those of the ‘Echo’-posts and vice versa (displayed in Figure 9). Overall, ‘Alexa’-posts have an average total engagement of 464 per post, whereas ‘Echo’-posts lag behind with 424. Thus, speaking in marketing terms, the ‘Alexa’-posts have a 9% higher ‘success’ rate when it comes to audience engagement.

Figure 8. The engagement metrics of ‘Alexa’-posts. Green marked indicates a higher rating compared to Figure 9

(25)

Figure 9. The engagement metrics of ‘Echo’-posts. Green marked indicates a higher rating compared to Figure 8

(Dataset 1A, retrieved on 7 March 2018, Appendix I).

As mentioned before, the 1A dataset contains a major outlier that has quite drastic consequences for the data depicted in the third columns of Figure 8 and 9. Figure 10, which displays the amount of engagement to ‘Echo’-posts over time, shows how one particular post in the end of 2017 sparked an unparalleled amount of engagement. This post, containing a playful video of people using the Echo as an assistant to escape from a room, dwarves all other posts with a total engagement of 16.507. When removing this post from the dataset, the average engagement of ‘Echo’-posts declines to 181. In contrast, removing the most successful ‘Alexa’-post results in an average engagement of 440 per ‘Alexa’-post – a considerably smaller drop. Omitting these two data points from both datasets results in an average engagement for ‘Alexa’-posts which is almost two-and-a-half times higher than that of ‘Echo’-posts.

(26)

Figure 10. Amount of engagement to ‘Echo’-posts over time (Dataset 1A, retrieved on 7 March 2018, Appendix I).

Whereas the figures above revolve around the degree of personification ascribed to the Echo in posts on its official Facebook page, Figure 11 approaches the device’s degree of sociability, extracted from descriptions in posts of its functionalities and roles. In the left column, the categories that were identified from these functionalities are listed (‘none’, ‘info’, ‘entertainment’, ‘assistant’, ‘companion’, and ‘friend’) (see Figure 11); the second column lists the categorization codes; the third the number of mentions per category, where one post in some occasions

mentions multiple categories; the fourth the percentage per category (N=309); the fifth and sixth the total engagement per category and average engagement per post per category respectively. As the figure below indicates, most posts (85%) contain at least one suggestion or reference to using the Echo in a specific manner. When communicating such types of device interaction to

(prospective) customers, Amazon most often utilizes an entertainment-narrative. As the engagement metrics reveal, suggesting how customers could use the Echo for playing music, games, or other forms of entertainment, also generates most engagement.

(27)

Categories of interaction, mentioned in posts (N=288)

code X % engagement avg. engag.

None 0 46 15 11793 256 Info 1 49 16 24644 503 Entertainment 2 119 39 62895 529 Assistant 3 83 27 42691 514 Companion / Friend 4 12 4 3526 294 Total 309

Figure 11. Categories of interaction with the Echo mentioned in Facebook posts. Some posts mention multiple

interaction types (Dataset 1A, retrieved on 7 March 2018, Appendix I).

Introduced in the previous, the 1B dataset contains the 721 comments to all twenty posts from December 2017 (see Appendix II). As Figure 12 reveals, the vast majority of these comments (85%) does not mention ‘Alexa’ or ‘Echo’ at all, whereas all posts to which these comments respond, mention at least one of the terms. Instead, qualitative analysis uncovers, commenters show a clear preference in responding by mentioning a Facebook friend: 68% of all comments mention – and thereby notify – at least one friend. Interestingly, of the fifteen percent comments that do mention ‘Alexa’, ‘Echo’, or both (N=109), 54% address the device in the same way as the accompanying post does. 39% break with this ‘tradition’, while the remaining seven percent either uses both titles where the post mentions only one or only uses one of the two titles mentioned in the post.

Figure 12. The mentioning of ‘Alexa’ and/or ‘Echo’ in comments on the Amazon Echo Facebook page in the first

two weeks of December 2017 (Dataset 1B, retrieved on 7 March 2018, Appendix II).

Reorienting the focus from the degree of personification of the Echo to its degree of sociability that surfaces in these comments, in the same fashion as with the 1A dataset, the first thing to note is that only fourteen percent of the comments of 1B mentions interaction types at all (N=103). Also, when interaction is mentioned, it is not necessarily the description of a personal

(28)

experience which has already occurred. Rather, it often concerns descriptions on how a

commenter wishes to (but cannot yet) interact with the device in the future or recommendations to thirds on how the Echo might be used. However, the different motives behind these

comments do not cloud the fact that they all share a description of at least one interaction type with the Amazon Echo. As Figure 13 shows, with 63% of these comments mentioning an interaction type that falls under the umbrella of entertainment, this is again the most mentioned category. Further, 75% of these comments mention the same interaction type as described in the accompanying post, indicating a strong correlation between the content of posts and comments in this regard.

Figure 13. Donut chart of interaction types, mentioned in comments to posts on the Amazon Echo Facebook page

in December. Taken into consideration are only those comments describing an interaction type at all – fourteen percent (N=103) of the total comments in this research sample (N=721) (Dataset 1B, retrieved on 7 March 2018, Appendix II).

7%

63% 23%

7%

CATEGORIES OF INTERACTION, MENTIONED

IN COMMENTS (N=103)

(29)

1.3 Discussion

As research shows, there are clear commercial advantages for Amazon to anthropomorphize the Echo device (Riva et al.; Schuetzler et al.; Seeger and Heinzl; Gefen). The results brought

forward in this chapter approach such underlying motives by highlighting some of the ways in which this company integrates and iterates anthropomorphism narratives in their public

communication around the Echo. These results also imply that in doing so, Amazon effectively shapes public perception of the Echo, subsequently affecting user behavior with regard to the device. To approach the subject of anthropomorphism in a constructive manner, I borrowed from the methodology brought forward in prior research and established the degrees of personification and sociability ascribed to the device by both Amazon and (prospective) users (Purington et al.; Nass et al.). I further concentrated the research focus by introducing two questions through which to approach the subject: How does Amazon employ the notion of anthropomorphism in presenting the Echo to its prospective customers on Facebook? And: How does Amazon’s framing of the Echo subsequently affect the relationship between Echo users and their devices? In the remainder I formulate answers to these questions and discuss how this dual perspective, which examines both a top-down and bottom-up line of communication between producer and consumer, contributes to a more conclusive approximation of the main research question: How does the anthropomorphizing of the Amazon Echo device affect human behavior in general and human-computer interaction in particular?

User behavior with regard to the Echo, this chapter argues, is on the one hand

determined by how Amazon ‘packages’ and presents the device to these users. This top-down marketing effort precedes and may thereby guide – or: nudge – actual usage in specific directions (Thaler and Sunstein 6). To better understand how users interact with and make use of their devices, it is therefore of great importance to first illuminate how they are being instructed to do so. Dataset 1A partly caters to these needs and provides answers to the first sub-question of this chapter, as it maps how Amazon frames the Echo on one of their most substantial marketing channels. As this dataset indicates, Amazon shows a strong preference for addressing the Echo as a person (i.e. ‘Alexa’) in Facebook posts. As such anthropomorphism increases the level of trust that users ascribe to their devices, which in turn increases their inclination towards making purchases of or through these devices, this may very well be a deliberate “trust-inducing design strategy” of Amazon before anything else (Gefen 734; Seeger and Heinzl 130). With regard to the degree of sociability that Amazon ascribes to the Echo in Facebook posts, 1A also shows how the company most often brings forward use cases that describe how the Echo can be used

(30)

as an entertainment device (39%), followed by instructions on how to use it as a personal assistant (27%).

To understand why Amazon prefers to ascribe these particular degrees of sociability to the Echo, it is important to bring forward how the narrative around the Echo, on the other hand, is also determined by the users themselves. In a process called ‘customer co-production’, customers actively – yet often unknowingly – participate in the value creation around the products they use, which in that sense becomes a public affair that is also affected bottom-up (Arvidsson, The Ethical Economy 330; The Potential of Consumer Publics 368). In this regard, the choice of Amazon to explain the Echo as an entertainment device or as a personal assistant becomes clearer. As the engagement metrics in 1A indicate, posts in these specific categories spark most user engagement, which, according to the law of the network effects, further increases engagement and thereby exposes the Echo to more (prospective) customers (Bucher 484). A similar pattern can be found with regard to the degree of personification in the 1A dataset, where ‘Alexa’-posts spark much more user engagement than ‘Echo’-posts. This indicates that Amazon is in fact giving up a certain amount of control with regard to the formulation of the Echo narrative, addressing more agency to the expanding Echo ‘community’ that finds pleasure in contributing to the narrative (Arvidsson, The Ethical Economy 335). On a broad level, this implies the blurring of the lines between the consumption and production of the Echo: Even though the marketing effort precedes user reception in many ways, it is also heavily guided and shaped by this very reception (Arvidsson, The Potential of Consumer Publics 367). Dataset 1A thus shows how the narrative around the Echo – instructions on its form (i.e. personification degree: how to address it) and function (i.e. sociability degree: how to use it) – is not only created by Amazon, but rather the product of a fluid dialogue between the company and its (prospective) customers, shaped by both top-down and bottom-up influences. To some extent, this makes the second sub-question on how Amazon’s framing of the Echo affects the relationship between Echo users and their devices somewhat obsolete, as this question perhaps leans too heavily on a supposed one-way framing process. With an in-depth analysis of the 1B dataset however, some noteworthy patterns rise to the surface, indicating how Amazon does in fact affect user behavior in a more direct, top-down manner. First, with regard to the degree of personification ascribed to the Echo by Amazon in posts, over half of the comments that mentioned ‘Echo’ and/or ‘Alexa’, did so by addressing the device in the exact same manner as was found in the post to which they responded. Secondly, with regard to the degree of

sociability, 75% of the comments that mentioned an interaction with the Echo, described an interaction of the same type as was brought forward in the corresponding posts. As these strong

(31)

correlations in the 1B dataset indicate, user behavior with regard to their Echo is, to a large extent, preceded and shaped by Amazon’s framing of the device.

Having formulated answers to the sub-questions of this chapter, I have paved the way for explaining how these answers contribute to the formulation of an answer to the main research question. To do so however, it is important to first take a step back and discuss the semantics of the datasets proposed in this chapter. As this entire research takes place in the online – or: digital – domain, it is essential to deliberate on ways in which this relatively novel (in academic terms) and rapidly changing environment should be studied with regard to societal and cultural questions. To do so in an adequate manner, the research practice of ‘digital methods’ was introduced. Digital methods propose to not only study (natively) digital objects (e.g. likes) within the confines of a particular medium (e.g. Facebook), but also utilize them as a means to approach social and cultural research questions (Rogers, Digital Methods 6-7). Focusing largely on the societal conditions that web data can approximate, Rogers introduces the quintessential challenge that is faced by any web research: Are the findings to be ‘grounded’ in the offline or in the online (Idem 10-11)? In the case of this research I argue the latter, following the research practices proposed by Rogers: It is possible to make claims about cultural change and societal conditions on the ground of the online findings presented in this chapter (Digital Methods 29).

However, the scope of claims about the ‘real’ that can be grounded by this research of the ‘virtual’ is limited by the fact that it is confined to the domain of Facebook. As Rogers argues, digital methods “follow the medium”, as it thinks along with and thereby learns from the devices and objects it handles, repurposing the methods of the medium to carry out research on (the effects of) this very medium (Digital Methods 6). In following a certain medium, the

research becomes ‘medium-specific’: It considers and/or takes place in the domain of a medium with its own distinctive, ontological features (Idem 35). While ‘media’ in general and ‘medium-specificity’ in particular have been conceptualized in many different ways (McLuhan; Hayles), this chapter follows the medium that is Facebook and argues that it not only refashions existing forms of output known from other media (e.g. text) (Fuller), but also produces entirely new outputs (e.g. likes) for which new research methods must be applied (Manovich; Rogers, Digital Methods 36). Facebook enables companies and users to express themselves in both old and new ways (Bogost and Montfort), while at the same time constraining the forms of such expression by the technical and political affordances of its very design (Gillespie; Gerlitz and Helmond). By implementing various nudging techniques, Facebook also influences the decision-making process of users, changing and even creating forms of user behavior (Bucher 479).