Technology acceptance of the smart speaker exploring factors affecting the use intention of an emerging technology

(1)

R o b b e r t W i l l e m d e K r u i j f f 1 1 8 6 1 5 4 2

Technology Acceptance of the Smart Speaker

Exploring factors affecting the Use Intention of an

emerging technology

08

Fall

MSc. In Business Administration – Digital Business Track

University of Amsterdam – Amsterdam Business School

Supervisor: Andrea Ganzaroli

June 2018

(2)

Statement of originality

This document is written by student Robbert Willem de Kruijff who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Preface

This thesis is written as part of the Master program: Business Administration – Digital Business track, at the University of Amsterdam. The Digital Business track is considered to be a boundary spanner between the digital world, the market and other disciplines. This thesis concerns an empirical research about the acceptance of the Smart Speaker that has the goal to bring the knowledge acquired during the Master’s program into practice.

I’m really grateful for the valuable comments on earlier drafts of this thesis that were given by my supervisor Andrea Ganzaroli, as well as for the clarifying insights in the field of Technology Acceptance. Furthermore, I would like to thank him for the great support he provided in the process.

(4)

Abstract

This research contributes to the rational understanding of the acceptance of the technology known as the Smart Speaker defined as: “A hands-free speaker powered with

digital voice assistant using two-way voice computing technology that is highly connected (based on Koo & Nam, 2017).”

In the literature review, two fields of theory have be selected and applied being; in the first place technologies embedded in the Smart Speaker (Spoken Language Dialog System, Voice Search as application and Smart Technologies) and secondly the Technology Acceptance Model (TAM), further developments of this concept (TAM2, UTAUT and HMSAM) including a case study using the TAM.

Factors affecting one’s Use Intention based on the TAM and the development of the TAM, substantiated by the technology background, are explored by means of a survey amongst 182 respondents with the following research question in mind: What are motivations

and perceptions that affect people’s intention of adopting the AI-based Smart Speaker?

This resulted in several factors that are proven to be significantly affecting (directly and indirectly) the Use Intention of the Smart Speaker such as: Social Influence, Perceived Entertainment, Perceived Usefulness and Perceived Ease of Use. Interesting observations concern the Interface Familiarity (the new era of human-computer interface with voice control) and Apprehensiveness (trust and privacy issues when using this AI-based technology).

Finally, future research could be conducted in the Virtual Assistant software, not constricted to the Smart Speaker. Furthermore, one could look at specific contexts (such as the elderly or a work environment), at advertising off-screen, at the Smart Speaker with the

(5)

Abbreviations

APP Apprehensiveness

DV Dependent Variable

HMSAM Hedonic-Motivation System Adoption Model

IF Interface Familiarity

IV Independent Variable

H3 Hypothesis 3

M Mediator / Mean

PE Perceived Entertainment

PEU Perceived Ease of Use

PU Perceived Usefulness

SD Standard Deviation

SI Social Influence

SLDS Spoken Language Dialog System

TAM Technology Acceptance Model

TAM 2 Extended Technology Acceptance Model

UI Use Intention

UTAUT Unified Theory of Acceptance and Use of

Technology

(9)

1. Introduction

1.1. A prosperous technology

‘OK Google’ …‘What’s playing tonight?’, Google Assistant will show films at your local cinema. And if you add ‘We’re planning on bringing the kids’, Google Assistant will know to serve up show times for kid-friendly films. You could then say ‘Let’s see Jungle Book’, and the assistant will purchase tickets for you” (Dale, 2016). This statement in the research of

Dale (2016) is a great example of voice-controlled technologies becoming more and more intelligent in mimicking the human interaction.

Within this piece of innovation a couple of prosperous technologies come together in the comfort of our homes; i.e. the Smart Speaker is voice-controlled, smart (meaning that it is possible to connect it to other smart devices) and connected to the internet (providing a doorway to endless possibilities). A collaborative study of NPR and Edison Research (The Smart Audio Report, 2017) shows that nowadays 16% of Americans older then 18 already own a Smart Speaker (that’s about 39 million people).

They do not only own one, but these people also seem highly satisfied. The study shows that out of the people that own a Smart Speaker 65% of the questioned volunteers could not imagine a life without one. Due to NPR and Edison Research “Smart Speakers are changing behaviours and forming new habits” (The Smart Audio Report, 2017).

1.2. Google, Amazon…

In 2014 Amazon was the first one to introduce a commercialized wireless playback device that featured voice activated digital assistants. This so-called Smart Speaker is on the rise and the numbers as mentioned by the Smart Audio Report tell us the same story. Since 2014 Google came with it’s Google Home and the Google Assistant as digital assistant software

(10)

(Lerner, 2017), Apple couldn’t stay behind and recently launched their Apple HomePod (Jaffe, 2018).

Not only the big technology firms, but also several entrepreneurs tap into this technology with more “niche” applications of the Smart Speaker such as SMARTY. SMARTY is a virtual assistant created by a startup called Siliconic Home, uniquely based on the voice of children; SMARTY positions itself as a kids-friendly Smart Speaker. The patented natural language processing technology can recognize the voices of kids, which have a significant different pitch compared to the pitch of adults (Montgomery, 2016).

Another example is Olly, which is being created by a startup called Emotech in London. Olly is again different compared to the other smart speakers. This virtual assistant actually has a personality that can develop and evolve as the result of conversations with the consumer. This means Olly understands people's way of communicating, including context and an

understanding whether or not information is appropriate for the user (Montgomery, 2016). Both these entrepreneurs, focusing on specific niches, confirm the great possibilities and the vast development of such technologies.

1.3. Expectations

The research of Gartner shows that expectations are growing significantly and looking at their Hype Cycle, the Virtual Digital Assistant (which is the software, and thus the backbone of the Smart Speaker) went from “Innovation Trigger” to “Peak of Inflated Expectations”, as presented in Appendix 1. They predict that the next 5-10 years the technology is going to

(11)

1.4. Barriers

A white paper of Symantec explains the main benefit of using the Smart Speaker is that the voice-activated assistant can access all the intelligence in the backend, translating every request into an appropriate task. On the other hand it also has a downside, as the same report mentions privacy and trust risks. This is illustrated this with metaphors like: “The attack of the curious child” (children ordering stuff online) and “the tale of the mischievous neighbor” (whispering requests through the window) both explaining the issues around trust in these devices (Wueest, 2017). See Appendix 2 for an example of a story of such trust issues in a news article.

The Smart Speaker can create great opportunities, but obviously there are also barriers for the adoption of this technology. For instance, research of Voice Labs states the Smart Speaker is still seen as a luxury product and consumers often still see it as another confusing or

redundant platform or interface (Marchick, 2017). Moreover, a chart from Statista, (based on data from NPR and Edison Research; The Smart Audio Report, 2017) shows reasons for not owning a Smart Speaker such as: “too expensive, not enough information available about the Smart Speaker, not going to use it enough, worried about hackers, bothered that the speaker is always listening, spend more money with it and listening by the government”. Thus, it can be concluded that barriers for adoption of this technology are related to concerns about among others the usefulness of the device and about trust and privacy risks (Cakebread, 2017).

(12)

1.5. This Research

When it comes to the Smart Speaker, there is a gap. There is not a good match between the technological feasibility and what the market expects. People don’t know the technology very well yet and might expect something different or more then is possible or reality. This is why the Smart Speaker calls for a rational understanding. In other words: the objective of this thesis is to assess the extent to which people are intending to adopt the Smart Speaker and to explore important factors possibly driving this intention. With this in mind the following research question is formulated:

What are motivations and perceptions that affect people’s intention of adopting the AI-based Smart Speaker?

In order to answer this research question, an assessment on the Smart Speaker is done by performing an adjusted Technology Acceptance Model (TAM) analysis, an assessment not made in literature so far. Holistically, the model simply helps the researcher to make up the sum of perceived “benefits” and “costs” of the technology in order to understand the attitude towards using the Smart Speaker and eventually adopting the technology.This model as introduced for the first time by Davis, et al. (1989) roughly looks like Figure 1. In this figure, Perceived Usefulness can be seen as a benefit in this example and Perceived Ease of Use as cost. More detailed explanations on the model and the appropriate used variables will be presented in the Literature Review and the Research Model.

(13)

1.6. Adoption of Spoken Language Dialog System (SLDS)

The so-called Virtual Digital Assistants such as Google Assistant and Amazon Alexa, is the software embedded in the Smart Speaker. This software is the backbone of the Smart Speaker technology and turns your spoken language into a task and creates feedback given to you by speech. More about this technology, in research more often called a Spoken Language Dialog System (SLDS), will be explained in the literature review.

But for now the question is, what does previous research say about the adoption of such SLDS’s? Through the years, a lot of research has been done on such voice controlled systems. The conclusion so far was pretty much the same. In theory the idea of an SLDS is there and the future view of application of the technology is very promising (Joshi 1991/ Liddy 2001/ Chowdhury 2003).

The Technology Acceptance Model has been used before for analysing speech recognition technology or SLDS. A study on the user acceptance of voice recognition technology done by Simon & Paper (2007) suggests that their adapted TAM had a great predictive power

(incorporating a “subject or social norm factor” next to “Perceived Usefulness” and “Perceived Ease of Use”). They furthermore mention that the rapidly evolving speech recognition technology will become less prone to error, more sophisticated, more powerful and user-friendlier. This can lead to less fluctuations in the Perceived Usefulness, Perceived Ease of Use and in this research’ case the social norm. Due to Simon & Paper (2007) this also impacts the intention of and actual system use. This calls out for a new research testing the most recent technologies with an adapted TAM; is the Smart Speaker the more stable SLDS technology that people are willing to adopt?

(14)

6 years later, not much has changed. Dahl (2013) states that looking at the future, applications for natural language processing would become more and more capable. This development will be based on factors such as; increase of power of devices, development of new techniques for exploiting vast amounts of data available on the Internet and related technologies like speech recognition. In other words, Dahl (2013) says that the synergy of these factors will make the future applications of natural language processing very likely to become a part of our lives.

More recent Dale (2016) states an important gap. He says that it makes sense that research is ahead of the actual products; it makes sense because often the commercial benefits for companies are not always clear. Dale (2016) mentions the risk that in the past the newest technologies remained in research. He even mentions: the next milestone for the Big Four of technology (Apple, Google, Amazon and Facebook) is to go towards truly conversational interactions, taking context into account rather that analyzing merely a sequence of independent conversational pairs.

To summarize, a lot of researches mention the promising technology of SLDS and point out the developments and call for actual products. Did the Big Four of Technology create these kinds of products with their Smart Speakers? And could this hardware become something tangible and feasible in this line of thought? The missing link is clear and so is the question concerning this part of the Smart Speaker. Is the improvement of the technology of SLDS (like better translating requests into tasks, better understanding conversations including context and being more robust against noise) going to improve the attitude of people towards the adoption of this technology?

(15)

Since the Smart Speaker is still quite young and in the early stages of it’s promising

development, the only research currently available is limited. However the research available was a good basis to define the Smart Speaker for this thesis. In the literature review the technologies embedded in the smart speaker are reviewed and combined with the

development and variable versions of the TAM. From this theoretical background variables are extracted for the research framework (and its hypotheses), which are later researched by means of a questionnaire. The reported results are followed by a discussion (including some striking results and limitations to the research) and finally the conclusion (including

(16)

2. Literature Review

In the literature review, this thesis will engage into two relevant parts for the further process of the research. Firstly, the literature on the technologies that are embedded in the Smart Speaker will be explored. This in order to create a better understanding what research has found on these technologies and what challenges and thus relevant motivations or barriers can be formulated as predictors for adopting the Smart Speaker.

Secondly, the TAM and its later adjusted models are reviewed in order to find the variables that are both relevant and applicable for the research model. These variables will be the basis for the research model, the corresponding hypotheses and eventually the questions for the survey.

2.1. Technology review

The Smart Speaker has been given several names (as also seen in the introduction) such as a voice-controlled speaker an Artificial Intelligence speaker or a Virtual Assistant speaker (Koo & Nam, 2017). To clarify what the Smart Speaker is, the technology review will start by giving a definition of the Smart Speaker, which also will be the fundament and support the cohesion for the reviewed literature.

“A hands-free speaker powered with digital voice assistant using two-way voice computing technology based on cloud computing (Koo & Nam, 2017).”

Based on this definition three interesting features come together in the Smart Speaker which combination makes it different from other technologies. The Smart Speaker is: a Spoken Language Dialog Systems (digital voice assistant using two-way voice computing), a Smart

(17)

2.1.1. Spoken Language Dialog System

As mentioned shortly in the introduction, a Smart Speaker is new version of a so-called Spoken Language Dialog System (SLDS). Figure 2 creates a general understanding of what such a system implies. In the definition this is described as “digital voice assistant using a two way voice computing technology”.

Figure 2: Spoken Language Dialogue System (Bertrand, et al., 2010)

In other words; the system translates your spoken language into text, the dialogue manager translates that text into meaning and a certain task (this is the work of the digital voice

assistant, possibly with help of its connections or applications), and finally gives a response or feedback by generating and synthesizing spoken text.

The reason for the Smart Speaker and thus the SLDS to have such great potential can be explained by taking a step back by looking at the general development of the human computer interface (interaction between human and computer), which is well illustrated in Figure 3. The research explains: “the desktop, browser and search metaphors of the last decades leads to a new solve metaphor focused on context and tasks Bellegarda (2013).”

(18)

Bellegarda (2013) concludes this figure by stating that the user will get more used to expressing a more general need and thereafter let a system fulfill (or solve) this need.

When looking at expressing a more general need, research states speech is proven to be the most essential and primary way of communicating for human beings (Prabhakar & Sahu, 2013). Logically this means, spoken language has the potential of being an important mode of interaction with computers. Furthermore the research says that today, speech technologies are commercially available for a limited but interesting range of tasks. These technologies enable machines to respond correctly and reliably to human voices and provide useful and valuable services and tasks (Prabhakar & Sahu, 2013).

2.1.2. Voice Search and Smart Technologies

Two fields of technology research or applications also interesting to look into are Voice Search and Smart Technologies, since these are also unique qualities that come together within the Smart Speaker. The Smart Speaker empowers Voice Search by making it able to do this at home and enables Smart Technologies to be controlled with the Smart Speaker as one central point.

One specific application of voice based human-computer interaction is the voice web search. This is separately mentioned since it illustrates the trend of voice-controlled technology. Schalkwyk, et al. (2010) found that voice (web) search is growing rapidly and many users intend to become frequent users. When it comes to voice mobile searches the following results for this research are interesting, Voice Search:

(19)

The question for now is, what the influence of Smart Speakers on this growth is and if these issues would be different in the privacy of our homes with an intelligent speaker. Answering this: the research of Moorthy & Vu (2014) found that participants of the research preferred using a Voice Activated Personal Assistant (VAPA) (another term for the Virtual Assistant or SLDS) in private locations (home). On the other hand, also in their homes people are

skeptical about using VAPA when it concerns more private input, compared to more general (less personal) input.

As mentioned in the definition and the name of the device, “smart” is part of the possible cloud computed connections the Smart Speaker is capable of. This is an interesting and unique function for its context. The Smart Speaker could be connected to other devices that have smart features in home (think about light, curtains or the thermostat). In other words; it could possibly be part of a smart home or a so-called “Ambient Intelligence”, or even more, it could be the central device controlling the smart home. That is why this part is more focussed on the possible contribution a Smart Speaker could make in these environments rather then the technology itself.

Within the research of Chan, et al. (2009) the test case for Ambient Intelligence (AmI) is based on the elderly; who, with the help of AmI, could be assisted and remain independent. The article of Chen, et al. (2009) reviews various technologies available for smart homes. “The devices to monitor health and activity and provide assistance in the home must be non-obtrusive and acceptable to users. The needs of users require more research.” Chen, et al. (2009) furthermore states: “AmI means making the environment sensitive to the user by using technology”.

In a similar research one statement points more into the direction of the Smart Speaker when Cook, et al. (2009) mentions that such systems should understand when to interrupt a user and when to suggest something and when not to. This suggests, more control should be possible in AmI systems.

(20)

In 2013 (Balta-Ozkan, et al., 2013) state the following about the smart home industry with energy as a context, but for this research still very relevant outcomes:

• Households say they would adopt such technologies in large quantities if these people would not have to change daily routines;

• The usefulness and benefits of the smart technology will have to be clearly stated and demonstrated;

• Increased control of technologies such as these can help to counteract consumer resistance;

• Finally data privacy is an issue in smart home technologies. This could be dealt with by privacy friendly techniques. But on the other hand, experts say too much

(21)

2.2. Evolution of the Technology Acceptance Model

In order to measure and validate the separate motivations and issues when adopting the Smart Speaker, a rational research model will have to be formulated and interpreted for the Smart Speaker. As mentioned in the introduction, for this the Technology Acceptance Model, or an adjusted version of the TAM, will be presented as the research model. This model examines specific factors that may influence technology adoption, such as presented in the first version of the TAM: Perceived Usefulness and Perceived Ease of Use. The development of these models and its applicable factors for the conceptual model will be discussed and used as basis for the adapted version of the TAM or research model. The following research and models are explored:

• Original TAM by Davis, et al. (1989)

• Advanced TAM by Venkatesh & Davis (2000) • UTAUT model by Venkatesh, et al. (2003) • HMSAM model by Lowry, et al. (2012)

• A previous Case study by Kwon & Chidambaram (2000)

2.2.1. Technology Acceptance Model

Research in the acceptance of information technology has delivered many different models, all having factors measuring the acceptance of a certain technology. The first and up until now the most widely accepted model is, as mentioned in the introduction, the Technology Acceptance Model (TAM) introduced by Davis, et al. in 1989.

Davis, et al. (1989) based the model on earlier research by Fishbein & Ajzen (1975) that created the Theory of Reasoned Action (TRA). The theory by Fishbein & Ajzen says that the behavioral intention is determined by both the attitude towards that behavior and a subjective norm concerning the behavior in question.

(22)

The TAM presented in Figure 4 is then an adaptation of the TRA, specifically created for measuring the acceptance of end-user computing technologies in an organizational context. As mentioned in the introduction, the first TAM made the sum of “cost” and “benefit” factors by looking at respectively the Perceived Ease of Use and the Perceived Usefulness.

Figure 4: Technology Acceptance Model (TAM) (Davis, et al., 1989)

In the TAM Davis, et al. (1989, page 985) mention:

• “Perceived Usefulness is the prospective user’s subjective probability that using a specific application system will increase his or her job performance within an organizational context”

• “Perceived Ease of Use is the degree to which the prospective user expects the target system to be free of effort.”

This was as mentioned before the basis for a lot more models and extensions on this idea of measuring technology acceptance. Important for the construct of the research model is the indirect effect of Perceived Usefulness on Behavioral Intention through Perceived Ease of Use, this is the construct used in the research model, also confirmed by the case study described in paragraph 2.2.5..

(23)

2.2.2. TAM 2

The further exploration of the TAM in the context of a smart speaker brings us to the TAM 2 as shown in Figure 5. This is an extended version of what Davis et al. introduced in 1989, created by Venktantesh & Davis (2000). They found several different factors significantly influencing user acceptance. The findings of this research improved the understanding of user adoption.

Figure 5: TAM 2 (Venkatesh & Davis, 2000)

One relevant factor extracted for this thesis is the Experience an individual has in comparable technologies. As mentioned in the introduction, voice-controlled technologies becoming more and more accepted with the rising usage of voice search as an example. In the case of this research, Experience will be (as also seen in the UTAUT model by Venkatesh, et al. (2003) in Figure 6) divided into two different factors, which will be further explained in the next chapter.

(24)

2.2.3. UTAUT

In 2003 Venkatesh, et al. did an analysis on a wide arrangement of frameworks concerning the acceptance of information technologies, which resulted in the Unified Theory of

Acceptance and Use of Technology (UTAUT), shown in Figure 6. From this model, the most important factor extracted for this thesis is Social Influence. Again, the direct effect of Social Influence on use intention is proved to be of importance, confirming the earlier statement of the TAM 2 research of Venkatesh & Davis (2000).

Figure 6: UTAUT (Venkatesh, et al., 2003)

2.2.4. HMSAM

Up until the following research these models remain to have an organizational context. But the smart speaker is also meant for personal (private, in-home) use, which means there is also a hedonic factor to be considered. In other words, how entertaining does a user think the smart speaker is? This Enjoyment is also explained in the research of Mun & Hwang (2003, page 435). They state that prior research already proposed Enjoyment is a determinant of

(25)

The importance of this hedonic factor was also acknowledged in the research of Lowry, et al. (2012), who further extended the TAM with a hedonic factor in the Hedonic-Motivation System Adoption Model (HMSAM). In Figure 7 is seen that Joy (or Perceived Entertainment, as used in the case study model in the next paragraph as well as in the research model of this study) is introduced and further explored in the research of Lowry, et al. (2012).

Figure 7: Van der Heijden's Model as the Baseline for the HMSAM (Lowry, et al. 2012)

2.2.5. A previous case study

As stated in the introduction issues such as trust and privacy also come with a technology with voice-control as the interface. The case study of cellular telephones using the TAM mentions the same factor (Kwon & Chidambaram 2000): “the anxiety about using a new medium or technology”. This finally inspires the last factor influencing the behavioral intention towards the use of technology: Apprehensiveness. The same research mentions innate fear and intrusion into personal privacy as part of the Apprehensiveness. As seen in Figure 8, this is one of the few models mentioning Apprehensiveness and Perceived

Entertainment. One can see the construct of this model is similar to the research model of this research.

(26)

(27)

2.3. Summary of the literature review

Combining the technology literature with the development of the TAM and starting off with the original TAM (Davis, et al. 1989) the following factors are important to take into account when assessing the intention to use the smart speaker:

• Social Influence;

• Experience; (later divided into Web Skills and Interface Familiarity) • Perceived Entertainment;

• Apprehensiveness;

The motivation and explanation for these variables will be presented in chapter 3.

Finally as already mentioned, currently the Smart Speaker is not being sold in the

Netherlands, where this research is conducted. This means the resources of finding out the actual behavior of purchasing this type of information technology is difficult to measure. As a consequence, considering the limited time frame of conducting this research, this means only the behavioral intention (Use Intention) towards this technology will be measured as

(28)

3. Variables and Research Model

In this section first all variables will be defined and motivated based on previous research presented in the literature review. The motivation will be formulated by connecting both the literature of the technology as well as the models frequently used to measure technology acceptance. Therefore each variable is defined, motivated and connected to the literature. With these variables finally the research model will be given and with it the hypotheses that will be analyzed in this qualitative research.

3.1. Variables

3.1.1. Perceived Usefulness

For the sake of structure, the definition of Perceived Usefulness is presented again: “degree to which a person believes that using a particular system would enhance his or her job

performance (Davis, et al., 1989)”. This means for example, a person might perceive the Smart Speaker useful because of reading a recipe, controlling the thermometer or asking for their daily schedule. Perceived Usefulness is one of the basic elements of the Original TAM by Davis, et al. (1989). This is also seen back in the technology review with the outcome of Balta-Ozkan, et al. (2013) stating the acceptence of technology will be enhanced if the usefulness and benefits of the smart technologies would clearly be stated and demonstrated. The same research states that increased control of technologies such as these can help to counteract consumer resistance. A more proactive approach that is provided by the Smart Speaker might change the Perceived Usefulness or at least enhance the Use Intention.

(29)

3.1.2. Perceived Ease of Use

Just like Perceived Usefulness, the definition of Perceived Ease of Use: “degree to which a person believes that using a particular system would be free from effort”. This means for example, learning how to communicate with the Smart Speaker or connecting with the Internet and other devices. Yet another basic TAM (Davis, et al., 1989) element, also essential and in practice seen in the Balta-Ozkan, et al. (2013) research: “a household would adopt smart technologies in large quantities if they wouldn’t have to change their daily routines.”

3.1.3. Social Influence

The Social Influence by Cho (2011) defined as: “a person’s perception that most people who are important to him think he should or should not perform the behavior in question.” Cho (2011) also acknowledges, what many TAM studies have shown regarding the direct effect of Social Influence on the behavioral intention (in this research the Use Intention). What your surroundings think about a certain technology is important, also confirmed in the TAM 2 (Venkatesh & Davis, 2000) and in the research model testing the acceptance of the cellular telephone (Kwom & Chidambaram, 2000). In the Ambient Intelligence literature it is stated that elderly could have the opportunity to live independent, which as a development also potentially influences the social surroundings (Chan, et al., 2009).

(30)

3.1.4. Perceived Entertainment

In the research of Lowry (2012), the HMSAM model describes Joy (or Perceived Entertainment) as: ‘the extent to which the activity of using the computer is perceived to bring about pleasure and Joy for their own sake, apart from any anticipated performance consequences“. The Smart Speaker is obviously also a technology that can be used for reasons of Joy or fun, not only in a professional context; meaning this hedonic factor will be included into the research model. This Perceived Entertainment has not been explored in the technology review, which makes sense since the technology research is more focused on the possibility of creating these technologies or focused on context of utility, not a context of fun. This makes this variable even more important to explore since there seems to be a gap in practice.

3.1.5. Apprehensiveness

Kwom & Chidambaram (2000) describe Apprehensiveness in their case study to be: “anxiety about using a new medium or technology”. Apprehensiveness in this thesis is focussed on people’s personal data and people’s privacy being safe. This is because, as already explained in the introduction, privacy is something for the 21th century. Everything that has to do with gathering data is rather sensitive nowadays, thus affecting one’s intention to use a technology. Apprehensiveness is also one of the main concerns in the technology review, which makes it an interesting variable to measure and use it in the research model to look if and how it is as a predictor of Use Intention.

(31)

voice-3.1.6. Interface Familiarity

The development of computer-human interactions shows that in practice there has always been change in the interface (Bellegarda, 2013). The challenge for now is, knowing what the background of a respondent is, how familiar he or she is with SLDS (Bertrand et al., 2010) and how that influences the Use Intention. This is also why Interface Familiarity in the model has an indirect effect of Use Intention via Perceived Usefulness. A person who is more familiar with this form of human-computer interaction may have a higher appreciation for the usefulness of the Smart Speaker.

This is why the first variable in this research is Interface Familiarity, detracted of what Venkatesh, et al. (2003) in their research call Experience. Even though the original definition as stated in the literature is slightly different (Gefen, 2000), being not so much focussed on interface, it does form the basis for the definition used in this thesis: “one's understanding of an interface based on prior interactions or experiences.”

The definition of Gefen (2000) is more focused on being familiar to a person rather than communicating in a certain way with a computer. Nevertheless, the definition is very applicable for this research and therefore used as definition for Interface Familiarity: “familiarity is an understanding, often based on previous interactions, experiences, and learning of what, why, where and when others do what they do.”

(32)

3.1.7. Web Skills

Web Skills are defined as “an individual judgment of one’s capability to use a computer” (Koufaris, 2002). Being also conducted from the Experience measurement of the UTAUT (Venkatesh, et al., 2003). It is important to clarify the difference between the variables Web Skills and Interface Familiarity. The variable Web Skills is more focused on a person’s self perception of his or her skills on the web and in general is focused on computer skills on the Internet, whereas the variable Interface Familiarity is more focused on interaction between human and computer.

The review of the technologies found that people are less likely to search for a website that requires more intensive interaction (Schalkwyk, et al., 2010). It is in this line of thought that the decision has been made for Web Skills to be connected to Perceived Entertainment, i.e. assuming that the more experienced and skilled a person is on the web, the better he or she knows what a Smart Speaker is capable of, especially when in search of entertainment, which is perceived to be a more intensive way of interacting.

3.1.8. Use Intention

Finally Use Intention is the dependent variable in the research model and is defined as: “the degree to which a person has formulated conscious plans to perform or not perform some specified future behavior” (Venkatesh, et al., 2003). In this research it means whether someone intents to use a Smart Speaker in the future or not.

As mentioned before, it is practically too difficult to measure the actual use of the Smart Speaker since is not officially launched in The Netherlands yet. That is why it is important to

(33)

3.2. Research Model

Based on the explored variables with their effect found in theory, the following research model for this thesis is presented in Figure 9.

Figure 9: Research model of adoption of the smart speaker (based on Technology Acceptance Model)

3.3. Hypotheses

The following hypotheses as presented in Figure 9 are stated and will be tested in the research

H1. Perceived Usefulness has a direct positive effect on Use Intention. H2. Perceived Entertainment has a direct positive effect on Use Intention.

H3. Interface Familiarity has an indirect, positive effect on Use Intention via Perceived Usefulness.

H4. Perceived Ease of Use has an indirect, positive effect on Use Intention via Perceived Usefulness.

H5. Perceived Ease of Use has an indirect, positive effect on Use Intention via Perceived Entertainment.

H6. Apprehensiveness has an indirect, positive effect on Use Intention via Perceived Usefulness.

H7. Apprehensiveness has an indirect, positive effect on Use Intention via Perceived Entertainment.

H8. Web Skills has an indirect positive effect on Use Intention via Perceived Entertainment. H9. Social Influence has a direct effect on Use Intention.

(34)

4. Method

The thesis and the collection of the data will be based on quantitative research. The data is gathered by conducting a survey, which means the data is cross-sectional. In the introduction of the survey, a short but clear explanation of the technology was provided to the respondent. Not everybody is familiar with the term “Smart Speaker”, that is why in this same

introduction two examples of the device with a photo of several Smart Speakers and a clear text including a definition gave every respondent the same background information as presented in Appendix 3. This is information was provided in order for a respondent to create a good idea of concepts such as Perceived Usefulness and Apprehensiveness.

4.1. Sampling

The non-probability, convenience sample will be users and non-users of the technology of the Smart Speaker. Since the Smart Speaker is not introduced in the Netherlands and the survey will be spread out via e-mail and social media (Facebook and LinkedIn), starting in the Netherlands the expectation is that most respondents actually will be not using the product. The survey was carried out in the period from the 23rd_{of April until the 10}th_{of May (2018).}

By means of an extra incentive, the researcher tried to maximize the number of respondents. This incentive is to randomly give away one Smart Speaker (Amazon Alexa Dot). Combining the extra reward incentive with content that is relevant for the interested respondent (sharing the research on relevant platforms whereas the main interest is such) the number of

respondents aimed to be maximized. The minimum amount of data cases for such a research would be 200, based on the thumb rule of that being a good sample size. Based on previous

(35)

4.2. Measures

The measures of the survey are presented in Table 1 (all measurements are intervals using a 7-point (completely disagree – completely agree) Likert scale. The Cornbach’s alpha as mentioned in the cited paper is presented as to justify the use of the variables and items by presenting the reliability in previous research. Apart from the examples of items presented, the used questions for the survey are presented in Appendix 4.

Note: the items chosen and thus question asked for the Apprehensiveness variable are asked in such a way that the effect is positive, as stated in H6 and H7.

Table 1: Used measurements and items and Cronbach’s α

Measure Paper Items Example Cronbach’s α

(or R^2)

Perceived Usefulness (PU)

Hong & Tam (2006) (cited 635)

3 I would find Smart Speaker to be useful in my daily life. 0.88 Perceived Entertainment (PE) Lowry, et al. (2012) (cited 116)

3 I would have fun using the Smart Speaker 0.93 – 0.98 Perceived Ease of Use (PEU) Venkatesh & Davis (2000) (cited 13829 times)

4 I find the Smart Speaker to be easy to use 0.86 - 0.98 Apprehensiveness (APP) Kwom & Chidambaram (2000) (cited 337) 3 (altered)

I would trust my data and information to be secure in a Smart Speaker R^2 = 0.02 Social Influence (SI) Cho (2011) (cited 42)

3 People who influence me think I should use Smart Speaker

(36)

Interface Familiarity (IF) Gefen (2000) (cited 3257) 4 (slightly altered) I am familiar with controlling a device with my voice

0.89

Use Intention (UI) Venkatesh, et al. (2003) (cited 20083)

3 I plan to use this Smart Speaker in the future.

0.935

Web Skills (WS) Koufaris (2002) (cited 3062)

3 I am very skilled at using the Web.

0.918

4.3. Control Variables

As already explored in the literature review, not a lot of research has been done specifically on the acceptance of the Smart Speaker. In an effort to add control variables, there has been looked at previous similar studies including the TAM. Based on that, three control variables have been chosen: Age, Gender and Educational Level.

4.4. Limitations of the design

The design of the study is not longitudinal but cross-sectional (only a snapshot), meaning the issue of reversed causality cannot be ruled out. The technology is not for sale in the

Netherlands, meaning the research will be dependent on the interpretation of the description of the product, whereas also the actual use can not be properly measured, the intention towards it’s use can and will be.

(37)

Considering the sample, the frame is not based on the complete population of potential users of the Smart Speaker, thus the respondents will be found based on convenient sampling. This means the generalizability and results are not guaranteed to be representative. Furthermore, the response rate in previous researches has been fluctuating, meaning there is no guarantee of a high response rate.

Finally, there is always a risk of common method bias and social desirability in answers due to the self-created and reported survey. Hopefully this will be as low as possible due to the relevance of the subject, the extra incentive (lottery of a Smart Speaker) and mentioning the issues of common method bias and social desirability prior to the questions.

4.5. Tools for Analysis

The following tools have been used in order to make the analysis:

• Qualtrics: shaping, designing the online survey in order to gather the data of the respondents;

• SPSS: the statistical program used to test, clean and analyze the data and give it statistical meaning and;

• PROCESS: a model created as a plug-in for SPSS in order to analyze the mediated effects in chapter 5.4..

(38)

5. Results

5.1 Demographics and response rate

5.1.1 Demographics

In Figures 10,11 and 12 the demographic data is presented. As can be seen, the educational level and age are rather concentrated, whereas 80% of the respondents either have a

Bachelor’s or a Master’s degree and 80% of the respondents are between 18 and 25 years of age.

Figure 10: Educational level of respondents

36% 64%

Gender

Male Female

Figure 11: Gender of the respondents

18-25 (80%) 26-35 (16%) _36-45 (3%) Other

Age Respondents (years)

9% 10% 57% 23% 1%

Educational level

High school degree Some college. No degree Bachelor's degree Master' degree

(39)

5.1.1. Response rate

The survey was distributed amongst large networks; so it’s hard to measure how many people actually saw it and thus making a real response rate. However, 182 people completed the questionnaire, whereas 217 are registered to have started filling it out. This makes a 82% “completion” rate.

5.2. Analytical Strategy

Before doing any analysis with the data in order to test hypotheses, a couple of checks and modifications have been done.

5.2.1. Data

First, a frequency check was done, which pointed out there were no errors in any of the items. Thereafter in some of the measurements there were missing values for certain variables, which were were deleted listwise. Also, for the sake of having a better overview, the for the analysis irrelevant data such as IP address and start date were ignored. Also, the variable “Sex” has been recoded into “SexNew”. Whereas Male was 1, it is now 0 and for Female 2 has changed into 1, making the data nominal.

(40)

5.2.2. Normality

Based on the Kolmogorov-Smirnov & the Shapiro-Wilk output it is significantly tested that none of the items are normally distributed (p<.05), as presented in Table 2.

Table 2: Normality check using Kolmogorov-Smirnov & Shapiro-Wilk tests

Kolmogorov-Smirnov Shapiro-Wilk

Statistic Significance Statistic Significance Perceived Usefulness .111 .000 .967 .000 Perceived Entertainment .218 .000 .902 .000 Interface Familiarity .123 .000 .942 .000 Perceived Ease of use .142 .000 .955 .000 Apprehensiveness .123 .000 .954 .000 Web Skills .101 .000 .951 .000 Social Influence .124 .000 .964 .000 Use Intention .086 .003 .957 .000

(41)

5.2.3. Computing means

Before running analysis, new variables as a mean of the already existing items are created. The means of all items (within one variable) were computed as a new set of data points. In the case of this research the mean of all items used to describe a variable is calculated. For instance the items: PE_1, PE_2 and PE_3 (all items concerning Perceived Entertainment) were computed into one variable as a mean of all items of Perceived Entertainment and called PE_TOT ((PE_1+PE_2+PE_3)/3 = PE_TOT). These computed means are then used to analyze all direct and indirect effects that are hypothesized in paragraph 3.3..

5.2.4. Outliers check

The outliers are checked by standardizing the means of the variables, looking into the

frequencies of those standardized variables, looking at possible outliers (cases with z>|3|) and finally examining the distribution whether these possible outliers are isolated cases. Outliers within the variables Perceived Entertainment and Perceived Ease of Use are excluded. This left the eventual sample size for data analysis N=171.

5.2.5. Reliability

The reliability analysis will be presented diagonally in the correlation matrix in Table 3 between brackets. Conclusion is that all Cronbach’s α’s are above 0.7, meaning the scales are reliable.

(42)

5.2.6. Correlation

Table one shows a matrix of correlations between coefficients of the variables used in this report. Strong, positive correlations can be seen between several variables (apart from the control variables that are age, education level and sex). With for instance a strong correlation between Perceived Usefulness and Use Intention. The important correlations for the further analyses of the data are presented bold and underlined.

One data point that stands out in Table 3 is that the interpreted control variables do not significantly correlate to the dependent variable. This suggests that using them in a control model (in a hierarchical multiple regression) will not have any significant effect.

Table 3: Correlation matrix including means and Cronbach’s α Means, Standard Deviation and Correlations

Variables M SD 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 1. Age 1,24 .59 - 2. Sex .66 .48 0.05 - 3. Edu 4.17 1.18 0.02 -.06 - 4. PU 4.58 1.31 .01 .169* -.06 (.89) 5. PEU 5.67 .75 -.16* -.08 -.03 .304** (.82) 6. PE 5.65 .97 -.10 -.01 -.002 .59** .40** (.92) 7. APP 3.56 1.59 -.073 -.065 .02 .42** .12 .24** (.93) 8. WS 5.78 .86 -.124 -.08 .01 .08 .44** .12 .07 (.75) 9. IF 4.68 1.64 -.08 -.003 -.08 .27** .37** .25** .20** .27** (.90) 10. SI 3.66 1.43 .04 .1 -.02 .50** .14 .31** .43** .07 .27** (.95) 11. UI 4.71 1.56 -.052 -.003 -.088 .62** .25** .57** .38** .10 .45** .53** (.94)

(43)

5.3. Data Analysis direct effects

In order to measure and explain the direct effects of all independent variables on Use Intention, a multiple regression has been made. Since the expected control variables are not correlated to the dependent variable (Use Intention) and the hierarchical multiple regression showed no significance for the control variables, a regular multiple regression has been done excluding the interpreted control variables. (See Appendix 5 for the outcome of the

hierarchical multiple regression, including the interpreted control variables in step 1 of the model).

5.3.1. Result Multiple Regression Analysis (direct effects)

Tables 4a & 4b: Multiple Regression

Dependent variable: Adjusted R^2

Use Intention .573 ***

Independent variables Beta T Significance

Perceived Entertainment .298 4.587 .000

Web Skills -.028 -.496 .621

Perceived Usefulness .320 4.36 .000

Perceived Ease of Use -.085 -1.368 .173

Interface Familiarity .272 4.807 .000

Social Influence .179 2.892 .004

(44)

5.3.2. Conclusions from Multiple Regression Analysis

To examine the direct effects of the independent variables on the Use Intention of the Smart Speaker, a multiple regression has been done. As can be seen in the Tables 4a and 4b in paragraph 5.3.1. there are 4 variables having a significant direct effect on Use Intention. These are Perceived Entertainment (β = 0.30, p < .001), Perceived Usefulness (β = 0.32, p < .001), Interface Familiarity (β = 0.27, p < .001) and Social Influence (β = 0.18, p < .005). This means for instance, if someone’s Perceived Entertainment increases with 1, the Use Intention will increase with 0.30.

With this results 3 hypotheses can be significantly explained. For all these, the H0 hypotheses can be rejected and therefore the following hypotheses accepted are:

H1: Perceived Usefulness has a direct positive effect on Use Intention. H2: Perceived Entertainment has a direct positive effect on Use Intention. H9: Social Influence has a direct effect on Use Intention.

Other then that, it is striking that also the Interface Familiarity has proven to have a direct positive effect on Use Intention. Nevertheless, the on theory based H3: “Interface Familiarity

has an indirect, positive effect on Use Intention via Perceived Usefulness.” is tested in

(45)

5.4. Data Analysis Indirect effects

The following indirect tests (a mediated effect) are analyzed with the method created by Hayes (2013), of which the model (Model 4 as seen in Figure 13) used for this analysis is presented in paragraph 5.4.1.. An example of the SPSS output testing this indirect effect (or the amount of mediation) can be found In Appendix 6.

The following indirect effects and thus hypotheses are tested:

(46)

5.4.1. Results of the mediation analysis

The first indirect hypothesis has been explained more broadly as seen in the Table 5a and 5b. In the first table two different steps are presented. First with the mediator as outcome, measuring the connection between the independent variable and the mediator (A1 presented in Model 4 in Figure 13). Thereafter the dependant variable is used as outcome; the effect of the mediator (Perceived Usefulness) on the dependent variable (Use Intention) is analysed (B1), as well as the direct effect of the independent variable (Interface Familiarity) on de dependant variable (C1’). The second one, if significant, can be used to suggest a direct effect.

(47)

Table 5a: Indirect effect Interface Familiarity on Use Intention through Perceived Usefulness

Table 5b as shown below calculates the total effect (C1) of the model and eventually the indirect effect (A1B1), the effect that is of interest (also known as the mediated effect). If the bootstrapped interval (taking 5,000 samples) is completely above zero, it can be stated there is a significant indirect effect. This is presented as Boot LLCI and Boot ULVI in Table 5b.

First part: separate effects A1, B1, C1’

Consequent

(M) Perceived Usefulness (Y) Use Intention

Antecedent Coeff. SE p Coeff. SE P

IF (X) A1 .21 .059 <.001 C1’ .28 .06 <.001

PU (M) - - - - B1 .64 .07 <.001

Constant I1 3.6 .3 <.001 I2 .39 .37 .28

R2= .072 R2=.47

(48)

Table 5b: Indirect effect Interface Familiarity on Use Intention through Perceived Usefulness

For H3: Interface Familiarity has an indirect, positive effect on Use Intention via Perceived

Usefulness, the following can be concluded based on the results as presented in Table 5a and

5b. Using a least square path analysis, the mediation is analyzed, exploring the indirect effect of Interface Familiarity on Use Intention through Perceived Usefulness. As Seen in table 5a, if a person is familiar with the interface that is voice control, the Perceived Usefulness is also estimated higher (A1 = .21, p < .001), the same goes for when a respondent has a higher Perceived Usefulness, the Use Intention increases (as also already proven in the previous chapter) (B1= .64, p < .001).

Finally, in order to measure the mediated effect (A1B1 = .14), a bias-corrected bootstrap confidence interval has been done, based on 5,000 bootstrap samples and was entirely above

Second part: Total, direct and indirect effect

Effect (B) SE p LLCI ULCI

Total effect C1 .42 .07 <.001

Direct effect

C1’ .28 .06 <.001

Boot SE Boot LLCI Boot ULCI

Indirect effect

(49)

The model also suggests a direct effect of Interface Familiarity on Use Intention (C1’ = .28, p < .001). This is true; there is a direct effect as proven in multiple regression analysis in paragraph 5.3.. Similar to the previous analysis on H3, H4 until H8 have been carried out. An overview is given in Table 6.

Table 6: Overview of all hypotheses with indirect effects and output Hypothesis IVàMàDV A1 (IV à M) B1 (M à DV) A1B1, indirect (IVàMàDV) C1’, direct (IV à DV) C1 (Total effect) H3: IFàPUàUI b= .21, p<.001 b=.64, p<.001 b=.14, Boot [.047 to .23] b=.28, p<.001 b= .42, p<.001 H4: PEUàPUàUI b= .52, p<.001 b= .71, p<.001 b=.38, Boot [.19 to .60] b= .15, p=.25 b= .52, p<.001 H5: PEUàPEàUI b= .51, p<.001 b= .89, p<.001 b=.45, Boot [.26 to .69] b= .07, p= .61 b= .53, p<.001 H6: APPàPUàUI b= .34, p<.001 b= .69, p<.001 b=.24, Boot [.15 to .35] b= .13, p=.04 b= .37, p<.001 H7: APPàPEàUI b= .15, p=.0017 b= .83, p<.001 b=.12, Boot [.04 to .22] b= .25, p<.001 b= .37, p<.001 H8: WSàPEàUI b= .12, p=.18 b= .91, p<.001 b=.11, Boot [-.05 to .29] b= .076, p=.52 b= .18, p=.2

(50)

5.4.2. Conclusions from mediated effects analysis

The in paragraph 5.4.1. presented Table 6, explains very well how the hypothesis should be concluded. For every hypothesized mediated effect a least square path analysis is used to analyze all the indirect effects. For this, the main focus is the indirect effect with a bias-corrected bootstrap confidence interval, based on 5,000 bootstrap samples. The requirement for the indirect effect to be significant is that this interval has to be completely above zero. This means that the same conclusion can as H3 be drawn for the following hypotheses; i.e. the H0’s can be rejected, the hypotheses are accepted and are significantly proven:

From Table 6 it can also be concluded that a significant relationship between the independent variable and the mediator can be seen for each of these hypotheses (A1).

Something noticeable is that the strongest indirect effects are the indirect effects starting with the Perceived Ease of Use; H4 (A1B1 = .38, Boot [.19 to .60]) and H5 (A1B1 = .45, Boot [.26 to .69]).

(51)

Finally, for H8: Web Skills has an indirect positive effect on Use Intention via Perceived

Entertainment, the H0 can’t be rejected and thus the effect is not significantly proven. The

indirect effect (A1B1=.38, Boot [-.05 to .29]) is not significant since the interval contains zero. Also the effect of Web Skills on Perceived Entertainment (A1=.12, P=.18) is neither significant, nor is the total effect of the model (C1= .18, p=.2).

5.5. Outcome model

In Figure 14 an overview is given of all the relevant results, giving the answers to all the hypotheses. As shown, Web Skills is the only “red” variable, since this is the only hypothesis that is rejected.

(52)

6. Discussion

“What are motivations and perceptions that affect people’s intention of adopting the AI-based Smart Speaker?” This research question has been answered (partly) with this research based

on the Technology Acceptance Model. Under some conditions and having a rather

concentrated response group, several motivations and perceptions can be significantly proven to be influencing the intention of using the Smart Speaker. First the detailed interpretation of the results will be explained and thereafter the limitations that were part of this study will be elaborated upon.

6.1. Interpreting the Results

First of all, the Technology Acceptance Model and it’s further explorations formed the basis for isolating the right variables and items, which in general were very good predictors for the effects on the Use Intention (the dependent variable). Furthermore, the literature review on the technologies that are an integrated part of the Smart Speaker provided a good motivation to add those specific variables to the research model typically for this technology.

The hypotheses including the variables Perceived Usefulness, Perceived Entertainment Perceived Ease of Use, Apprehensiveness and Social Influence acted as expected; the H0’s were rejected and the hypotheses were accepted (H1, H2, H4, H5, H6, H7 and H9). The research model (based on the literature review and introduction) as presented turned out to be fit for purpose and offered an adequate framework for carrying this research. This means, when looking back at the research question several motivations and perception of people are

(53)

6.1.1. Interface Familiarity

One of the two hypotheses that are not mentioned in the discussion yet, is actually

significantly proven, showed by the result accepting H3 in paragraph 5.3.2.. Nevertheless, this variable needs mentioning since the multiple regression analysis showed that Interface

Familiarity has a significant (p = .000) direct effect on Use Intention. This is interesting since the original variable “Experience”, as seen in the TAM 2 model (Venkatesh & Davis, 2000) as well as the UTAUT model (Venkatesh, et al., 2003), was always used as a moderator, not a variable with a direct effect on Use Intention.

This can be due to the development of the Spoken Language Dialog Systems (the Smart Speaker is such a system), explained by Bertrand et al. (2010) in Figure 2. The technology review mentions the development of the human-computer interface, which is more and more focused on solving problems and tasks. At the basis of the value of this SLDS is a very simple and rational understanding (Prabhakar & Sahu, 2013); speech is the most essential, efficient and primary way of communication for a human being. In this line of thought, typing in a computer can be seen as an inefficient detour. This explains the confirmed third hypothesis; the Interface Familiarity has an indirect effect on use Intention through Perceived Usefulness. It could also explain the Interface Familiarity having a larger impact on Use Intention. This is because speech as the way of communicating is potentially making the use of any system (not only the Smart Speaker) more efficient, also stated in the previous mentioned research of Prabhakar & Sahu (2013). Interesting for future research would be to investigate just the perception of this new human-computer interface, regardless of the hardware this software is embedded in.

(54)

6.1.2. Web Skills

The last hypothesis that needs mentioning is that of the indirect effect of Web Skills on Use Intention through Perceived Entertainment (H8), which was rejected. Looking at the

correlation matrix in table 3 the correlation between Web Skills and Perceived Entertainment was not significant which already suggests the effect is not there.

Now looking further at the correlation matrix, one correlation that stands out is that between Web Skills and Perceived Ease of Use (which in it’s turn has a proven indirect effect on Use Intention). Even though theory on the TAM doesn’t suggest such an effect, in Appendix 7 model 6 from PROCESS (Hayes, 2013) and output of the analysis of the following, multiple mediated effect are presented: The indirect effect of Web Skills on Use Intention, through

Perceived Ease of Use and Perceived Usefulness. The output is significant (Effect Ind2=.155,

Boot [.07 to .28], interval based on 5,000 bootstrapped samples) which proves this effect is happening and rationally this makes sense since certain Web Skills will influence one’s Perceived Ease of Use. This suggests the outcome model should be closer to the in Figure 5 presented TAM 2 (Venkatesh & Davis, 2000). In this model Experience (or at least Web Skill’s part of one’s experience) contributes to the effect on the Perceived Usefulness of a technology (just like Perceived Ease of use).

(55)

6.2. Limitations

Apart from the limitations of the research design mentioned in paragraph 4.4., the following limitations to this research are set out below.

First of all, the intended control variables were not significant, forcing this research not using any of them and doing analysis without them. Furthermore, unfortunately several cases had to be either removed or ignored in the analysis, as mentioned in paragraph 5.1.1., 217 people started the questionnaire whereas 182 finished it. Then out of the 182 some outliers and cases with missing data were deleted list wise. For the results this left 171 data points to analyze. Reasons for this missing data could for instance be; sensitive questions (especially

Apprehensiveness questions), the survey being too lengthy or irrelevance of questions for respondents.

The final important factor to mention is that looking at the demographic data, it can be seen it is rather unbalanced. 80% of the respondents were peaking in age between 18 and 25 years old and 80% of the respondents had either a Bachelor’s degree of a Master’s degree (57% had a Bachelor’s degree and 23 % a Master’s degree). This makes the results and analysis not representative for the whole population, which is the risk of having a non-probability, convenience sample.

(56)

7. Conclusions

7.1. Contributions

The purpose of this thesis was to create a rational understanding of why a person would adopt the Smart Speaker and why not. Once more, the following research question was formulated:

What are the main motivations and perception of people in their process of adopting the AI-based Smart Speaker? Since the technology is rather young and very little research has been

done on the acceptance of the Smart Speaker, this study contributes to the understanding of the motivations, perceptions and barriers in one’s intention to use a Smart Speaker as well as an understanding of what a Smart Speaker is exactly.

Starting from the basic TAM a research model was created appropriate for the Smart Speaker. Combining managerial reports (in the introduction), the literature review (both technology and the evolution of the TAM) and the basis of the TAM, a number of variables were

identified and appropriate effects on the dependent variable (Use Intention) were determined.

By conducting a survey amongst a convenience-sampled group of respondents, the variables were measured based on items of previous research with reliable variables and items. This resulted in a data analysis of which the summary can be seen in Figure 14 in chapter 5.5..

As expected, based on literature, H1 until H7 and H9 were significantly proven in the results (these hypotheses can be found in paragraph 3.3.). Apart from these hypotheses, 2

Technology acceptance of the smart speaker exploring factors affecting the use intention of an emerging technology

R o b b e r t W i l l e m d e K r u i j f f 1 1 8 6 1 5 4 2