Do You Like a Humorous Chatbot? The Effects of Humour and Tone of Voice on the Evaluations of Chatbots and the Related Brand

(1)

Do You Like a Humorous Chatbot?

The Effects of Humour and Tone of Voice on the Evaluations of

Chatbots and the Related Brand

Tianshu Yue (11580682)

Supervisor: Guda van Noort

Master thesis Graduate School of Communication

(2)

Abstract

Chatbots, also called disembodied conversational agents, are gaining more and more popularity among companies and their online communication with customers. Despite this trend, the academic research about customers’ evaluation of chatbots is still rather limited. The use of humour has been shown to be a particular promising communication strategy, while it has never been examined in the context of chatbots. The study at hand is the first one to examine the effect of humour on customers’ evaluation of chatbot, while furthermore checking for the moderating effect of tone of voice (human vs. corporate) in this relationship. It is also the first one to prove that funny cartoon images are a valid operationalization of humour in the context of chatbot conversations. The study has found no significant main effect of humour on human-likeness or satisfaction with the chatbots. However, there is a significant interaction effect, such that the positive effect of humour is amplified for chatbots with human voice but reduced for chatbots with corporate voice. Moreover, three chatbot evaluation dimensions (human likeness, satisfaction and enjoyment) are found to moderately influence brand evaluations, including brand attitude and repurchase intention. The findings extend the research about customers’ evaluations of chatbots and may inspire companies to develop favourable chatbots with humorous and human characters. After all, chatbot evaluations may significantly influence brand evaluations. A positive and effective use of chatbots may therefore reshape the future of online brand communication with customers.

Keywords: chatbot, disembodied conversational agents, humour, voice, evaluation, brand

(3)

Do You Like a Humorous Chatbot? The Effects of Humour and Tone of

Voice on the Evaluations of Chatbots and the Related Brand

Introduction

Disembodied conversational agents, more commonly known as “chatbots”, are bots mostly implemented in messaging-based interfaces and aimed to communicate with humans, by mimicking human-to-human conversations (Araujo, 2018). Nowadays more and more chatbots are used by companies within customer service. Facebook Messenger has attracted over 11,000 Business to Consumer (B2C) chatbots from different companies within three months, since it launched the chatbot function in 2017 (Larionov, 2017). Moreover, the global chatbot market is expected to reach USD 1.25 billion by 2025 (Grand View Research, 2017).

Compared to the rapidly increasing implementation of chatbots in the economy, academic research about B2C chatbots is still in its beginnings. How to improve chatbot performance to satisfy the customers’ needs is one of the most important topics in chatbot research. Improving language ability and language style seems like the essential approach and challenge to improve customers’ evaluations of chatbots (Kuligowska, 2015). Some studies try to improve chatbot linguistic ability by enhancing their Natural Language

Processing (NLP) (Chakrabarti, & Luger, 2015; Griol, Carbo, & Molina, 2013), while others try to change chatbot language style by manipulating the way of greeting and other linguistic cues (Araujo, 2018; Kuligowska, 2015). However, there is no consent yet about which language style can generate the best customers’ evaluations (Araujo, 2018). Humour, as one

(4)

important manifestation of language style (Bazarova, Taft, Choi, & Cosley, 2013; Lynch, 2002), has never been assessed in the context of customers’ evaluations of chatbots and related brands.

Humour has been shown to play an important role in human-to-human and

human-to-computer communication. The influence of humour can be either positive or negative. On the one hand, humour is beneficial in interpersonal communication settings (Cann, Calhoun, & Banks, 1997; Hampes, 1999; Kurtzberg, Naquin, & Belkin, 2009).

Humour has the function of increasing involvement and enjoyment in the conversation. It can help to increase intimacy and trust, which can benefit the overall satisfaction with the

customer service (van Dolen, de Ruyter, & Streukens, 2008). On the other hand, humour has been shown to be potentially harmful to advertising and task-oriented human-computer interactions. A humorous communication style can be distractive, thus it can reduce the recall of online advertisements (Strick, Van Baaren, Holland, & Van Knippenberg, 2009). The use of humour might be especially detrimental to customer evaluations, when the website cannot process users’ requirements well (Van Dolen et al., 2008).

In the case of a chatbot conversation, this study assumes that the effect of humour may be ambiguous. Firstly, humour, as an important indicator of human communication, can significantly increase the human likeness of computers (Nijholt, 2003). People interact with agents as they are social actors (Reeves & Nass, 1996), and the display of humour is useful to produce human-like interactions between agents and humans (Nijholt, 2003). Research by Nijholt, Niculescu, Alessandro and Banchs (2017) has shown that humour might benefit the interaction between humans and embodied conversational agents. However, it might also

(5)

damage consumers’ evaluations in combination with service failures which regularly happens in today’s chatbot-to-human conversations (Luger, & Sellen, 2016; Van Dolen et al., 2008). Comparing these opposing findings, it is still unknown, whether humour will rather make chatbots unfavourable due to the service failure, or if it predominantly elicits a positive influence due to an increase in human likeness. The first aim of this study therefore is to examine the effect of humour on customers’ evaluations of chatbots.

The concept of humour has different forms, such as verbal humour (laugh), facial and bodily expressions (smile), linguistic humour (joke) and humorous images (Lynch, 2002). Some studies have tried to make chatbots more humorous by adding jokes in the form of texts (Binsted, 1995; Loehr, 1996; Tinholt, & Nijholt, 2007), however no studies up to today have used humorous images to operationalize humour in chatbot conversations. Due to the complexity of expressing linguistic humour, it is difficult for chatbots, to actually behave like human beings, especially when only using linguistic humour in the form of texts (Dybala, Ptaszynski, Rzepka, & Araki, 2009). Humorous images are regarded as a simple way to indicate humour, especially using humorous cartoons to convey an innocent humour type (Strick et al., 2009) which appears to receive universally positive evaluations (Nass, & Brave, 2005). Moreover, humorous images appear to be particularly popular on today’s social media channels (Grãdinaru, 2016; Highfield, & Leaver, 2016). Using humorous images in chatbot conversations may match people’s expectations, since chatbots are mostly implanted in messaging mobile applications. For investigating the impact of humour in chatbot

conversations, this study therefore makes use of humorous images in its operationalization of humour.

(6)

The effect of humour may be influenced by other essential components of language style, since people rather perceive language style as entity, instead of its separate linguistic

components (Bazarova et al., 2013). It is proposed that tone of voice, including human voice and corporate voice (Barcelos, Dantas, & Sénécal, 2018), might play be an influential factor to moderate the impact of humour on customers’ evaluations. Human voice is perceived as a more natural, close, and human style of communication, while corporate voice is perceived as a more distant and formal style (Barcelos, Dantas, & Sénécal, 2018). Humour, as a way of informal communication, may rather match the profile of human voice instead of the more formal corporate voice (Lynch, 2002). Therefore, the second aim of the study is to assess, whether there is an interaction effect of humour and tone of voice on customers’ evaluations of chatbots.

Investigating the effects of humour and tone of voice will contribute creating a

favourable language style for chatbots. This is very important for companies, since chatbot evaluations can further influence customers’ perceptions of the brand and the company itself (Araujo, 2018). Araujo (2018) has investigated the effect of chatbot evaluations on

company-related outcomes, including attitudes, satisfaction and the emotional connection with the company. The study suggests that only emotional connection with the company is significantly affected by chatbot evaluations. Other important dimensions such as behavioural intention of repurchasing the brand are not evaluated yet. We will further examine how chatbot evaluations are influencing brand evaluations, which is the third aim of the study.

This study aims to explore the effect of humour and the interaction effect of tone of voice on the evaluation of chatbots, and further explore the influence of chatbot evaluations

(7)

on brand evaluations. It contributes to an understanding of the function of humour in human-chatbot communication, while helping to extend theories on chatbot design and human-computer interactions. Additionally, the study has important societal implications, since the use of chatbots yields big business potential for companies and may reshape the way of online brand communication in the future (Grand View Research, 2017).

Theoretical Background and Hypotheses

Chatbots

Concept of Chatbots

Before diving into the research questions of this study, it is important to define the concept of chatbots first. Because chatbot is not a well-known concept yet and there is common misunderstanding in the use of this concept (Zerfass, Moreno, Tench, Verčič, & Verhoeven, 2017).

The first way to understand the concept of chatbots is to regard them as so called “social bots”, which are automated computer programs for conducting repetitive work on social media (Clément, & Guitton, 2015). Since social media expanded drastically over the last decades, it has become impossible to organize information or influence people on social media only by human work (Dunbar, 2012). Thus social bots are created to manage repetitive tasks, with much higher processing speed than human work. For instance, they can fetch and file Wikipedia contents (Clément, & Guitton, 2015), write simple journalistic articles or create millions of tweets to influence political elections (Zerfass et al., 2017). Chatbots

(8)

belong to social bots because they are automated computer programs and mainly exist on social media.

Unlike other social bots, chatbots have conversational ability to interact with users directly. Thus chatbots can also be considered as one type of conversational agents. Conversational agents are defined as “software that accepts natural language as input and generates natural language as output, engaging in a conversation with the user” (Griol et al., 2013). They are divided into two groups: embodied or disembodied agents. Embodied conversational agents (ECAs) are animated cartoons resembling human beings. ECAs are equipped with facial expressions or body gestures to communicate with customers in both verbal and physical ways (Nijholt, 2003). An example of ECAs is an online animated shopping assistant who can smile, wave and talk to customers. Disembodied conversational agents (DCAs) however, can only communicate with users in messaging-based interfaces (Araujo, 2018; Krämer, Bente, Eschenburg, & Troitzsch, 2009). Some of DCAs are equipped with profile pictures although they do not have any body language. Chatbots are DCAs because they can only interact with users through message applications in the form of texts, images and videos without any facial or body expressions (Kuligowska, 2015).

Therefore, chatbots can be defined as a combination of social bots and conversational agents (the relationship is shown in the Figure 1). The aim of chatbot conversation is to communicate with humans by mimicking human-to-human conversations, often within a messaging-based interface (Araujo, 2018).

(9)

Fig. 1. The relationship between social bots, conversational agents and chatbots.

Business to Consumer (B2C) chatbots are the focus of this study. B2C chatbots are created by companies for communicating with consumers, in order to offer better customer services and further increase customer preference of their brands (Kuligowska, 2015). B2C chatbots are examined in this study because they are the majority of existing chatbots and yield big potential for companies. The use of B2C chatbots might influence consumers’ evaluation of the whole company and brand (Araujo, 2018), which in turn might create a considerate business value.

Overview of Current Research about Chatbots

Current research mainly focuses on two aspects of chatbots: the first is establishing the criteria of evaluating chatbot performance, and the second is improving customers’

evaluations of chatbots by improving the language ability and the language style of chatbots. The most important criteria for evaluating a chatbot is its ability of mimicking

human-to-human conversations. The increase of human-like cues in chatbots can increase perceived anthropomorphism (Araujo, 2018) which is “the assignment of human traits and characteristics to computers” (Nass, & Moon, 2000). Perceived anthropomorphism includes two parts: one is mindful anthropomorphism which is conscious perception of human

(10)

likeness of computers, and the other is mindless anthropomorphism which is attribution of human characteristics to the computers (Kim, & Sundar, 2012). Perceived anthropomorphism is regarded as an important indicator of human likeness of chatbots (Araujo, 2018).

For making chatbots more human-like, the language ability and the language style of chatbots have been examined in prior studies. Some studies use machine learning techniques to improve the language ability of chatbots, combining pragmatics (grammatical and

linguistic abilities) and content semantics (abilities to adhere to the contexts) (Chakrabarti, & Luger, 2015; Gnewuch, Morana, & Maedche, 2017), while other studies manually

manipulate cues to create favourable language style. It is suggested that if a chatbot has a profile picture resembling a living person, it will increase users’ involvement and willingness to start the conversation (Van Vugt et al., 2010). Besides, using the name of a person and informal language style will increase human likeness of the chatbot, compared to using the name of a company and formal language style (Araujo, 2018).

However, current research has not included humour as part of chatbot language style yet, nor any research has examined the moderation effect of tone of voice (human vs. corporate) in the relationship. Hereafter both humour and tone of voice are introduced as two important elements of the language style of chatbots.

Humour

Humour is perceived as an essential part of human communication (Lynch, 2002), but no comprehensive theories of humour have emerged (Van Dolen et al., 2008). One classic definition of humour is “any social or non-social event, occurring purposely or inadvertently,

(11)

that is perceived to be amusing” (Wyer, & Collins, 1992). Humour can be expressed in many ways, including text, sound, image and facial or body expressions.

Humour in Interpersonal Communication and Customer Service

The function of humour can either be beneficial or detrimental in interpersonal communication and customer service. Humour is proven to have a positive effect in

human-to-human communication, because humour is a highly valued quality of people and an important personal characteristic, and showing the appreciation for humour is beneficial to interpersonal attraction (Cann et al., 1997). The perception of humour will increase the trust between people and further increase their intimacy to each other (Hampes, 1999).

Humour can also help increase satisfaction in customer service, since humour has the function of increasing involvement and enjoyment in the conversation (van Dolen et al., 2008). In online communication with customers, strategically employing humour can help smoothen the tension during online negotiation (Kurtzberg et al., 2009), because humour can be linked with high level of trust, cohesiveness and stress-reduction (Romero, & Cruthirds, 2006). Using humour in e-mail campaigns also has a positive effect on sales, since it can increase the pleasantness in customers’ shopping experiences which leads to high willingness to buy (McKeown, 2002).

However, humour is not always beneficial to online communication with customers, especially for task-oriented human-computer interactions (Van Dolen et al., 2008). Since humour will increase customers’ involvement, both favourable and unfavourable features of the online platform can get high attention. The website incorporating humour will get positive evaluations when it has favourable outcomes, but it will get negative evaluations

(12)

when the website has unfavourable outcomes (Van Dolen et al., 2008). Some studies also argue that humour may distract users from their tasks thus increase the total completion time of an online shopping task, which is perceived negative for the evaluation of the interface (Shneiderman, 2010).

In conclusion, the effects of humour depend on the communication situations. In different settings, humour have either positive or negative effects on customers’ evaluations of the service (Shneiderman, 2010).

Humour in Conversational Agents

While there is no research about the effect of humour in DCAs (chatbots) yet, there is a general agreement on the positive effect of humour on ECAs (Nijholt, 2003; Niculescu et al., 2013), because Artificial Intelligence has powered ECAs to socially react to human users (Araujo, 2018). It is suggested that ECAs are able to use their visual appearance, speech and gesture to mimic human interactions. If the agent tells a light joke combined with the sound of laughing, it can improve the users’ perception of task enjoyment and robot human likeness (Niculescu et al., 2013). Thus using humour in ECAs will lead to more positive evaluations of the agent and better interaction quality (Nijholt et al., 2017; Wendt, & Berg, 2009).

The benefits of humour in ECA evaluations are manifested in verbal and physical ways. It is unknown if humour will still benefit customer’s perceptions of DCAs, since they only exist in text-based interfaces. Besides, considering the negative effects of humour on website evaluations when the website cannot offer favourable outcomes (Van Dolen et al., 2008), humour may damage the perceptions of chatbots when there is service failure, which is common in current stage of chatbot development (Luger, & Sellen, 2016). In this case,

(13)

humour may probably make the chatbots unfavourable due to service failure. Therefore, this study aims to explore the effect of humour on chatbot evaluations first.

How to indicate humour in chatbot conversations is also the focus of this study. Currently only texts are used to indicate humour in chatbots (Binsted, 1995; Loehr, 1996; Tinholt, & Nijholt, 2007). Due to the complexity of expressing linguistic humour (jokes), it is difficult for these humorous chatbots behave really like human beings (Dybala et al., 2009). Current research does not implement humour in chatbots in other formats, such as humorous images. However, humorous images may be a good way to manifest humour. Firstly,

innocent humour is perceived to have universal positive evaluation, compared to sarcastic or irrelevant humour (Nass, & Brave, 2005), and a funny cartoon image is a good representative of innocent humour (Strick et al., 2009). Secondly, images including animated Graphics Interchange Format (GIF) are very popular in the conversations on social media (Grãdinaru, 2016; Highfield, & Leaver, 2016). Considering B2C commercial chatbots exist mostly in social media platforms such as Facebook Messenger (Larionov, 2017), it is suitable to present funny cartoon images in chatbot conversations and these images may be perceived humorous.

In summary, the humorous image is a good way to underline humour in chatbots. Considering the effect of humour on chatbot evaluations, even though humour may damage the chatbot evaluations when there is service failure, it can increase the human likeness of chatbots. And users expect chatbots to be humanlike since they perceive agents as “social actors” (Reeves & Nass, 1996). Therefore, the first hypothesis is:

H1: Chatbots communicating with humorous images lead to more positive evaluations, compared to chatbots communicating without humorous images.

(14)

Human Voice vs. Corporate Voice

Humour is just one element of language style, and people perceive all the elements of language style entirely instead of separately (Schiffrin, 1985). Linguistic politeness is a strategy for reducing interpersonal conflict, and language style should be coherent and

suitable for the person’s social state, according to the linguistic politeness theory (Park, 2008). The aim of using linguistic politeness is to maintain and enhance one’s face (i.e., public self-image) during social encounters. People use different language styles according to their gender, age and social status, and keeping a proper and coherent language style is essential to build a favourable public self-image (Schiffrin, 1985). Humour, therefore, has positive effects in chatbot communication, only if it is coherent with the overall language style and the

language style is favourable to customers (Griol et al., 2013).

The language style is also called tone of voice in brand communication, and tone of voice is often classified into two groups: human voice and corporate voice (Barcelos et al., 2018). Human voice is defined as a more natural, close, and human style of communication (Park, & Cameron, 2014), and corporate voice is a more distant and formal style traditionally used by companies to communication with their customers (Barcelos et al., 2018). Human voice and corporate voice may moderate the relationship between humour and chatbot evaluations. Since humour is an informal way of communication and can humanize the tone of voice (Budd, 2004), it is assumed that conversations will be more coherent when

combining humour with human voice, instead of humour with corporate voice. It is suggested that being humanlike is the most important evaluation of chatbots (Kuligowska, 2015) and humour can increase the human likeness of chatbots (Niculescu et

(15)

al., 2013). Besides, whether the chatbots communicate in a coherent language style is also important for customers’ evaluations of chatbots (Griol et al., 2013), and humour is more coherent with human voice instead of corporate voice. Based on these reasons, the second hypothesis is:

H2: The effect of humorous images on evaluations of chatbots is moderated by tone of voice of chatbots, such that the effect of humour will be amplified for chatbots with a human voice, and the effect of humour will be reduced for chatbots with a corporate voice.

Evaluations of chatbots and related brands from customers’ perspective

Some studies evaluate chatbots regarding their language ability (Shawar, & Atwell, 2007; Yu, Xu, Black, & Rudnicky, 2016), but there is no generally accepted chatbot evaluations from the perspective of customers yet (Araujo, 2018). In this study, three dimensions of customers’ evaluations of chatbots are proposed: human likeness, customer satisfaction and customer enjoyment.

Human likeness is the first and the most important dimension in chatbot evaluations. Because mimicking human conversations is in the definition of chatbots. All the techniques in developing chatbot language ability are aimed to make chatbot communicate in a

human-like way (Shawar, & Atwell, 2007; Yu et al., 2016). And human likeness of chatbots determines the perceived quality of chatbot performance (Kuligowska, 2015). Besides, customers’ satisfaction is also an important dimension in chatbot evaluations, because users’ satisfaction is related to their utilization of the computer system (Bailey, & Pearson, 1983), and satisfaction is also a key element for measuring agents’ service performances (Verhagen

(16)

et al., 2014). Third, enjoyment is included in chatbot evaluations in this study, since it is an important parameter of online service performance (Van Dolen et al., 2008), and the factor humour we will examine in this study has an influence on customers’ enjoyment of

interacting with the chatbots (Nijholt et al., 2017). Therefore, human likeness, satisfaction and enjoyment are used to evaluate chatbots from the customers’ perspective.

Considering the societal value of chatbots for brand communication, it is valuable to explore whether chatbot evaluations will influence customers’ perception of the brand. Araujo (2018) tested the influence of chatbot performance on customers’ attitude, satisfaction and emotional connection with the company, and only emotional connection with the

company was influenced by chatbot human likeness. In this study, attitude towards the brand is kept as a measurement of brand evaluations. Besides, behavioural intention of reusing the brand is also included in brand evaluations. Choosing these two dimensions of brand

evaluations are based on two reasons. First, attitude and behavioural intention are key

measures of brand communication in many studies (Grace, & O'cass, 2005; Van Dolen et al., 2008). Second, according to the integrative model (Fishbein, & Cappella, 2006), intention is one of the most important indicators of customers’ future behaviour such as reusing the brand and creating Word of Mouth (WOM), and attitude is one of the indicators of customers’ intention. Thus the attitude towards the brand and the behavioural intention of reusing the brand are chosen to evaluate the customers’ perceptions of the brand.

Since B2C chatbots are highly valued by companies (Grand View Research, 2017), chatbot communication is assumed to have influence on brand communication, and

(17)

customers’ evaluations of chatbots will influence their evaluations of the related brands. Thus the third hypothesis is:

H3: A more positive chatbot evaluation (including human likeness, customer satisfaction and customer enjoyment) will lead to a more positive brand evaluation (including attitude towards the brand and intention of repurchasing the brand).

In summary, this study aims to explore the relationship between humour and chatbot evaluations, and the moderate effect of tone of voice in this relationship; further, this study will test the influence of chatbot evaluations on brand evaluations (the conceptual framework is shown in Figure 2).

Fig. 2. Conceptual framework

Method

Experimental Design and the Procedure

A 2 (humorous vs. non-humorous chatbot communication) by 2 (human-voice vs. corporate-voice chatbot communication) factorial, between-subjects design was implemented in this research. Participants were randomly assigned to one of the four conditions. The factors humour and voice were manipulated to create four conditions, namely humorous

(18)

human-voice chatbot, humorous corporate-voice chatbot, non-humorous human-voice chatbot and non-humorous corporate-voice chatbot.

Participants received a task of ordering a pizza through the chatbot of a pizza restaurant. The task included three steps: 1) when the chatbot asks you what kind of pizza you want to order, try “tropical fruit” first; 2) if it does not work, try “margherita” next; 3) if the chatbot asks whether the delivery location is correct, answer “correct”.

The chatbots followed four steps to interact with participants: 1) greet the customer; 2) ask what kind of pizza the customer wants to order; 3) tell the customer there is no “tropical fruit” pizza; 4) confirm there is “margherita” pizza and communicate the delivery location (see the four chatbot conversations in Appendix A). After completing the task, participants filled out a questionnaire regarding their evaluations of the chatbot and the brand, then they answered a set of demographic questions and questions about control variables and the manipulation check.

Stimuli Materials

Chatbots

A novel pizza brand named “Vivi” was created in this study. The novel brand instead of an existing pizza brand is used, in case people’s perception towards the existing brand interfered the results. Four chatbots of this pizza brand were built through InVision website (Higgs, 2016). For controlling the experimental conditions well, participants can only interact with the chatbots by clicking the given options instead of tying in their own requests (an example of chatbot conversation is shown in Figure 3).

(19)

Fig. 3. An example of the choices in chatbot conversations Humour

Cartoon images were used to indicate innocent humour in this study. Two cartoon images were used in the humorous chatbot communication. The first one was three Minions expressing excited and happy feelings and it was placed right after the chatbot greeted the customer. The second one was SpongeBob expressing “sorry” and it was placed when there was a service failure (when the chatbot could not offer the pizza the customer wanted). In the non-humorous conditions, the chatbots did not communicate these two pictures thus humour was not present.

Tone of Voice

Inspired by the study of Barcelos et al. (2018), both of the profile picture and the wording were manipulated to differentiate a human-voice chatbot from a corporate-voice chatbot. The human-voice chatbot had a human name (Jacob) and a profile picture of a young male adult, and he used informal language style. The corporate-voice chatbot had a brand name “Vivi” and a profile picture of the brand logo, and it communicated with customers in a

(20)

formal language style. Below were the examples of chatbot wording.

Table 1

Wording of human-voice chatbot and corporate-voice chatbot.

Occasion Human-voice chatbot Corporate-voice chatbot Greeting Hi, I’m Jacob. Nice to meet you!

What kind of pizza do you want?

Here is Vivi Pizza Restaurant. Please select the pizza you want to order.

Reply to service failure

Sorry, we don’t have this kind of pizza now…

Could you order another kind of pizza?

This kind of pizza is not available at the moment. My apologies. Please select another pizza you want to order.

Conduct delivery Good choice! I will deliver the pizza at this location. Is it correct?

This kind of pizza is available. Vivi will deliver the pizza at this location. Please confirm if the address is correct.

End the conversation

Thank you for ordering the pizza. Tell me if you have questions at any time. Bye!

Thank you very much for ordering the pizza from “Vivi”. Please contact me if you have questions.

Pre-test

To guarantee successful manipulations of humour and voice, a pre-test was run among 33 people (9 males, Mage = 24.23, SD = 0.78). Two seven-point questions were asked for the

pre-test: “to what extent do you think this chatbot is funny or not funny” (1 = not funny at al, 7 = very funny). Funniness instead of humour was used to test the level of humour in the chatbot, because humour has many different interpretations and funniness is suitable to measure innocent humour (Strick et al., 2009). The other question is “to what extent do you think the language the chatbot uses is formal” (1 = informal, 7 = formal) and it is used to classify human-voice and corporate-voice chatbots.

Two independent t-tests were conducted for manipulation checks. The language style of corporate-voice chatbots was perceived significantly more formal (M = 4.47, SD = 2.33) than

(21)

Thus, language style was successfully manipulated. However, the level of perceived funniness was not significantly different between humorous chatbots (M = 4.25, SD = 1.73) and non-humorous chatbots (M = 3.53, SD = 1.28), t(31) = 1.37, p = .197. Therefore, the stimuli materials were adjusted and pretested again.

Humour was increased in the humorous chatbot conditions by inserting a third cartoon image, which was one Minion saying “goodbye” happily at the end of conversations. The second pre-test, using the same item to measure funniness, was conducted among 52 participants (17 males, Mage= 24.01, SD = 1.31). Humorous chatbots were perceived

significantly more funny (M = 4.33, SD = 1.69) than non-humorous chatbots (M = 3.54, SD = 1.25), t(50) = 2.80, p = .008. Thus the manipulation of humour was valid after adding one more cartoon picture in the humorous chatbots. The adjusted materials were used in the experimental study.

Participants

Participants were approached via social media and email. In total, 252 people participated the experiment of which 9 participants were removed from the sample, because their data were not reliable (such as choosing all “1” or “7” in scale questions with opposite wordings). Finally, 243 participants were included in the analysis. There were 192 females (79.0%), and the average age of participants was 24.33 (SD = 2.92). 224 participants obtained a bachelor or master degree (92.2%). Most of the participants were university students, who were generally familiar with social media, especially messaging applications (Highfield, & Leaver, 2016).

(22)

Chatbot Evaluations

Chatbot evaluations were measured with three dimensions, perceived anthropomorphism as the indicator of human likeness, chatbot satisfaction and chatbot enjoyment (Araujo, 2018; Van Dolen et al., 2008; Verhagen et al., 2014).

Perceived anthropomorphism included two parts: perceived mindful and mindless anthropomorphism. Perceived mindful anthropomorphism was measured by the items adapted from the study of Powers and Kiesler (2006). Participants were asked about to what extent the chatbot was being human or machine-like, natural or unnatural and lifelike or artificial. Perceived mindless anthropomorphism was measured by the scales from Kim and Sundar (2012), evaluating to what extent the adjectives could describe the chatbot well (likeable, sociable, friendly and personal). In total seven items were measured in seven-point bipolar scales. After recoding the scores of these items, high scores represented high perceived anthropomorphism. The average score of recoded items was used to indicate the human likeness of the chatbots (M = 4.44, SD = 1.44), and the Cronbach’s Alpha was .89 (the Mean and Standard Deviation of all single items was reported in Appendix B).

Satisfaction with the chatbot was measured by three-item questions on a seven-point scale (1 = totally disagree, 7 = totally agree) adapted from the study of Maxham and Netemeyer (2002): “In my opinion, this chatbot provided a satisfactory resolution to my request of ordering the pizza”, “I am not satisfied with the way of this chatbot to handle my request of ordering the pizza” (reversed), and “Regarding the process of ordering the pizza, I am satisfied with this chatbot”. After recoding the scores of these items, high average scores represented high levels of satisfaction with the chatbot (M = 4.85, SD = 1.29), and the

(23)

Cronbach’s Alpha was .81.

Enjoyment of communicating with the chatbot was measured by two-item seven-point scale questions adapted from the study of Grembler and Gwinner (2000). One item is ‘‘I enjoyed communicating with this chatbot’’ and the other is ‘‘I felt comfortable communicating with this chatbot’’. High average scores represented high levels of chatbot enjoyment (M = 4.61, SD = 1.22), and the Cronbach’s Alpha was .61.

Brand Evaluations

Brand evaluations were measured with two dimensions, the attitude and the behavioural intention towards the brand (Fishbein, & Cappella, 2006).

Brand attitude was measured with five seven-point scale items, including good/bad (reversed), favourable/unfavourable (reversed), satisfactory/unsatisfactory (reversed), negative/positive and disliked/liked (Becker-Olsen, 2003). The higher the average score was, the more positive the evaluation of the brand was (M = 4.84, SD = 1.17), and the Cronbach’s Alpha was .90.

Behavioural intention was measured with a seven-point scale question “to what extent you want to use this chatbot to order pizza again?” (1 = every unlikely, 7 = very likely, M = 4.81, SD = 1.41).

Control Variables

Two control variables were included in this study: familiarity with chatbots and attitude towards pizza. Familiarity with chatbots was asked in three degrees (30.0% reported “not familiar”, 40.3% reported “moderately familiar” and 29.6% reported “familiar with chatbots”) and attitude towards pizza was also asked in three degrees (4.9% did not like pizza, 45.3%

(24)

neither liked or disliked pizza and 49.8% liked pizza).

Results

Manipulation Checks

The same two seven-point questions as in the pre-test were asked for manipulation checks: “to what extent do you think this chatbot is funny” (1 = not funny at al, 7 = very funny), and “to what extent do you think the language the chatbot uses is informal or formal” (1 = informal, 7 = formal). Two independent t-test were run for these two questions. Chatbots with humorous cartoons were perceived significantly more funny (M = 4.70, SD = 1.60) than chatbots without humorous cartoons (M = 3.94, SD = 1.77), t(241) = 3.48, p = .001. And the language of chatbots with corporate voice were perceived more formal (M = 4.56, SD = 1.66) than the language of chatbots with human voice (M = 4.11, SD = 1.79), t(241) = 2.02, p = .044. Therefore, the manipulations of this study were successful.

Randomization Checks

The randomization checks were tested by Person Chi Square analyses with independent variables and control variables, since control variables were ordinal variables. Results suggested that there were significant association relationships between experimental conditions (humour and voice) and control variables (familiarity with chatbots and attitude towards pizza). There was a weak association between chatbot humour and familiarity with chatbots, Pearson Chi-Square is 10.454, p = .005. There was also a weak association between chatbot voice and familiarity with chatbots, Pearson Chi-Square is 6.66, p = .036. There was also a weak association between chatbot humour and attitude towards pizza, Pearson

(25)

Chi-Square is 14.90, p = .001. But there was no significant association between chatbot voice and attitude towards pizza, Pearson Chi-Square is 1.16, p = .561 (crosstabs of these associations were reported in Appendix C). The randomization of this experiment was not successful. Therefore, the two control variables (ordinal variables) were recoded to dummy variables and included as covariates in the analysis.

Effect of Humour and Voice on Chatbot Evaluations

There were three dimensions of chatbot evaluations in this study: human likeness of the chatbot, satisfaction and enjoyment of interacting with the chatbot. Three ANCOVA analyses were run to test the effect of humour and voice of chatbots on these evaluations.

Firstly, the effects of control variables on chatbot evaluations were examined. The analyses showed that familiarity with chatbots had significant impacts on chatbot human likeness (F(1, 235)dummy1 = 7.60, pdummy1 = .006; F(1, 235)dummy2 = 11.66, pdummy2 = .001) and

chatbot satisfaction (F(1, 235)dummy1 = 3.98, pdummy1 = .047; F(1, 235)dummy2 = 12.77, pdummy2

< .001) but not on chatbot enjoyment, while attitude towards pizza did not have any significant influence on chatbot evaluations (detailed results were reported in Appendix D).

The first hypothesis assumed that chatbots communicating with humorous images would lead to more positive chatbot evaluations, compared to chatbots communicating without humorous images. After controlling the influence of two control variables, there was no significant main effect of humour on the three chatbot evaluations, including chatbot human likeness, F(1, 235) = 3.60, p = .059, chatbot satisfaction, F(1, 235) = 2.90, p = .090 and chatbot enjoyment, F(1, 235) = 1.48, p = .225. Therefore, the first hypothesis was rejected.

(26)

moderated by the tone of voice of chatbots, such that the effect of humour would be amplified for chatbots with human voice, and the effect of humour would be reduced for chatbots with corporate voice. The analysis suggested that after controlling the influence of control variables, there was a significant interaction effect of humour and tone of voice on chatbot human likeness, F(1, 235) = 4.32, p = .039. After conducting a simple effect analysis, for chatbots with human voice, the humorous chatbot (M = 4.72, SE = 0.18, 95% CI [4.36, 5.09]) was significantly more humanlike than non-humorous chatbot (M = 3.98, SE = 0.19, 95% CI [3.61, 4.35]), F(1, 235) = 7.95, p = .005. But for chatbots with corporate voice, there was no significant difference on human likeness between the humorous chatbot (M = 4.50, SE = 0.19, 95% CI [4.13, 4.87]) and the non-humorous chatbot (M = 4.54, SE = 0.18, 95% CI [4.20, 4.89]), F(1, 235) = 0.03, p = .874 (the results is shown in Figure 4).

Fig. 4. The interaction effect of humour and voice on chatbot human likeness

Similarly, there was a significant interaction effect of humour and tone of voice on chatbot satisfaction, F(1, 235) = 14.33, p = < .001. After conducting a simple effect analysis, for chatbots with human voice, the humorous chatbot (M = 5.36, SE = 0.16, 95% CI [5.05, 5.66]) was significantly more satisfactory than the non-humorous chatbot (M = 4.48, SE =

(27)

0.19, 95% CI [4.17, 4.79]), F(1, 235) = 15.28, p < .001. But for chatbots with corporate voice, there was no significant difference on satisfaction between the non-humorous chatbot (M = 4.93, SE = 0.15, 95% CI [4.65, 5.23]) and the humorous chatbot (M = 4.60, SE = 0.16, 95% CI [4.29, 4.92]), F(1, 235) = 2.28, p = .133 (the results is shown in Figure 5).

Fig. 5. The interaction effect of humour and voice on the chatbot satisfaction

However, there was no significant interaction effect of humour and voice on chatbot enjoyment, F(1, 235) = 0.88, p = .348 (the results is shown in Figure 6). Therefore, the second hypothesis was partially supported in terms of chatbot human likeness and satisfaction, but not chatbot enjoyment.

(28)

In conclusion, the first hypothesis was completely rejected because there was no significant main effect of humour on chatbot evaluations, which means there was no significant difference on human likeness, satisfaction or enjoyment between humorous chatbots and non-humorous chatbots. The second hypothesis was partially supported for chatbot human likeness and satisfaction, but not for enjoyment. For chatbots with human voice, the humorous chatbot was perceived more humanlike and more satisfactory than the non-humorous one. But for chatbots with corporate voice, the humorous chatbot was no longer perceived more humanlike or more satisfactory than the non-humorous one. Thus the positive effect of humour was indeed amplified for chatbots with human voice and reduced for chatbots with corporate voice, in terms of human likeness and satisfaction. However, this interaction effect did not exist in terms of chatbot enjoyment.

The Influence of Chatbot Evaluations on Brand Evaluations

The third hypothesis assumed that a more positive chatbot evaluation (including human likeness, customer satisfaction and customer enjoyment) would lead to a more positive brand evaluation (including attitude towards the brand and intention of repurchasing the brand). Six linear regressions were run to test the influence of each chatbot evaluation on each brand evaluation.

Firstly, three linear regressions were run to test the influence of each chatbot evaluation on brand attitude. The model with chatbot human likeness as the independent variable and brand attitude as the dependent variable was significant, F(1, 241) = 156.43, p < .001. And 39.4% of the variation in brand attitude can be explained by chatbot human likeness. In the same way of analysis, 56.8% of the variation in in brand attitude can be explained by chatbot

(29)

satisfaction, F(1, 241) = 316.84, p < .001, and 28.8% of the variation in brand attitude can be explained by chatbot enjoyment, F(1, 241) = 97.45, p < .001. The results indicated that chatbot human likeness, satisfaction and enjoyment all had influence on brand attitude and the strength of the relationships were moderate.

Secondly, three linear regressions were run to test the impact of each chatbot evaluation on intention of repurchasing this brand. The model with chatbot human likeness as the independent variable and intention of repurchase as the dependent variable was significant, F(1, 241) = 100.27, p < .001. 29.4% of the variation in intention of repurchase can be explained by chatbot human likeness. In the same way of analysis, 48.0% of the variation in intention of repurchase can be explained by chatbot satisfaction, F(1, 241) = 222.32, p < .001, and 48.0% of the variation in intention of repurchase can be explained by chatbot enjoyment, F(1, 241) = 224.25, p < .001. The results indicated that chatbot human likeness, satisfaction and enjoyment all had influence on intention of repurchase and the strength of the relationships were moderate.

In conclusion, the analysis suggested that the chatbot evaluations including human likeness, satisfaction and enjoyment positively influenced the brand evaluations including brand attitude and intention of repurchasing the brand, and the strength of the relationships was generally moderate. Therefore, the third hypothesis was supported.

Conclusion and Discussion

The study aims to explore the effects of humour and tone of voice on consumers’ evaluations of chatbots and further examine whether the chatbot evaluations will influence

(30)

the brand evaluations. For this reason, an experiment has been conducted. Participants were randomly assigned to four groups, with each group interacting with one chatbot of an unknown brand for ordering a pizza, and subsequently giving their evaluations of the chatbot and the brand. The study addresses the academic gap concerning the effect of humour on chatbot evaluations, and explores the moderating effect of tone of voice (human vs. corporate) in this relationship. The study expands knowledge about influencing factors on chatbot evaluations, and it develops a theoretical approach on chatbot design and human-computer interactions. Besides, considering the huge potential business value of chatbots for companies (Grand View Research, 2017), this research also has important societal implications, especially for unknown pizza brands which are used in the experiment.

The first key finding of this study is that humour itself cannot impact chatbot evaluations significantly. Moreover, it does not appear to make chatbots (DCAs) more human like, more satisfactory or more enjoyable. However, previous studies have proved that humour could make ECAs more human like and likable, which leads to more positive evaluation and better interaction quality (Nijholt et al., 2017; Wendt, & Berg, 2009). Unlike for ECAs, in which humour is expressed by verbal and facial expressions, DCAs (chatbots) only indicate humour by texts, images and videos, often within a messaging-based interface (Araujo, 2018). This suggests that expressing humour in verbal and facial expressions more easily evokes positive evaluations in the customer, compared to expressing humour in messaging-based conversations.

The second key finding is that the effect of humour is moderated by the chatbots’ tone of voice. The positive effect of humour is amplified for chatbots with human voice. Within the

(31)

human-voice condition, the humorous chatbot is perceived more humanlike, more satisfactory, however not more enjoyable, compared to the non-humorous chatbot. The positive effect of humour is reduced for chatbots with corporate voice, meaning there is no significant difference on human likeness, satisfaction and enjoyment between the humorous chatbot and the non-humorous one in this condition. This supports the beginning notion that humour, as an informal way of communication, indeed matches more with the human voice instead of the corporate voice. These findings underline the importance of sticking to a coherent language style, when designing communication patterns for chatbots (Griol et al., 2013). Differing from experiences in website interactions, users are found to prefer a chatbot with human and humorous language style, even when there is a service failure. Thus humour in this context is not a distractive factor which may lead to negative evaluations (Shneiderman, 2010; Van Dolen et al., 2008), but a positive influence to make the chatbot more human-like and more satisfactory when incorporated with a human voice. These findings are hoped to encourage companies to develop chatbots, which communicate both with humour and human voice which are shown to generate the best evaluations.

Thirdly it is found that chatbot evaluations influence brand evaluations moderately. This study has measured three dimensions of chatbot evaluations (human likeness, satisfaction and enjoyment) and two dimensions of brand evaluations (attitude towards the brand and intention of repurchasing the brand). It is suggested that a human-like, satisfactory and enjoyable chatbot leads to positive brand attitude and high intention of repurchasing the brand, and the strength of these relationships appears to be moderate. Considering there are no generally accepted measurements of chatbot evaluations and their respective brands yet

(32)

(Araujo, 2018), human likeness, satisfaction and enjoyment are suggested to be valid measurements from the customers’ perspective, since all three dimensions influence brand attitude and intention of repurchase moderately. The results emphasize the importance of developing a favourable chatbot for companies, since the chatbot performance influences brand evaluations significantly.

An additional finding concerning this study’s operationalization is, that humorous cartoon images are found to effectively convey humour in chatbot-human interactions. This finding adds value to chatbot theories since only humorous texts were used to represent humour in previous research. Innocent humour such as cartoon images might be the only humour receiving universal positive evaluations (Nass, & Brave, 2005). In line with this, the chatbots communicating with humorous images are perceived as more funny than chatbots communicating without them. Considering the limited NLP ability of chatbot, it may be difficult for a chatbot to use linguistic humour (jokes) naturally. Therefore, this study suggests the humorous image as a new approach to indicating humour in chatbot conversations.

Limitations and Suggestions for Future Study

There are several limitations to the study’s findings. Firstly, participants have interacted with the chatbots by clicking predetermined answer options instead of typing in their responses individually. While this allows for a high internal consistency in the study, it makes the conversations artificial, and therefore it might not fully represent the conversations between users and B2C chatbots in reality. Future research should consider developing and

(33)

applying chatbots, which allow participants to interact in their own words, in order to increase ecological validity of the experiment.

Secondly, the participants might not be good representatives of the whole population. They are relatively familiar with chatbots (70.0%) and most of them are females (79.0%), while the majority of regular mobile users is not familiar with chatbots (Zerfass et al., 2017), and people who pay attention to chatbots on social media are mostly male (Facebook IQ, 2018). Thus the presented results may be difficult to generalize. In this study, the control variable familiarity with chatbots actually is proven to have an influence on chatbot evaluations. Thus future study about chatbot evaluations could include the influences of familiarity with chatbots and demographic factors. This seems of particular importance, for developing chatbot evaluation theories and for companies to target the right audiences with their chatbots.

Thirdly, ordering pizza is the only situation examined in the experiment, while other business areas are not covered. The advantages of humour might not exist in other more serious and complex business fields, such as insurance and medicine (Zerfass et al., 2017). Furthermore, chatbot evaluations and brand evaluations in this study are valid but not complete. Future studies could explore other business situations and make comparison between serious business situations and rather trivial ones. Future research could also explore other valid evaluations of chatbots and related brands.

In spite of the given limitations, this study is deemed to be valuable in addition to current research about chatbots. It examines the effects of humour and tone of voice on chatbot evaluations, and explores the influence of chatbot evaluations on brand evaluations. It

(34)

extends the theories of chatbots and human-agent interactions. Besides this, the study has important implications for companies, concerning the development of favourable B2C chatbots. Valid measurements of chatbot evaluations are proposed in this study and can be used by companies to test customers’ feedbacks on chatbot performance. Considering the big business value of chatbots and its potential to rebuild online brand communication (Grand View Research, 2017), more explorations will be needed in chatbot research. As one of the first preliminary studies about the effect of humour and tone of voice on chatbot evaluations, this study will contribute to future research and business practice in this field.

(35)

References

Araujo, T. (2018). Living up to the chatbot hype: The influence of anthropomorphic design cues and communicative agency framing on conversational agent and company perceptions. Computers in Human Behavior, 85, 183-189.

Bailey, J. E., & Pearson, S. W. (1983). Development of a tool for measuring and analyzing computer user satisfaction. Management science, 29(5), 530-545.

Barcelos, R. H., Dantas, D. C., & Sénécal, S. (2018). Watch Your Tone: How a Brand's Tone of Voice on Social Media Influences Consumer Responses. Journal of Interactive Marketing, 41, 60-80.

Bazarova, N. N., Taft, J. G., Choi, Y. H., & Cosley, D. (2013). Managing impressions and relationships on Facebook: Self-presentational and relational concerns revealed through the analysis of language style. Journal of Language and Social Psychology, 32(2), 121-141.

Becker-Olsen, K. L. (2003). And now, a word from our sponsor--a look at the effects of sponsored content and banner advertising. Journal of Advertising, 32(2), 17-32.

Binsted, K. (1995, August). Using humour to make natural language interfaces more friendly. In Proceedings of the AI, ALife and Entertainment Workshop, Intern. Joint Conf. on Artificial Intelligence.

Budd, J. W. (2004). Employment with a human face: Balancing efficiency, equity, and voice. Cornell University Press.

(36)

Cann, A., Calhoun, L. G., & Banks, J. S. (1997). On the role of humour appreciation in interpersonal attraction: It’s no joking matter. Humour-International Journal of Humour Research, 10(1), 77-90.

Chakrabarti, C., & Luger, G. F. (2015). Artificial conversations for customer service chatter bots: Architecture, algorithms, and evaluation metrics. Expert Systems with

Applications, 42(20), 6878-6897.

Christodoulides, G., De Chernatony, L., Furrer, O., Shiu, E., & Abimbola, T. (2006). Conceptualising and measuring the equity of online brands. Journal of Marketing Management, 22(7-8), 799-825.

Clément, & Guitton. (2015). Interacting with bots online: Users’ reactions to actions of automated programs in Wikipedia. Computers in Human Behavior, 50, 66-75.

Dunbar, R. I. (2012). Social cognition on the Internet: Testing constraints on social network size. Philosophical Transactions of the Royal Society of London: B Biological Sciences, 367, 2192–2201.

Dybala, P., Ptaszynski, M., Rzepka, R., & Araki, K. (2009, May). Humouroids:

conversational agents that induce positive emotions with humour. In Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems-Volume 2 (pp. 1171-1172). International Foundation for Autonomous Agents and Multiagent Systems.

Facebook IQ. (2018). Topics to Watch in the United States for January 2018. Retrieved from

(37)

https://www.facebook.com/business/news/insights/2018-01-topics-to-watch-united-state s?ref=search_new_0

Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2016). The rise of social bots. Communications of the ACM, 59(7), 96-104.

Fishbein, M., & Cappella, J. (2006). The Role of Theory in Developing Effective Health Communications. Journal of Communication, 56((Supp.)), S1-S17.

Grãdinaru, C. (2016). The painting that moves: The internet aesthetics and the reception of GIFs. Hermeneia, (16), 81-91. Retrieved from

http://proxy.uba.uva.nl:2048/docview/1833036240?accountid=14615

Gnewuch, U., Morana, S., & Maedche, A. (2017). Towards designing cooperative and social conversational agents for customer service.

Grace, D., & O'cass, A. (2005). Examining the effects of service brand communications on brand evaluation. Journal of Product & Brand Management, 14(2), 106-116.

Grand View Research, Inc. (2017). Chatbot Market Size To Reach $1.25 Billion By 2025 | CAGR: 24.3%. Retrieved from

https://www.grandviewresearch.com/press-release/global-chatbot-market

Gretry, A., Horváth, C., Belei, N., & van Riel, A. C. (2017). “Don't pretend to be my friend!” When an informal brand communication style backfires on social media. Journal of Business Research, 74, 77-89.

Grice, P. (1957). Meaning. The Philosophical Review, 66. 

Grice, P. (1975). Logic and conversation. Syntax and Semantics, 3, 41–58.  Grice, P. (1989). Studies in the way of words. Harvard University Press. 

(38)

Griol, D., Carbo, J., & Molina, J. M. (2013). An automatic dialog simulation technique to develop and evaluate interactive conversational agents. Applied Artificial Intelligence, 27(9), 759e780.

Hampes, W. P. (1999). The relationship between humour and trust. Humour-International Journal of Humour Research, 12(3), 253-260.

Heeter, C. (1992). Being there: The subjective experience of presence. Presence: Teleoperators & Virtual Environments, 1(2), 262-271.

Highfield, T., & Leaver, T. (2016). Instagrammatics and digital methods: studying visual social media, from selfies and GIFs to memes and emoji. Communication Research and Practice, 2(1), 47-62.

Huang, T. H. K., Lasecki, W. S., Azaria, A., & Bigham, J. P. (2016, September). " Is There Anything Else I Can Help You With?" Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent. In Fourth AAAI Conference on Human Computation and Crowdsourcing.

Kim, Y., & Sundar, S. S. (2012). Anthropomorphism of computers: Is it mindful or mindless?. Computers in Human Behavior, 28(1), 241-250.

Kuligowska, K. (2015). Commercial chatbot: performance evaluation, usability metrics and quality standards of embodied conversational agents. Professionals Center for Business Research, 02.

Kumar, N., & Benbasat, I. (2002). Para-social presence and communication capabilities of a web site: a theoretical perspective. E-Service, 1(3), 5-24.

(39)

Kurtzberg, T. R., Naquin, C. E., & Belkin, L. Y. (2009). Humour as a relationship-building tool in online negotiations. International Journal of Conflict Management, 20(4), 377-397.

Krämer, N. C., Bente, G., Eschenburg, F., & Troitzsch, H. (2009). Embodied conversational agents: research prospects for social psychology and an exemplary study. Social Psychology, 40(1), 26-36.

Larionov, M. (2017). Messenger Platform 1.1: Ratings, Quick Replies, Account Linking, and More. Messenger Platform. Retrieved from

https://messenger.fb.com/blog/messenger-platform-1-1-ratings-quick-replies-account-lin king-and-more/

Lee, K. M., Jung, Y., Kim, J., & Kim, S. R. (2006). Are physically embodied social agents better than disembodied social agents?: The effects of physical embodiment, tactile interaction, and people's loneliness in human–robot interaction. International Journal of Human-Computer Studies, 64(10), 962-973.

Loehr, D. (1996). An integration of a pun generator with a natural language robot. In Proc. International Workshop on Computational Humour, 1996 (pp. 161-172). University of Twente.

Luger, E., & Sellen, A. (2016, May). Like having a really bad PA: the gulf between user expectation and experience of conversational agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5286-5297). ACM. Lynch, O. (2002). Humorous Communication: Finding a Place for Humour in

(40)

McKeown, M. (2002). Why They Don't Buy: Make Your Online Customer Experience Work. Financial Times & Prentice Hall, London.

Mey, J. L. (2001). Pragmatics: An introduction (2nd ed.). Oxford: Blackwell. Nass, C., & Moon, Y. (2000). Machines and mindlessness: Social responses to

computers. Journal of social issues, 56(1), 81-103.

Nass, C., & Brave, S. (2005). Wired for speech: How voice activates and advances the human-computer relationship. MIT press.

Niculescu, A., van Dijk, B., Nijholt, A., Li, H., & See, S. L. (2013). Making social robots more attractive: the effects of voice pitch, humour and empathy. International journal of social robotics, 5(2), 171-191.

Nijholt, A. (2003). Humour and embodied conversational agents. Centre for Telematics and Information Technology, University of Twente.

Nijholt, A., Niculescu, A. I., Alessandro, V., & Banchs, R. E. (2017). Humour in human-computer interaction: A short survey. Institute for Infocomm Research. Park, H., & Cameron, G. T. (2014). Keeping it real: Exploring the roles of conversational

human voice and source credibility in crisis communication via blogs. Journalism & Mass Communication Quarterly, 91(3), 487-507.

Park, J. (2008). Linguistic politeness and face‐work in computer mediated communication, Part 2: An application of the theoretical framework. Journal of the American Society for Information Science and Technology, 59(14), 2199-2209.

(41)

Powers, A., & Kiesler, S. (2006, March). The advisor robot: tracing people's mental model from a robot's physical attributes. In Proceedings of the 1st ACM SIGCHI/SIGART conference on Human-robot interaction (pp. 218-225). ACM.

Reeves, B., & Nass, C. I. (1996). The media equation: How people treat computers, television, and new media like real people and places. Cambridge university press. Romero, E. J., & Cruthirds, K. W. (2006). The use of humour in the workplace. Academy of

Management Perspectives, 20(2), 58-69.

Saygin, A. P., & Ciceklib, I. (2002). Pragmatics in human–computer conversation. Journal of Pragmatics, 34, 227–258. 

Schiffrin, D. (1985). Conversational coherence: the role of well. Language, 640-667. Shawar, B. A., & Atwell, E. (2007, April). Different measurements metrics to evaluate a chatbot system. In Proceedings of the workshop on bridging the gap: Academic and industrial research in dialog technologies (pp. 89-96). Association for Computational Linguistics.

Shneiderman, B. (2010). Designing the user interface: strategies for effective human-computer interaction. Pearson Education India.

Spotts, H. E., Weinberger, M. G., & Parsons, A. L. (1997). Assessing the use and impact of humour on advertising effectiveness: A contingency approach. Journal of Advertising, 26(3), 17–32. 

Strick, M., Van Baaren, R. B., Holland, R. W., & Van Knippenberg, A. (2009). Humour in advertisements enhances product liking by mere association. Journal of

(42)

Tinholt, H. W., & Nijholt, A. (2007, July). Computational humour: utilizing cross-reference ambiguity for conversational jokes. In International Workshop on Fuzzy Logic and Applications (pp. 477-483). Springer, Berlin, Heidelberg.

Van Dolen, W. M., de Ruyter, K., & Streukens, S. (2008). The effect of humour in electronic service encounters. Journal of Economic Psychology, 29(2), 160-179.

Verhagen, T., Van Nes, J., Feldberg, F., & Van Dolen, W. (2014). Virtual customer service agents: Using social presence and personalization to shape online service

encounters. Journal of Computer-Mediated Communication, 19(3), 529-545. Vugt, H. C. V., Bailenson, J. N., Hoorn, J. F., & Konijn, E. A. (2010). Effects of facial

similarity on user responses to embodied agents. ACM Transactions on Computer-Human Interaction (TOCHI), 17(2), 7.

Wendt, C. S., & Berg, G. (2009, September). Nonverbal humour as a new dimension of HRI. In Robot and Human Interactive Communication, 2009. RO-MAN 2009. The 18th IEEE International Symposium on (pp. 183-188). IEEE.

Wyer, R. S., & Collins, J. E. (1992). A theory of humour elicitation. Psychological review, 99(4), 663.

Yu, Z., Xu, Z., Black, A. W., & Rudnicky, A. (2016). Chatbot evaluation and database expansion via crowdsourcing. In Proceedings of the chatbot workshop of LREC (Vol. 63, p. 102).

Zerfass, A., Moreno, Á., Tench, R., Verčič, D., & Verhoeven, P. (2017). European

(43)

of visualisation, social bots and hypermodernity. Results of a survey in 50 Countries. Brussels: EACD/EUPRERA, Quadriga Media Berlin.

Appendix

Appendix A. Chatbot stimuli in the experiment

1) Humorous and human-voice chatbot conversation

2) Humorous and corporate-voice chatbot conversation

(44)

4) Non-humorous and corporate-voice chatbot conversation

Appendix B. Mean and Standard Deviation of every single item

Items of perceived anthropomorphism

Human like

Natural Lifelike Likeable Sociable Friendly Personal

Mean 4.20 4.47 4.22 4.86 4.28 4.68 4.35

Std. Deviation 2.03 1.84 1.94 1.64 1.82 1.88 1.86

Items of satisfaction and enjoyment

Satisfaction1 Satisfaction2 Satisfaction3 Enjoyment 1 Enjoyment 2

Mean 4.68 5.04 4.82 4.40 4.82

Std. Deviation 1.67 1.33 1.51 1.53 1.34

Items of brand attitude Good Favourable Satisfactory Positive Liked

Mean 5.12 4.86 4.98 4.59 4.66

Std. Deviation 1.19 1.33 1.35 1.58 1.54

Appendix C. Crosstabs of independent variables (humour and voice of chatbot conversations) and control variables (familiarity with chatbots and attitude towards pizza)

(45)

Humour Total Humour Non-humour Familiarity with chatbots Not familiar 25 48 73 Moderately familiar 58 40 98 Familiar 36 36 72 Total 119 124 243 Voice Total Human Corporate Familiarity with chatbots Not familiar 34 39 73 Moderately familiar 39 59 98 Familiar 43 29 72 Total 116 127 243 Humour Total Humour Non-humour

Attitude towards pizza I don't like pizza 0 12 12

I neither like or dislike pizza 63 47 110

I like pizza 56 65 121

Total 119 124 243

Voice Total

Human Corporate

Attitude towards pizza I don't like pizza 7 5 12

I neither like or dislike pizza 49 61 110

I like pizza 60 61 121

Total 116 127 243

Appendix D. Effects of control variables on chatbot evaluations and the explanations of dummy variables a

Human likeness Satisfaction Enjoyment Familiarity with chatbots Dummy1 (1 = not familiar) 7.60** 3.98* 18.08*** Dummy2 (1= moderately familiar) 11.66** 12.77*** 2.77 Attitude towards pizza Dummy1 (1 = dislike) 0.32 6.03* 15.90*** Dummy2

(1= neither like or dislike)

0.01 0.95 0.02