• No results found

4.5 Qualitative

4.5.1 Perception and experience

We know from the quantitative section that Vincent was perceived to be equally intelligent, caring, likeable and trustworthy in all conditions, but these scales did not address the entire experience with Vincent.

Evaluation of conversation In the caregiving condition, it was possible to assess people’s evaluation of the self-compassionate letter writing exercise directly through one of Vincent’s questions: “How did this make you feel? Did you like the exercise? Why (not)?”. Participants’

answers fell on an axis ranging from negative to positive. For example, a participant who disliked the exercise said the following:

“It made me uncomfortable. I wouldn’t do it if you weren’t paying me, and really, I feel bad right now that I’ve relived that, and I feel exploited and if you weren’t taking advantage of my poverty I would be in a much better mood.” [G35]

In contrast, a positive participant responded with: “I loved the exercise it makes me question what I believe in and how I view myself. It makes me feel more happy and positive about myself as a person.” [G80]

Most participants, however, were closer to being neutral:

“It was fine.” [G44]

“It was interesting to deep dive into those emotions.” [G50]

These responses show that the exercise was received differently throughout the caregiving condition, but that by itself does not yet tell us much.

However, there is a more subtle way to gauge for evaluations about the entire conversation by looking at the way that participants responded to Vincent’s last message. In all conditions, Vincent ended the conversation by asking participants to hang out again. Their response to this question can be considered an evaluation of how enjoyable or satisfactory the conversation had been to the participant. Here, too, we see replies ranging from (very) positive:

“Sure, I would love that!:-) And thanks for talking to me as well!” [R38]

“Yeah, sounds good. It was nice talking to you, Vincent. Take care, and I hope you get to see the sequoias in-chatbot one day!” [C10]

Through neutral, or perhaps polite:

“Sure thing.” [C43]

“Okay.” [G39]

To (very) negative:

“No.” [R69]

“No! But you’re a robot, so you don’t care. So, bye!” [G81]

Comparing Vincent Apparently, not all participants enjoyed their conversation with Vincent as much: there was quite some variation in their final goodbyes. However, this variation is not reflected in the quantitative scores from Table 1: there, all questions assessing perception of the conversation received similar scores across conditions. In this case, the open-ended question that ended the section on conversation perception may provide insight:

“Why did Vincent’s responses (not) resemble those of other chatbots?”

Although this question was intended to gauge whether people thought Vincent was a real chatbot9, responses hinted at what kind of chatbot participants compared Vincent with. As expected, many participants commented on Vincent’s limitations in understanding their input, ostensibly comparing him to a more clever and sophisticated chatbot. However, others made remarks about his use of GIFs or his display of emotions - suggesting that they compared Vincent with a more average, non-emotional task-focused chatbot.

9As opposed to being a set of conditional survey questions

When we look at all answers across conditions, they reveal an underlying axis as shown in Table 5 below: some participants were surprised because of Vincent’s limitations, others were surprised because of his human-likeness, and yet another group was surprised because of other reasons, such as his humor or the nature of the conversation.

Table 5

Limited chatbot - human axis

Axis Quote

Surprised because of limitations

“I felt as though these were pre-determined responses without much thought.

When the machine did not understand it just dealt with it the best way it could by trying to be funny and sidestepping. Most other chatbots tend to at least point you in a general direction and not be funny.” [R1]

“They were dumber and they tried too hard. But most chatbots I used are for things like finance.” [C104]

Surprised because of details

“Vincent interjected humor which was refreshing.” [G52]

“Other chatbots I have dealt with were designed with service in mind. This was a different kind of conversation.” [R15]

“Most that I’ve interacted with do not attempt to seem so human – they do not claim to have (and fail) programming exams or claim to feel bad about such things. Most are all, or almost all, business, so to speak.” [R10]

Surprised because of humanness

“He sounded like a real person to me, he was lovely and very caring, he seemed really worried for his failure, I felt like I was chatting with a real human.”

[R92]

Combining evaluation and comparison For the participants in the caregiving condition, we could go one step further and combine the human-machine axis with the positive-negative axis as identified earlier. We printed 1410 responses that showed clear examples of positive, negative and neutral answers, and ordered them from negative to positive. Then we ordered these participant’s answers on the human-machine axis from human-like to machine-like11. A simplified outcome of this procedure is shown in Figure 2 on the next page. The full visualization, including the participants’ actual replies and answers, can be found in Appendix C.

10Originally, 15 responses were selected - an arbitrary number. The answer to one open-ended question was left out because participant did not address Vincent’s humanness or limitations

11Another Master’s student also ordered the answers on this scale. The final ordering was an average of the author’s and this other student

Although the distances between observations Figure 2 are not exact, it shows a trend: the more positive participants were about the exercise they completed with Vincent, the more they noted Vincent’s apparent humanness. In contrast, the participants who were negative noted Vincent’s shortcomings as a machine more often. Even though their evaluations came before the comparison, we do not imply that this is a causal relation: rather, they seem to correlate.

Figure 2: Responses ordered along two axes: humanlike - machinelike & negative - positive

In fact, the different ways of saying goodbye showed a similar correlation: those who were negative always remarked on Vincent’s limitations and gave considerably lower evaluations when asked if they felt like they were having a real conversation. For example, consider the goodbye message from the following participant who rated the realness of the conversation with a 1:

“Yeah, thanks but I’m good. have fun taking over the World.” [C114]