• No results found

Cheap Frills? Effectiveness of Cheap-talk Scripts in Reducing Hypothetical Bias in Choice Based Conjoint Experiments through the Lens of Dual Process Theory

N/A
N/A
Protected

Academic year: 2021

Share "Cheap Frills? Effectiveness of Cheap-talk Scripts in Reducing Hypothetical Bias in Choice Based Conjoint Experiments through the Lens of Dual Process Theory"

Copied!
65
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Cheap Frills? Effectiveness of Cheap-talk Scripts in Reducing

Hypothetical Bias in Choice Based Conjoint Experiments

through the Lens of Dual Process Theory

MSc Thesis - Marketing (Intelligence track)

University of Groningen, Faculty of Business and Economics

Author

Jareef Bin Martuza

Completion Date

(2)

1

Cheap Frills? Effectiveness of Cheap-talk Scripts in Reducing

Hypothetical Bias in Choice Based Conjoint Experiments

through the Lens of Dual Process Theory

Submitted on 11th June 2020

Author

Jareef Bin Martuza S 4074866 Antarestraat 23-36 (K-5) 9742LA Groningen j.b.martuza@student.rug.nl +47 9251 3344 Supervisor/ University Dr. Felix Eggers University of Groningen

Second Supervisor/ University Dr. Keyvan Dehmamy

(3)

2

Table of Contents

Abstract ...3

Acknowledgement ...4

1 Introduction ...5

2 Theoretical Framework and Hypotheses ...7

2.1 HB in CBCE and Dual Process Theory...7

2.2 Mitigating HB in CBCE and the Boundaries of CT ... 10

3 Conceptual Model and Operationalizations of Constructs ... 13

4 Methodology ... 14 4.1 Experimental Design ... 14 4.2 Manipulated Variables ... 14 4.3 Dependent Variables ... 14 4.3 Conjoint Design ... 15 4.4 Conjoint Modeling ... 16 4.5 Sampling Method ... 17 5 Results ... 17

5.1 Preparing the Dataset ... 17

5.2 Sample Description ... 17

5.3 Creation of Condition Variables ... 18

5.4 Base Model ... 18

5.5 Comparison of Moderation Models ... 19

5.6 Interpreting Moderators ... 22

5.7 Model Comparison Across PF Split Samples ... 22

5.8 Post Hoc Latent Class Analysis ... 24

6 Implications ... 24

7 Limitations and Further Research ... 26

References ... Error! Bookmark not defined. Appendix 1: Comparison of Estimates Across Models ... 31

Appendix 2: Choice Shares Comparison ... 32

Appendix 3: Detailed Sample Statistics ... 33

Appendix 4: Snapshot of a choice set ... 35

Appendix 5: Cheap-talk Script (Adapted from Cummings and Taylor ... 35

(4)

3

Cheap Frills? Effectiveness of Cheap-talk Scripts in Reducing Hypothetical Bias in Choice Based Conjoint Experiments Through the Lens of Dual Process Theory

Abstract

Background: Choice based conjoint experiments (CBCE) have become an important tool to base managerial decisions on regarding customer preferences. However, the degree to which CBCE are affected by hypothetical bias (HB) reduces the former’s accuracy. One of the much-debated methods to mitigate HB in CBCE is by the use of cheap-talk (CT) scripts.

Originality: Most previous studies on CT, in the context of choice experiments, examined its efficacy based on whether it reduced estimated willingness-to-pay (WTP) values. However, reducing WTP may not always indicate reduced HB. In fact, HB can depend on the type of the product or cognitive resources used to make the purchase decision. Thereby, this thesis focuses on how close the estimated relative partworths are to true partworths as a criterion of reducing HB. To the best of my knowledge, no known studies have investigated the moderating role of purchase frequency (PF) on the effectiveness of CT using dual process theory.

Purpose: This thesis aims to add to the ongoing discussion by exploring how the effectiveness of cheap-talk scripts depends on whether the product has high purchase frequency (system 1 thinking driven) vs low purchase frequency (system 2 thinking driven). This is aimed to help estimate relative parthworths more accurately and thereby validate a cost-effective way of generating more accurate choice share predictions that are closer to reality.

Methodology: The study has a 2-condition (CT vs Hypothetical) between-subjects design for the conjoint experiment. The CBCE was conducted on an online sample of 300 respondents based in the USA, recruited through Prolific Academic to generate the dataset for analysis. Results: The model where the influence of CT was moderated by PF performed better than one without CT, one with CT without factoring in PF, or one including both CT and PF without having a CT*PF interaction. This reaffirms the moderating role of purchase frequency in the effectiveness of cheap-talk scripts.

(5)

4

Acknowledgement

I would like to express my utmost gratitude to my supervisor Dr. Felix Eggers for his overall guidance, prompt feedback both written and verbal, and introducing me to the world of conjoint analysis and hypothetical bias. Also, I always wondered what it would feel like to use data for a thesis project without spamming my social network. Well, Dr. Felix made it happen for our thesis group; and I feel privileged to be able to use data from an online research platform. I am also grateful to our thesis group- Marit, Rick, and Samradha, for collaborating in designing the survey and checking up on each other’s progress. Our sessions were always fun and informative, and I am glad to have been part of this group.

I also want to express my appreciation to Dr. Keyvan Dehmamy for being the second supervisor. It’s poetic in that you have been a part of all us MI students’ lives except Block 2.1- the only time you did not teach us a course. You’re literally everywhere- on our Nestor pages and hearts.

Finally, I would like to thank my parents, A.K.M. Martuza Ahmed and Jesmin Ara Sultana, for constantly motivating me to put in the hours and complete this thesis. It’s nice to have parents cheering for you, especially so during strange times since the spring of 2020.

Thank you.

Everything passes. Nobody gets anything for keeps. And that is how we have got to live.

(6)

5

1 Introduction

Investigating consumer preferences, such as what attributes and attribute-levels of an offering make or break the deal, is of paramount importance to marketers. An experimental method to measure such preferences is choice based conjoint analysis (CBCE). Here, a product is conceptually deconstructed as a combination of different attributes, with each attribute having several levels. For example, a smartphone can be deconstructed into attributes of processor power, memory, battery life, screen size, etc. Similarly, an attribute, such as screen size, can be 4 inches, 5 inches, or 6 inches- each representing one level of the attribute. The choices participants in a CBCE make are used to model relative partworth utilities of different attribute levels. The stated preferences reveal how much customers value changes in levels of different attributes. For example, all things equal, how much more would someone pay for a 5-inch screen than a 4-inch screen. Because of its indirect nature of revealing preferences for particular aspects of a product or service, insights from conjoint choice experiments are often more valid than those where participants state their preferences directly.

However, since most choice experiments involve participants making hypothetical choices, the difference between the participant’s hypothetical preferences and actual preferences in the marketplace introduces a hypothetical bias (Murphy, Allen, Stevens, and Weatherhead, 2005). Extant literature on experimental studies is testament to the notion that participants behave differently under hypothetical conditions than that under real conditions (Silva, Nayga, Campbell & Park, 2012). Since the usual goal of conducting CBCE is to understand what people would do in real-life scenarios, hypothetical bias (HB) can lead to biased preference measures in the sense that stated preferences may be far from actual preferences. Therefore, to better understand what drives HB and make insights from CBCE more accurate representations of reality, it is worth investigating different methods to reduce HB.

(7)

6

randomly selected for the incentive/reward (Ding, 2007) and this realization probability can have similar effects of incentive alignment (Voelckner, 2006). Yet, there are instances where the experimenter may find it not feasible to give out even one actual product (e.g., an electric car).

Another technique of mitigating HB is by using Cummings and Taylor’s (1999) technique of using cheap-talk scripts, i.e., a piece of written or verbal instructions that makes respondents aware of HB before doing the choice tasks. The effectiveness of cheap talk scripts (CT) has had mixed findings till now. CT was effective only for respondents who did not have prior experience with the product (Champ, Moore & Bishop, 2009). This also supports Lusk’s (2003) finding of CT being ineffective for respondents who had knowledge or experience of goods similar to that valued in the study. In addition, Brown, Ajzen, and Hrubes (2003) found CT to be more effective in the context of higher priced product than lower. The lack of theory to explain HB (Murphy, Allen, Stevens and Weatherhead, 2005) also extends to us not being able to explain when CT works (and when it does not) with a established theory. This calls for investigations into the boundary conditions of the effectiveness of CT in reducing HB to explain these mixed findings.

Kahneman (2011) popularized the dual process theory, which proposes that we use two kinds of thinking in daily life- system 1 (fast, intuitive, heuristics based) and system 2 (slow, rational, and driven by logic). Since a key goal of CBCE is to simulate psychological realism as much as possible, the congruency of the mode of thinking (system 1 or 2) during the experiment and in real-life can influence HB. Brown, Ajzen, and Hrubes (2003) and Lusk’s (2003) findings suggest that product type influences the efficacy of CT. This paper posits that it is how closely the CBCE evokes respondents to behave as if in marketplace conditions that moderates the effectiveness of CT in reducing HB. Different products may evoke different modes of thinking during decision-making and the congruence of psychological processes during real and experimental conditions can affect the efficacy of CT. More specifically, this paper aims add to our understanding of HB by drawing from dual-process theory to examine the purchase4 frequency (PF) boundary condition of CT.

RQ: To what extent can dual process theory explain the effectiveness of CT to mitigate HB?

(8)

7

and Selove (2017). Managers can gain meaningful insights about their products by knowing when to use CT (or not) in their conjoint experiments. Moreover, the use of CT can also be a cheaper alternative to incentive-alignment in more accurate estimations without compensating on validity too much. Finally, the use of dual process theory to explain mitigation of HB by CT is hoped to inspire further contributions in explaining the drivers and boundary conditions of HB.

The following sections are structured as follows. First, a theoretical framework is presented to connect hypothetical bias, dual process theory, and cheap-talk scripts that lead to the proposed hypotheses for the study. Thereafter, the experimental and modeling components are discussed in the methodology section, which is followed by a discussion of key results and validation of hypotheses. The thesis concludes with theoretical and managerial implications and suggests arenas for future research based on current limitations and findings.

2 Theoretical Framework and Hypotheses

2.1 HB in CBCE and Dual Process Theory

Hypothetical bias (HB) results from the difference between hypothetical valuations of a product as opposed to actual marketplace behavior (Penn & Wu, 2018). In CBCE, HB is the difference between how much people find an attribute level to be important (revealed from stated preferences), compared to that revealed from marketplace behavior. The prevalence of HB casts doubt on the generated insights from CBCE since we cannot trust the estimated relative partworths because that would give us erroneous preferences and market share predictions. In their meta-analysis across 28 stated preference valuation studies, Murphy, Allen, Stevens and Weatherhead (2005) found the median ratio of hypothetical to actual willingness-to-pay (WTP) to be 1,35. The finding that hypothetical WTP tend to be higher than real WTP has been a commonality in several studies (Hensher, 2010; Fifer, Rose & Greaves, 2014; Penn & Wu, 2018). Schmidt and Bijmolt (2019) find HB to vary with product value, category, sample, experimental procedures, and model specifications. At the core, HB can result from respondents not taking hypothetical tasks seriously, or not exerting enough cognitive effort because their choices usually lack consequences (De-Magistris, Garcia & Nayga, 2013).

(9)

8

our decisions are based on heuristics without much conscious thinking (Kahneman, 2011). Thereby, those decisions are subject to biases to begin with (Hilbert, 2012), and over-correcting for those in experimental conditions may introduce new biases in the study. Contrary to extant findings, this paper posits that it is not so much about whether participants exert enough

cognitive effort but whether the experimental psychological processes mirror that of real-life.

Even though Schmidt and Bijmolt (2019) present salient characteristics of products and experimental procedures as antecedents of HB, we need to dig deeper into cognitive processes to achieve a better understanding of the drivers of HB.

Kahneman (2011) popularized the idea that people use two routes of thinking: a quick and intuitive one (system 1), and a rational one driven by logic (system 2). Depending on the context, some of our decisions are more system 1 dependent, while some rely more on system 2 thinking. System 1 thinking allows us to make fast and automatic decisions that may take place unconsciously and thereby is useful in everyday scenarios. The heuristic nature of system 1 thinking enables us to easily complete routine tasks but may lead to biases and error prone judgment. On the contrary, system 2 thinking is slow and effortful that take place in our conscious mind and is useful for making complex and first-time decisions. Albeit the structure of evaluating pros and cons and analyzing multiple angles of the decision, system 2 thinking comes at the price of being slow and consuming more mental energy.

Drawing from Murphy, Allen, Stevens, and Weatherhead’s (2005) definition of HB, the difference in real and hypothetical preferences can result from incongruency of psychological processes occurring during marketplace and experimental scenarios. For example, HB is expected to be higher if how customers think when deciding which smartphone to buy is different from how participants think in a CBCE set-up investigating smartphone preferences. It is reasonable to assume that most people use system 2 thinking when buying a smartphone, and thereby the onus is on the experimental set-up to evoke system 2 thinking similar to that of the marketplace. If a real-life decision is driven by system 2 thinking, mirroring experimental conditions that also evoke system 2 thinking can bridge the gap between in our thought processes during markerplace and hypothetical processes and thereby reduce HB. On the contrary, if experimental conditions cannot evoke system 2 thinking sufficiently, the mismatch of mental processes occurring in marketplace and hypothetical conditions can result in different valuations by respondents and lead to HB.

(10)

9

thinking by default. This is because their responses generally do not have real consequences and the survey is not “important enough” to put adequate mental effort. Many study samples and respondents on a research platform have had prior experiences with completing surveys (Smith, Roster, Golden & Albaum, 2016), which enables them to respond heuristically. In other words, mental processes during completing surveys are often driven by system 1 thinking since the respondent may feel comfortable in just “going through the motions”. Interestingly, several purchase decisions in real life, especially making choices about frequently purchased items, are also made based on heuristics (Kahneman, 2011). For example, most people use system 1 thinking to buy toilet paper since the many prior times buying that has made it possible to delegate this decision to system 1. This means that a CBCE investigating the preference of toilet paper attribute levels by default would have participants on a system 1 thinking mode by default and thereby experimental conditions already mirror that of real. As a result, respondents relying on system 1 thinking during hypothetical surveys would mirror real life scenarios better for CBCE that also rely on system 1 thinking, and thus have less HB to begin with.

On the contrary, real decisions that require substantial cognitive resources (system 2 thinking) are more difficult to simulate in CBCE. Since people are generally cognitive misers (Szmigin & Piacentini, 2018), people relying on system 1 for completing surveys becomes incongruent to real-life scenarios where the decision requires system 2 thinking. As suggested earlier, this discrepancy can have a higher propensity to induce bias which can cause hypothetical valuations to be different from real life. This relationship between congruence of thought processes in the experiment and marketplace, and where HB is present is outlined in below.

Experiment Marketplace

System 1 System 2

System 1 No HB HB

System 2 HB No HB

(11)

10

respondents are familiar with those decisions (Murphy, Allen, Stevens & Weatherhead, 2005). This means that for products that are bought more often (high purchase frequency), people become more familiar with those decisions and thereby a CBCE in that context would have lower HB.

Dickie, Fisher and Gerkin (1987) found that how much people stated they would pay for a pint of strawberries and how much they actually paid did not differ, providing further support that HB may not be relevant for goods one is familiar with buying. This is because the familiarity of the purchase decision facilitates easier evoking of marketplace processes in experimental conditions. Since HB stems from the differences in actual and hypothetical preferences, this again implies that the hypothetical bias would be weaker for products that are frequently bought. As alluded to before, higher purchase frequency would lead to greater familiarity, and the brain can delegate the task to system 1 since it becomes categorized as a habitual action. On the other hand, buying products that are not frequently bought require more deliberative thinking and thereby uses more of system 2 thinking. Given the usually limited mental efforts expended in completing surveys, a stronger HB is expected when they study is concerned with products that are not frequently bought.

An example to clarify how purchase frequency influences familiarity and mode of thinking (system 1 or system 2) while making purchases, let us take an example of someone riding a bicycle. When someone first learns riding a bicycle, they use system 2 thinking since the brain wants to have conscious control. However, as riding a bicycle becomes familiar, it becomes more of a system 1 function since the person have been through the motions several times. This goes to show that with higher purchase frequency, customers become more comfortable using system 1 for those purchases (such as for buying toilet paper).

2.2 Mitigating HB in CBCE and the Boundaries of CT

(12)

11

“realization probability” may also overcome the lack of consequentiality. However, incentive alignments need to be directly related to the participant expressing true preferences, or else the experiment-behavior would not be consistent with real-behavior. Having valid incentives becomes even more difficult when a CBCE is conducted in the context of products that are not in the market yet or is too expensive to give away (even if it just one).

Another mechanism to reduce hypothetical bias and uncover respondents’ true preferences is by using a cheap-talk (CT) scripts introduced by Cummings and Taylor (1999). At its core, the way CT works is the respondents are made aware of the influence of the study’s hypothetical nature on their valuations of different choices. This can be done with a written script, verbal instructions, explanatory videos, or a combination of those- communicated before the choice tasks. Basically, by making respondents aware of HB, it is hoped that they would expend more of their cognitive resources and take the study in a more serious frame of mind. Evidently, one commonality between the HB mitigatory methods is that they aim to make respondents take decisions with more cognition, suggesting that they try to evoke more of system 2 thinking. However, there is mixed evidence regarding effectiveness of CT in reducing HB (Silva, Nayga, Campbell & Park, 2012). Penn and Hu (2019) also question the efficacy of CT and reiterates the need to investigate boundary conditions. In their meta-analysis, the authors found that CT scripts were effective in reducing hypothetical bias in 23 out of 30 studies that were choice experiments. Brown, Ajzen, and Hrubes (2003) found that CT worked better for higher priced items that lower. Of course, we also do not frequently buy product categories that are relatively on the expensive side. Moreover, Murphy, Stevens and Weatherhead (2005) found that CT worked better when respondents did not have much prior experience in evaluating alternatives of choices. These results do suggest that CT works better when there is low familiarity in purchase decisions. This indicates that including CT in experimental set-ups would be more effective when the participants are less familiar with the decision, i.e., would have otherwise used system 2 thinking in the marketplace.

(13)

12

thought processes (system 1 or 2) in the marketplace and experiment can lead to stronger HB. The effectiveness of a mitigatory agent such as CT also depends on the congruency of mental processes evoked during experiment conditions compared to marketplace. This is relationship is summarized below.

CT Marketplace

No Yes

System 1 Effective Ineffective

System 2 Ineffective Effective

It becomes evident that the effectiveness of CT depends on whether the experimental set-up and marketplace decisions are made with similar psychological processes. More specifically, the psychological processes in the minds of participants are influenced by the familiarity of those decisions, operationalized by purchase frequency (PF). Drawing from the previous section, HB in CBCE also depends on decision familiarity, i.e., PF. Therefore, the effectiveness of CT should also be moderated by purchase frequency.

H1: The effectiveness of CT in reducing HB in CBCE is moderated by purchase frequency of the product in the study.

To reiterate, CT works by evoking system 2 thinking, and earlier studies suggest that CT is more effective when the choices in the study also require deliberative thinking in real-life. Since HB is weaker when hypothetical questions are used for products that are bought frequently, CT would not have sufficient HB to reduce in the first place. Since HB becomes weaker when experimental conditions mirror real conditions better, CT would be more effective when it is used in CBCE on products that are not frequently bought (low PF). This is because CT evokes system 2 thinking- which is usually used for making decisions that are infrequent, thereby facilitating the thought processes in CBCE to be closer to real life scenarios.

H1a: CT is more effective in reducing HB in CBCE when the product in the study is less frequently bought by the participant.

(14)

13

than reducing HB. Drawing from the previous chapter, a mismatch between modes of thinking (system 1 or 2) used in the marketplace and experimental conditions would lead to HB not being reduced. Therefore, if CT is the cause of the mismatch, i.e., evoking system 2 thinking in experimental conditions when system 1 thinking is used in the marketplace- this very phenomenon would not reduce, if not enhance HB. The means that for purchase decisions that people are quite familiar with (high PF), a CT prompt would rather overcorrect for a HB that is not that prevalent in this context. In other words, CT would be less effective in reducing HB if its presence increases the difference between how people think in the CBCE as opposed to that in real life scenarios.

H1b: CT is less effective in reducing HB in CBCE when the product in the study is more frequently bought by the participant.

3 Conceptual Model and Operationalizations of Constructs

Figure 1: Conceptual Model

(15)

14

to manipulate whether there is a match (or mismatch) of modes of thinking (system 1 or system 2) between marketplace and experimental conditions.

4 Methodology

4.1 Experimental Design

The purpose of this study is to examine to what extent dual process theory can be used to explain what influences the effectiveness of cheap-talk (CT) scripts in mitigating hypothetical bias (HB) in choice based conjoint experiments (CBCE). To investigate this, an online CBCE on toothpaste preferences generated the dataset catering to four different studies of the thesis group on drivers of hypothetical bias. The relevant section for this thesis has a between-subjects design with participants either assigned to a CT or hypothetical condition.

4.2 Manipulated Variables

The manipulated variable is the presence of a cheap-talk script (CT condition) or the absence of it (hypothetical condition). Both conditions have the same structure for three sections: experimental instructions, a conjoint task, and demographics and control questions. The CT condition was manipulated by presenting a CT script as a page before the conjoint task. The CT script was adapted to this context (see Appendix 5) from the pre-existing CT script of Cummings and Taylor (1999). This additional page’s aim was to inform respondents about the existence of hypothetical bias in experiments and urge them to be more aware. The hypothetical condition had the conjoint task appear right after the instructions section.

Purchase frequency (PF) was not manipulated but is to be extrapolated from a two-item measure that reveals the last time the participant bought toothpaste, and how many times they bought toothpaste in the last 12 months. A median split was done for the dummy variable PF (1: Low PF, 0: High PF). Toothpaste is an FMCG product which some people buy frequently and whereas some infrequently, depending on the household. This implies that choosing toothpaste could be both a familiar task (system 1) or unfamiliar (system 2) depending on past purchase behavior. Based on the scope of the study on a single product, PF was thereby not manipulated but to be determined from past purchase behavior of participants in the CBCE.

4.3 Dependent Variables

(16)

15

explained compared to the error. The fit would be a proxy of the usefulness of having CT being moderated by PF.

Secondly, the external validity is investigated by comparing mean absolute errors (MAE) of predicted choice shares of different models with that of the two hold-out sets. This follows the logic that better models would have lower mean absolute errors which suggests greater predictive power of the models. Moreover, the MAE’s of the models are also compared with the two holdout sets from an incentive alignment (IA) condition.

Thirdly, a we investigate the significance and estimated coefficients of the moderators to compare across different models. We also check for significance of CT in low and high PF subsamples to further elaborate on the boundary conditions of the effectiveness of CT.

4.3 Conjoint Design

The CBC experiment to generate the dataset for this study comprises of specific attributes of toothpaste, namely- flavor, freshness, color, whitening, cleaning, ingredients, and price. These attributes are varied across both the conditions: hypothetical and CT. The levels of each attribute are between 3 and 4, designed to be mutually exclusive and circumvent number of levels effect from one attribute dominating others. The attributes and levels were mostly determined by two other students in the thesis group since what attributes and levels are key to their research questions. Nonetheless, this was motivated by extant literature and we had multiple discussions to reach a conclusion that is viable for everyone in the thesis group. Those attributes and their levels are detailed in Table 1.

Table 1. Toothpaste Attributes and Levels

Attribute Level 1 Level 2 Level 3 Level 4

Flavor Fennel Peppermint Watermelon

Freshness Max Fresh Cooling Blast Fresh Breath

Color Triple Color Paste White Paste Black Paste

Whitening Advanced Whitening Regular Whitening No Whitening

Cleaning Deep Clean (advanced cavity, tartar, and enamel protection)

Everyday Clean (regular cavity, tartar, and enamel protection)

Sensitive Clean (gentle cavity, tartar, and enamel protection)

(17)

16

To mitigate response fatigue, a fractional factorial conjoint design was implemented with three alternatives of toothpaste and one None Option in each set, and 13 choice sets per respondent. The first “choice set” was for practice, while two of the sets are to be used as holdout sets. This left 10 choice sets per participant for estimating conjoint models. A snapshot of a choice set in the conjoint experiment can be found in Appendix 4.

4.4 Conjoint Modeling

The goal of this study is to investigate how purchase frequency moderates the effectiveness of CT in reducing hypothetical bias. The main criteria here is how the interaction of CT and purchase frequency is significant, and how it improves model fit predictive accuracy. As is the norm in CBCE studies, this study uses a random utility model (RUT) to predict the participants’ toothpaste preferences. The toothpaste can be seen as a bundle of attributes, with its utility being determined by the sum of utilities of the attribute levels (such as color-white, freshness-max, flavor-peppermint, etc. Preference for different attribute levels are decomposed statistically from stated preferences in the choice sets. Drawing from the framework of RUT, participants bases their choices on which of the given alternatives (combination of attributes) provide the maximum utility. Mathematically, this can be expressed as:

𝑈𝑖𝑗 = 𝑉𝑖𝑗 + 𝜀𝑖𝑗

Note that the all the components are indexed per participant per product- meaning the components vary across participants and products. More specifically, the overall utility U of a participant i for a particular combination of a toothpaste j is a latent construct that comprises of a systematic component (rational utility) V and a stochastic utility component (random error) 𝜀. Since the choice sets also contain None Option, the utility of a particular toothpaste combination has to be higher than that of the None Option to be chosen.

A multinomial logit model (MNL) will be used to estimate the systematic utility component of different toothpaste offerings and predict choice probabilities. More specifically, we will later compare how variations of our models account for the impact for toothpaste attributes and levels on systematic utility, 𝑉𝑖𝑗. Here, 𝑉𝑖𝑗 is assumed to be a linear combination of the partworth

utilities of different attribute levels.

(18)

17

4.5 Sampling Method

Respondents for this study is recruited through Prolific Academic (ProA), an online research platform that can be used to recruit online human test subjects for research. Since this study is concerned with investigating experimental techniques, one drawback of online research platforms is that respondents may have prior experiences with similar research designs (such as the use of CT) and this non-naivety may bias responses. Moreover, since this thesis is aimed to investigate the differential effect of CT in high or low PF scenarios, it is important to secure a sample where CT would have a higher likelihood of registering as a stimuli and evoke system 2 thinking rather than being ignored due to prior experiences. Peer, Brandimarte, Samat, Acquisti (2017) in their study found that ProA samples generated higher quality data than the more well-known Amazon’s Mturk. This enhances the confidence in differentiating whether the manipulation makes a difference or not without being attributed to mainly to sample bias. Finding such a platform as ProA and running the CBCE there was made possible by Dr. Felix.

5 Results

5.1 Preparing the Dataset

The datasets (survey data and choice data) for this study was provided by Dr. Felix Eggers, our thesis group supervisor. The survey for the conjoint experiment was designed together with three other members of the thesis group under the supervision of Dr. Felix. The survey dataset initially comprised of a sample of 300 respondents based in the United States of America. From there, conditions not relevant to this study were filtered out from the survey dataset to a final sample of 158 (Hypothetical: 105 vs CT: 53) with 6320 observations in total after merging the survey and choice datasets. All 158 were complete responses. The dataset was consistent, such as the duration to complete the survey was within the expected time taken to complete the survey, and thereby all deemed as valid responses.

5.2 Sample Description

(19)

18

members, and a little over half of them (52,53%) spent less than 4$ on a tube of toothpaste. Please refer to Appendix 3 for more details on sample statistics.

5.3 Creation of Condition Variables

Hypothetical and CT conditions were already manipulated in the between-subjects design with participants being placed in either of the conditions and is represented in the dataset as a dummy variable (CT:1, Hypothetical: 0). A median split was done for purchase frequency (PF) with (1: low vs 0: high). Thereby, participants who bought toothpaste both in the last three months and more than three times a year classified as having high purchase frequency, and the rest with low. This gave us an even split with 79 participants in the PF-high group, and 79 in the PF-low group.

5.4 Base Model

Firstly, base model 1 (BM1) is estimated, which comprises of only the main effects of all attribute levels of toothpaste in terms of partworth utilities (Table 2). Next, base model 2 (BM2) is estimated with price in a linear format. Chi-square tests on the differences of likelihood ratios are conducted, and BM2 is significantly better than the null model (p < ,001), whereas not significantly different than models where price is in a partworth format (p = ,5722). Moreover, BM2 has a higher adjusted 𝑅2 percentage points (BM2: ,1781 vs BM1: ,1772) which suggests

it being better. Finally, the Akaike Information Criteria (AIC) was also lower for BM2 (BM2: 3600,7 vs BM1: 3602,7). Therefore, we proceed with the model in which price is linear to keep the model parsimonious since there are no significant differences despite BM2 having three parameters less than BM1.

Table 2: Modeling Base Models

Null Model BM1 (price-partworth) BM2 (price- linear) LL -2190,3 -1785,3 -1786,3 AIC 4380,6 3602,7 3600,7 𝑅2 ,1849 ,1844 Adj 𝑅2 ,1772 ,1781

Estimates of most attribute levels are significant at p < ,05, except for all three levels of

freshness and regular level of whitening (See Appendix 1 for details). The estimate of price is

(20)

19

5.5 Comparison of Moderation Models

BM2 is then extended to include the moderation effects of CT (cheap-talk) and PF (purchase frequency) with price and None Option. This is because the content of the CT specifically urges the participants to be surer of their choices- which would be represented in the share of None Option choices and price sensitivity. When participants are more sure of their choices, the stated preferences should be more close to actual preferences, and this would materialize in a higher proportion of choice sets having None Option as the chosen one, when the alternatives in the sets are not attractive enough. Similarly, CT should also make respondents pay more attention to price, and combined price effects would be higher as well. Moreover, CT or PF are not expected to affect preference across alternatives and is thereby not interacted with all attribute levels.

I take an incremental model building approach with Moderation Model 1 (MM1) including two interactions of CT with None Option and price. Next, MM2 includes two interaction effects of PF with None Option and price. It is to be noted that MM2 is there to control for how PF behaves standalone without the inclusion of CT. MM3 contains four interaction terms of CT with None Option and price, and PF with None Option and price. Finally, MM4 adds two double interaction terms of CT and PF with None Option, and CT and PF with price to MM3- which represents the proposed optimum model.

(21)

20

Table 3: Moderation Models

BM2 MM1 MM2 MM3 MM4 Moderators Base Model CT*None CT*price PF*None PF*price CT*None CT*price PF*None PF*price CT*None CT*price PF*None PF*price CT*PF*None CT*PF*price LL -1786,3 -1782,3 -1784,2 1780,1 -1776,0 AIC 3600,7 3596,6 3600,5 3596,3 3592,1 𝑅2 ,1844 ,1863 ,1854 ,1873 ,1892 Adj 𝑅2 ,1781 ,1790 ,1781 ,1791 ,1800 MAE H1 11,5% 7,92% 7,17% 8,47% 4,87% MAE H2 12,87% 9,05% 13,82% 9,68% 6,40% MAE IA H1 9,14% 5,56% 4,81% 6,11% 2,51% MAE IA H2 9,45% 5,63% 10,40% 6,26% 2,98%

Since the overarching goal of this study is investigating the boundary conditions of the effectiveness of CT in reducing hypothetical bias, the first step is looking at internal validity comparisons of the estimated models. Firstly, in terms of adjusted 𝑅2 percentage points (Table 3), we find models that account for CT (MM1, MM3, and MM3) perform better than the base model (BM2). We can also see that the model fit (Table 3) gradually rises as CT is taken into consideration (MM1), both CT and PF is accounted for (MM3) and finally when the interaction effects of CT and PF is added (MM4) while comparing the models (𝑅2𝐵𝑀2 < 𝑅2𝑀𝑀1 < 𝑅2𝑀𝑀3 <

𝑅2

𝑀𝑀4 ). Moreover, the AIC (Table 2) also shows that MM4 performs the best as it has the

lowest value (𝐴𝐼𝐶𝐵𝑀2 > 𝐴𝐼𝐶𝑀𝑀1 > 𝐴𝐼𝐶𝑀𝑀3 > 𝐴𝐼𝐶𝑀𝑀4 ). This is in line with Murphy, Stevens and Weatherhead’s (2005) position that greater model fit suggests greater internal validation and thereby better performance of MM4 in reducing hypothetical bias.

Therefore, the model comparisons provide the valid evidence that CT is more effective in reducing HB when considered in conjunction with PF (MM3 having better model fit than MM1). In addition, the results also suggest that the effectiveness of CT is also dependent on whether PF is high or low (MM4 having better model fit than MM3), which signals significant interaction effects.

(22)

21

However, another important criterion in comparing models is checking external validity based on prediction errors on out-of-sample observations. Thereby, all five models are used to predict choice shares of two holdout sets (H1 & H2) and thereby compute mean absolute errors (MAE). Again, it is clear that MM4 outperforms the other models in being able to predict choice shares of each of the two holdout sets with the lowest MAE. Surprisingly, despite MM3 having better fit than MM1, the predictive error of the former is higher (even though better than BM2). However, the stark difference in predictive performance of MM4 and MM3 suggests that it is the interaction between CT and PF that is important, than merely accounting for those. The difference between MM1 and MM4 on the grounds of both model-fit, and predictive accuracy are especially indicative of the proposition that CT is more effective when accounting for PF. The following section provides more evidence on this proposition.

Since actual market choice shares from the market cannot be obtained, the predictive accuracy is also compared by taking the holdout sets in the IA condition as a proxy of the market shares. It is to be noted that this is possible due to one of the thesis group members writing their thesis with IA. In the IA condition, the respondents all had a random chance of receiving one of their chosen alternatives. Therefore, we can say that this can be a closer representation of marketplace behavior. Again, we see consistent results with the predictive power (lower MAE) in order of MM4 > MM1 > BM2. This further adds evidence the predictive power increase of having a CT*PF interaction in the model. The MAE comparison is illustrated below.

Figure 2: Comparison of Model Prediction Errors

0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 MM1 MM2 MM3 MM4 Me an A bs o lut e Er ro r Models

Model Prediction Error Comparison

(23)

22

5.6 Interpreting Moderators

Table 4: Comparing Coefficients of Moderators

MM1 MM2 MM3 MM4 None -,391*** -,649*** -,493*** -,615*** Price -,362*** -,287*** -,295*** -,309*** CT*None -0,576* -0,594* -,109 CT*Price ,0158 ,0267 ,0722 PF*None ,1871 ,2229 0,459* PF*Price -0,141* -0,144* -,115 CT*PF*None -1,046* CT*PF*Price ,084 Combined None Option Effects -,967* -,649* -1,087* 1,202* Combined Price Effects -,362* -,428* -,439* -,309* *significant at p<0,05, **significant at p<0,01, ***significant at p<0,001

From the Table 4, we can see that the combined magnitude effects of the None Option increases in the manner MM1 (|-,967|) < MM3 (|-1,087|) < MM4 (|-1,202|). This suggests that people are most likely to choose the None Option when incorporating for CT*PF*None in the model. Choosing the None Option more than not indicates truer stated preferences since if a choice set does not have an attractive enough alternative, the person chooses None Option rather than stating a preference that is far from true. Moreover, as mentioned before, the findings that CT*None becomes insignificant when a CT*PF*None is introduced suggests the moderating role of PF in the effectiveness of CT.

Surprisingly, our results do not suggest that respondents become more price sensitive when adding CT and PF to the model. However, this may be explained by lower than normal prices of toothpaste (25% off normal price), which can actually make people perceive the higher priced ones having a higher absolute discount. Noticing this can even make people less price sensitive since they may feel they get more value from choosing such alternatives and thereby lead to this inconsistency.

All in all, we find strong support for H1, thereby can conclude with evidence that the effectiveness of CT is moderated by PF.

5.7 Model Comparison Across PF Split Samples

(24)

23

Table 5: Comparing CT Across High vs Low PF Subsamples

Base Model (High PF) CT Model (High PF) Base Model (Low PF) CT Model (Low PF) LL -921,1 -920,7 -853.3 -845.2 AIC 1870,3 1873,7 1734.7 1722.6 𝑅2 ,1590 ,1593 ,2209 ,2282 Adj 𝑅2 ,1462 ,1447 ,2081 ,2136 MAE H1 8,76% 8,83% 16,69% 8,20% MAE H2 10,94% 12,32% 14,94% 9,90%

In the low PF subsample, the same inclusion of CT does significantly improve the model as determined by the Chi squared test (p = ,0003). Moreover, the interaction term of CT and None option was highly significant (p < ,001). In addition, the Adjusted 𝑅2 value is higher in the CT condition (Base: ,2089 vs CT: ,2136) indicating the addition of CT when PF is low does make the model better. This shows that CT is more effective when PF is low, thereby supporting H1a. However, in the high PF subsample, the inclusion of CT interactions with price and None Option does not significantly improve the model as determined by the Chi squared test (p = ,7183). Moreover, both the CT interaction terms were insignificant (CT*price: p = ,4159, CT*none: p = ,6931). In addition, the Adjusted 𝑅2 value is actually lower in the CT condition

(Base: ,1462 vs CT: ,1447) indicating the addition of CT when PF is high actually makes the model worse. This suggests that inclusion of CT does not reduce hypothetical bias when the product has a low PF and may even be counterproductive, thereby supporting H1b.

For external validity, choice shares where predicted on two holdout sets (H1 & H2) by Base and CT models in both High and Low PF subsample (Table 5). We can see that MAE for high PF subsample is not lower when CT is introduced to the model (Holdout 1: 8,83% for CT vs 8,76% for Base; Holdout 2: 12,32% for CT vs 10,94% for Base). However, MAE is substantially lower when including CT in the low PF subsample (Holdout 1: 8,20% for CT vs 16,69% for Base; Holdout 2: 9,90% for CT vs 14,94 for Base). This clearly shows that the CT helps predict choice shares more accurately when PF is low, whereas it is not effective when PF is high, thereby providing more support for H1a and H1b.

(25)

24

5.8 Post Hoc Latent Class Analysis

The current section outlines a post hoc analysis on investigating whether the consumers can be assigned to different segments. Since the study has a between-subjects design (CT vs Hypothetical), segmentation was based on one dummy variable (CT)and thereby the model was run for two classes. Increasing the number of classes to 3 or 4 did not make the model fit better. The latent class model (with two classes) was found to be significantly different from MM4 as supported by a likelihood ratio chi-squared test which has p < 0,001. The latent class model also has a lower AIC than MM4 (3302 vs 3592) therefore reaffirming that the respondents can be divided into CT vs Hypothetical in their modal class memberships.

The estimates class2 (-,8762 at p < ,001) and CT*class2 (,3679 at p <,00) further provides evidence on the heterogeneity of customers across CT and Hypothetical conditions. The class memberships are classified at 115 for hypothetical (actual: 105) and 42 for CT (actual: 53), with a mean absolute error (MAE) of 13.29%.

The same latent class analysis is done on split samples of high and low purchase frequency (PF). Again, both class2 and CT*class2 were significant for both high and low PD subsamples at p < ,001. For low PF subsample, the class memberships were classified at 58 (actual 55) for hypothetical and 21 (actual 24) for CT with a MAE of 7,59%. Similarly, for the low PF subsample, the class memberships were classified at 56 (actual 50) and 23 (actual 29) with a MAE of 15,19%. The discrepancy in MAE across subsamples are also explained by the result that basing the latent class modeling on two variables (CT and PF) did not make the model significantly better compared to only using basing the segmentation on CT.

All in all, it is shown that it is indeed possible to segment customers into two segments based on CT, the same cannot be done when basing the segmentation on PF. However, it is to be noted that whereas customers may not behave differently based on PF, this thesis is rather concerned with how the effectiveness of CT is moderated by PF.

6 Implications

(26)

25

execution of CBCE when the choice sets involve products that are infrequently bought by the respondent, while making no significant difference for products frequently bought. This is in line with both Murphy, Allen, Stevens, Weatherhead (2005) and Dickie, Fisher and Gerking’s (1987) findings that HB is less prevalent for products that respondents are more familiar with. If there is less HB to begin with, it is to be expected that adding a HB reducing mechanism would not improve the model fit. This is because CT can explain part of the unexplained variation for products with low PF as compared to not having CT for low PF products.

Referring to the research question, the results also suggest that dual process theory can be used to explain the effectiveness of CT. To address the mixed findings of Silva, Nayga, Campbell, and Park (2012), this study does point out that the effectiveness of CT depends on whether the marketplace decision was driven by system 2 thinking. Since the way CT works is by urging the respondents to use more of their cognition, it facilitates system 2 thinking. Our results clearly show that CT made the execution of CBCE better in the subsample which had low PF of toothpaste, suggesting lack of familiarity and thereby system 2 usage in the marketplace (if prompted). Whereas, in the high PF sample, using CT did not make any significant difference. This is in line with Lusk’s (2003) findings that CT is effective when there is product unfamiliarity among the respondents. Moreover, the finding that CT did not make any improvement in the high PF subsample is in line with Champ, Moore, and Bishop’s. (2009).’s findings that CT is not effectives when respondents have higher prior experience with the product.

(27)

26

usually requires system 1 thinking in the marketplace, it would be better to let the participant also be on system 1 thinking by not introducing mechanisms that over-commit cognitive resources. Also, for researchers interested in better execution of CBCE, the findings of this study can help them better design their CBCE studies without extra costs. Finally, the findings are also relevant for students writing their bachelor’s or master’s thesis who want to increase the validity of their study in a cost-effective manner.

7 Limitations and Further Research

The findings and interpretations of this study is of course subjected to limitations such as violations of assumptions of CBCE, scope and design of the study, and sample validity. Firstly, all levels of Freshness being insignificant may suggest some ambiguity in how respondents perceived the different levels. Moreover, the assumption of independence of irrelevant alternatives may be violated since toothpaste have a wide range of alternatives and it is not possible to incorporate all without compensating on reducing respondent fatigue. Future studies investigating boundary conditions of CT may benefit from choosing a more standard product. Secondly, even though the incorporation of holdout tasks does allow us to test external validity, the market shares are extrapolated from the IA condition and not actual market shares per se. In addition, the study did not have brand as an attribute which may have enabled comparison of brand choice shares in the study with that of the marketplace. Also, it is to be noted that the CBCE was a collaborated effort, so the attributes and levels were balanced across hedonic and utilitarian aspects to fit the overarching goal of the thesis group. Thereby, future studies in this domain may benefit from choosing a category with fewer brands (unlike toothpaste) and having brand as an attribute. Echoing Eggers, Hauser and Selove (2017), to overcome the limitation of this study only having text-based stimuli, future studies may also benefit from having visual stimuli to better mirror the marketplace.

(28)

27

(29)

28

References

Berinsky, A. J., Margolis, M. F., & Sances, M. W. (2014). Separating the shirkers from the workers? Making sure respondents pay attention on self‐administered surveys.

American Journal of Political Science, 58(3), 739-753.

Brown, T. C., Ajzen, I., & Hrubes, D. (2003). Further tests of entreaties to avoid hypothetical bias in referendum contingent valuation. Journal of environmental Economics and

Management, 46(2), 353-361.

Champ, P. A., Moore, R., & Bishop, R. C. (2009). A comparison of approaches to mitigate hypothetical bias. Agricultural and Resource Economics Review, 38(2), 166-180. Cummings, R. G., & Taylor, L. O. (1999). Unbiased value estimates for environmental goods:

a cheap talk design for the contingent valuation method. American economic review,

89(3), 649-665.

De-Magistris, T., Gracia, A., & Nayga Jr, R. M. (2013). On the use of honesty priming tasks to mitigate hypothetical bias in choice experiments. American Journal of Agricultural

Economics, 95(5), 1136-1154.

Dickie, M., Fisher, A., & Gerking, S. (1987). Market transactions and hypothetical demand data: A comparative study. Journal of the American Statistical Association, 82(397), 69-75.

Ding, M. (2007). An incentive-aligned mechanism for conjoint analysis. Journal of Marketing

Research, 44(2), 214-223.

Ding, M., Grewal, R., & Liechty, J. (2005). Incentive-aligned conjoint analysis. Journal of

marketing research, 42(1), 67-82.

Eggers, F., & Eggers, F. (2011). Where have all the flowers gone? Forecasting green trends in the automobile industry with a choice-based conjoint adoption model. Technological

Forecasting and Social Change, 78(1), 51-62.

(30)

29

Fifer, S., Rose, J., & Greaves, S. (2014). Hypothetical bias in Stated Choice Experiments: Is it a problem? And if so, how do we deal with it?. Transportation research part A: policy

and practice, 61, 164-177.

Hauser, J. R., Eggers, F., & Selove, M. (2019). The Strategic Implications of Scale in Choice-Based Conjoint Analysis. Marketing Science, 38(6), 1059-1081.

Hensher, D. A. (2010). Hypothetical bias, choice experiments and willingness to pay.

transportation research part B: methodological, 44(6), 735-752.

Hilbert, M. (2012). Toward a synthesis of cognitive biases: how noisy information processing can bias human decision making. Psychological bulletin, 138(2), 211.

Howard, G., Roe, B. E., Nisbet, E. C., & Martin, J. F. (2017). Hypothetical bias mitigation techniques in choice experiments: do cheap talk and honesty priming effects fade with repeated choices?. Journal of the Association of Environmental and Resource

Economists, 4(2), 543-573.

Kahneman, D. (2011). Thinking, fast and slow. Macmillan.

List, J. A., & Gallet, C. A. (2001). What experimental protocol influence disparities between actual and hypothetical stated values?. Environmental and resource economics, 20(3), 241-254.

Miller, K. M., Hofstetter, R., Krohmer, H., & Zhang, Z. J. (2011). How should consumers’ willingness to pay be measured? An empirical comparison of state-of-the-art approaches. Journal of Marketing Research, 48(1), 172-184.

Murphy, J. J., Allen, P. G., Stevens, T. H., & Weatherhead, D. (2005). A meta-analysis of hypothetical bias in stated preference valuation. Environmental and Resource

Economics, 30(3), 313-325.

Murphy, J. J., Stevens, T., & Weatherhead, D. (2005). Is cheap talk effective at eliminating hypothetical bias in a provision point mechanism?. Environmental and Resource

economics, 30(3), 327-343.

Peer, E., Brandimarte, L., Samat, S., & Acquisti, A. (2017). Beyond the Turk: Alternative platforms for crowdsourcing behavioral research. Journal of Experimental Social

Psychology, 70, 153-163.

Penn, J. M., & Hu, W. (2018). Understanding hypothetical bias: An enhanced meta-analysis.

American Journal of Agricultural Economics, 100(4), 1186-1206.

Penn, J., & Hu, W. (2019). Cheap talk efficacy under potential and actual Hypothetical Bias: A meta-analysis. Journal of Environmental Economics and Management, 96, 22-35. Smith, S. M., Roster, C. A., Golden, L. L., & Albaum, G. S. (2016). A multi-group analysis of

online survey respondent data quality: Comparing a regular USA consumer panel to MTurk samples. Journal of Business Research, 69(8), 3139-3148.

(31)

30

Schmidt, J., & Bijmolt, T. H. (2019). Accurately measuring willingness to pay for consumer goods: A meta-analysis of the hypothetical bias. Journal of the Academy of Marketing

Science, 1-20.

Silva, A., Nayga Jr, R. M., Campbell, B. L., & Park, J. L. (2012). Can perceived task complexity influence cheap talk's effectiveness in reducing hypothetical bias in stated choice studies?. Applied Economics Letters, 19(17), 1711-1714.

Tonsor, G. T., & Shupp, R. S. (2011). Cheap talk scripts and online choice experiments:“looking beyond the mean”. American Journal of Agricultural Economics,

93(4), 1015-1031.

Voelckner, F. (2006). An empirical comparison of methods for measuring consumers’ willingness to pay. Marketing Letters, 17(2), 137-149.

Wlömert, N., & Eggers, F. (2016). Predicting new service adoption with conjoint analysis: external validity of BDM-based incentive-aligned and dual-response choice designs.

(32)

31

Appendix 1: Comparison of Estimates Across Models

Attributes & Levels BM2 MM1 MM2 MM3 MM4

Flavor Fennel -0,2832*** -0,2833*** -0,2845*** -0,2845*** -0,2847*** Peppermint 0,6459*** 0,6458*** 0,6468*** 0,6467*** 0,6466*** Watermelon -0,3627*** -0,3625*** -0,3623*** -0,3622*** -0,3619*** Freshness Max ,0378 ,0384 ,0392 ,0399 ,0399 Cooling -,0210 -,0215 -,0238 -,0247 -,0254 Fresh ,0168 ,0169 ,0154 ,0152 ,0145 Color Triple 0,1348*** 0,1343*** 0,1356*** 0,135*** 0,1348*** White 0,17435*** 0,1744*** 0,1721*** 0,1722*** 0,1726*** Black -0,30915*** -0,3087*** -0,3077*** -0,3072*** -0,3074*** Whitening Advanced 0,3168*** 0,3174*** 0,3161*** 0,3166*** 0,3162*** Regular ,0708 ,0705 ,0693 ,0689 ,0699 No -0,3168*** -0,3174*** -0,3161*** -0,3166*** -0,3162*** Cleaning Deep 0,1799*** 0,1796*** 0,1795*** 0,1791*** 0,1784*** Everyday -0,2026*** -0,2016*** -0,2021*** -0,2011*** -0,2006*** Sensitive 0,03645*** 0,03621*** 0,0226*** 0,022*** 0,0222*** Ingredients Contains Fluoride 0,1277** 0,128** 0,1277** 0,12798** 0,1272** Fluoride-free -0,0948* -0,0947* -0,0929* -0,0928* 0,0926* Paraben-free -0,0329** -0,0333** -0,0348** -0,03518** -0,2198**

Price & None

Price -0,3561*** -0,3616*** -0,2866*** -0,2949*** -0,309*** None -0,5576*** -0,3908*** -0,6485*** -0,4933** -0,615*** Moderators CT*None -0,5762* -0,5943** -,1087 CT*Price ,0158 ,0267 ,0722 PF*None ,1871 ,2229 0,4594* PF*Price -0,1412* -0,1436* -,1148 CT*PF*None -1,0455 CT*PF*Price ,0841*

***significant at p <0.001, **significant at p<0.01, *significant at p<0.05

(33)

32

Appendix 2: Choice Shares Comparison

Option 1 Option 2 Option 3 None MAE MAE

(34)

33

Appendix 3: Detailed Sample Statistics

Age: Nearly 82% between 18-39.

≤17 18-24 25-29 30-34 35-39 50-44 45-49 50-54 55-59 60-64 65-69 ≥70 1 57 27 28 17 7 8 3 5 3 1 1 Conditions: 33% CT Hypothetical CT 105 53

US Region: Evenly spread.

West Southeast Northeast Southeast Midwest Other Not in USA

35 13 34 40 33 1 2

Survey Duration in Seconds

Mean SD Median Min Max

842,68 359,28 754,5 279 2038

Household Size in number of members: about 74% between 2-4 members per household

1 2 3 4 5 ≥ 6

22 43 41 33 14 5

Recency based on the last time they purchased toothpaste: 49.37% bought toothpaste last

month

Last month Last 3 months Last 6 months Last 12 months More than 1 year

78 57 18 4 1

Frequency based on how many times the respondent bought toothpaste in the last 12 months:

81% bought between 1-6, 50,6% 3 or less.

Never 1-3 4-6 7-9 ≥ 10

2 78 50 10 18

Monetary based on money spent in USD per tube of toothpaste: 52.53% less than $4.

>2 2-3,99 3-4,99 5-6,99 ≥ 7

(35)

34

Current brand of toothpaste: 77,85% use Crest or Colgate, 84,8% use Crest, Colgate, or

Sensodyne. Cres t Colg ate Sens odyn e Arm & Ham mer Elm ex Oral -B Tom ’s of Mai ne Hell o Cora l Paro dont ax Aqu a Fres h Othe r Do not kno w 62 55 17 2 1 1 6 1 0 0 4 3 6

Elaboration as mean of elaboration items (scale: 0-6):

Mean SD Median Min Max

3,97 1,38 4,12 0 6

CRT as percentage correct on crt questions: 59% correct answers

Mean SD Median Min Max

,59 ,29 ,5 0 1

CRT as number of correct answers: 52.53% got two or less correct.

0 1 2 3 4

(36)

35

Appendix 4: Snapshot of a choice set

Appendix 5: Cheap-talk Script (Adapted from Cummings and Taylor

Caution:

In a recent study similar to this, participants just like you chose what they preferred most among different types of toothpaste. One group did not have to actually buy the toothpaste, and 38% said they would indeed buy one of the options they chose. The second group had an additional clause that they would have to actually buy one of their chosen options. This time, only 25% said that they would buy the toothpaste. That is quite a difference, right? We call this difference a hypothetical bias that shows how people’s hypothetical choices differ from real purchases in which they have to use their own money.

Therefore:

We request you to choose exactly as you would have chosen if you really were going to face the consequences, i.e., if you really had to buy the toothpaste you choose. So, during every task, please ask yourself: 1. What features of the toothpaste are most important to me? 2. Taking all

options into account, will I really spend my money to buy that toothpaste? Please keep this in

(37)

36

Appendix 6: R Script

set.seed(2020) # set random seed to allow results to be replicated. # setting working directory

rm(list=ls())

setwd("C:/Users/jaree/Desktop/Thesis Data")

################# loading survey data ######################## library(tidyr)

dat <- read.csv("Survey_1294_data.csv")

############# descriptive stats ###########

############################################ # filtering IA out

dat1 <- subset(x = dat, subset = IA < 1) # for IA estimation

dat2 <- subset(x = dat, subset = IA > 0) colnames(dat1)

# gender 180 female, 118 male, 2 prefered not to answer table(dat1$X32_gender) # age table(dat1$X33_age) # condition table(dat1$CT) # US region table(dat1$X34_US.region) # elapsed seconds library(psych) describe(dat1$elapsed_seconds) # household size table(dat1$X35_hh.size) # for dat dat$r <- 0 dat$r[(dat$X37_recency > 0)] = 1 dat$f <- 0 dat$f[(dat$X38_frequency > 0 )] = 1 dat$purchase <- 0

(38)

37 # recency table(dat1$X37_recency) dat1$r <- 0 dat1$r[(dat1$X37_recency > 0)] = "1" # frequency table(dat1$X38_frequency) dat1$f <- 0 dat1$f[(dat1$X38_frequency > 0 )] = "1" # purchase frequency dat1$purchase <- 0

dat1$purchase[(dat1$r > 0 & dat1$f > 0) ] = "1" table(dat1$purchase)

# monetary

table(dat1$X39_monetary) # current toothpast brand table(dat1$X36_brand_item)

############# for purchase frequency with(dat1, cor(X38_frequency, X37_recency))

############### loading choice data ################### ######################################################## choices <- read.csv("Survey_1294_choices.csv")

# removing the first set which was a trial

cbc <- subset( x = choices, subset = choices$Set_id > 1) library(dplyr)

# renaming set_id cbc <- cbc %>%

(39)

38

mutate(Set_id = replace(Set_id, Set_id == 11,10)) # merging choice and survey data

library(tidyr)

cbc2 <- merge(cbc, dat, by.x = "Resp_id", by.y = "resp_id")

cbc2 <- cbc2[order(cbc2$Resp_id, cbc$Set_id, cbc$Alternative_id),] # filtering IA out and keeping one for IA estimation

cbc3 <- subset(x = cbc2, subset = IA < 1) cbc3.1 <- subset(x = cbc2, subset = IA > 0) # converting the datasets into mlogit formats library(mlogit)

cbc4 <- mlogit.data(cbc3, choice="Selection_Dummy", shape="long", alt.var="Alternative_id", id = "Resp_id")

cbc4.1 <- mlogit.data(cbc3.1, choice="Selection_Dummy", shape="long", alt.var="Alternative_id", id = "Resp_id")

##################### estimate with just attributes ########################

################################ base models ############################### library(mlogit)

bm1 <- mlogit(Selection_Dummy ~ Flavor.1..Fennel. + Flavor.2..Peppermint. + Freshness.1..Max.Fresh. + Freshness.2..Cooling.Blast. +

Color.1..Triple.color.paste. + Color.2..White.paste. +

Whitening.1..Advanced.whitening. + Whitening.2..Regular.whitening. + Cleaning.1..Deep.clean. + Cleaning.2..Everyday.clean. +

Ingredients.1..Contains.fluoride. + Ingredients.2..Fluoride.free. + Price.1..2.00. + Price.2...2.80. + Price.3...3.60. +

None_option | 0, cbc4) summary(bm1)

null <- 158*10*log(1/4)

R1 <- 1- bm1$logLik[1]/null ###### r-squared

R_adj1 <- 1- (bm1$logLik[1]-17)/null ##### pseudo r-squared chi1 <- -2*(null-bm1$logLik[1])

pchisq(chi1, df = 14, lower.tail = FALSE)

# estimate with price linear. linear is NOT significantly worse # proceed with price linear

# creating price linear cbc4$price <- 4.40

cbc4$price[(cbc4$Price.1..2.00.=='1')]="2" cbc4$price[(cbc4$Price.2...2.80. =='1')]="2.80" cbc4$price[(cbc4$Price.3...3.60. =='1')]="3.60"

(40)

39 Freshness.1..Max.Fresh. + Freshness.2..Cooling.Blast. + Color.1..Triple.color.paste. + Color.2..White.paste. + Whitening.1..Advanced.whitening. + Whitening.2..Regular.whitening. + Cleaning.1..Deep.clean. + Cleaning.2..Everyday.clean. + Ingredients.1..Contains.fluoride. + Ingredients.2..Fluoride.free. + price + None_option | 0, cbc4) summary(bm2) R2 <- 1- bm2$logLik[1]/null ###### r-squared

R_adj2 <- 1- (bm2$logLik[1]-14)/null ##### pseudo r-squared chi2 <- -2*(bm2$logLik[1] - bm1$logLik[1])

pchisq(chi2, df = 3, lower.tail = FALSE) library(sjPlot)

library(sjmisc) library(sjlabelled)

tab_model(bm1,bm2, show.aicc = TRUE,show.loglik = TRUE, show.est = TRUE, show.r2 = TRUE, show.p = TRUE, show.reflvl = TRUE)

# shares of holdout 1 table(dat1$X14_holdout1_best)/158 # shares of holdout 2 table(dat1$X15_holdout.2_best)/158 ##### IA holdout ##### # holdout 1.1 table(dat2$X14_holdout1_best)/142 # holdout 2.1 table(dat2$X15_holdout.2_best)/142

################# estimate with just CT and PF moderators ################## ########################### moderation models ############################## # # interact CT with price and none option

# model does improve

# ct*price not significant, ct*none significant

mm1 <- mlogit(Selection_Dummy ~ Flavor.1..Fennel. + Flavor.2..Peppermint. + Freshness.1..Max.Fresh. + Freshness.2..Cooling.Blast. +

Color.1..Triple.color.paste. + Color.2..White.paste. +

Whitening.1..Advanced.whitening. + Whitening.2..Regular.whitening. + Cleaning.1..Deep.clean. + Cleaning.2..Everyday.clean. +

Ingredients.1..Contains.fluoride. + Ingredients.2..Fluoride.free. + price + I(CT*price) + I(CT*None_option) +

Referenties

GERELATEERDE DOCUMENTEN

26 Yet she traces some of the implications of what she means by ‘flourishing’ in her argument that theology is concerned with human happiness: that psychological and moral

This article merely proposed a method to validate Microsoft’s Kinect as a device to enable low fidelity, unobtrusive, robust sensing of behavior.. The Xsens MVN suit is presented

By formulating this optimization problem as a dynamic multi-objective network design problem, in which the dynamic traffic management measures are the decision variables

To be concluded, based on our primary and secondary findings of price and dual-response attribute, as well as the evaluation of model fit and its relation

In conclusion, could a persons’ perception of assortment variety, prior experiences and product knowledge (combined in product category expertise), their level of personal decision

Finally, as Evans (2008, 2009) and Stanovich (2009) have already initiated, there is a growing need for the revision of the dual-process theory and for further empirical research

Unlike most previous studies about leaders in organizations and their way of talking (e.g., Steffens &amp; Haslam, 2013), this study will analyze coaches of football teams. I

Review: Planned early delivery versus expectant management for hypertensive disorders from 34 weeks gestation to term Comparison: 1 Planned early delivery versus expectant