Dedication
To my parents, without whom I would never be who and where I am today, for their unconditional support and encouragement
Acknowledgements
I would like to thank my supervisor Dr. Felix Eggers, who not only facilitated data collection but also offered inspiring and helpful feedback. I am deeply grateful for your guidance throughout the thesis process.
Abstract
Over the years, empirical research has conclusively shown that taking heterogeneous response behavior in probabilistic choice models into account is of great importance (e.g. Ben-Akiva & Lerman, 1985; Ben-Akiva & Morikawa, 1990; McFadden & Train, 2000). Heterogeneity in discrete choice data can be related to individual preferences and to differences in the amount of error variance (or scale heterogeneity). The latter describes the lack of consistency in an individual’s (expected) choices and isthe focus of this research.
Recent progress in Latent Class modeling
offers a new tool for separatingTable 1: Attributes and Levels
Atrribute Nr. Attribute Levels Specification
1 Meat (1) Chicken part-worth
(2) Chicken double
(3) Beef
(4) Beef double
2 Cheese (0) None part-worth
(1) Cheddar
(2) Cheddar double 3 Lettuce (0) None linear
(1) Lettuce
4 Tomatoes (0) None linear
(1) Tomatoes
5 Onions (0) None linear
(1) Onions
6 Sauce (1) Ketchup part-worth
Table 2: Summary of Measurement Items Construct Variables Socio-Demographic Characteristics Age in years Gender male / female Occupation pupil / student / employed / self-employed / unemployed / retired / other Respondent
Engagement Emotional Engagement Burger rating: scored on 7-point Likert scale from "do not like at
all" (0) to "like very much" (6) "How much do you like burgers, such as Hamburger, or Chicken Burger? Survey Satisfaction: scored on 7-point Likert scales from "do not agree" (0) to "fully agree" (6) i) I enjoyed participating in the survey. ii) The scenarios I was presented with were realistic. iii) The pictures of the burgers were appealing iv) I was able to fully understand the tasks I was faced with. v) I always considered all attributes when making my decisions. Behavioral
Engagement Response Time to CBC tasks: in seconds Trap Question: failure / success
to select "fully agree" "For quality assurance purposes, please select Strongly Agree" Cognitive
4 OPN Q) = exp (&U′ +"W,#) exp ( X &U′ +"W,Y) In the scale-adjusted approach proposed by Madison and Vermunt (2007), the model is bi-linear in the parameters, yielding a logit model with a term that simultaneously estimates utility and scale parameters (i.e., the log-linear scale model contains a multiplicative term for the scale classes and effects of the covariates). Respondents are described as having some probability of being in a particular latent preference class q which distinguishes their preferences for each attribute level, while they are simultaneously described as having some probability of being in a certain latent scale class Z[, \ = 1, … , ] which distinguishes their level of choice consistency from other respondents (Vermunt and Magidson, 2007). Hence, the utility of respondent n for alternative i depends on q and s. As such, this model structure is able to group respondents on the basis of similar preferences, whilst accounting for potentially confounding differences in error variability. That is, each of the latent preference classes contain some respondents with λ1 = 1 and some respondents with a higher (in case of lower uncertainty) and/or a lower scale value (in case of higher uncertainty). Hence, the probability of individual n‘s choice i in choice replication t given preference class q and scale class s, becomes:
4 OPN Q, \ = exp (Z[&U
2.6 Model Diagnostics
2.6.1 Goodness of Fit Measures
To test whether the estimated model parameters are significantly different from zero, Likelihood
Ratio Tests are performed using the Chi-squared test statistic, which is chi-square distributed with
degrees of freedom (df) equaling the number of parameters: `a= −2 cc 0 − cc(& , where cc 0 is
Table 3: Sample Statistics
Variable /
Construct Classification Mean (S.D.) Sample (%)
sample chose consistently in the identical Holdout sets with rotated attribute order (i.e., Holdout set 1 and 2). There was no significant difference in choice consistency between the two presentation formats (p = 0.37).
3.3 Estimation Results
A number of models were estimated iteratively in order to test the proposed model while and ensure reliable estimates under the final model specification. The final model (which indicated 4 preference and 2 scale class groupings) was selected on the basis of minimizing BIC. The suggested SALC approach in Latent Gold consists of two steps, where the first is to estimate LC choice models without scale classes and the second is to estimate LC models with two (or more) scale classes (Statistical Innovations 2014). Table 4 shows model summary statistics for conventional LC models with 2 - 4 preference classes and for SALC models with 1-4 preference classes and 2 scale classes.1 The models were fitted using respondents’ gender, age, occupation as well as the proposed engagement indicators. Table 4: Model Summary Statistics
Model LL BIC AIC Npar Class. Error R2 adj. Hit Rate
3.4 Parameter Interpretation
To assess how preferences differ across preference classes and how each attribute influences choices, relative attribute importance was calculated (i.e., the percentaged largest difference between attribute parameters). While class 1, 2, and 3 put greatest importance on price, meat is most important in the remaining class 2. Least important burger ingredients are sauce and onions among all classes (see Table 8). Table 8: Attribute Importance 4-Class 2sClass Model Preference Class 1 2 3 4 Meat 15,9% 45,9% 23,0% 19,0% Cheese 16,3% 7,3% 12,8% 11,2% Lettuce 18,7% 5,5% 17,3% 20,0% Tomatoes 10,5% 3,6% 8,7% 10,0% Onions 4,8% 1,9% 3,7% 2,0% Sauce 1,1% 11,8% 4,7% 8,2% Price 32,7% 24,0% 29,9% 29,7% Moreover, Willingness-to-Pay (WTP) estimates were calculates as WTP = exp(λ2 - λ1) ßji / exp(λ2 -λ1) ßj,price = - ßji / ßj,price. Thus, WTP is scale-free (see Table 9). Corresponding to the attributes’ importance,
Table 9: Willingness-to-Pay 4-Class 2sClass Model Preference Class 1 2 3 4 Meat Chicken - 0,10 € - 3,43 € - 0,03 € - 1,18 € Chicken double 0,77 € - 2,63 € - 1,02 € 0,33 € Beef - 1,17 € 1,85 € - 1,02 € 0,93 € Beef double 0,50 € - 1,69 € 0,18 € 1,38 € Cheese None - 1,19 € - 0,37 € - 0,62 € - 0,72 € Cheddar 0,39 € - 0,42 € 0,43 € 0,78 € Cheddar double 0,80 € 0,79 € 0,19 € - 0,06 € Lettuce 2,28 € 0,92 € 1,42 € 2,70 € Tomatoes 1,29 € 0,61 € 0,71 € 1,35 € Onions 0,59 € 0,31 € 0,30 € 0,26 € Sauce Ketchup 0,08 € 0,49 € 0,18 € 0,44 € Chili Sauce - 0,05 € - 1,22 € 0,01 € - 0,66 € Barbeque - 0,02 € 0,73 € - 0,20 € 0,22 € None - 1,67 € - 6,27 € - 12,63 € 7,28 €
3.5 Predictive Validity
In order to assess predictive validity, holdout choice shares and MAE were calculated. As summarized in Table 10, the scale-adjusted model performs slightly worse than the conventional LC model with 4-Classes (see Appendix B and C for choice share predictions for the specific Holdout Sets). Table 10: MAE per Model and Holdout SetMacLachlan, J. and Myers, J. (1983), "Using Response Latency to Identify Commercials That Motivate," Journal of Advertising Research, 23(5), 51-57. Magidson, J., and J.K. Vermunt (2007), “Removing the scale factor confound in multinomial logit choice models to obtain better estimates of preference,” Proceedings 2007 Sawtooth Software Conference, Sawtooth Software, 139-154. Magidson, J., and Vermunt, J. K. (2003), “New Developments in Latent Class Models,” Proceedings 2003 Sawtooth Software Conference, 89-112. Magidson, J., Dumont, J., Vermunt, J.K. (2015), “A new modeling tool for identifying meaningful segments and their willingness to pay: Improving Validity by reducing the Confound between Scale and Preference Heterogeneity,” Advanced Research Techniques Forum 2015, Statistical Innovations. Retrieved October 18, 2015 from https://www.statisticalinnovations.com/wp-content/uploads/ART2015_Magidson_Dumont_Vermunt_May27.pdf McFadden, D. (1974), Conditional logit analysis of qualitative choice behavior. P. Zarembka (ed.), Frontiers in Econometrics, 105-142. New York: Academic Press. McFadden, D. (1986), "The Choice Theory Approach to Market Research," Marketing Science, 5(4), 275-97. McFadden, D., and K. Train. (2000), "Mixed MNL Models for Discrete Response," Journal of Applied Econometrics, 15, 447-470. Park, J. and Hastak, M. (1994), “Memory-Based Product Judgments: Effects of Involvement at Encoding and Retrieval,” Journal of Consumer Research, 21 (3), 534-47. Statistical Innovations (2014). “How to Estimate Scale-Adjusted Latent Class (SALC) Models and Obtain Better Segments with Discrete Choice Data,” Latent Gold® Choice 5.0 tutorial #10B (1-file format), Oct. 30, 2014. TrueSample (2010, March 7), “When Good Respondents Go Bad: How unengaging Surveys Lower Data Quality”. Retrieved October 4, 2015 from http://truesample.com/white-paper/when-good-respondents-go-bad/ Tyebjee, T. (1979), "Response Time, Conflict, and Involvement in Brand Choice,” Journal of Consumer Research, 6 (3), 295-304.
Vermunt, J. K. (2013), Categorical Response Data, The SAGE Handbook of Multilevel
Modeling (2013): 287.
Appendices
A.
4-Class Choice Parameters
Table A 1: Parameter Estimates 4-Class Choice Model Class 1 2 3 4 Class Size 38,7% 31,0% 11,6% 18,7% Meat Chicken Burger 0.06 -0.04 -0.76*** -0.35** Chicken Burger double 0.27*** 0.25*** -0.60*** -0.35** Beef Burger -0.37*** -0.33*** 0.44*** 0.27** Beef Burger double 0.04 0.12* 0.93*** 0.43*** Cheese None -0.39*** -0.24*** 0.06 -0.24** Cheddar 0.11** 0.17*** -0.15* 0.23** Cheddar double 0.27*** 0.07 0.09 0.00 Lettuce 0.71*** 0.48*** 0.08 0.78*** Tomatoes 0.45*** 0.21*** 0.14 0.39*** Onions 0.13* 0.13* 0.09 0.02 Sauce Ketchup 0.05 0.05 0.00 0.20** Chilli Sauce -0.05 0.04 -0.34*** -0.24** Barbecue Sauce -0.00 -0.08 0.34*** 0.05 Price 0.32*** -0.23*** -0.22*** -0.27*** None Option -0.67*** -3.94*** -1.28*** 2.21*** * Significance at 10% level; ** Significance at 5% level; *** Significance at 1% level
B.
Predictive Validity 4-Class Model
Table B 1: Holdout Set 1 Hit rates 4-Class
Alt 1 Alt 2 Alt 3 None Predicted 8,0% 16,3% 27,5% 48,2% Actual 24,5% 27,7% 27,3% 20,5% error 16,5% 11,4% 0,2% 27,7% MAE 14,0% Table B 2: Holdout Set 2 Hit Rates 4-Class
Alt 1 Alt 2 Alt 3 None Predicted 34,5% 9,5% 27,2% 28,9% Actual 30,9% 24,2% 19,1% 25,8% Error 3,6% 14,7% 8,0% 3,0%
MAE 7,3%
Table B 3: Holdout Set 3 Hit Rates 4-Class
Alt 1 Alt 2 Alt 3 None Predicted 32,1% 9,7% 27,7% 30,6% Actual 27,8% 22,5% 21,9% 27,8% Error 4,3% 12,9% 5,8% 2,8% MAE 6,4%
Table B 4: Holdout Set 4 Hit Rates 4-Class
Alt. 1 Alt. 2 Alt. 3 Alt. 4 Alt. 5 Alt. 6 Alt. 7 Alt. 8 None Predicted 10,2% 15,3% 3,5% 9,3% 10,2% 22,4% 1,7% 2,0% 25,5% Actual 4,7% 14,7% 5,7% 9,3% 10,8% 24,7% 4,3% 5,0% 20,8% Error 5,5% 0,6% 2,2% 0,1% 0,6% 2,3% 2,6% 3,0% 4,7% MAE 2,4%
Table B 5: Holdout Set 5 Hit Rates 4-Class
Alt. 1 Alt. 2 Alt. 3 Alt. 4 Alt. 5 Alt. 6 Alt. 7 Alt. 8 None Predicted 7,7% 6,9% 18,2% 3,3% 20,3% 3,2% 7,8% 4,6% 28,0% Actual 8,0% 19,1% 20,7% 8,0% 15,9% 4,0% 6,0% 6,4% 12,0% Error 0,2% 12,2% 2,5% 4,6% 4,3% 0,8% 1,8% 1,8% 16,1% MAE 4,9%
C.
Predictive Validity 4-Class 2sClass Model
Table C 1: Holdout Set 1 Hit Rates SALC
Alt 1 Alt 2 Alt 3 None Predicted 12,1% 17,6% 30,9% 39,4% Actual 24,5% 27,7% 27,3% 20,5% Error 12,4% 10,1% 3,6% 18,9% MAE 11,3% Table C 2: Holdout Set 2 Hit Rates SALC
Table C 3: Holdout Set 3 Hit Rates SALC
Alt 1 Alt 2 Alt 3 None Predicted 16,8% 18,5% 25,3% 39,4% Actual 27,8% 22,5% 21,9% 27,8% Error 11,0% 4,1% 3,4% 11,7% MAE 7,5% Table C 4: Holdout Set 4 Hit Rates SALC
Alt. 1 Alt. 2 Alt. 3 Alt. 4 Alt. 5 Alt. 6 Alt. 7 Alt. 8 None Predicted 3,9% 9,9% 5,2% 15,2% 9,2% 14,0% 3,7% 3,9% 34,9% Actual 4,7% 14,7% 5,7% 9,3% 10,8% 24,7% 4,3% 5,0% 20,8% Error 0,7% 4,8% 0,5% 5,9% 1,5% 10,7% 0,6% 1,1% 14,1% MAE 4,4% Table C 5: Holdout Set 5 Hit Rates SALC