Seed by Example
A conjoint solution to the cold-start problem in recommender systems
by
Recommender systems
• Important driver of online revenue
• Used by Amazon, Netflix, Reddit, Linkedin
• Clearly a marketing topic, yet usually discussed in
machine learning and information retrieval domains
Research goals
• Reduce cold-start by using seed data
• Performance comparison between seeded and
unseeded recommender
Collaborative filtering
• Find preference patterns between similar users
• ‘Birds of a feather flock together’
Marketing Marketing Research Product Management Consultant Product Management Consultant
?
?
Cold-start problem
Marketing Marketing Research Product Management Consultant?
Consultant Marketing?
• No information about new user
• No overlap, no similarity Marketing Marketing Research
?
Consultant Product Management?
?
?
Annemarie Maurice Sophie Krik
Marketing
?
?
?
?
?
?
?
• New online service has no data
Annemarie Maurice Sophie Krik
• What if there were users that rated every job,
always!?
• Conjoint analysis to compute utility levels for every job • Extract latent classes (underlying segments)
• Transform utility to choice probability (MNL Logit)
• Compute choice probability for each job, cut-off: 50%
• Generate seed node from transformed latent class profiles
Conjoint experiment
• Existing dataset from marketing engineering course
winter 2014/2015
• N = 158
• Fractional factorial conjoint choice elicitation task
Model selection
• Estimate 8 models
• Numeric or nominal attributes
Model selection
• Log-likelihood ratio > test against null model and
each other
• Model 7 selected for latent class analysis
Latent Class Analysis
Latent Class Analysis
• More than 5 classes (kNN = 5)• Estimate several models with different classes
Multinomial logit
• Transform utility to choice probability
Latent Class Analysis
Latent Class Analysis
Seeded Collaborative Filter
No Yes Yes ? Job 1 Yes Yes ? ? Job 2 No No ? ? Job 3 Yes No ? No Job 4 No Yes ? ? Job 5 … … ? ? …• Absence of overlap solved
Implementation
• CF implemented with Raccoon
• Experiment designed with custom survey HTML /
CSS / JavaScript
Collaborative filter
k-Nearest Neighbor
Jaccard similarity
• Similarity measure between user and kNNs
• Divide intersection by union (overlap / total)
• Common binary like / dislike similarity measure
Performance test
• 2 groups
CF without seed nodes (N = 50) CF with seed nodes (N = 50)
Conclusion
• Seeding outperforms not seeding• Cold start partially addressed
• Managers can use this method without changes to algorithm • Step towards full conjoint recommender
(hybrid top N-recommender with conjoint choice sets)