SEED BY EXAMPLE: A CONJOINT SOLUTION TO THE COLD-START PROBLEM IN RECOMMENDER SYSTEMS

(1)

SEED BY EXAMPLE:

A CONJOINT SOLUTION TO THE COLD-START PROBLEM IN

RECOMMENDER SYSTEMS

BY

BART VERSCHOOR

(2)

SEED BY EXAMPLE:

A CONJOINT SOLUTION TO THE COLD-START PROBLEM IN

RECOMMENDER SYSTEMS

BY

BART VERSCHOOR

Author: Bart Verschoor (S2585855) b.b.verschoor@student.rug.nl President Kennedylaan 244-I 1079 NV, Amsterdam The Netherlands +31 (0) 64 15 73 35 74 1st supervisor: Dr. F. Eggers f.eggers@rug.nl Nettelbosje 2

Duisenberg Building (DUI321) 9747 AE, Groningen The Netherlands +31 (0) 50 363 70 65 2nd supervisor Drs. N. Holtrop n.holtrop@rug.nl Nettelbosje 2

Duisenberg Building (DUI310) 9747 AE, Groningen The Netherlands +31 (0) 50 363 9621 Master Thesis Date of completion: 08-06-2015 University of Groningen M.Sc. Marketing (Intelligence profile)

(3)

1 Preface

This master thesis is an original intellectual product solely created by the author, Bart Verschoor. This master thesis is part of the master program marketing (intelligence profile) at the University of Groningen. It is submitted in order to fulfill graduation requirements set forth by the university. The master thesis was developed between February and June of 2015. Major parts of the text are based on research from others. I went to great lengths to provide accurate references to these sources. No intended or unintended plagiarism occurred nor have any parts been published before. The treatment of the participants was in accordance with the ethical standards of APA. Furthermore, the Javascript syntax and HTML/CSS markup was written by me. I would like to thank the open source community for providing the technologies that enabled this study. In

particular, the work of Guy Morita on the Javascript library recommendation raccoon was instrumental to this study.

1.1 About the author

In my spare time I spend a lot of time reading up on technology, in particular on artificial intelligence techniques. There is a certain magic quality to the notion of automating an autonomous intelligent system. Inspired by my brother Joris who is insanely adept at programming, as well as a mixture of sci-fi novels like ‘I, Robot’ and ‘Snow Crash’ or movies like ‘Terminator’, ‘The Matrix’, ‘Her’ and ‘Ex Machina’, the elegance of automation has me spellbound.

(4)

of solving problems in one domain by applying the lessons from another domain excited me, and I got to work on this interdisciplinary project. It was a challenging trial of my abilities. This project provided me invaluable lessons on recommendation engines, a technique I plan on using for a future ecommerce project.

1.2 Acknowledgements

Prior to starting this master thesis I had heard of the horror stories and tragic tales of the mythical monster that is the master thesis. The onus of completing the master thesis is well chronicled and it had made me cautious of its perils. Contrary to the myths told in the hallways of the Duisenberg building, writing the master thesis was a very enjoyable process for me. I attribute this in large part due to the pleasant cooperation with Prof. Dr. Felix Eggers, my supervisor and honours college mentor. His kind and calm approach has made this more than a memorable period and I will look back upon it with great

satisfaction. In particular, I would like to thank him for his patience, his calm and

collected demeanor and his uncanny expertise in conjoint methodologies. The feedback I received was insightful, precise and critical yet at the same time kind and supportive. Credits are due also for generously providing me with the dataset for the conjoint experiment upon which a large part of this thesis rests as well as for providing me with the appropriate orthogonal array. Providing this information contained the already daunting scope of this project within reasonable bounds. Working with Prof. Dr. Felix Eggers was an enlightening experience. In the spirit of this master thesis’ topic, I would strongly recommend his supervision to anyone.

(5)

Mentorship is the greatest gift one can bestow on others. As you two have invested in me, so I will commit to invest in others. May the ripple effects of your mentorship echo onward, reaching lives from one degree of separation to the next.

Bart Verschoor

Amsterdam, the Netherlands June 7th, 2015

(6)

2 Table of Contents

1 Preface ... 3

1.1 About the author ... 3

1.2 Acknowledgements ... 4 3 Abstract ... 8 4 Introduction ... 8 5 Theoretical framework ... 10 5.1 Recommender systems ... 10 5.2 Cold-start problem ... 14

5.3 Conjoint analysis (CA) ... 16

5.4 Proposed recommender system ... 19

6 Methodological overview ... 20

6.1 Conjoint utility model ... 20

6.2 Collaborative filtering model ... 21

6.3 Methods of collaborative filtering ... 22

6.3.1 k-NN classification ... 22

6.3.2 Similarity measure ... 23

7 Research design ... 24

7.1 Research context ... 24

7.2 Procedure ... 25

7.3 Conjoint experiment design ... 25

7.3.1 Stimuli ... 27

7.3.2 Job design ... 27

7.3.3 Seed node design ... 27

7.3.4 Latent classes ... 32

7.4 Implementation ... 34

7.5 Experimental setting ... 35

7.6 Predictive validity ... 35

8 Results ... 36

(7)

9.1 Findings ... 38

9.2 Limitations ... 39

9.3 Further research ... 40

10 References ... 41

11 Appendices ... 48

11.1.1 Appendix 1: Custom Survey ... 49

11.1.2 Appendix 2: Individual level cumulative hit rates per nth_{recommendation ... 50}

11.1.3 Appendix 3: surveyserver.js ... 51

11.1.4 Appendix 4: jobrecsys.html ... 55

11.1.5 Appendix 5: thesis.html ... 58

(8)

3 Abstract

This study aims to reduce the cold-start problem in recommender systems. In doing so, the study also investigates whether or not a hypothetical conjoint recommender would be viable. This study links conjoint analysis to a collaborative filter by seeding a database with latent classes derived from an a priori conjoint experiment. The study contrasts the performance over time of two conditions: a collaborative filter trained with latent classes and a benchmark condition using an untrained collaborative filter. The study finds a substantial improvement in mean hit rates by using the ad-hoc seeding strategy over not seeding, while partially addressing the cold start problem.

Keywords: RecSys, Recommender System, Conjoint Analysis, Cold-start, Collaborative

Filter, Latent Class Analysis

4 Introduction

The ever increasing burden of information overload has caused an epidemic of infobesity amongst consumers (Adomavicius & Tuzhilin, 2005) and companies (Rogers, Puryear & Root, 2013). With the advent of the big data trend information became a ubiquitous commodity to be mined and analyzed. Information has exploded partly due to the increasing connectedness of the social graph, the effects of decreasing costs of storage leading to the growth of a ‘memory’ of the internet, distributed computing propelled by the widespread adoption of HDFS1 (Hadoop Distributed File System), the compounding effects of content creators whose content remains online to be shared for eternity and the fact that there exists no such metaphorical equivalent as a garbage truck for the internet that can periodically clean up outdated content.

On one hand, consumer-related information overload presents an opportunity for companies to mine and synthesize behavioral data into consumer insights, while on the other hand product-related information overload obstructs the purchase decision of consumers. Recommender systems offer a panacea to the scourging effects of product-related information overload by algorithmically separating signal from noise using statistical techniques. Consumer-related information overload may fuel recommender

1_{The Apache Hadoop software library is a framework that allows for the distributed processing of large}

data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

(9)

systems that in turn reduce the burden of product-related information overload by offering fitting product recommendations. Paradoxically, despite the widespread availability of data, newly launched recommender systems suffer from data sparsity issues. To separate the wheat from the chaff, recommender systems ideally use input data that is as complete as possible. Automatically matching user preferences to items requires that the system get to know the user over time so that it can identify patterns across other users and/or items. However, a side effect of the information abundance is that users have become increasingly impatient towards online services (Maier, Laumer, Eckhardt & Weitzel, 2012). Users churn if the service is not immediately useful. Herein lies the crux of the cold-start problem: How can recommenders familiarize themselves with users from the start, without having any information about the user’s preferences? Many partial solutions to the cold-start problem have been proposed (Schein, Popescul & Ungar, 2002; Zhou, Yang & Zha, 2011; Liu, 2011).

This study archaically enriches the database by seeding nodes based on an a priori conjoint analysis to a recommender system based on collaborative filtering2_{. This may}

benefit companies that wish to launch new recommender systems without haphazardly altering the intricate subtleties of proven recommender techniques.

A conjoint approach to the cold-start issue may result in recommender systems that can better predict product-user fit for new users in the early stages of the user lifecycle. Decision aids create a better customer experience (Xiao & Benbasat, 2007), which in turn affects the bottom line of the firm. As data grows, recommender systems will gain a more prominent important role in the reduction of choice overload. User retention may be improved by reducing the cold-start problem. Better product recommendations early on in the customer lifecycle may prevent users from churning before positive word-of-mouth effects from new users can cascade into more widespread service adoption.

The study contributes to firms by proposing a data enrichment extension that helps reduce the cold-start problem without having to alter previously implemented algorithms. This is noteworthy because many variations of recommender systems exist. It is not required to redesign each individual algorithm to implement the improvement. The methods proposed in this study are broadly applicable.

(10)

Literature on methodological issues on recommender systems is mostly

concentrated in the field of information retrieval, computer science, machine learning and artificial intelligence (see next chapter for a review). This study attempts to answer the methodological challenge of the cold-start problem by using an interdisciplinary approach, namely by integrating a technique common to marketing researchers and applying it recommender systems. Recommender systems belong in the burgeoning field of marketing automation, which is at the intersection of computer science, statistics and marketing, thus viewing its methodological problems from a marketing background brings a refreshing perspective, which surprisingly few studies have attempted in the past. Secondly, this study could be a stepping-stone toward a fully automated conjoint-based recommender system.

After a brief introduction to the focal topics and relevant techniques in this paper, a literature overview covering methodological issues and comparisons between

recommenders is discussed, followed by a methodology section addressing the particular study design and implementation. Afterwards, the results of the study are reviewed, closed by a concluding discussion in which the implications of the findings are provided, while taking note of the limitations and recommendations for future research are

discussed.

5 Theoretical framework 5.1 Recommender systems

(11)

spending too much time searching for that product. The basic concept is that the better the underlying information filter can connect items characteristics to user preferences, the better the recommendation becomes, which results in a better customer experience and business performance of the firm. This explains the widespread adoption of recommender systems by corporations across a multitude of industries. There are three main techniques in recommender systems. Refer to table 1 for an overview of recommendation

techniques.

Collaborative filtering (CF) ‘crowd sources’ the recommendation task by clustering into different groups people who show similar preference behavior. The adage ‘birds of a feather flock together’ is the core assumption upon which collaborative filtering rests. It recommends to likeminded people products that other group members have purchased before them. CF segments the market using a similarity measure upon which it bases its predictions. CF is built solely upon the similarities between people and ignores individual product features. As preferences are revealed by the behavior of customers, the CF dynamically changes. Due to its social character CF can offer

serendipitous results. Another advantage of CF is that it is minimally affected by ramp-up problems3. This means that it is computationally efficient to run on large scale databases (Beutel, Weimer & Minka, 2014). This makes CF suited for real-life production

environments and explains its popularity amongst firms. However, CF requires a feedback mechanism (likes or purchases) in order to work. This makes the CF sensitive to data sparsity issues both for new items and for new users. This is often referred to as the cold start problem.

Content-based recommenders (CB) build a user profile consisting of product feature preferences that are used to make predictions. It learns from users by taking note of feature preferences. For instance, in the case of movies, these include movie genre, actors and director. The CB compares previously liked or purchased products to products with a similar score on that specific mix of features. CBs assume that, for instance, if you liked Quentin Tarantino’s Reservoir Dogs, Pulp Fiction and Kill Bill, you will probably also like Django Unchained and Inglorious Bastards. One caveat of CB is that it requires

3_{Ramp up refers to the incremental reassignment of computational containers (fractions of machines) as}

(12)

content-descriptors that state the product’s features. Text mining algorithms such as TF-IDF4 remedy this problem. Using CB often lead to results that are not very surprising. This is because the CB often keeps suggesting very similar products only. The problem here is that the main goal of recommender systems is to predict what type of product the user would like or purchase, not necessarily to predict which products are similar to previously liked or purchased. Content-based recommenders suffer from cold start problems since user preferences are unknown initially. Unlike CF, CB does not rely on a community of users for its predictions. This results in a similar performance of the CB regardless of the activity and number of users feeding information to the recommender system.

Knowledge-based recommenders are hard-coded rules determined on the outset by a domain expert. The quality of the recommender system is contingent on the

knowledge of the domain expert. Knowledge-based recommendations are very consistent due to its rigidity, but they do not evolve based on the behavior of users. Unfortunately, they do not capture market trends and the rules can become outdated quickly. This limited staying power is due to the static nature of knowledge-based recommenders and the dynamic nature of markets. Since knowledge-based recommenders do not rely on user behavior, it does not suffer from the cold-start problem.

Hybrid methods combine content-based recommendation and collaborative filtering by applying both. It refers to a multiple classifier system that selects whether to use the prediction of the content-based recommender or the collaborative filter by means of a selection filter, usually some sort of weighting scheme. This mixed approach is often riddled in complexity making it difficult to design and implement. However, the

combination of CF and CB is very potent. Companies that rely on recommender systems are often willing to adopt hybrid recommenders because it delivers great results.

4_{TF-IDF refers to term frequency-inverse document frequency. It is a metric that states the importance of a}

(13)

Pros Cons Collaborative Filtering • Negligible ramp up effort

• Results are serendipitous

• Learns market segments

• Dynamic model

• Requires rating feedback

• Cold start for new users

• Cold start for new items

Content-Based • No community required

• Comparison between item

characteristics

• Dynamic model

• Content-descriptors necessary

• Cold start for new users

• No serendipity Knowledge-based • Deterministic recommendations • Consistent quality of recommendations • No cold-start

• Knowledge engineering required

• Doesn’t respond to trends

• Static model

Hybrid • Combines an arbitrary

amount of CF and CB techniques to leverage pros and offset cons

• Uses selection algorithm

• Difficult implementation

Table 1: Taxonomy of recommender systems

Within each of these groups, a myriad of design tweaks can be taken into account, such as picking the appropriate similarity measure. Whichever recommender system one decides to adopt, the basic concept is that recommender systems automate the prediction process of items based on observed preferences from users by estimating utility levels and sorting groups of similar items and/or people together.

Some notable examples include LinkedIn’s item based collaborative filtering platform called Browsemaps, (Wu et al., 2014), Netflix’s ALS-WR, an abbreviation of Alternating Least Squares with Weighted λ-Regularization (Y. Zhou, Wilkinson,

Schreiber & Pan, 2008), Amazon’s item-item collaborative filter (Linden, Smith & York, 2003), Hackernews’ ranking algorithm (Salihefendic, 2010), Reddit’s Hot Ranking (Salihefendic, 2010) and Foursquare’s Explore (Moore, 2011).

(14)

from malicious users such as hotels that manipulate travel sites in order to increase their bookings (Chirita, Nejdl & Zamfir, 2005) while some recommenders aim to increase serendipity such that recommendations are not only accurate, but also surprising

(Kaminskas & Bridge, 2014). The main techniques used can be largely categorized into three groups, namely: (1) content-based recommenders, (2) collaborative filters, (3) knowledge-based recommenders and (4) hybrid recommenders as shown in table 1 (Jannach & Friedrich, 2011). The term collaborative filter was original coined at Xerox (Goldberg, Nichols, Oki & Terry, 1992) for use in its mail service Tapestry. The Grouplens Project developed at the University of Minnesota (Resnick, Iacovou & Suchak, 1994) developed a Usenet news client that supported collaborative filtering, notably improving Tapestry. Other early uses include music recommendation service Firefly, a spinoff from MIT’s project Ringo (Shardanand, 1994), PHOAKS (Terveen et al., 1997) and the historically significant web browser Mosaic. Content-based

recommenders have its roots in information retrieval and cognitive filtering (Morita & Shinoda, 1994; Konstan & Recommender, 2004). Hybrid recommenders combine various recommendation techniques and filter the best option (Adomavicius & Tuzhilin, 2005).

5.2 Cold-start problem

The cold start problem can arise in systems that require automated data modeling. The cold start problem, also known as the sparsity problem, refers to the inability of an

(15)

also suffers from the cold start problem. One more occurrence of the cold start problem exists. Imagine a newly launched web shop with no users. The quality of the

recommender system will be poor for the web shop’s first users because the

recommender system is still untrained. Yet again, the cold start problem arises potentially leading to customer churn. This may or may not be critical depending on how strongly the e-tailer relies on decision aids.

Content-based recommenders (CBR) must first construct a sufficiently detailed model of the user’s tastes and preferences by querying or observing user behavior, for instance by taking into account likes, ratings or purchase history. Before the

recommender system can make recommendations with some degree of intelligence, the user must train the recommender system by revealing his/her preferences. Once the system has created a sufficiently detailed user profile, the recommender system is able to perform as it is intended.

Collaborative filters (CF) identify users who share similar rating patterns. Collaborative filters suggest favored items to users that share a similar rating pattern but only for those products that the active user has not seen yet (e.g. missing values). Without other cluster members, the collaborative will filter fail to recommend products. The cold-start problem makes it so that unrated items will never be recommended; a major

limitation of this technique.

Knowledge-based recommenders (KBR) originate from the domain of case-based reasoning (Kolodner, 1983; Kübler, 2005; Shank, 1999). It involves having a set of examples and a way to search a database for those examples to find relevant solutions to similar problems in the past. Knowledge-based recommenders are systems that use explicit knowledge about item attributes and user preferences a priori. This explicit knowledge is hardcoded into a body of deterministic rules upon which recommendations are suggested. Traditionally, knowledge-based recommenders are used when

(16)

5.3 Conjoint analysis (CA)

The collaborative filtering procedure concerns itself with the estimation of utility levels based on the notion that item choices of users can be extrapolated to other users who show similar preference behavior patterns. Choice-based conjoint (CBC) analysis takes on a different approach. Where recommender systems aim to solve the estimation of missing values for item ratings, CBC analysis aims to estimate the utilities of attribute levels given an arbitrarily large set of hypothetical item combinations. It does so by asking participants to make successive trade-offs between products. This study applies CBC analysis. However, like recommender systems, there are many conjoint variants that differ in their approach to estimating the utility of a product. A taxonomy of these

techniques can be formulated as (1) decompositional methods (e.g. ranking-based conjoint (Green & Srinivasan, 1989), choice-based conjoint (Louviere & Woodworth, 1983), transaction-based conjoint (Ding, Park & Bradlow, 2009; Netzer et al., 2008), (2) compositional methods (e.g. self-explicated method (Green & Srinivasan, 1989), paired comparisons (Scholz, Meissner & Decker, 2010), adaptive self-explicated method (Efrati, Lin, Toubia, Hebrew & Colloquium, 2007)) and (3) hybrid methods (e.g. ACA (Green, Krieger & Bansal, 1988), ACBC (Chapman, Alford, Johnson, Weidemann & Lahav, 2009) and HIT-CBC (Eggers & Sattler, 2009)). The decompositional approach

statistically decomposes the attribute level preferences from overall product evaluation. Choice-based conjoint (CBC) Products are described as a composition of features called attribute levels. The estimation reveals the relative importance of each attribute in order to identify the most appealing combination of attribute levels. Utility levels are estimated based upon which choice options of the presented choice sets are preferred by the respondents. In CBC, choices are modeled as dependent variables. CBC allows for the inclusion of a no-choice option. This can be used to determine what product requirements make up the threshold for purchase consideration, or market relevance. A notable advantage of CBC is that it includes interaction effects. The

(17)

requires rather complex designs, which require balanced, orthogonal arrays5. One must take precautions with experiments since participants may resort to reduction strategies when overwhelmed with choices. The number of choice sets and the information density of each product presented must be limited to prevent wear out effects from occurring.

Ranking-based conjoint (RBC) In RBC, participants are asked to rate product on an ordinal scale. It makes intuitive sense to use ordinal ratings both for conjoint analysis and for collaborative filtering, since we are interested in knowing which

products are most preferred. However, several problems plague RBC. One major problem is that the distance between different items is assumed to be equal. The limited

information RBC provides limits its applied use greatly. For instance, consider the following example of a preference comparison between three cars. Imagine that the most preferred option is a Bugatti Veyron, the second most preferred option is a McLaren SLR and the least preferred option is a Fiat Panda. RBC assumes that the distance in

preference between a Bugatti Veyron and a McLaren SLR is the same as the difference between a McLaren SLR and a Fiat Panda. Another problem is that RBC lacks a no-choice option, which makes it impossible to compute a consideration threshold.

Transaction-based conjoint (TBC) uses barter methods that facilitate ‘trade’ under challenging market conditions. Barter methods simulate markets where participants (buyers and sellers) respond to barter offers made by other participants. The social nature of a market simulation where goods are exchanged makes TBC effective in capturing rich information. Price information is especially well captured, because unlike the discrete nature of CBC price levels, buyers and sellers set prices based on their own judgment. Contrary to CBC’s no-choice option used to derive the minimum requirements for consideration, TBC is designed such that less-desirable products are also priced and traded. TBC is impacted by the endowment effect because it is a simulation of buyers and sellers. The endowment effect is a bias that states that people value items more based merely on the fact that they possess them (Kahneman et al., 1990). Although some studies have shown that barter methods outperform CBC (Ding, Park & Bradlow, 2009), barter methods take considerable implementation and participant coordination effort. Table 2 summarizes the decompositional conjoint methods in a brief taxonomy.

(18)

Methods Pros Cons Ra ti ng -ba se d conj oi nt • Participants asked to rate products on an ordinal scale • Easy to implement

• Using ordinal ratings in both

CA and CF makes sense

• Distance between rated

items assumed equal

• Lacks ‘no-choice’ option

• Difficult to interpret Ch o ic e -ba s e d c o n jo in t • Choices as dependent variables

• Asks for most

preferred option between several alternatives

• Allows choice predictions

• Includes consideration with

‘no-choice’ option. Signifies expected demand decrease

• Includes interactions

between attributes ‘whole greater than sum of parts’

• Hierarchical Bayesian

estimation allows individual level estimation of part-worth utilities6

• Complex design

• Participants resort to

reduction strategies when faced with choice overload (Wear out effects) Tr ans ac ti o n -ba s e d c o n jo in t

• Uses barter methods

to facilitate ‘trade’ under adverse conditions

• Market simulation

where respondents digitally trade products

• Dynamic customization

based on participant’s responses and outcomes

• Collects substantially more

information with limited wear out

• Less desirable offers also

addressed with negative offers

• Complex design

• Participants’ outcomes

are dependent on the outcomes of other participants

• Prone to loss-aversion

bias (Kahneman & Tversky, 1984)

Table 2: Taxonomy of decompositional conjoint methods

The fundamental difference between recommender systems and conjoint analysis is that conjoint analysis is a non-automated approach based on eliciting preferences from survey participants Conjoint analysis and latent class analysis are traditionally used in marketing research, particularly for segmentation analysis and new product development where obtaining the optimal feature configuration, willingness to pay and estimating the brand premium are crucial for the competitive performance of firms.

(19)

According to Kramer (2007), the accuracy of recommender systems is influenced by ‘task transparency’, ergo the predictive strength of recommender systems increase as users gain a better understanding of how the recommendations were derived. Conjoint methods provide a more intuitive interface for users that may take advantage of this bias towards task transparency. This paper takes a step towards creating a recommender based on the conjoint methodology.

5.4 Proposed recommender system

The proposed recommender is a collaborative filter. Latent classes are seeded into the database obtained from an a priori preference measurement survey upon which conjoint analysis is applied. CF is chosen for this study because it derives inferences from similar users. In the case of CBR, seeding the database is not useful because CBR is not based on the premise of user similarity. Although hybrid recommenders are arguably the best performing recommender systems, using one in the context of this study would unnecessarily overcomplicate the study in terms of analysis and implementation.

The methodology proposed in this paper suggests to seed information obtained from a conjoint analysis (CA) containing user-item preferences to a database upon which collaborative filtering (CF) is applied. This pre-loaded information contains seeded representative profiles from a latent class analysis subsequently referred to as seed nodes. The seed nodes provide the initial nodes upon which the CF infers correlations for new users. Contrary to KBR, this approach does not restrict the system to static, deterministic rules. The advantage of this methodology is that, as the taste profiles of users become more complete (e.g. number of users and the number of ratings per user grow), users tend to correlate more strongly to each other than to seed nodes, whereas at the start, users correlate strongly to the seed nodes. In a sense, the seed nodes act as a set of training wheels to which the CF can compare new users against. Since conjoint analysis can calculate every hypothetical item, the seed nodes provide complete information that can be used to segment new users efficiently based on their preference behavior patterns.

(20)

as a feature combination) can be calculated, which means the seed nodes will have zero missing values. The seed nodes populating the database are able to guide the CF by enabling it to recommend products that do not have any ratings. This is particularly useful for e-commerce databases that contain niche, long-tail products or job databases that contain a stream of job postings that can expire.

In summary, the study attempts to utilize the benefits of CF (e.g. dynamics, serendipitous results, integrating social environment) while offsetting the disadvantage of CF (e.g. cold-start problem).

6 Methodological overview 6.1 Conjoint utility model

CBC regresses the various attribute levels (independent variable; e.g. marketing manager, €2500) on the product choice (dependent variable) in such a way that the utility of a product is a linear combination of part-worth utilities. Utility estimates can be further optimized using vector specification or ideal points specification for each attribute and then comparing the predictive results of the utility sum using likelihood ratio tests. The random utility function is formulated in equation 1 and 2 (Eggers, 2014):

𝑢_!" = 𝑉_!"+ 𝜀_!" ₍₁₎

Where: 𝑛 = user

𝑖 = product (item or job)

𝑉_!"= systematic utility component (explained utility) of consumer 𝑛 for product 𝑖 𝑢_!" = utility of consumer 𝑛 for product 𝑖

𝜀!"= stochastic utility component (disturbance term) of consumer 𝑛 for product 𝑖

𝑉_!" = 𝛽_!"𝑥_!" ! !!! (2) Where: 𝑘 = 1, … , 𝐾 number of attributes

(21)

𝛽 = part-worth utility of consumer 𝑛 for attribute 𝑘

Choice-based conjoint is based on the nonlinear multinomial logit model (Kuhfeld, 2010). Here, a reverse transformation is done so that the discrete dependent variable choice is changed into a continuous scale such that a value can be calculated for each possible choice.

𝑝 𝑖|𝐽 = 𝑒𝑥𝑝 𝑉! 𝑒𝑥𝑝 𝑉_! !

!!! (3)

6.2 Collaborative filtering model

The core problem recommender systems aim to solve is the estimation of missing values for item choices. These missing ratings are items that have not yet been shown to the active user. Consider the following where:

N = set of all users

I = set of all items (e.g. jobs) u = utility function

u: N × I → R, where R is an ordered set of real numbers within a certain range. Such that each user n ∈ N choose i’∈ I that maximizes the user’s utility.

∀𝑛 ∈ 𝑁, 𝑖! ! =

arg max 𝑢(𝑛, 𝑖)

𝑖 ∈ 𝐼 (4)

Each user n in the set N can be defined by a profile that contains various attributes pertaining to the user’s personal information such as sex, age, birthdate, email address, and so forth. Likewise, each item i in the set I can be defined by a set of product features. The utility u is often only defined on a subset of N × I, thus the utility u should be

(22)

6.3 Methods of collaborative filtering

6.3.1 k-NN classification

In collaborative filtering, the utility 𝑢(𝑛, 𝑖) of item 𝑖 of user 𝑛 is estimated based on the utilities 𝑢(𝑛_!, 𝑖) assigned to item 𝑠 by those users 𝑛_! ∈ 𝑁 who are similar to user 𝑛. Various similarity measures can be applied. The focus of this paper will be on using the k-Nearest Neighbors classifier in conjunction with the Jaccard similarity coefficient. The k-Nearest Neighbor classifier, formulated in equation 5, finds the closest 𝑘 neighbors of each user given the similarity coefficient (Wen, 2008), formulated in equation 7. This method can be applied successfully in real world situations because the algorithm is efficient in its computational resource consumption since it only compares against the closest 𝑘 neighbors, instead of the entire database.

𝑃_!,! = 𝑖′ ∈ 𝑁!!(𝑖)𝑠𝑖𝑚 𝑖, 𝑖′ 𝑅!!,!

𝑖′ ∈ 𝑁_!!_{(𝑖)| 𝑠𝑖𝑚(𝑖, 𝑖′)} (5)

Where:

𝑁_!! _{𝑖 = {𝑖}!_{: 𝑖}!_{𝑏𝑒𝑙𝑜𝑛𝑔𝑠 𝑡𝑜 𝐾 𝑚𝑜𝑠𝑡 𝑠𝑖𝑚𝑖𝑙𝑎𝑟 𝑖𝑡𝑒𝑚𝑠 𝑖 𝑎𝑛𝑑 𝑢𝑠𝑒𝑟 𝑛 𝑐ℎ𝑜𝑠𝑒 𝑖′}} 𝐾 = top 5 nearest neighbors

𝑠𝑖𝑚(𝑖, 𝑖′) = the binary Jaccard similarity coefficient in equation 7 𝑅_!!,! = the existing choices of user 𝑛 on item 𝑖′

𝑃_!,! = the prediction for user 𝑛 on item 𝑖

Where 𝑠𝑖𝑚 𝑛, 𝑛! _{can be any similarity measure}7_{, which is used to differentiate between}

user similarity. In the implementation of this study, the Jaccard similarity coefficient is calculated for each user 𝑛. Then, a k-Nearest Neighbors classifier compares users only to

7_{Some examples to choose from include using ant-based clustering (Nadi, Saraee, Jazi & Bagheri, 2011),}

(23)

their top 5 nearest neighbors derived from a sorted list, which computes the final recommendation. The number of nearest neighbors (five) of the kNN classifier has implications on the latent class analysis that is discussed later on in this paper. To distinguish between segments properly, the database must consist at least of five segments for a fair comparison. If there exist less than five users in the database, a new user will be compared against less than five nearest neighbors. This will result in a minor reduction in predictive power for the first users.

6.3.2 Similarity measure

The Jaccard similarity coefficient is a metric that represents the similarity between users and k-nearest neighbors by dividing the intersection with the union. This method is widely used in practice. In 2009, Youtube abandoned its 5-star rating system for a binary like/dislike system based on the Jaccard similarity coefficient (Rajaraman, 2009). The change was implemented because users rated videos either with five stars or one star, but hardly ever with two, three or four stars. The Jaccard similarity coefficient is formulated equation 6.

𝐽 𝑛, 𝐾 = 𝑛 ∩ 𝐾

𝑛 ∪ 𝐾 (6)

Where: 𝑛 = user

𝐾 = top 5 nearest neighbors

Given user 𝑛 and the users that comprise the k-nearest neighbors 𝐾, each with 𝑥 binary choices, the Jaccard coefficient measures the overlap between the choices of user 𝑛 and kNN 𝐾. Choices are binary, thus each choice of user 𝑛 and KNN 𝐾 can either be 0 or 1 (e.g. no or yes, like or dislike). The complete combination of choices 𝑥 between user 𝑛 and kNN 𝐾 are:

𝑀_!!= Total choices where user 𝑛 and kNN 𝐾 are both 1.

(24)

Equation 7 shows the binary Jaccard similarity coefficient between user 𝑛 and the kNN 𝐾 (Tan, 2007). 𝐽 = 𝑀!! 𝑀_!"+ 𝑀_!"+ 𝑀_!! (7) 7 Research design 7.1 Research context

The proposed methodology is applied to the prediction of job preferences amongst students. Job recommenders provide an interesting context for two reasons. Firstly, prior to embarking on a professional career, student profiles are reasonably homogenous given the fact that students in the same field pursue similar academic degrees. Students often have limited work experience that could more distinctly differentiate one another. This is exemplified by the growing need for students to

differentiate themselves with extracurricular activities (Caplan, 2011) in order to become attractive to top-tier firms (e.g. McKinsey, Boston Consulting Group, MorganStanley, Goldman Sachs) and university admission officers (Allison, 2012). Lack of professional experience implies that students have inherently incomplete profiles. Recommender systems are capable of estimating missing values, in this case job preference regardless of prior job experience.

Secondly, students often don’t know exactly what type of job they wish to apply for after they graduate. Therefore there exists a clear need for good recommendations amongst students. Landing the right job is important for anyone, but especially so for fresh graduates. For fresh graduates, their job choice is either a springboard that launches their professional careers on a trajectory for success or one that inhibits or dashes the student’s professional aspiration.

Jobsites make use of recommender systems in order to match job requirements of vacancies to the skillsets and preferences of its users. The ability to leverage

(25)

application services providers (e.g. PeopleClick), E-recruiting consortium (e.g. DirectEmployers.com) to corporate career websites (Al-Otaibi, 2012). This study implements the proposed solution in a simulated experiment that could be applicable to any professional career site using recommender systems.

7.2 Procedure

The study design is split up in several parts. Chapter 7.3 covers the design of the conjoint study and its subsequent latent class analysis with the goal of creating the seed nodes. Chapter 7.4 covers how the collaborative filter was implemented and how the latent classes were integrated. Chapter 7.5 covers the experimental design where students were tasked to evaluate jobs using the recommender system and chapter 7.6 covers how the predictive accuracy is measured.

7.3 Conjoint experiment design

The author of this study obtained permission to use the results of an employer choice survey dataset from Dr. Eggers. The sample (N = 158) consists of 51% males and females 49%, with age distributed around the mean (M = 23.14, S = 1.597). The sample consists wholly of students following the graduate level course Marketing Engineering at the University of Groningen in the Netherlands. This employer choice survey gathered information about job preferences at the start of the course in early November 2013 and 2014. Students were incentivized to participate by awarding them an additional .3 grade points on their grade for an assignment. Incentive alignment has shown to improve predictive accuracy and reduce hypothetical bias (Eggers & Sattler, 2011). Demographic information asked prior to choice elicitation included the respondent’s age, work

(26)

are relevant since students are nearing graduation and must actively seek jobs. The hypothetical jobs are related to the curriculum of the graduate program.

Model specification design for conjoint analysis requires the testing for different combinations of attribute utility measures (linear, quadratic or nominal), in order to find the model with the best model fit and predictive power. Therefore, different models must be compared manually. The criteria used to assess the final model are the Pseudo-R2, adjusted pseudo-R2, hit rate and MAE (mean absolute error) as well as the information criteria AIC and AIC3. The final model is than estimated using latent class analysis; a preference based segmentation method for conjoint analysis. Each segment is comprised of utility values for each attribute. This assumes that utilities are distributed across participants who belong to discrete segments that vary in its preference patterns (latent class). Respondents are classified into segments with a probability. These segments will become the initial seed nodes; a special node in the database that is representative of the preferences for an entire segment, upon which the collaborative filter will make

inferences.

Position

o Market researcher in a research company

o Market researcher within an organization o Product manager o Management consultant Location o Groningen o Amsterdam o Rotterdam o The Hague Company Size o 50 employees o 150 employees o 500 employees o 1500 employees

Holidays per Year

o 20 days o 25 days o 30 days o 35 days Income o € 2489.- o € 2766.- o € 3042.- o € 3180.-

(27)

7.3.1 Stimuli

Several prerequisite steps must be completed prior to starting the experiment, namely: 1. Determine the number of jobs and the selection of the job characteristics 2. Design the seed nodes using latent class analysis

3. Implement recommendationRaccoon, a javascript library for collaborative filtering

4. Implement a custom web survey, seed the database with the seed nodes and integrate the collaborative filter

7.3.2 Job design

The jobs that are to be recommended during the experiment were designed based an orthogonal array for five 4-level attributes of size sixteen provided by Dr. Eggers. A full overview of the jobs is shown in table 4.

•j1 = {Market researcher in a research company, Groningen, 50, 20, €2489}

•j2 = {Market researcher in a research company, Amsterdam, 150, 25, €2766}

•j3 = {Market researcher in a research company, Rotterdam, 500, 30, €3042}

•j4 = {Market researcher in a research company, The Hague, 1500, 35, €3180}

•j5 = {Market researcher within an organization, Groningen, 150, 30, €3180}

•j6 = {Market researcher within an organization, Amsterdam, 50, 35, €3042}

•j7 = {Market researcher within an organization, Rotterdam, 1500, 20, €2766}

•j8 = {Market researcher within an organization, The Hague, 500, 25, €2489}

•j9 = {Product Manager, Groningen, 500, 35, €2766}

•j10 = {Product Manager, Amsterdam, 1500, 30, €2489}

•j11 = {Product Manager, Rotterdam, 50, 25, €3180}

•j12 = {Product Manager, The Hague, 150, 20, €3042}

•j13 = {Management Consultant, Groningen, 1500, 25, €3042}

•j14 = {Management Consultant, Amsterdam, 500, 20, €3180}

•j15 = {Management Consultant, Rotterdam, 150, 35, €2489}

•j16 = {Management Consultant, The Hague, 50, 30, €2766}

Table 4: Job definitions according to an orthogonal array 7.3.3 Seed node design

The number of seed nodes and their preferences were designed based on a latent class analysis following a conjoint experiment. In the next sections, the steps to determine the seed nodes are discussed.

(28)

The independent variables of the model can be specified as linear, quadratic or part worth functions. There are five attributes plus a no-choice attribute. The attributes location and position are part-worth utilities since these can only be defined categorically. The no-choice option is defined numerically for computational convenience. The remaining three variables allow for eight different models (23). The eight possible permutations are shown in table 5. Several different methods were used to assess model fit and predictive

strength. A comprehensive overview of the test results on all models is shown in table 6.

Po s it io n Si ze Lo c a ti o n Ho li d a y s In c o m e No C h o ic e df LL( 0 ) LL( β )

Model 1 nom nom nom nom nom num 16 -2628,41 -2193,70

Model 2 nom nom nom num nom num 14 -2628,41 -2197,58

Model 3 nom num nom num nom num 12 -2628,41 -2197,64

Model 4 nom num nom num num num 10 -2628,41 -2198,82

Model 5 nom nom nom num num num 12 -2628,41 -2198,76

Model 6 nom nom nom nom num num 14 -2628,41 -2194,89

Model 7 nom num nom nom num num 12 -2628,41 -2194,92

Model 8 nom num nom nom nom num 14 -2628,41 -2193,74

Table 5: Eight models for conjoint estimation (nom=nominal, num=numeric) Likelihood Ratio Test

The models are compared against a null-model using a χ2 test. The χ2 test statistic and degrees of freedom are derived according to equation 8.

𝜒! _{= −2 (ln ℒ(0) − ln ℒ(𝛽}∗₎₎

(8)

Such that ln ℒ(0) = 𝑛 ∙ 𝑐 ∙ ln (_!!) is the minimum likelihood and ln ℒ(𝛽∗_{) =} _{ln (𝑝(𝑖}

!"|𝐽!") !

! is the maximum likelihood. Where:

𝑚 = number of alternative per choice set J 𝑛 = number of consumers

(29)

𝑝 𝑖 𝐽 = the conditional probability of alternative 𝑖 given choice set 𝐽. with 𝑑𝑓 = 𝑛𝑝𝑎𝑟_{(!" ℒ !}∗ ₎

The aim is to reject H0, which states no differences exist between the null-model and the

specified model. Each model outperformed the null-model (See LL ratio test in table 6).

Mo d e l df Ps u e d o -R 2 Ps u e d o -R 2 ad j Hi tr a te MA E LL ra ti o AI C (L L ) AI C 3 (L L ) BI C CA IC 1 16 .1654 .1593 47.679% 4.92% p(869,4198) < .01 4,419.41 4,435.41 4468.41 4468.51 2 14 .1639 .1586 47.046% 5.13% p(861,6768) < .01 4,423.15 4,437.15 4466.03 4466.12 3 12 .1639 .1593 46.994% 5.05% p(861,5425) < .01 4,419.28 4,431.28 4456.03 4456.11 4 10 .1634 .1594 46.835% 5.13% p(859,1870) < .01 4,417.64 4,427.64 4448.27 4448.33 5 12 .1635 .1589 46.835% 5.13% p(859,3094) < .01 4,421.52 4,433.52 4458.27 4458.35 6 14 .1649 .1596 47.046% 4.74% p(867,0514) < .01 4,417.78 4,431.78 4460.65 4460.74 7 12 .1649 .1604 47.046% 4.76% p(866,9862) < .01 4,413.84 4,425.84 4450.59 4450.67 8 14 .1654 .1600 47.521% 4.92% p(869,3488) < .01 4,415.48 4,429.48 4458.36 4458.44

Table 6: Model fit tests (high-lighted are best values)

After each model was tested against the null-model, the nested models were pitted against each other with additional χ2_{tests. To do this, the χ}2_{test statistic and the degrees}

of freedom were computed according to equation 9 and 10. 𝑑𝑓 = 𝑛𝑝𝑎𝑟_!"#$%!− 𝑛𝑝𝑎𝑟_!"#$%! ₍₉₎ 𝜒! _{= −2 ln ℒ 𝛽}∗ !"#$%!− ln ℒ 𝛽∗ !"#$%! ₍₁₀₎

(30)

thus the model with the least number of parameters is favored for parsimony, which is model 6).

Pseudo-R2

The Pseudo-R2 provides insights in the predictive strength of the model. To calculate the Pseudo-R2, the equation 11 is used:

Pseudo-R2 = 1 −!" ℒ(!_{!" ℒ(!)}∗) (11)

As expected, the highest Pseudo-R2 _{is found in model one (Pseudo-R}2 _{= 0,165). This is}

due to the fact that Pseudo-R2_{does not punish for the number of parameters (df = 16).}

Since goodness of fit always increases with more parameters, the adjusted Pseudo-R2 using equation 12. The best model was model 7 (𝑅_!"#! _{= 0.16).}

𝑅_!"#! _{= 1 −} ln(ℒ 𝛽∗ − 𝑛𝑝𝑎𝑟!"(ℒ !

ln ℒ(0) (12)

Hit rate

The hit rate calculates the percentage of observations that were predicted correctly (See equation 13).

𝐻𝑖𝑡 𝑟𝑎𝑡𝑒 =# 𝑜𝑏𝑠𝑣𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦

𝑡𝑜𝑡𝑎𝑙 # 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 (13)

Model one outperforms the others (HR = 0,477) since the formula does not penalize for number of parameters.

Mean absolute error

To find out the predictive strength of the model, a holdout validation was performed. Descriptive analysis on the holdout sample revealed how often the different choices were actually selected and were compared against how often the different choices were

predicted. The results for model 7 are provided in table 7 and equation 16, based on the mean absolute error formula in equation 14 and 15.

(31)

Alternative 1 Alternative 2 Alternative 3 No choice

Observed shares 39,2% 29,1% 29,1% 2,5%

Predicted shares 32,23% 32,49% 35,28% 0%

Absolute error 6,97% 3,39% 6,18% 2,5%

Table 7: Absolute error for model 7 based on holdout sample

𝑀𝐴𝐸_!"#$%! = 6.97 + 3.39 + 6.18 + 2.50

4 = 4.76% (16)

Model 7 (MAE = 4.76%) is the second best model in terms of mean absolute error, after model 6 (MAE = 4.74%) as shown in table 6.

Information criteria

Lastly, the AIC, AIC3, BIC and CAIC information criteria were checked. AIC3 penalizes more heavily for complex models compared to AIC (See equation 17 and 18). BIC and CAIC penalize model complexity even more so (See equation 19 and 20).

𝐴𝐼𝐶 = −2 𝑙𝑛 ℒ 𝛽∗ _{+ 2 ∙ 𝑛𝑝𝑎𝑟} !" ℒ ! (17) 𝐴𝐼𝐶3 = −2 𝑙𝑛 ℒ 𝛽∗ _{+ 3 ∙ 𝑛𝑝𝑎𝑟} !" ℒ ! (18) 𝐵𝐼𝐶 = −2 𝑙𝑛 ℒ 𝛽∗ _{+ 𝑙𝑛 𝑁 ∙ 𝑛𝑝𝑎𝑟} !" ℒ ! (19) 𝐶𝐴𝐼𝐶 = −2 𝑙𝑛 ℒ 𝛽∗ _{+ 𝑙𝑛 𝑁 + 1 ∙ 𝑛𝑝𝑎𝑟} !" ℒ ! (20)

The most important model fit tests are pseudo-𝑅_!"#! _{, log likelihood ratio test between} models and the information criteria since these favor simple models. In the end, model 7 outperformed each other model on these metrics. Model 7 had lowest scores on each information criterion (AIC = 4413.84, AIC3 = 4425.84, BIC = 4450.59, CAIC = 4450.67) as well as the highest pseudo-𝑅_!"#! . The subsequent latent class segmentation is

(32)

7.3.4 Latent classes

In order to find the optimal number of segments, the finite mixture model was estimated with two to nine classes. Several information criteria were assessed to select the

appropriate model. BIC and CAIC are favored because these heavily penalize for model complexity. According to this, the seven-class solution is most attractive (See table 8).

# o f C la s s e s df Cla s s .E rr o r np a r LL AIC (L L ) BI C CA IC AI C 3 (L L ) 6 81 0.0541 77 -1705.9121 3565.8241 3801.644 3878.644 3642.8241 7 68 0.0395 90 -1666.6074 3513.2148 3788.8484 3878.8484 3603.2148 8 55 0.0486 103 -1638.1301 3482.2602 3797.7075 3900.7075 3585.2602 9 42 0.0503 116 -1621.484 3474.9679 3830.229 3946.229 3590.9679

Table 8: Information criteria of latent classes (Best performers highlighted) Table 8 reveals that there is no single best model based on the information criteria. Models six to nine each score highest on either AIC, AIC3, CAIC or BIC. Given this ambiguous result, further analysis was performed on models with 6 to 9 latent classes. This analysis included the estimation of the models (See table 9 for nine-class solution), the calculation of sum utilities and the binomial logit model to construct probabilities according to equation 21. Refer to table 10 for the result of the calculation of sum utilities and the binomial logit transformation for the nine-class solution.

𝑝!" =_!" !"_! !!!!!!!! !!!∙!"#$%& !!!!! !!!∙!"#$ !!!!!!! !!!∙!"#$%& !!!!! !!!∙!"#$ !!" !!! → 𝑝!" < 50% = 𝑛𝑜 𝑝_!" ≥ 50% = 𝑦𝑒𝑠 (21) where:

𝑝_!"= probability of taking a job j for class i 𝛽!!= utility of position for class i

(33)

After running this for each solution, the nine-class solution proved to contain the most distinct classes. The nine-class solution was favored over other solutions that scored better on the information criteria because it discriminates better between the choice and no-choice option. The optimal latent class solution contained many classes that would be indistinguishable from one another after applying the logit transformation to a binary choice variable (yes and no). Therefore, the seed nodes will be based on the slightly suboptimal nine-class model.

Cl a s s 1 Cl a s s 2 Cl a s s 3 Cl a s s 4 Cl a s s 5 Cl a s s 6 Cl a s s 7 Cl a s s 8 Cl a s s 9 Position Market researcher in a research company -0.375 1.4762 0.0307 -1.461 0.1061 -2.7862 -0.2363 1.3126 0.4304 Market researcher in an organization 0.1673 1.1358 -0.0652 -1.6496 -0.0055 -1.5598 -1.7017 4.8919 1.2905 Product manager 0.1785 -1.2679 -0.4174 1.3874 -0.5354 2.8474 4.9449 -0.8514 2.6248 Management Consultant 0.0292 -1.3441 0.4519 1.7233 0.4349 1.4986 -3.0069 -5.3531 -4.3457 Location Groningen -0.9003 0.6644 1.795 0.3825 -0.8701 -2.3234 1.2513 10.4116 -0.5032 Amsterdam 0.4157 -0.3532 0.0267 -0.6317 1.5527 1.2954 -1.4975 -3.0463 -1.0338 Rotterdam 0.2989 -0.1592 -1.4003 -0.3243 -0.3439 0.3445 -0.1548 -4.5445 -1.1551 The Hague 0.1857 -0.152 -0.4213 0.5735 -0.3387 0.6835 0.401 -2.8208 2.692 Gross Income 0.0029 0.0028 0.0011 0.0022 0.0031 0.0024 0.0035 0.0065 -0.0015 Company Size -0.0001 -0.0002 -0.0003 -0.0006 0.0003 -0.0008 -0.0012 0.0018 -0.0062

Holidays Per Year

20 days -0.549 -0.4815 -0.2979 -0.4431 -0.6131 -0.2922 -0.7685 -2.2661 -1.8076

25 days 0.0661 -0.1909 -0.0022 0.1002 0.1404 -0.506 -0.4078 -0.01 -1.3371

30 days 0.1934 0.1984 -0.0139 0.2118 0.0881 0.6421 1.045 3.3148 1.2409

35 days 0.2896 0.474 0.314 0.1311 0.3846 0.1561 0.1314 -1.0387 1.9038

No Choice -1.6995 4.6279 -6.1534 4.4169 8.8505 7.9173 7.167 30.1338 -5.5334

Table 9: Latent class utilities

Table 10 must be interpreted as follows: job preferences with a probability < 50% are coded as ‘No’, whereas job preferences with a probability > 50% are coded as ‘Yes’. Classes one, two and three do not contain any job preferences below 50% and are therefore interpreted as identical. The duplicate classes one, two and three were merged reducing the total number of distinct classes to seven. Significance testing and

(34)

Jo b La te n t C la s s 1 Se e d n o d e 9 La te n t Cl a s s 2 Se e d n o d e 8 La te n t Cl a s s 3 Se e d n o d e 1 La te n t Cl a s s 4 Se ed n o d e 2 La te n t Cl a s s 5 Se e d n o d e 3 La te n t Cl a s s 6 Se e d n o d e 4 La te n t Cl a s s 7 Se e d n o d e 5 La te n t Cl a s s 8 Se e d n o d e 6 La te n t Cl a s s 9 Se e d n o d e 7 j1 99.92% 98.18% 100.00% 82.02% 15.26% 1.77% 94.36% 2.32% 13.51% j2 99.99% 98.23% 99.99% 39.79% 82.75% 3.24% 54.80% 0.00% 18.45% j3 100.00% 99.51% 99.97% 59.92% 64.11% 5.65% 97.16% 0.01% 16.61% j4 100.00% 99.69% 99.99% 71.56% 83.33% 3.13% 92.12% 0.01% 2.90% j5 100.00% 99.81% 100.00% 80.75% 56.59% 2.54% 98.76% 99.99% 86.55% j6 100.00% 99.42% 99.99% 52.38% 92.60% 31.76% 58.72% 0.01% 94.38% j7 99.99% 96.48% 99.91% 16.13% 31.25% 1.82% 12.89% 0.00% 0.01% j8 99.99% 95.41% 99.97% 44.60% 23.35% 2.35% 31.78% 0.00% 79.34% j9 99.99% 94.72% 100.00% 96.34% 24.11% 26.92% 99.97% 6.10% 90.97% j10 99.99% 64.99% 99.97% 75.52% 60.37% 83.77% 98.58% 0.00% 0.93% j11 100.00% 93.39% 99.96% 97.62% 57.04% 91.38% 99.97% 0.00% 64.24% j12 100.00% 87.63% 99.98% 97.60% 29.68% 92.43% 99.96% 0.00% 97.21% j13 99.99% 91.90% 100.00% 95.43% 49.54% 5.05% 30.07% 0.76% 0.00% j14 99.98% 51.56% 99.99% 80.34% 59.63% 53.94% 1.30% 0.00% 0.02% j15 99.99% 78.28% 99.97% 92.41% 35.11% 48.39% 15.90% 0.00% 6.14% j16 100.00% 85.93% 99.99% 98.44% 48.09% 81.84% 70.95% 0.00% 65.96%

Table 10: Probability of taking a job by latent class (<50% = no, >50% = yes)

7.4 Implementation

The collaborative filter was implemented using the recommendationRaccoon engine8 (Morita, 2014). The recommendationRaccoon engine uses the Jaccard similarity

coefficient, a commonly used distance measure for binary data, which is appropriate for a binary like / dislike rating system. It uses a k-Nearest Neighbor (k-NN) algorithm to compare participants solely to nearest neighbors in order to keep the calculations fast. The code was altered to work with the job preference data model. Representative user profiles derived from the a priori latent class analysis were uploaded in the database according to the information of table 10 with the cells labeled in red marked as 0 (No)

8_{Github
repositories
for
recommendationRaccoon:}

(35)

and other cells marked as 1 (Yes). Refer appendix 3 to 6 for details on the implementation.

7.5 Experimental setting

The experimental design uses two groups: an experimental group is provided job recommendations via a collaborative filter based on a seeded database and a control group is provided recommendations via an untrained collaborative filter. A short, descriptive text outlining the purpose of the study is shown. A web interface was built where participants are asked to rate job recommendations from the collaborative filter (Refer to figure 1 and appendix 1). Participants each receive 16 recommendations. After each answer, the database is updated and the collaborative filter provides a new, more accurate recommendation.

Figure 1: Collaborative Filter

7.6 Predictive validity

The predictive validity of the collaborative filter experiment is determined using hit rates. However, there is a crucial caveat that must be discussed when using hit rates for

recommender systems. The hit rate increases when observed and expected values

(36)

depleted, the performance of the recommendation engine dips. This is only a problem for databases with a very small, humanly exhaustible amount of jobs such as in this

controlled experiment, where a limited number of good job options are available. The hit rate is calculated after each nth recommendation and each respondent to plot the evolution of the predictive strength of the collaborative filter. Hit rate provides an indication of predictive validity at the individual level (Melles, Laumann & Holling, 2000). After computing the hit rate of the nth recommendation for all participants, the mean hit rate for each nth recommendation is computed to get an overall performance statistic per nth recommendation.

8 Results

Two groups of 50 respondents (Seed group vs. Benchmark group) were analyzed. The student sample of the seed condition consists of 51% males (Seed: N = 27, Benchmark: N = 24) and 49% females (Seed: N = 23, Benchmark: N = 26). Age was distributed evenly (Seed: M = 23.84, SD = 2.159, Benchmark: M = 23.60, SD = 2.237). Each respondent was approached at the Faculty of Economics and Business of the University of Groningen. Each person was asked to participate in person.

The 1st job was not recommended by the collaborative filter, but was hardcoded into the software. Ergo, every respondent started out evaluating the same job (See j1 in

table 4). The CF starts recommending jobs on the 2nd_{choice. In this experiment the CF}

recommends jobs until all choices are exhausted. When there are no good

(37)

study provides are based on choice 2-9. Hit rates were calculated on an aggregate level (mean hit rate of all respondents per choice) to track the evolution of the overall performance of the CF over consecutive job choices (See figure 2 and table 12).

Figure 2: Hit rate evolution over consecutive job choices (seed vs. benchmark) Figure 2 provides overall insights into the performance of the seed condition compared to the benchmark condition. Taking into account the cut-off point after the 9th_{job choice,}

the seed recommendation outperforms the benchmark recommendation at each nth_choice.

In order to gain insights into the cold start problem, the hit rates for each choice were examined for each individual respondent (See appendix 2). The seed condition

outperformed the benchmark condition as the database gets more populated, but there is no conclusive evidence to indicate that it the seed condition differed from the benchmark condition before the 15th respondent.

60%64% 62% 60% 52% 48% 54% 46% 44% 40% 34% 40% 62% 42% 38% 76% 82% 64% 62% 56% 66% 62% 72% 38% 56% 50% 38% 22% 20% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 H it ra te i n %

Consecutive job choices

Hit rate evolution

(38)

Re c o m m e n d a ti o n Benchmark Seed Me a n h it r a te Pr e di ct io n e rr o r Me a n h it r a te Pr e di ct io n e rr o r 1 60% 40% 76% 24% 2 64% 36% 82% 18% 3 62% 38% 64% 36% 4 60% 40% 62% 38% 5 52% 48% 56% 44% 6 48% 52% 66% 34% 7 54% 46% 62% 38% 8 46% 54% 72% 28% 9 44% 56% 38% 62% 10 40% 60% 56% 44% 11 34% 66% 50% 50% 12 40% 60% 38% 62% 13 60% 40% 22% 78% 14 42% 58% 20% 80% 15 38% 62% 38% 62%

Table 12: Hit rate and prediction error per choice

9 Conclusions and recommendations 9.1 Findings

This study showed how the integration of seed nodes derived from a conjoint experiment can boost the performance of a collaborative filter. The main goals of this study were to link conjoint analysis to collaborative filtering in a way that would address the cold start problem and to evaluate whether or not the development of a fully automated conjoint recommender is worthwhile. The cold start problem was partially addressed. On the aggregate level, the seed condition predicted more accurately the preferences of new respondents by at least 2% and up to 36% compared to the benchmark. This means that seeding the database with latent classes resulted in performance gains, effective even for databases with a low population (N < 50). This suggests that training the collaborative filter using latent classes is a worthwhile optimization technique. As was mentioned in the previous chapter, both conditions suffer from a decline in hit rates after the 9th consecutive choice. The main explanation for the appearance of this pattern is because the controlled experiment used only 16 jobs for reasons discussed earlier. The