Alleviating the cold-start problem of collaborative filtering by hybridising it with a demographic recommender system

(1)

Alleviating the cold-start problem of

collaborative filtering by hybridising it

with a demographic recommender system

Axel Hirschel (10656146)

Bachelor’s Project Artificial Intelligence (18EC) University of Amsterdam

June 2017

Supervisors:

Leonard Wolters, Crobox, Jollemanhof 17, 1019 GW Amsterdam

(2)

Alleviating the cold-start problem of Collaborative Filtering by hybridising it with a Demographic RS - Axel Hirschel

ABSTRACT

The cold-start problem represents the difficulty of collaborative filtering recommender systems (RS) to recommend products for new users. Demographic RS do not suffer from this problem, but the collaborative filtering RS is better almost 75 percent of the time on longer session lengths. Using data easily available for e-commerce companies, this research has constructed a context-aware demographic RS which is hybridised in three ways together with a collaborative filtering algorithm. The feature combination and cascade hybrids do not suffer from the cold-start problem, but both are significantly worse on longer session lengths. The switching hybrid improves performance on session lengths of 0 by using the demographic RS recommendation and is similar to the collaborative filtering RS on longer session lengths. However, it is not able to identify the 25 percent of recommendations of the demographic RS which are better on longer session lengths.

---

ABSTRACT 2

INTRODUCTION 3

THEORETICAL FRAMEWORK 4

Collaborative Filtering Recommender Systems 4

Content-Based Recommender Systems 4

Demographic Recommender Systems 5

Hybrid Recommender Systems 5

Context-Awareness in Recommender Systems 6

METHODOLOGY 8

Data Insight and Data Processing 8

Collaborative Filtering Recommender System 10

Demographic Recommender System 12

Hybrid Recommender Systems 13

Evaluation 15

RESULTS 17

Most Common Recommender System 17

Demographic Recommender System 17

Collaborative Filtering Recommender System 19

Switching Hybrid Recommender System 20

Feature Combination Hybrid Recommender System 21

Cascade Hybrid Recommender System 22

Comparative Results 23

CONCLUSION & DISCUSSION 24

(3)

INTRODUCTION

The increasing amount of products on websites leads to the disadvantage that users have problems with finding suitable items for themselves (Schafer, Konstan & Riedl, 1999). Recommender systems (RS) are systems to overcome this problem by helping users to find new, relevant items. Ricci, Rokach and Shapira (2011) provide an overview of reasons why these RS can be useful from an e-commerce perspective. Some e-commerce companies sell more products, because the suggested items are specifically catered to individuals’ preferences. Users have a better user experience too because they spend more time on browsing interesting products instead of analyzing irrelevant content. Lastly, a more detailed understanding of users can be made and with this information new products can be created or marketed in a specific way.

These RS are categorized by Pazzani (1999) into three types; collaborative filtering, demographic RS and content-based. In collaborative filtering the similarity between users their ratings is used to recommend new items.Demographic RS also depend on similarities between users, however, instead of users rating this RS depends on demographic features. Alternatively, content-based RS produce recommendations based on product features. The performance of all these algorithms is sometimes boosted by merging them into hybrid RS (Burke, 2002; 2007).

One of the performance issues of collaborative filtering and content-based, the two most commonly used RS, that could be solved with hybrid RS is that they suffer from the so-called user cold-start problem. This user cold-start problem represents the difficulty of these RS to recommend relevant items to users with little clicks on websites. It is therefore complicated to identify similar users or products and thus the recommendations are sometimes irrelevant (Rashid et al. 2002 ). The cold-start problem is particularly important for e-commerce companies, because they get new users regularly (Burke, 2007). For this reason this research is aimed to alleviate the user cold-start problem for e-commerce companies.

A commonly used technique to mitigate the cold-start problem is to combine the collaborative filtering with a demographic RS into a hybrid RS. For example, _{Lika, Kolomvatsos and} Hadjiefthymiades (2012) have employed demographic features to find similar users, before applying the collaborative filtering algorithm. Another approach is to extend the features of collaborative filtering with demographic features, and this has improved performance (Vozalis & Margaritis, 2004). However, the applied demographic features in both papers are not accessible for many e-commerce companies as users do not normally disclose demographic data such as profession and age.

Despite the absence of the features used in previous researches the dataset within this research, provided by Crobox, does contain some demographic features, as well as contextual features. Adomavicius and Tuzhilin (2011) have pointed out that many RS do not employ contextual features, such as what time, how long and from where users are browsing. Nevertheless, these features have boosted web search applications (Borisov, Markov, de Rijke & Serdyukov, 2016) and travel guides (Adomavicius and Tuzhilin, 2011). Within this research the easily accessible contextual and demographic features are used to produce three hybrid RS that outperform standard collaborative filtering algorithms, especially on the short term.

More insights on the theoretical background of the three main RS, the various types of hybrid RS and context-aware RS are further explored within the first chapter of this research. With this information the construction of the basic demographic and collaborative filtering RS is examined within the methodology, as well as the three proposed hybrids, the information within the dataset and the evaluation criteria of the algorithms. Subsequently, the results of the experiments are published, which are discussed within the discussion section. Lastly, within the conclusion of the research it is determined what can be deducted from this research.

(4)

THEORETICAL FRAMEWORK

Recommender systems (RS) are systems guiding users in a personalized way to interesting and useful products while there are many possibilities (Schafer, Konstan & Riedl, 1999). Various types of RS have been made, and each of these RS have specific advantages and disadvantages. The most common RS are collaborative filtering RS, content-based RS and demographic RS (Burke, 2002) (Burke, 2007). Within the rest of this chapter these techniques are described, as well as the combination of these techniques called hybrid recommendations. The final paragraph of this chapter is about how RS can be transformed into context-aware RS.

Collaborative Filtering Recommender Systems

The first RS is collaborative filtering, which uses the people-to-people correlation to recommend according to personal preferences. The correlation can be identified via a neighbourhood- or a model-based method (also called_{latent factor models}). Within the neighbourhood-method the ratings of a user are compared to the ratings of other users. The positively rated items of the most similar users, called neighbours, are then recommended to the initial user (Desrosiers & Karypis, 2011). Within the model-based method the system transforms both the users and the items to the same latent factor space. The factors on which the characterizations of users and movies are inferred from user feedback and with these characterizations recommendations can be made (Koren & Bell, 2011). Collaborative Filtering RS often suffer from a scarcity of data, because they depend on similarities between users. When few users have clicked the same item it is difficult to find good neighbours or create correct models. Moreover, if a specific user has not rated many items, it is difficult to recommend relevant items (Burke, 2002). This problem is also known as the cold-start problem and often specifically important for e-commerce companies, the data from Crobox indicates that slightly more than 50 percent of the users leave after one click on the overview pages. This possibly means that these users are unsatisfied with the products they have clicked.

However, still collaborative filtering RS are widely used because the various advantages they give. Firstly, Burke (2002) suggests that collaborative filtering RS are able to to suggest cross-genre items. Even more so, with collaborative filtering a more personalized and complete overview of a user can be generated, allowing them to outperform most demographic RS when enough ratings are given. Lastly, even without detailed descriptive data about items, it is possible for collaborative filtering RS to create results that can outperform content-based filtering RS (Burke, 2002).

This efficiency means that the collaborative filtering RS are still relevant despite their detriments in the short term. Within this research the collaborative filtering is therefore used as the leading algorithm, and the other RS and hybridising methods should complement the collaborative filtering to reduce the effects of the user cold-start problem.

Content-Based Recommender Systems

In contrast to the collaborative filtering RS the content-based RS do not take ratings of other users into account. Content-based filtering RS look at positively rated items of a user and try to find similar items based on the characteristics of that item. Based on the implicit feedback of users features are abstracted which a user is interested in or features that a user does not like (Lops, De Gemmis & Semeraro, 2011). The big advantage of content-based filtering is that it is able to overcome the item cold-start, which is the difficulty to recommend new items. The features of the items are known without interaction with users and therefore it is possible to compare the new items to other items (Burke, 2002).

Equally to collaborative filtering, enough ratings per need to be present for the system to give relevant recommendations, thus the content-based RS also suffer from the user cold-start problem. Also, enough descriptive data needs to be present for the algorithms to provide recommendations, and the

(5)

provided dataset does not contain features of items. Therefore the content-based RS are not used within this research to recommend items.

Demographic Recommender Systems

Demographic RS also recommend products derived on clusters of similar users, however, these clusters are based on users’ demographic data. New users are then matched to these clusters, and items which have received positive ratings by people in these clusters are recommended to the new users. The used demographic data could include age, gender and profession, however, the obtainment of this data is generally challenging (Burke, 2002).

One of the major asset of demographic recommender systems is the ability to propose relatively relevant recommendations for new users, since these users can be compared to other clusters even with little information available and thus recommendations are made accordingly. An example of this is the research of Lika, Kolomvatsos and _{Hadjiefthymiades (2014), they have used demographic} features to create clusters of similar users before applying a collaborative filtering algorithm. This approach succeeded in increasing the performance on relatively new users. On top of that, Al-Shamri (2016) has mentioned the simplicity and speed as major benefits of demographic recommender systems. Moreover, Al-Shamri (2016) and Burke (2002) have recognized the ability to identify cross-genre niches within groups_{, similarly to the collaborative filtering RS. Thirdly, as mentioned by} Burke (2002), specific domain knowledge is not necessary when demographic recommender systems are used.

The main deficiency of demographic recommender systems is that many users have reluctance in sharing their personal information with websites. Therefore, some predictive features are arduously applied within the algorithm and this causes more generalized recommendations (_{Al-Shamri, 2016).} Moreover, categorizing users to specifically one group is regularly too simplistic. This is the so-called gray-sheep problem, because in reality users belong to multiple groups (Burke, 2002). Lastly, the demographic recommender systems suffer from the new-item-cold-start, because if few users have interacted with an item it can not be recommended.

In conclusion, demographic RS have proven to produce well-performing recommendation, although the recommendations are often too generalized. Potentially, this means that hybridising the demographic RS with the collaborative filtering RS solves their cold-start problem. Therefore this research employs the demographic RS, in combination with the collaborative filtering RS, to boost the effectiveness on shorter session lengths.

Hybrid Recommender Systems

Combining the demographic RS and the collaborative filtering RS can be achieved with multiple approaches. Within the research of Burke (2002) and Burke (2007) an overview of the various techniques of hybridising is provided. This paragraph is aimed in explaining the different techniques of hybridisation and examining the most suitable approaches for this research.

The first technique is weighing the outcomes of multiple RS. Both RS produce a score for each item and these scores are combined with a pre-defined weight. The items with the highest combined score eventually gets recommended to the users. As mentioned by Burke (2007), the assumption of weighted hybrids is that the RS have a similar performance regardless of the amount of information about the user and products. This technique is therefore less useful to overcome the cold-start problem of collaborative filtering RS, because the recommendations of collaborative filtering on the short session length are also taken into account by a weighted hybrid. Similarly, the generalized recommendation of the demographic RS would also influence the more specific recommendation of collaborative filtering RS on longer session lengths.

A second way of creating hybrids is mixing the results of two recommenders. This means that the recommendations are presented next to each other, and no interaction between the RS take place. For

(6)

example, if two RS recommend products these products would be shown on websites next to each other. However, with mixing results ranking the recommendations is not possible, and that is necessary within this research (Burke, 2002).

The third possibility of hybridisation is feature combination, which is mainly used to translate the features of collaborative filtering RS to be used by content-based RS. However, Vozalis & Margaritis (2004) have also employed this technique to extend the matrices used in collaborative filtering to calculate similarity between the train and test data with demographic features. Although this dataset contains different features, which are discussed in the methodology, the success of that research implies possibilities for this research and therefore the feature combination hybrid is experimented with.

The fourth hybrid technique is switching between RS after a number of ratings by a user or based on the degree of certainty of a RS. _{Ghazanfar and Prugel-Bennett (2010) show that a switching of a naive} Bayes and a collaborative filtering can decrease the item cold-start and have a better performance as well. This shows that by switching specific advantages of algorithms can be exploited, which is again a reason for this hybrid to be used within this research.

The fifth technique is cascading, in which first one technique is employed and then a second technique is used to break ties between different items. Lika et al. (2014) have used this technique to cluster similar users based on a demographic RS and then using collaborative filtering to find relevant recommendations. Again, the dataset in this research has data which is typically available for e-commerce companies and thus the result of Lika et al. (2014) can not be directly applied, yet they have shown that cascading is a promising hybridising technique.

The sixt technique is feature augmentation, the output of one RS is used to create new features to be used by another RS. The last technique meta-level is similar to feature augmentation, but instead of producing feature now the whole model is used as input for the next RS. Burke (2007) mentions that it is often difficult to use feature augmentation or meta-level, because the RS are not built to produce features or models.

The outcome of this research should be a hybrid RS that exploits the benefits of collaborative filtering RS, but reduces the influence of its cold-start problem by merging it with a demographic RS. Three of the, by Burke mentioned, hybridising methods are promising based on both previous research and the goal of this research, these RS are the feature combination, the switching and the cascade hybrid.

Context-Awareness in Recommender Systems

Often, the context in which users visit a website are important for their preferences ( _{Adomavicius &} Tuzhilin, 2011_{). For example, in winter users are more likely to buy warmer clothes. As trivial as this} might be, many systems do not adapt to this kind of context. In the research of _{Adomavicius &} Tuzhilin three possibilities to implement context awareness are discussed and these are visualized in Figure 1. Note that these are not RS, but boosts to their performance as they take context into account. Firstly, within contextual pre-filtering only data is used which have taken part within the specific context. For example, if someone searches for clothes and that user browses in New York, only earlier ratings from people in New York are taken into account for the creation of a recommendation. It is important to notice that features can be generalized to prevent over-specialization. This means that data produced in the context of “saturday” and “sunday” can be combined in order to create the “weekend”-context. The generalizations can be made by experts, or the filter can be evaluated on train data with different generalizations.

Secondly, within contextual post-filtering a list of recommended items can be reordered based on the given context. This filtering can be done based on merely the similarity of the current context to the

(7)

context of the items or by taking into account the predicted rating as well. Both contextual pre-filtering and contextual post-filtering can be used with all RS.

Thirdly, within contextual modelling the contextual features are directly used in recommendation functions. Contextual modelling is either directly used with heuristic calculation or machine learning techniques. In contrast to the previous context aware techniques, collaborative filtering RS and content-based RS can not be combined with contextual modelling. Demographic RS can be combined with contextual modelling.

Figure 1: Different ways of establishing context in Recommender Systems (Adomavicius & Tuzhilin, 2011)

Within this research, context-awareness is to a large extent achieved by boosting the demographic recommender system with contextual features. An overview of the contextual features can be found in Appendix A, and more explanation on the selection of the features is given in the methodology. This means that contextual modelling is applied within the demographic recommender system, since the features are directly added to the demographic features. The switching hybrid RS is thus partly contextualized, since the collaborative filtering algorithm is not constructed context-aware. The feature combination hybrid is also a contextual model, because the demographic RS is merged with the collaborative filtering RS into one system. The cascade algorithm is arguably a contextual pre-filter, since only collaborative filtering data of users with similar contextual features are used.

(8)

METHODOLOGY

Within this section the methodology of this research is further explained. In Figure 2 the major parts of this research are visualized. Firstly, the data has to be processed in order to be used by the various algorithms. A detailed overview of the data and the processing steps is given in the first section. Secondly, the functioning of the two major algorithms of this research, the collaborative filtering and the demographic RS, are explained. Thirdly, the primary steps for the construction of the hybrid between the two algorithms are analyzed. Lastly, the evaluation of the algorithms, the hybrids and their parameters is examined.

Figure 2: Overview of the different elements within this section

Data Insight and Data Processing

The data in this research is provided by a Crobox, a company specialized in increasing revenue on internet webshop using psychology, marketing and machine learning techniques. The provided dataset contains information about Crobox’s clients and the clients’ users, and is for privacy reasons not disclosed in this paper. However, within this section the content of the data and the processing steps are explained, because with this information researchers with comparable datasets can apply similar techniques to their research. The applied dataset is of a webshop specialized in sports clothing. Although many of the other researches on RS have used either the MovieLens (Miller & Albert, 2016), the Entree (Pazzini, 1999) (Gupta & Arora, 2016) or the Netflix Prize ( Hallinan & Striphas, 2016) dataset this research does not use these datasets for two reasons; firstly and most importantly, the provided dataset by Crobox has other contextual data available and thus the proposed techniques can not be used in a similar way. Secondly, this research is aimed to analyze RS their behaviour on webshops and although many parts of the research on RS in the entertainment industry can be applied, this research is directly applicable for webshops.

Within the dataset of Crobox the users do not give the products explicit ratings, so therefore their implicit interests need to be considered. Many actions are monitored by Crobox, such as clicking on an item, adding it to their shopping cart and buying the item. For this research all the interactions with an item are considered as positive feedback on an item. It is obvious that if someone clicks on an item multiple times that person has more interest in that item, however, this research will not weigh multiple clicks on the same item extra following the advice of Desrosiers and Karypis (2011).

The dataset also contains demographic and contextual information about the sessions and user. Many RS have used demographic data, such as Pazzani (1999) and Gupta & Arora (2016), however, no one gives an explicit definition of demographic data. On the other hand, contextual data is defined by Adomavicius et al. (2011) as points that change during a user’s interaction with the website. Following this definition, we can consider parts of our data primarily contextual. A detailed overview of the features within the dataset can be seen in Appendix A.

(9)

Figure 3: Overview of the Data Processing Steps

The main parts of the processing of the data are visualized in Figure 3. The load-in step within the data processing consists of cleaning the data. This means that the homepage and detail pages and the color codes of products are discarded and this reduces both skewness towards often occurring products and the amount of products. Secondly, the actions of users are grouped as chronological sessions. The last step is to remove items from sessions which have been clicked on before since within e-commerce it is not necessary to recommend known products again. For other applications, such as the creation of music playlists, the

decision can be taken differently, because people do like songs they know in a playlist.

The second step in processing the data is to shuffle the sessions because now the sessions are chronologically ordered and all days should be represented in the train and test sets. The order of clicks within sessions are not shuffled. This happens before modifying the data for their use in the algorithms. This means that the performance of the algorithms can be directly compared because they use the same selection of data. The third step to prepare the data is the pre-processing, which consists of 5 steps. Firstly, cyclic data, such as time of day and weekday, is transformed to correctly represent their movement through time. Secondly, for the vectorization of categorical data OneHot-Encoding is used, which makes it possible to calculate neighbourhoods, even with non-numeric data. A disadvantage of OneHot-encoding is that too many features are created for a computer to use all the data. Therefore a cut after the 10 most common values of that feature is established, and the information of other categories is discarded. Thirdly, all numerical features are normalised using minMax-normalisation, so that all features have a similar impact on the learning algorithms.

Figure 4 & 5: Occurrences of the top 10 values and the amount of data not represented of respectively browser types and cities of users.

The fourth step in the pre-processing is to remove features that have product information, have no predictive value or are not present for many users. The features with product information are removed, because these feature would give information about a click instead of information of the context of the user. Some features, such as timestamps and Ids, have no predictive value and are thus

(10)

discarded. The last group of features that have been removed are those that are not present in all samples. Figure 4 and 5 give an overview of the variance of two features, and since within the data represented in Figure 4 a large part of the data is not represented in the top 10 values that feature is discarded. A full overview of all the features in the dataset can be found in Appendix A.

Figure 6 visualizes the distribution of session lengths within the dataset. The longer sessions always entail the shorter sessions as well, so total number of users within the dataset is 450650. As can be seen, just a small part of the visitors has many clicks on the overview pages, which means that after a certain time the evaluation is based on fewer samples. In total, just over 1,5 million clicks have

been made within the dataset, Figure 6: Distribution of session lengths within the data which means that on average

users click on 3,3 products. In total _{1121 different items are} given implicit feedback on, and to reduce the sparsity merely 1000 of the products are used in this research. A distribution of how often items are clicked on are visualized in Figure 7. The last step in processing the data is to divide the data set into five equal parts. Every

version of every algorithm will Figure 7: Distribution of item clicks in the data be trained on ⅘ of this data, and

evaluated with the last part. This is done five times in total, with each time employing a different part of the data as test data.

Collaborative Filtering Recommender System

The first constructed RS is the Collaborative Filtering Recommender System and an overview of the most important processes in this algorithm is visualized in Figure 8. The inputs of the data are the test and training data gathered in the data processing step. As mentioned, the different versions of this algorithm are tested five times. The first part of this section is addressed to the creation of the test and train matrices. The second part is focussed on the production of the eventual prediction. As can be seen in Figure 8, this prediction is then evaluated which is explained in the last section of this chapter.

(11)

Figure 8: An overview of the Collaborative Filtering Recommender System.

The first step in producing the collaborative filtering algorithm is to produce a vector of the 1000 items which are clicked most frequently. With this vector a train matrix can be constructed of _{n users} by 1000 items, with ones on the intersections of users who bought the item of that column and otherwise a zero.

Algorithm Collaborative Filtering Recommender System Parameters K-means clustering:

Prediction based on: k in kNN:

10, 30, 100, 300, 1000 kNN, Weighted 1, 3, 10

Table 1: An overview of the parameters of the constructed Collaborative Filtering Recommender Systems

Secondly, the various train matrices are constructed, these possess the data used to construct a prediction. The data is clustered to speed up the training, since it is necessary that the system can run smoothly if it is implemented at e-commerce companies. The clustering of the data is done with K-means clustering algorithm. Multiple amounts of clusters are used in the evaluation, an overview of all parameters of this algorithm can be found in Table 1.

Thirdly, the test data is prepared to be compared with the train matrices. Figure 9 visualizes a possible scenario with fewer data, but it is exemplary for larger sets as well. For every item of each user a vector of the information to that point is constructed. If it is

Figure 9: The construction of the test matrix, the correct items vector and the session length

(12)

the first click of a user the vector consist of merely zeros. For every extra click a one is added to the vector, symbolizing the knowledge gained previously. This construction correctly displays the gathered information, and it also does not use information it would not have in an application. Besides the vector also the correct item is appended to a list of all clicked items. Similarly, the length of the session so far is appended to a vector of all session lengths.

The second process of the collaborative filtering algorithm is the establishment of a prediction. One at a time, the test vectors are compared with the clusters in the train matrix. The comparison is computed using the cosine similarity, which results in a list of similarities with all other users. This list of similarities is used to produce predictions, and two possible methods of processing the similarities are used. Firstly the kNN algorithm is used in which the _{k most similar clusters are identified. Then the k} most common vectors are merged together and the sum of their item clicks are used to sort the 1000 products and produce a list of recommendations.

The second method of producing the prediction is by multiplying the similarity and train matrix with each other. The result of this multiplication a vector of 1 * _m. The item list is subsequently sorted according the result of the multiplication. The produced recommendation is also a sorted list with the item most likely to be bought at the first place and so forth.

Demographic Recommender System

The second constructed RS is the Demographic Recommender System and an overview of the most important processes in this algorithm is visualized in Figure 10. The first step is to split the X and Y data of the inputted train data and using K-means clustering a _Kamount of clusters can be constructed with the X-traindata. For every constructed cluster the clicked items of the actions within that cluster are summed, and this is later used for a prediction.

Figure 10: Overview of the most important steps in the Demographic Recommender System Similarly to collaborative filtering, the similarity of a test-vector and the clusters is calculated. Using this similarity two possible approaches for the creation of a prediction can be taken. The first approach is to sum up the Y counts for the _{k closest clusters. A second approach is to weigh the count} of each cluster by the similarity between the cluster and the test vector. All the different parameters which are tested are mentioned in Table 2.

Algorithm Demographic Recommender System Parameters Feature input:

Amount of clusters: Prediction based on: k in kNN:

various versions of features 10, 30,100

kNN, Weighted, pre-clustering 1, 3, 10

(13)

The performance of these kind of algorithms is largely dependent on the data input. Most of the demographic RS have been tested with manually selected data. To test whether the selected features are effective, also some other variations of features have been tested as well.

The last version of the demographic recommender systems is a RS in which the session length is also a defining feature. This results that before the clusters are constructed, the data is split on the basis of session lengths. The next step is to create a _k amount of clusters per session length. The prediction is now thus made by finding the session length of the test sample, and comparing its demographic data with the clusters of that session length.

Hybrid Recommender Systems

Within this research three different hybrid recommendation systems are constructed; these are a switching hybrid, a cascade hybrid and a feature combination hybrid. This section explains how these hybrid RS function and explain how the hybrid techniques mentioned by Burke (2002) are used.

Figure 11: An overview of the Switching Hybrid Recommender System

The first constructed algorithm is a switching hybrid RS which switches between the best versions of the collaborative filtering RS and the demographic RS and its parameters are displayed in Table 3. Figure 11 displays the general idea behind the built switching hybrid. During the training phase the algorithm evaluates how well both the demographic and the collaborative filtering algorithm perform per session length. The switching hybrid is then adapted to choose between both recommendations, based on the session length. On short session lengths the demographic RS can be used to solve the cold-start problem, and on longer session lengths the strength of collaborative filtering is used.

Algorithm Switching Hybrid Recommender System between a Collaborative Filtering and a Demographic Recommender System

Parameters Switching criterion: Logistic classifier input:

Session Length, Output of logistic classifier

Cumulative performance on session, previous performance on session Table 3: An overview of the parameters used in the constructed switching hybrid recommender

systems.

Another version of this algorithm uses the output of a logistic regression classifier as switching criterion. The input for the classifier is the cumulative or the previous performance of both algorithms on users in the train set. For example, if before the 10 th_{click the demographic RS has better predicted} this user 3 times the cumulative performance, and thus X-feature, for this train example is 3. If the

(14)

demographic RS predicts the 10th_{click better then the Y-value is 1, otherwise it is 0. If the 9} th_click was better predicted by the demographic RS, then the X-feature for the previous performance is a 1. These features are used to classify when it is better to use the collaborative filtering RS or the demographic RS. This classifier should be able to find sessions on which the demographic RS is also better on longer session lengths, besides solving the cold-start problem.

Figure 12: An overview of the different steps in the Feature Combination Hybrid Recommender System

The second hybrid is the feature combination hybrid RS, which is visualized in Figure 12 and its parameters are displayed in Table 4. The algorithms of the demographic RS and the collaborative filtering RS within this research share many similarities in how they function. The main difference is the input of data, but the methodology of calculating similarities can still be applied if the data is combined, and this is the main idea behind the feature combination hybrid. The matrixes of the demographic RS and the collaborative filtering RS are appended to each other, followed by K-means-clustering on the combined data. The following steps are equal to the earlier mentioned algorithms which means that the similarity is calculated and this similarity is used to select the k nearest clusters or produce a weighted recommendation.

Algorithm Feature Combination Hybrid Recommender System Parameters Amount of clusters:

Prediction based on: k in kNN:

100,300,1000 kNN, Weighted 1, 3, 10

Table 4: An overview of the parameters used in the constructed Feature Combination Hybrid Recommender System

The last hybrid is the cascade hybrid which is depicted in Figure 13. The central idea of this cascade hybrid is that the demographic RS selects users that are similar to the test user before collaborative filtering is used to order the provided data into an ordered list of items. The collaborative filtering algorithm is thus merely applied on a training set of users that belong to the similar demographic and contextual cluster and this merge can thus also be viewed as a contextual pre-filter, which was mentioned by Adomavicius & Tuzhilin (2011).

(15)

Figure 13: An overview of the different steps in the Cascade Hybrid Recommender System Within the cascade hybrid first demographic and contextual clusters are constructed, these are the pre-clusters. Per pre-cluster also the corresponding collaborative filtering data of the users is used to construct clusters within the demographical clusters. The predictions are created by calculating the similarity of the test user and the demographic clusters. The test user its collaborative filtering data is then compared to the collaborative filtering clusters within the most similar demographic cluster. Then either kNN or a weighted version is used to construct a recommendation for the test user, this parameter and all the others can be found in Table 5.

Algorithm Cascade Hybrid Recommender System Parameters Amount of demographic pre-clusters:

Amount of collaborative filtering clusters: Prediction based on:

k in kNN:

10, 30, 100

10, 30, 100, 300, 1000 kNN, weighted 1, 3, 10

Table 5: An overview of the parameters used in the Cascade Hybrid Recommender System

Evaluation

This section is dedicated to the evaluation of the predictions of the algorithms and Figure 14 visualizes the entire evaluation. As mentioned, all versions of all algorithms give a sorted list of the 1000 most common items within the data. Besides the prediction, also the correct item and the length of the session so far are given as input for the evaluation step.

Figure 14: An overview of the evaluation of predictions

The Average Rank of Correct recommendation (ARC) is used to measure the success of the algorithm. Burke (2007) uses this metric specifically because the tested algorithms can directly be

(16)

compared to other algorithms. Moreover, the ARC shows how well RS can discriminate certain items from others, as the output is an ordered list. The ARC is calculated by searching in the ordered lists outputted by the RS and locate the item that is clicked on. The rank of the location is stored and when all the ranks are calculated the mean can be calculated. This is then considered the ARC score and an example of the calculation is visualized in Figure 15. A low ARC score means that the algorithms can more accurately recommend according to a user’s preferences. Since the session lengths are also given as input it is possible to calculate the ARC score per session length, which allows to compare algorithms also on different session lengths.

The data is split into 5 equal parts so it is possible to see whether the results of the various algorithms is not caused by a overfitting to a particular part of the data. All RS are also compared with a baseline recommender which always recommends a sorted list of the most occurring items, to see whether they are really different.

Figure 15: Example of an ARC Calculation A final remark is that runtime is not taken into account in the comparison of the various algorithms. However, the run time of the algorithms has to be taken into account when the algorithms are implemented for e-commerce companies. Since further increasing some parameters, such as the amount of clusters, would increase the run time a lot, this research has chosen not to increase the parameters further than mentioned before.

(17)

RESULTS

Within this section the results of this research are demonstrated. Firstly, the results of the baseline algorithm is presented. Then the outcomes of the collaborative filtering and the demographic recommender system are shown. Thirdly, the results of the hybrid recommender systems are visualized. The last part of this section contains an overview and comparison of the best performing versions of each algorithm.

Most Common Recommender System

The most common recommender system is the main baseline of this research and thus other RS are compared to the findings of the most common recommender. As can be seen in the Figure 16 the RS does not suffer from a cold start problem. On the contrary, the most common recommender performs better on shorter session lengths compared to longer session lengths.

Figure 16: The ARC per session length of the Most Common Recommender System

Demographic Recommender System

In Table 6 the findings of the various demographic RS are displayed. The best performing version is highlighted in green, and the worst performing variant in red. All the experimented versions outperform the most common recommender, regarding the average ARC-score, and this shows that demographic data can contribute in increasing performance. However, the maximum improvement of the demographic RS is 4,3 percent.

Algorithm Amount of Pre- clusters Average ARC- score

ARC-score on session length =

0 1 2 3 4 5-10 11-20 21+ most common - 255 203 237 258 272 282 298 317 331 DRS weighted 10 253 200 234 256 270 280 296 315 329 DRS kNN k=1 10 247 195 228 250 264 274 291 310 326 DRS kNN k=3 10 249 196 230 252 266 276 292 312 327 DRS kNN k=10 10 252 199 232 254 268 281 296 315 332 DRS prefilter 10 250 195 233 256 270 279 296 319 313 DRS weighted 30 252 200 234 255 269 279 295 314 329

(18)

Alleviating the cold-start problem of Collaborative Filtering by hybridising it with a Demographic RS - Axel Hirschel DRS kNN k=1 30 246 193 227 249 263 273 290 310 326 DRS kNN k=3 30 246 193 227 248 263 272 289 309 325 DRS kNN k=10 30 248 196 230 251 265 275 292 311 326 DRS prefilter 30 250 194 233 255 269 279 295 319 312 DRS weighted 100 252 200 233 255 269 279 295 314 329 DRS kNN k=1 100 250 196 231 253 267 277 293 312 328 DRS kNN k=3 100 244 191 226 247 262 272 288 307 323 DRS kNN k=10 100 245 192 226 247 262 272 288 308 323 DRS prefilter 100 250 194 232 255 269 279 295 318 312 Improvement best compared tomost common - 4,3% 5,9% 4,6% 4,3% 3,7% 3,6% 3,4% 3,2% 5,7%

Table 6: Results of the various Demographic Recommender Systems

The results suggest that the weighted demographic RS recommend similarly to the most common recommender, because the ARC-score on all session lengths is close to the ARC-score of the most common recommender. This indicates that the constructed clusters based on demographic data are similar. Since the weighted version still takes all clusters into account, because they are so similar, those prediction are similar to the most common recommender.

Figure 17 & 18: Visualization of the parameter optimization in the kNN Demographic Recommender Systems

Within Figure 17 and 18 the optimizations of the main parameters of the kNN demographic recommender systems are visualized. The first graph shows the different amount of neighbours which were tested. Although the optimum differs per amount of clusters, it can be concluded that the best version has three neighbours. The second graph shows the optimization of the amount of clusters. The two best performing versions have hundred clusters. The graph still shows decline, which means that more clusters could even decrease the ARC-score further. However, since the decline is limited and it increases run time this research has not pursued further optimization.

The pre-filter variant of the demographic recommender system performs well on session lengths smaller than two. This implies that on shorter session lengths the data is more predictable as well, especially because the most common recommender also shows better performance on short sessions. Besides an uplift of performance compared to the most common recommender on short sessions the pre-filter version is outperformed by the kNN variant.

In addition to the listed versions of the demographic RS also 15 experiments with the best parameters and varying features have been conducted. During these experiments a number of features have been

(19)

removed and then the demographic algorithms were run. None of the fifteen experiments significantly improved performance which resulted in that this research did not pursue feature selection any further. On average the kNN demographic RS with 100 pre-clusters and a 3 neighbours has the best performance of all demographic RS. The demographic RS are used to tackle the cold-start problem and since this variant also has the best performance on short session lengths this version is considered the best of the demographic RS.

Collaborative Filtering Recommender System

In Table 7 the results of the collaborative filtering RS is displayed. Many of the versions outperform the most common algorithm, and various versions also show the potential of collaborative filtering with performances that are over 30 percent better than the most common recommender. It is interesting that most versions also perform well on short session lengths longer than 1. However, if the session length is 0 then all the collaborative filtering algorithms are outperformed by the most common algorithm. Algorithm Amount of Pre-clusters Average ARC-score

ARC-score on session length =

0 1 2 3 4 5-10 11-20 21+ mostcommon - 255 203 237 258 272 282 298 317 331 CF kNN k=1 10 239 263 201 213 223 230 243 257 276 CF kNN k=3 10 232 256 195 207 216 222 235 252 268 CF kNN k=10 10 264 226 252 268 278 284 295 306 312 CF weighted 10 225 249 183 197 206 213 226 244 262 CF kNN k=1 30 221 272 178 183 190 195 208 228 249 CF kNN k=3 30 202 265 148 158 168 175 188 209 234 CF kNN k=10 30 219 239 187 197 205 210 221 235 250 CF weighted 30 205 265 149 159 168 176 190 213 239 CF kNN k=1 100 225 330 160 161 167 172 188 214 243 CF kNN k=3 100 183 289 108 114 123 131 149 180 219 CF kNN k=10 100 184 266 123 130 139 146 160 183 216 CF weighted 100 185 285 109 117 127 135 154 185 222 CF kNN k=1 300 265 405 208 195 190 189 199 223 246 CF kNN k=3 300 192 325 107 111 119 127 147 181 219 CF kNN k=10 300 185 293 120 124 129 133 142 165 209 CF weighted 300 187 304 108 117 127 136 155 187 226 CF kNN k=1 1000 293 381 287 254 234 228 232 258 287 CF kNN k=3 1000 204 326 127 125 134 143 164 198 238 CF kNN k=10 1000 178 288 116 111 113 117 133 167 213 CF weighted 1000 190 303 111 116 126 133 153 185 222 Improvement best compared to most common 30,2% 0,0% 54,9% 57,0% 58,5% 58,5% 55,4% 48,0% 36,9% Table 7: Results of the various collaborative filtering recommender systems.

Figure 19 and 20 show the optimization of respectively the amount of neighbours and the amount of pre-clusters. It can be seen that if you increase the amount of clusters a higher amount of neighbours increases the performance. Similarly, if k increases, more clusters perform better as well. Although increasing the amount of neighbours and clusters might be beneficial for the performance of the

(20)

collaborative filtering algorithm this research has not pursued this. Already the run speed has been slow, and further increasing the parameters would also increase the run time. As mentioned in the methodology, more run time is a considerable downside for many e-commerce companies.

Figures 19&20:Optimization of the k in kNN and amount of clusters in the Collaborative Filtering RS The best version of the collaborative filtering RS is the version with 1000 pre-clusters and 10 neighbours. It is outperformed by some versions on short session lengths, but since the collaborative filtering algorithms within this research are used for their long term performance this is neglectable. These parameters have also proven most successful when used with the tests with the known data included as well. The increase of performance of this version compared to the most common RS on unknown data is just over 30 percent.

Switching Hybrid Recommender System

In Table 8 the results of the various switching hybrid recommenders are displayed. Specifically for this hybrid the result of the previous recommenders are also mentioned, because they influence this hybrid a lot as it switches between their results.

Algorithm Splitting Criterion _ARC-scoreAverage

ARC on session length =

0 1 2 3 4 5-10 11-20 21+ most Common - 255 203 237 258 272 282 298 317 331 DRS kNN k=3, clusters = 100 - 244 191 226 247 262 272 288 307 323 CF kNN k=10, clusters = 1000 - 178 288 116 111 113 117 133 167 213 Switching Hybrid Session Length 149 191 116 111 113 117 133 167 213

Switching Hybrid cumulative ARC-score Logistic regression on algorithms

149 191 116 111 114 117 133 167 213

Switching Hybrid Logistic regression on previous ARC- score algorithms

149 191 116 111 114 118 133 165 213

Switching Hybrid

Logistic regression on cumulative wins of both

algorithms 149 191 116 111 113 117 134 168 216

Improvement best compared to most

Common 41,6% 5,9% 51,1% 57,0% 58,5% 58,5% 55,4% 48.0% 35,7% Table 8: Results of the different versions of the Switching Hybrid Recommender System As was expected, the switching hybrid RS are able to exploit the strengths of the other algorithms to produce a prediction that outperforms the earlier constructed algorithms. The strong performance of the demographic RS is used on the shortest session length, before switching to the collaborative

(21)

filtering on longer session lengths. This leads to an increase compared to the most common recommender of 41,6 percent.

During the construction of the switching hybrid recommender system the performance per click of the demographic and the collaborative filtering RS have been evaluated as well. The results of these tests have been visualized in Table 9 and it shows how many of the predictions per session length of the collaborative filtering RS were better than the demographic RS and vice versa. These results indicate that even on the session length of 0 the collaborative filtering is sometimes better. Perhaps more surprisingly is that more than 24 percent of the time, the demographic recommender system outperforms the collaborative filtering predictions on longer session lengths. When for each click the best recommendation of those RS is selected the overall ARC-score is 114.

Session Length Amount CF _better Percentage CF _better Amount DRS _better Percentage DRS _better

ARC if best recommender is choses Overall 872958 58,67% 614988 41,33% 114 0 113763 25,24% 330638 73,37% 150 1 179682 71,23% 70022 27,76% 81 2 122988 74,72% 40426 24,56% 82 3 87542 75,32% 27964 24,06% 97 4 64995 75,26% 20905 24,21% 91 5-10 164226 73,79% 58344 26,21% 103 11-20 79809 68,88% 36057 31,12% 129 21+ 59953 66,18% 30632 33,82% 165

Table 9: Performance of the DRS and CF per click relative to each other. The logistic regression classifiers are

specifically implemented to select the best performing algorithm for a specific click. The logistic regression classifiers have always predicted the demographic RS to predict better on the 0th_{click since it is better 73 percent of} the time. Unfortunately, on longer session lengths collaborative filtering is almost always selected, which means that roughly 25 percent of the time a worse prediction is chosen. The reason for this is that the previous performance on algorithms is not predictive enough for the classifier to decide when the demographic RS is better. Figure 21 visualizes this with an example of the 10th _{click, the x-axis is the} amount of times demographic RS

Figure 21: Ratio of Demographic RS predictions better than Collaborative filtering

RS predictions on the 10th _{click of a user} was better on the previous 9 clicks and the y-axis shows the ratio of 10 th_{clicks that is better predicted} by the demographic RS. Figure 21 thus shows that the logistic classifiers only learn to predict the demographic RS to be better when 8 out of 9 clicks were so far better predicted by DRS. This small amount is thus not enough to create a better performing splitting criterion.

Feature Combination Hybrid Recommender System

The experiments of the feature combination hybrid recommender system have been carried out with the best performing cluster amounts of the collaborative filtering and demographic RS. The results of these experiments are listed in Table 10.

(22)

Algorithm _Pre-clustersAmount of Average _ARC ARC on session length =

0 1 2 3 4 5-10 11-20 21+ most Common - 255 203 237 258 272 282 298 317 331

Feature Combination weighted 100 254 201 235 257 271 280 298 316 326

Feature Combination kNN k=1 100 250 198 232 255 270 279 298 314 298

Improvement best compared to most Common 4,7% 4,4% 3,8% 3,1% 2,6% 3,2% 4,7% 5,7% 12,4% Table 10: Results of the different versions of the Feature Combination Hybrid Recommender The best performing version of the feature combination hybrid is the version with 1000 pre-clusters and ten nearest neighbours. The difference is little compared to the version with 300 pre-cluster and 10 neighbours, but on longer sessions the version with 100 pre-clusters is slightly better. The average rank of this version improves the performance of the most common RS with 4,7 percent.

The difference of this version compared to the most common RS and the demographic RS is small. This means that a large part of the predictions is caused by the demographic RS and not the collaborative filtering RS. However, the feature combination RS is distinctly different; on session lengths of 3 the demographic RS outperforms this version and on longer sessions the feature combination RS is better than the demographic RS.

Cascade Hybrid Recommender System

Since there are 48 versions of the cascade hybrid recommender system only the versions with an average ARC-score lower than 200 are visualized in Table 11. The 3 best performing versions of this hybrid all have 1000 pre-clusters of the collaborative filtering data. The best performing collaborative filtering algorithms also had this as their best performing parameter. However, in contrast to the collaborative filtering here the best performing versions are the ‘weighted’ versions. Since to a large extend the DRS pre-clustering within this cascade hybrid is similar to the DRS version with 1 neighbour it is logical that almost all versions with 10 and 30 pre-clusters perform better than the version with 100 pre-clusters. Those parameters also lead to the best results in the DRS experiments.

Algorithm Amount of DRS pre-clusters Amount of CF pre-clusters Average ARC

ARC of session length

0 1 2 3 4 5-10 11-20 21+ most Common - - 255 203 237 258 272 282 298 317 331 Cascade weighted 10 300 191 244 142 147 156 164 181 212 246 Cascade weighted 10 1000 174 230 125 126 133 141 159 191 230 Cascade weighted 30 300 193 238 151 153 159 168 185 213 248 Cascade weighted 30 1000 175 222 138 133 137 144 160 190 230 Cascade weighted 100 1000 189 215 181 161 160 164 175 201 240

(23)

In Table 11 the best performing versions of the cascade hybrid RS are displayed and as can be seen is that these version all beat the most common recommender. However, similarly to the collaborative filtering RS their performance lacks on a session length of 0. When users have more than 1 click on the website this versions are able to outperform the most common recommender. The best performing version of the cascade hybrid RS is the weighted version with 10 DRS and 1000 CF pre-clusters.

Comparative Results

In Table 12 the best performing results are displayed, and it shows that the switching hybrid RS is the best constructed within this research. All of the RS have outperformed the most common recommender on average, and only 2 versions perform worse on session lengths of 0. The best performing algorithm is the switching hybrid with a splitting criterion based on session lengths of the session which is 41.6 percent better than the most common recommender.

Algorithm Average _ARC ARC of session length =

0 1 2 3 4 5-10 11-20 21+ most Common 255 203 237 258 272 282 298 317 331

CF kNN k=10 & 1000 pre-clusters 178 288 116 111 113 117 133 167 213

DRS kNN k=3 & 100 pre-clusters 244 191 226 247 262 272 288 307 323

Cascade weighted 10 DRS pre-clusters &

1000 CF pre-clusters 174 230 125 126 133 141 159 191 230 Feature Combination kNN k=10 & 1000

pre-clusters 243 194 228 250 265 273 287 299 290 Switching Hybrid split based on session

length 149 191 116 111 113 117 133 167 213 Improvement best compared to most

Common 41,6% 5,91% 51,1% 57,0% 58,5% 58,5% 55,4% 47,3% 35,7%

Table 12: Results per session length of the best performing Recommender Systems in this research Figure 22 shows the performance of the various algorithms, and it is clear that the performance of the collaborative filtering algorithm, as expected, suffers from the cold-start problem. However, its performance from then onwards is unrivaled except for the switching algorithm which uses exactly the same predictions on those session length. The cascade hybrid can outperform the collaborative filtering RS on a session length of 0, but its performance declines relative to the best performers. The demographic RS and the feature combination hybrid are slightly better than the most common recommender, but their performance on longer session length is significantly worse compared to the cascade hybrid, the collaborative filtering RS and the switching hybrid.

(24)

CONCLUSION & DISCUSSION

Within this section the results of this research are discussed, the main conclusion is formulated and the opportunities for further research are identified. The section thus starts with the analysis of counterintuitive results within the 5 tested algorithms. These unusual results are explained, and their implications for the research are argued. Secondly, from these results a conclusion are is derived which clarifies whether the cold-start problem of the collaborative filtering RS is solved. Thirdly within this section, the possibilities for further research into the topic of recommender systems for e-commerce companies are examined.

This research aims to combat the cold-start problem of collaborative filtering RS for e-commerce companies by merging the collaborative filtering algorithm with a context-aware demographic RS. The cold-start problem within the dataset is limited to session lengths of 0, since on longer session lengths the collaborative filtering RS immediately exceeds the performance of the most common recommender. The good performance on those relatively short session lengths is partly contributed to the large amount of products which should be sorted. Since all categories are within the same list, and most users do not switch between categories the collaborative filtering algorithm largely becomes a category matcher which recommends items of similar categories to the user.

The performance of the collaborative filtering algorithm surprisingly declines when the session length becomes bigger than 5. The most common recommender shows similar performance and therefore the decline is probably caused by the user interaction on the site. The first clicks on the website are mostly on more popular items, which are thus easier to predict. Later, the amount of clicks on the most bought products decreases, which hardens recommending products.

Despite the collaborative filtering its good performance overall it is on a session length of 0 surpassed by other versions based on the demographic RS. The context-aware demographic RS itself is superior to the most common recommender, but only by a small margin. This could mean that the variation between the demographic and contextual features is little. Each test sample is thus pretty similar to the clusters in the train set and so many of the clusters are taken into account when predicting. Moreover, although the increase is significant, the demographic data used in the Entree dataset (Pazzani, 1999) (Burke, 2002) has showed more predictive value.

The good performance of demographic RS on session lengths of 0 allow the switching hybrid RS to exploit it to mitigate the cold-start problem. This relatively simple algorithm is currently useful, because the algorithms clearly outperform their counterparts on certain session lengths. However, as shown in the results, still roughly a quarter of the demographic RS predictions are better. Our methods to determine which RS would be better based on performance of both RS on a user have not been successful, as these versions always predicted collaborative filtering to perform better for that user. This suggest that previous performance it not predictive for the future performance of algorithms. The other hybrid algorithms, feature combination and cascade, specifically used the demographic data to boost the performance of collaborative filtering, also on longer session lengths. The feature combination hybrid showed that using both features directly does not improve the recommendation. Since the demographic recommender outperformed the most common recommender on all session lengths it was expected that the demographic prefiltering in the cascade version is better than using all data within the dataset. However, the cascade hybrid has a decrease in performance on longer session lengths compared to the collaborative filtering version. This decrease in effectiveness implies that although the collaborative filtering data and the demographic data both have predictive value, they do not base their recommendations on the same users.

In conclusion, the switching hybrid which combines the collaborative filtering algorithm with the context-aware demographic RS to a large extend solves the cold start problem of collaborative

(25)

that showed potential have also been unable to improve the performance of collaborative filtering on longer session lengths. E-Commerce companies can use these findings to dismiss their current RS, and instead use the switching hybrid RS to recommend items to users.

The fact that the collaborative filtering on this dataset only has a cold-start problem before the first click does not mitigate the usefulness of this research. Over 40 percent of the users have no further interaction with the website and improving the prediction for these users allows e-commerce to better suit the needs of these users. Moreover, since the switching hybrid directly copies the recommendation of the collaborative filtering systems the recommendations on longer session lengths are still adequate.

The failure of both the cascade and the feature combination hybrid shows that intricate merging methods do not have the desired effects. The results suggest that the contextual and demographic data and the collaborative filtering data are not compatible with each other and that future research should not focus on alternative ways of merging these types of data with each other.

Instead, continuing this research can be done by further developing switching criteria between the two RS for the switching hybrid RS since the results show that a decent amount of the predictions of the demographic RS were better than the collaborative filtering. A possibility for this is to use supervised learning algorithms on the demographic and collaborative filtering data because these algorithms can then evaluate how well algorithms are doing with certain data. Similarly, the similarity between test users and the clusters can be used to train the supervised learning algorithms.

Secondly, Borisov et al. (2016) have shown that contextual features can be used to improve the quality of web search applications. Instead of using the contextual information as input for cluster algorithms they have used this data to weigh implicit feedback within their dataset. Since within this research it is argued that combining features does not improve performance, it should be researched whether with this dataset contextual data can be used for weighing the collaborative filtering data as well.