• No results found

Predicting customer repeat purchase and product return behavior for brick-and-mortar grocery retailers : an empirical study using probabilistic customer-base analysis models

N/A
N/A
Protected

Academic year: 2021

Share "Predicting customer repeat purchase and product return behavior for brick-and-mortar grocery retailers : an empirical study using probabilistic customer-base analysis models"

Copied!
58
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

(Cokal, 2005)

Business Administration 2015-2016 Draft version of the master thesis in Marketing

University of Amsterdam Faculty of Economics and Businesses Author: dr. M. M. van der Kuyp (10266313)

milanvanderkuyp@gmail.com/ milan_manu@hotmail.com Supervisor: dr. E. Kormaz

Second Reader: dr. U. Konus Submission date: 24 June 2016

Product Return Behavior for Brick-and-Mortar

Grocery Retailers

(2)

Table of Contents

Abstract ... 1

I. Introduction ... 2

II. Literature review ... 7

Academic and managerial contributions ... 9

III. Theoretical background ... 11

Theories on past purchase behavior and product return behavior ... 11

IV. Research design ... 16

Data ... 16 Key variables ... 20 Method ... 22 V. Results ... 23 Preliminary analysis ... 23 Explanatory analysis ... 28 VI. Discussion ... 37 VII. Conclusion ... 41 VIII. Bibliography ... 44 IX. Appendix ... 48 Figures ... 48 Tables... 49 Syntax ... 51 Data cleaning ... 51 Variable computation ... 52 Statistical analyses ... 55

(3)

Abstract

In recent years, various academic studies focused on predicting repeat purchase behavior using probabilistic modeling approaches. Main purpose of this study is to contribute to this literature by investigating the relationship between the past purchase behavior including the product return behavior and the future repeat purchase behavior in a brick and mortar grocery retail setting. Based on the expectancy-disconfirmation model, it is expected that consumers who returned more products will have less number of future repeat purchases. In other words,

customers who return more products are more likely to be disgusted and therefore drop-out.

The study uses transaction data on 4.014 customers of a brick-and-mortar grocery retailer with the timespan of March 2, 2012 through March 30 2013. The data is gathered from Kaggle, which is an open source platform that organizes competition among data scientists. The BG/NBD model of Fader et al (2005) has been used to predict repeat purchases. The study conducts a hierarchical multiple regression analysis to test the relationship between past purchase behavior as well as product return behavior and repeat purchase behavior

Results show that the BG/NBD model did not perform well enough on this particular data to predict repeat purchases. The grocery retailing setting is characterized with extremely frequent visits on short intervals which could violate the purchase behavior assumptions of the BG/NBD model. Nevertheless, the strongest predictors of the repeat purchases have been found as the number of products purchased and the amount of money spent. Further, the analyses show product return behavior and repeat purchases behavior have a spurious relationship; whereby differences in number of frequent repeat purchases are caused by the past purchase behavior and not the product return behavior. Limitations on the used dataset as well as the methods adopted are given in an extensive discussion.

This thesis extends the limited empirical validation of the BG/NBD model by considering the product return behavior of customers in a grocery retailing setting.

Keywords BG/NBD model, repeat purchases, frequency of buying, customer retention,

(4)

I. Introduction

“If we can develop better relationship with our customers, we can help them accomplish their goal and they, in turn, can help us accomplish ours” (Johnson, 2016). It is crucial for businesses to understand customers and to build profitable relationships in order to become successful. As stated by marketing columnist Brent Johnson in his recent column “Marketing in 2016”, modern marketing is not just about selling your products and services but about

developing long-term relationships with your customers to let them become repeat buyers. The more often a customer comes back, the stronger the relationship with this customer is; and stronger relationship with customers eventually leads to more future purchases (Dwyer, Schurr, & Oh, 1987). It has been recognized that the most valuable customers are those who return and become repeat buyers (Gupta et al., 2006). In return, repeat buyers are more likely to use word-of-mouth and spend more (e.g. increase in margins and cross-sales) while costs of retention decrease over time (Reichheld & Teal, 2001). Over the last decades, marketers have been paid enormous attention to predicting the repeat purchase behavior to better understand how to build longer customer relationships.

The scope of this study is the prediction of customer repeat purchase behavior for a brick- and-mortar grocery retailer. The data has been gathered from an American grocery retail chain. The costs of customer retention are not taken into account in this study. Instead special attention has been paid upon the past purchase behavior as well as the product return behavior. Past purchase behavior has been measured with the number of products customers purchased and the amount of money they spent; and product return behavior with the number of products customers returned and the amount of money they got back with returned products.

The study adds knowledge on the empirical validation of the BG/NBD model on

predicting customer repeat purchases at brick-and-mortar grocery retailing. Repeat purchases at brick-and-mortar grocery retailing are difficult to predict since this setting is characterized with first unobserved drop-out behavior, and second high customer heterogeneity in the frequency of visits. In our dataset, we observe a big group of people with extreme number of frequent visits on short intervals, which could to some extent violate the purchase behavior assumptions of the BG/NBD model. Our results provide grocery retail managers insights in the effects that product defects and the related product returns have on future sales. One major focus of this study is if

(5)

customers who return more numbers of products have a lower number of frequent repeat purchases.

The purpose of this study is to find the dimensions of customer behavior that predict the number of frequent repeat purchases the best at brick-and-mortar grocery retailing. The study follows a comparable research design as (Reinartz & Kumar, 2000). In this study, first the Pareto/NBD model is used to predict if customers of an online catalog company are still active in the future (measured as the customer lifetime value). Thereafter, it tests the relationship between customer profitability and the customer lifetime value. Instead of using the Pareto/NBD model to predict customer lifetime value, this study applies the BG/NBD to predict the number of frequent repeat purchases. The study tests the relationship between past purchase behavior as well as product return behavior and repeat purchases behavior in order to find the strongest predictor. The following research question has been answered:

Which dimensions of customer behavior in a brick-and-mortar retailing setting predict the repeat purchases the best?

The dimensions of customer behavior are past purchase behavior and product return behavior. Past purchase behavior contains the variables: the number of products purchased, the average number of products purchased per visit, the amount of money spent and the average amount of money spent per visit; and product return behavior uses the variables: the number of products returned, the average number of products returned per visit, the amount of money back (due to product return) and the average amount of money back (due to product return) per visit1. The main research question has been answered using the following 8 sub-questions:

Q1. Do customers who have purchased more number of products during a fixed period of time, have a higher number of frequent repeat purchases?

Q4. Do customers who have spent more amount of money during a fixed period of time, have a higher number of frequent repeat purchases?

1 Product return behavior is the reverse of prior purchases. In other words, the purchased products are handed in

along with the receipt of purchase; and in return customers get back their money. In the following, “number of products returned” and “amount of money back (due to product return)” has been used to describe the reverse of prior purchases

(6)

Q3. Do customers who have returned more number products during a fixed period of time, have a higher number of frequent repeat purchases?

Q4. Do customers who have got more money back (due to product returns) during a fixed period of time, have more frequent repeat purchases?

Q5. Do customers who have purchased on average more number of products per visit, have a higher number of frequent repeat purchases?

Q6. Do customers who have spent on average more money per visit, have a higher number of frequent repeat purchases?

Q7. Do customers who have returned on average more number of products per visit, have a higher number of frequent repeat purchases?

Q8. Do customers who have got on average more amount of money back (due to product returns) per visit, have a higher number of frequent repeat purchases?

Below, Figure 1 examines the conceptual framework of the relationship between past purchase behavior as well as product return behavior and repeat purchase behavior. The theoretical explanation on the positive relationship between past purchase behavior and repeat purchase behavior holds that, customers who purchased more number of products as well as customers who spent more amount of money are the customers who visited the store more frequently. Following, customers who visited the store more frequently have a stronger relationship with the grocery retailer and are therefore more likely to continue their purchase habits in the future. In addition, the study uses the same explanation on the negative relationship between the average number of products purchased as well as average amount of money spent on number of frequent repeat purchases. It assumes that customers who purchased on average less number of products per visit and spent on average less amount of money per visit, are the customers who visited the store more frequently. They visited the store more frequently because they spread their grocery expenses across multiple visits instead of buying all their groceries in just a few visits. In line with this reasoning, the study expects that customers who purchased on average less number of products and who spent on average less amount of money, tend to have stronger relationship and continue their purchases habits in the future; resulting in a higher number of frequent repeat purchases.

(7)

Concerning the theoretical explanation of the negative relationship between product return behavior and repeat purchase behavior, the expectancy-disconfirmation model has been used. Based on the expectancy-disconfirmation model, the study expects that returning a product and getting money back (due to product return), leads to a negative experience with the grocery retailer because the purchased product did not meet the expectations that the customers about this product. Customers who returned more number of products and got more amount of money back are more likely to be disgusted with the grocery retailer (due to the negative experiences of the products); and therefore, have less future purchases. The study expects the same negative

relationship for the average number of products returned as well as the average amount of money back on number of frequent repeat purchases.

Figure 1: Conceptual framework of the relationship of past purchase behavior, product return behavior and repeat purchase behavior

The study uses transactional data on 4.170 customer of an American brick-and-mortar grocery retail chain. The timespan of the data is from March 2, 2012 through March 30, 2013. The transactions of first 28 weeks have been used to estimate the BG/NBD model and measure past purchase behavior and product return behavior. The study conducts a correlation analysis to compare the predictions of the BG/NBD model with the actual repeat purchase of the last 28 weeks and tests how well the model performs. Furthermore, the study conducts a hierarchical multiple regression analysis to test the relationship between the different dimension of customer behavior on repeat purchases behavior.

(8)

The structure of the chapters is as follows. To start, chapter 2 gives an overview of the literature that focuses on the prediction of repeat purchase behavior. Following, chapter 3 discusses the theoretical background, which have been used to formulate 8 different hypotheses. Further, chapter 4 presents the research design of this study. Next, in chapter 5, the study

overviews the results of the correlation analysis and the hierarchical multiple regression analysis. Chapter 6 gives a discussion on the theoretical background, the research design and the findings of the study. Lastly, chapter 7 presents the conclusion by reviewing the main results and finding of this study. Additional Tables, Figures and the syntax can be found in the appendix.

(9)

II. Literature review

Entering the digital-age, the business environment and the way business interact with their customers has dramatically changed. Developments in database technology have made it possible for businesses to capture large amounts of customer data; and in return the data is used to get customer insights. Concepts of relationship equity, customer retention and customer repeat purchases have been widely discussed within the marketing literature (Slater & Narver, 1998; Vesanen & Raulas, 2006; Arons, van den Driest, & Weed, 2014; Avery, Fournier, &

Wittenbraker, 2014).Further, the research agenda of the Marketing Science Institute 2014-2016 prioritizes on developing marketing analytics for a data-rich environment and getting customer deep-insight”. In response, academics and practitioners have paid enormous attention on predicting repeat purchases behavior; using advanced statistical models on past purchases behavior.

In the literature that focuses on predicting repeat purchase behavior, theoretical work mostly takes on a more data-driven approach; and its focus is largely on the application of several techniques on customers’ purchases. Repeat purchases have been the object of research for many years. Starting at the 1950s mass marketing techniques of mail orders like catalogs have been used to communicate and collect data on customers. This has dramatically changed after the 1960s when computers were introduced and marketers started using customer loyalty cards to collect extensive amount of customer data. Prepackaged statistical programs SAS and SPSS has also allowed marketers to analyze the customer data and build models for customer behavior

(Petrison, Blattberg, & Wang, 1997; Stone & Shaw, 1988).

To describe and predict repeat purchases of customers in a non-contractual setting,

(Schmittlein, Morrison, & Colombo, 1987) developed the Pareto/NBD model. The setting is characterized with unobservable dropout behavior and high customer heterogeneity in number of purchases. “The model assumes that customers buy at a steady rate for a certain period of time and then become inactive” (Fader et al., 2005). It uses two different levels to estimate parameter of “purchase rate” and “dropout rate”. The number of repeat purchases is modeled using the NBD (negative binominal distribution) (poising-gamma mixture counting) model; and the customer dropout is modeled using the Pareto (exponential-gamma mixture). To make predictions on repeat purchases, the model requires transaction history information on the number of past purchases and the recency of the last purchase.

(10)

Since its development, several studies have showed its strength in predicting repeat purchases (Fader et al., 2005; Reinartz & Kumar, 2000). Wübben & Wangenheim (2008)

compare the different techniques which will provide a better understanding of their application.

In their study the effectiveness of the simple heuristic model is tested against two stochastic models, the BG/NBD and Pareto/NBD. Results show that the stochastic models perform better than the heuristic model on predicting repeated purchases.

However, despite the wide interest of academics in modelling customer behavior, marketers have failed to the Pareto/NBD model due to its complicated estimation procedure that incorporates various evaluation of Gaussian hyper-geometric function. Verhoef, Spring, Hoekstra & Leefland (2003) test the usages of the statistical models in businesses; and find that most of the businesses still use heuristic methods cross-tabulation and RFM model on predictive analysis instead of more advanced methods like the Pareto/NBD model. The authors show importance of fit between business practices and academic research and that researchers should consider the applicability of new techniques (Verhoef, Spring, Hoekstra, & Leeflang, 2003)

In response, academics have tried to develop statistical techniques which are faster and easier to implement. Fader & Hardie (2001) use transaction data from an online context of CD purchases to predict future transactions and sales. The authors use a simplified stochastic model, which can be implemented using spreadsheet software. The study shows how past purchases can be used to predict future sales (Fader & Hardie, 2001).

Following, a few years later, Fader et al. (2005) have developed the BG/NBD model. The model is almost identical as the Pareto/NBD model, except that it assumes that customer dropout occurs only after a customer purchases. Instead of using a Pareto (exponential-gamma mixture) it uses a beta-geometric model. Due to this slight variation, the model is implemented way faster and easier. Whereas the Pareto/NBD model needs advanced computation software like MATLAB to estimate its parameters; the BG/NBD model can be implemented with spreadsheet software and is therefore more accessible for businesses. Results show that the predictions of the

Pareto/NBD and the BG/NBD are almost the same (Fader et al., 2005; Wübben & Wangenheim, 2008).

In testing the relationship between past purchase behavior and future customer purchasing, academic studies find that the amount of money customers spent forms a good

(11)

find a positive relationship between customer profitability and customer lifetime value. Further

Cheng & Chen (2009) uses a RFM model to predict repeat purchases. Again, the results show that the amount of money customers spent is a good predictor for repeat purchases.

This study extends previous studies on the relationship between past purchase behavior and repeat purchases by including product return behavior. Literature on product return behavior focusses mostly on Ready Made Garments and the related clothing industry and not so much on brick-and-mortar grocery retailing. This is because it is quite common to return cloths whereas grocery products are only returned when the quality of the purchased product is insufficient. Related studies at this setting focusses mostly on relationship between satisfaction and customer return (Anderson & Mittal, 2000; Verhoef, 2003). The relationship between product return behavior and future purchases using transaction data has not been studied. Moreover, we find limited application of the BG/NBD model in brick-and-mortar grocery retail setting. This study therefore validates the BG/NBD model at a brick-and-mortar grocery retail setting; and tests the relationship between past purchase behavior and repeat purchase behavior including product return behavior. The following research question is answered:

Which dimensions of customer behavior in a brick-and-mortar retailing setting predict the repeat purchases the best?

Academic and managerial contributions

Predicting customer future purchases at brick-and-mortar grocery retailing, the study adds knowledge about the empirical validation of the BG/NBD model. As mentioned, the used dataset contains a big group of people with extreme number of frequent visits on short interval which could violate certain purchase behavior assumptions of the model. By validating the model, the study finds if the model performs well enough on this particular data. Further, its empirical validation can be compared with previous studies that validate the BG/NB on a different setting.

Fader et al (2005) use transaction data on customer of an online CD company and find a correlation of (r =0,626, p = 0,000) between the predict repeat purchases and the actual repeat purchases. To draw further conclusion whether this correlation is good predictor, both studies are compared with each other. Managers from different settings can use this study to decide whether the apply the model.

(12)

Moreover, the study extends previous studies on past purchase behavior by including product return behavior. Product return behavior is not so much studied in a brick-and-mortar retail setting. The results of this study provide varies insights for academics and managers on the importance of product returns. To start, results give the mangers insights on the amount of product returns in brick-and-mortar grocery retailing. Using a large amount of data, indications can be given on how many products customers return in respect to the number of products they purchase.

Further, the study tests the relationship between past purchases as well as product return behavior and repeat purchase behavior. It thereby adds knowledge on the different dimension of customer behavior that predict future purchases. Moreover, results provide grocery retail

managers insight in the effects that product defects and the related product returns have on future sales. The results could help managers to think about the implications that product defects have and further they can be used to adjust current product return policies.

At last, basic probabilistic customer-base analysis can be improved with the results of this study. The current BG/NBD model uses only purchase history on frequency of purchase and recency of last purchase. However, the model can be improved when covariates are layered into the model. In our purpose of finding predictors of repeat purchases, the dimensions which predict repeat purchases the best can be used to improve the model. For instance, a covariate of product return behavior could be used to improve predictions of the BG/NBD model.

(13)

III. Theoretical background Theories on past purchase behavior and product return behavior

This section presents the theories on the relationship between the past purchase behavior as well as the product return behavior and the future repeat purchases in a brick and mortar grocery retailing. First it discusses the relationship between past purchase behavior and repeat purchases and thereafter the relationship between product return behavior and repeat purchases.

Past purchase behavior. Starting with the former, academic studies indicate that past

purchase behavior form a good predictor for repeat purchase behavior (Dwyer et al., 1987; Reichheld & Teal, 2001; Reinartz & Kumar, 2003; Cheng & Chen, 2009). This is because customers who visited the store more frequently, have a stronger relationship with the store; and customer who have stronger relationship with the store are likely to maintain their repeat

purchases habits in the following period. Following, the number of frequent visits is associated with the number of products purchased and amount of money spent. Customer who visited the store more frequent also purchased more number of products and spent more amount of money at this store. They have a stronger relationship with the store than customers who visited the store less frequent; and therefore have more future purchases. Therefore, the study expects that:

H1: Customers who have purchased more number of products during a fixed period of time, have a higher number of frequent repeat purchases.

H2: Customer who have spent more amount of money during a fixed period of time, have a higher number of frequent repeat purchases2.

Following, the number of frequent visits and repeat purchases is associated with average number of products purchased and the average amount of money spent per visit. Customers that visited the store more frequently are the customers that purchased less number of products and spent less amount of money per visit. They spread their grocery expenses across multiple visits instead of buying all their groceries at one visit and therefore visited the store more frequently. In

2 The numbers of the hypotheses correspond to the numbers of the sub-questions. H1 (the relationship between

products purchased and repeat purchases) corresponds to Q1, H2 (the relationship between money spent and repeat purchases) corresponds to Q2, etc.

(14)

addition, they have a stronger relationship with the store and are likely to continue their purchase habits in the future. This study expects that:

H5: Customers who purchased on average less number of products per visit, have a higher number of repeat purchases.

H6: Customers who spent on average less amount of money per visit, have a higher number of frequent purchases.

Product return behavior. If the quality of the product is insufficient customers can

return their product(s) and get their money back. This study focuses on both types of product return behavior: the number of products returned and the amount of money back (due to product return). Since product return behavior has not been studied so much for brick-and-mortar grocery retailing chains, the study proposes three different types of reasoning on the relationship between product return behavior on repeat purchase behavior.

The first stream of reasoning holds that the product return behavior of customers at brick-and-mortar grocery retailing is irrelevant for predicting the number of frequent repeat purchases. To start, its relationship with repeat purchases is irrelevant because returning products at brick-and-mortar grocery retailing rarely takes place3. At the brick-and-mortar grocery retailing customers return the purchased product when it is broken or the quality is insufficient. The product return policies are stricter in comparison to the clothing industry, where firms offer generous product return policies of 10 till 30 days after the purchase. Because customers rarely return a product the relationship between product return behavior and repeat purchases is irrelevant.

Secondly, if there is a relationship between the number of products customers returned and number of frequent repeat purchases, this relationship is spurious. To return a product customer first need to purchase that product. Customers who purchase more products are more likely to return a product (compared to customers who purchase less products) they are more likely to buy a defected product. The relationship between the number of products purchased and

(15)

number of frequent repeat purchases is therefore a logical outcome of the number of products that the customer purchases.

Thirdly, if product return behavior influences customers’ satisfaction with the grocery retailer brick-and-mortar grocery retailing, this will not lead to dropout behavior. In studying the relationship between customer satisfaction and customer switching behavior, (Anderson & Mittal, 2000) find that dissatisfaction doesn’t necessary lead to dropout behavior. Customers can be dissatisfied but still remain shopping at the same store, and vice versa, be satisfied but switch to the competitor. It shows that purchase habits are more important that the satisfaction with the grocery retailer and therefore the affective change related to product return behavior are

irrelevant for future purchases (Dick & Basu, 1994; Keaveney, 1995). Therefore, in line with the argumentation the null hypothesis is expects that:

H03: the number of frequent repeat purchases doesn’t differ if customers have returned more or

less number of products during a fixed period

H04: the number of frequent repeat purchases doesn’t differ if customers have got more or less

amount of money back (due to returning a product) during a fixed period

In contrast, according to expectancy-confirmation model, customers hold certain expectations about the products and if the expectation are not met, the likelihood of customer disgust increases (Oliver, 1980; Alexander, 2012). Returning a product lead to with customer disgust when the initial expectations about the product _are not met buy the outcome of the product. Disgust is a function of the negative affect (grief) plus a negative surprise (Alexander, 2012). Again returning a product causes negative affect or negative surprise because the outcome of the product doesn’t meet the initial expectation of the product. For instance, when the date of the milk has expired, the customer experiences grief because he has to return the product and cannot drink it right away. His expectations about the milk didn’t meet the outcome of the milk. Customer who returned more number of products and got more money back are more likely to be disgusted they experience have more negative experience with the grocery retailer. The

(16)

Ha3: the more number of products customers returned during a fixed period of time, the lower

number of frequent repeat purchases

Ha4: the more amount of money customers got back per visit (due to returning a product) during

a fixed period of time, the lower number of frequent repeat purchases

As mentioned, product return behavior takes rarely place at brick-and-mortar grocery retailing. The study expects same negative relationship between the average product return behavior per visit and number of frequent repeat purchases, namely that:

H07: the number of frequent repeat purchases doesn’t differ if customers returned on average

more number of products per visit.

H08: the number of frequent repeat purchases doesn’t differ if customers got on average more or

less amount of money back (due to returning a product) per visit.

Ha7: customer who returned on average more number of products per visit, have less number of

frequent repeat purchases

Ha8: customers who got on average more amount of money back per visit (due to returning a

product), have less number of frequent repeat purchases

Alternatively, the third line of reasoning holds that product return behavior leads to delight instead of disgust. According to Alexander (2012),delight as a function of the positive affect ‘joy’ and positive surprise. Customer expectations are formed by previous experiences together with social norms. In brick-and-mortar grocery retailing absence on product return behavior lowers customer expectations on returning a product. If customers got their money back (due to returning a product), this outcome overestimate their its initial expectations on returning the product leading to a positive surprise (Alexander, 2012). Assuming that customers who returned products enjoy a positive surprise and thus are more delighted than those who don’t

(17)

return products. The third stream of reasoning expects therefore that higher product return behavior lead to higher repeat purchases behavior.

Since product return is mostly the results of insufficient quality, it is not likely that customers will have a positive experience by returning their products. Therefore, the study holds no hypothesis on the third line of reasoning.

(18)

IV. Research design Data

To provide an answer on the research question this study uses the “Acquire Valued Shoppers” (AVS) challenge data of Kaggle4. Kaggle is an open source platform that provides the link between data problems and data solutions. Users of the platform come from all over the world. They form the largest community of data scientists consisting of tens of thousands PhD’s in quantitative fields (e.g. computer science, statistics, econometrics, math and physics). Data is publically available to all scientists for the purpose of the competition5. The scientists finding the best solution to solve the complex data science problems get a determined amount of prize money. In return the company with the data problem or sponsor pays a certain amount of fee. Academics have the possibility to work in teams and use forums to share issues and results6.

The ASV data was collected by 134 brick-and-mortar grocery retailers which are located at 34 different geographical regions. In total, 350 million transactions were recorded of 311.541 customers. Each store chain recorded all the transactions of each shopping cart during a period of coupon promotion. Customers that redeemed the coupon offer were selected for the data. All purchase information on customer and product were anonymized to protect customers and sales information. Names of customers, brands, companies and store chain are replaced by unique identification numbers that correspond to the names. The original purpose of the challenge was to find the best solution of predicting which shopper will become repeat buyers of the product of the coupon offer. However, the transaction information can be used for different research purposes. This study uses the transaction information to measure how many products customers purchased or returned; and to measure the amount of money customers spent or got back (due to product return). Following, it purposes to find the dimensions of customer behavior that predict the number of frequent repeat purchases the best. Table 1 describes the key variables of the initial dataset, which have been used to construct the different dimension of customer behavior.

4 See https://www.kaggle.com/c/acquire-valued-shoppers-challenge/data for the information on the data and to find

the data.

5 Kaggle stated that it is not responsible for the credibility of the data. To increase the validity, strong effort was

taken in cleaning the data before using it for analysis (Saunders, Lewis, & Thornhill, 2009) 325-331). Further, inaccurate records have been reported in the appendix.

6See the Kaggle forums https://www.kaggle.com/c/acquire-valued-shoppers-challenge/forums for appropriate

(19)

To answer the research question, the grocery retailer with the highest number of customers has been selected. Since all purchase information has been anonymized there is no additional information on this grocery retailer expect the overall statistics of customer purchases7. The retailer has transaction information on 32.640 customers and the timespan of their

transactions ranges from March 2, 2012 through July 23, 2013. The store offers a wide selection of products from over 25.000 different brands. Most of the products that are sold have the price between $0,50 and $5, - dollar but also products with a price of $10, - dollar or higher are sold. Remarkable, is that the retailer collected an extensive amount of data over a period of almost one and a half year. During this period the shopping cart information of over 3 million visits, with an average around 92 visit per customers, have been traced. It is assumed that the company uses memberships cards and that customers need to scan their card each time they visit the store. From the initial 32.640 customers, a sample of 4.208 customers (with 5.669.001 transactions) has been used. The final dataset contains all customers that made their “first

purchase” during March and April 2012. In the study 38 customers who made their first purchase after April 30, 2012 have been removed because they are from a different customer cohort. The remaining sample of this study contains 4.170 customers8. Further the timespan of the

transactions has been adjusted from “March 2, 2012 through July 23, 2012” to “March 2, 2012 through March 30, 2013” because most of the transactions after March 30, 2013 were not traced9. The adjusted dataset has been used to create two time periods of 28 weeks: the calibration period and validation period. The calibration period ranges from March 2, 2012 through September 14, 2012 and the validation period from September 15, 2012 through March 30, 2012. With the calibration period the variables of the past purchase behavior and product return behavior have been calculated. The past purchase behavior has been used to construct the BG/NBD model and predict the number of frequent repeat purchases. With the validation period the actual repeat purchases have been calculated.

7 See Table in the appendix for additional statistics on the grocery retail

8 The selection of the customers with the same cohort is based on the study of Fader et al (2005). The authors select

the customers who made their first purchase during the first quarters, whereby the full dataset contains of five quarters. This study selects the customers who made their first purchase during the first two months, whereby the full dataset contains 13 months.

9 See Figures in the appendix for the customer activity from March 2, 2012 through July 23, 2013. The customer

activity measures the number of visits per day at the studied grocery retailer. On average between 900 and 1200 customers visit the grocery retailer per day. From March 30, 2013 the customer activity decreases every day. The same decrease has been found for other grocery retailers in the Acquire Valued Shopper dataset. The study expects the grocery retailers didn’t provide the full transaction data from March 30. 2013 and therefore adjusts the timespan.

(20)

To assess the applicability of the data, the study discusses the advantages and

disadvantages of the final dataset. Starting with the advantages, the used dataset is applicable for empirical validation of the BG/NBD model. The calibration period can be used to build the model and the validation period can be used to test how well the models performs in respect to the actual number of frequent repeat purchases.

Furthermore, the key variables of the initial dataset (see Table 1) can be used to construct the different dimension of customer behavior and measure repeat purchases. The transactions contain not only information on the number of products purchased (purchase_quantity positive) and amount of money spent (purchase_amount positive) but also information on the number of products returned (purchase_quantity negative) and the money of money customers got back due to returning the products (purchase_amount negative). Therefore, the transaction information can be used to measure different dimensions of past purchase behavior as well as product return behavior. In addition, the BG/NBD model requires information on the numbers of purchases (frequency of buying) and recency of last purchase (recency). The transaction information on the date of transaction (date) can be used to calculate the both the frequency of buying and recency.

On the other hand, the used data also holds multiple disadvantages. To start, there is no description on the brick-and-mortar grocery retailer and whether it concerns a non-contractual setting or a contractual setting remains unknown. As mentioned, Kaggle anonymized the

transaction data and therefore it is unknown what products the grocery retailer sold. Further, it is also unknown how the company collected this large amount of transaction data. The study

assumes that the retailer uses some kind of membership card whereby each time a customer visits the grocery retailer he uses the card to register his purchases. Yet, this is an assumption which we cannot be certain about. Following from this assumption, it remains unknown whether the

customers have a contract with the retailer or not. If the customers have a contract and pay a certain amount of membership fee, the results only apply to brick-and-mortar grocery retailers in a contractual setting. Academic studies find differences in customer behavior in contractual setting or non-contractual (Tsai, Huang, Jaw, & Chen, 2006; Woisetschläger, Lentz, &

Evanschitzky, 2011)10. Thus, the results of this study could have been influenced by the setting of the data.

(21)

Further, there could be a selection bias in the data. As mentioned, the grocery retailers collected the data on customers who were selected for a coupon promotion and who redeemed the coupon. No additional information is given on how the customers were selected. It could be that customers were selected because of previous purchases. In this case the data is biased towards customers with a certain previous purchase behavior. Moreover, the study also doesn’t know if customers who did redeem the coupon differ in behavior from customers who didn’t redeem the coupon. If so, the purchase behaviors in the data are biased as well.

At last, the used data contains some systematic measurement errors. As mentioned, the timespan of the final dataset has been adjusted because the data stops tracing most of the

transactions after May 30, 2013. The transactions after May 2013 can therefore not been used for this study. Further, doing some descriptive analysis on the key variables of Table 1, the study finds that 0.34% of all transactions misses information on the number of products purchased (purchase_quantity positive), the amount of money spent (purchase_amount positive), the number of products purchased (purchase_quantity negative) and the amount of money back (purchase_amount negative). The study expects that the missing information didn’t influence the results. According to the statement on Kaggle’s website, it is common to find some noise in real-world data (A Note On Data Quality, 2013). Furthermore, the study finds that the noise is spread across different customers and not concentrated on the transaction of only a few customers

Table 1: Key variables of original dataset

Variable Description Range

id Unique number representing the customer [1 – 32.640]

chain Unique number representing the store chain [21]

date Date of transaction [2012 03-02 –

2013 07 30] Purchase_quantity positive Number of products purchased per transaction [0 – 95] Purchase_quantity negative Number of products returned per transaction [0 – 50] Purchase_amount positive Amount of money spent per transaction [0 – 1.200] Purchase_amount negative Amount of money returned per transaction [0 – 100]

Considering the advantages and disadvantages, the study concludes that the used dataset is applicable to answer the research question. Although there is no additional information on the grocery retailer and the data contains some systematic measurement errors; the data is applicable

(22)

to measure different dimension of customer behavior and repeat purchases behavior and statistically test the relationship between past purchase behavior as well as product return behavior and repeat purchase behavior. In this way it can find the best predictor of repeat purchases and answer the research question.

Key variables

The key variables have been computed using the spreadsheet software Excel. In this section the computation of the variables is discussed along with some descriptive statistics of the variables. Table 2 gives the definitions of the key variables and Table 3 provides its range and distribution. For a detailed description of the formulas that have been used to compute the key variables, see the variable computation in the appendix.

Past purchase behavior and product return behavior. As discussed, the initial dataset

contains information on the transaction level. For the purpose of this research, the transaction level data has been transformed to customer level data. Firstly, the transactions that occurred on the same day have been added up to calculate the total sum of purchases per visit for each customer. The study uses “purchase_quantity positive” and “purchase_quantity negative” to calculate how many number of products the customers purchased per visit and how many number of products the customers returned per visit. Further, “purchase_amount positive” and

“purchase_amount negative” have been used to calculate the amount of money the customers spent per visit and the amount of money the customers got back (due to product return) per visit. Secondly, the study uses the sum of all purchases during the first 28 weeks, to calculate “the number of products purchased”, “the amount of money spent”, “the number of products returned” and “amount of money back”. Following, the study calculated the average of the four variables to compute “the average number of products purchased”, “the average amount of money spent”, “the average number of products purchased” and “the average amount of money back”.

Predicted repeat purchases and repeat purchases. The BG/NBD model has been used

to predict the number of frequent repeat purchases. The model requires two types of information namely the number of frequent purchases and the recency of the last purchase. Again, the

transactions occurred on the same day were added up. Following, the study calculated how many purchases each customers has during the first 28 weeks. This variable is called the “frequency of buying”. Recency is the number of the week of the last purchase during the first weeks. To

(23)

calculate recency, the number of days from the first purchase through the last purchase is divided with seven (the number of day in a week). Lastly, the number of frequent repeat purchases the study calculates how many purchases each customer has made during the last 28 weeks.

Table 2: Description on variables

Variable Description

products purchased Total number of products purchased during the validation period

products returned Total number of products returned during the validation period average products purchased Average number of products purchased per visit during the

validation period

average products returned Average number of products returned per visit during the validation period

money spent Total amount of money spent during the validation period

money back Total amount of money back (due to product return)

average money spent Average money spent per visit during the validation period average money back Average money back per visit during the validation period frequency of buying Number of repeat purchases during the validation period

recency The number of the week when the last purchase occurred

predicted repeat purchases Number of expected repeat purchases during the validation period

repeat purchase Number of repeat purchases during the validation period Notes: See section 9.3.2 (appendix) for additional descriptions of how the variables were computed

(24)

Table 3: The range and distribution of the key variables

Variable Range Percentiles

25% 50% 75%

products purchased [71 – 7.380] 1.174 1.695,5 2.340

products returned [0 – 225] 2 7 18

average products purchased [2,9 – 144,7] 15 21,2 39,3

average products returned [0 – 3,82] 0.03 0.1 0.2

money spent [265 – 23.310] 3.109,2 4.568,9 8.422,5

money back [0 – 1.309] 3 11 26,2

average money spent [8,59 – 323] 40,1 57,1 79,7

average money back [0 –13,5] 0,1 0,3 0,5

frequency of buying [1 – 186] 27 37 54

recency [6,3 – 28] 26,7 27,1 27,7

period of purchase [19,6 – 28] 27,3 27,7 27,9

predicted repeat purchases [0 – 173] 29,24 38,5 53,4

repeat purchase [4 – 220] 30 43 61

Note: values are rounded to one number after the decimal.

Method

The study uses a bivariate correlation analysis on the frequency of buying, predicted repeat purchases and repeat purchases to test how well the BG/NBD model performs. If the model performs well enough the predicted repeat purchases is used, otherwise the study uses the actual repeat purchases of the last 28 weeks. Following, the relationship between the independent variables and dependent variables has been tested with a bivariate and partial correlation analysis. At last, the study conducts a hierarchical multiple regression analysis to answer the research question. The standardized partial regression coefficient β has been used to compare the strength of each effect. Further the study uses the explained variance statistic R² to see which dimension of customer behavior explains most the variance in the number of frequent repeat purchases.

(25)

V. Results

To answer the research question, the study tests the relationship between past purchase behavior as well as product return behavior and number of frequent repeat purchases. The section has been divided in the preliminary analysis and the explanatory analysis. The preliminary

analysis tests the predictive accuracy of the BG/NBD model and measures a bivariate correlation between the independent and dependent variables. The explanatory analysis uses partial

correlation analyses and hierarchical multiple regression analyses to test which dimension of customer behavior predicts repeat purchases the best. The regression analyses use the

standardized (partial) regression coefficient β to compare the different effects; and the explained variance statistics R² to find the best predictor of repeat purchases.

Preliminary analysis

Firstly, to assess the predictive accuracy of the BG/NBD model, a bivariate correlation between frequency of buying, predicted repeat purchases and repeat purchases has been used. If the model predicts well enough the study uses the predicted repeat purchases to test its

relationship with past purchase behavior and product return behavior. The results of the bivariate correlation between frequency of buying predicted repeat purchases and repeat purchases have been examined in Table 3. The study finds a high and positive correlation of (r = 0,999, p = 0,000) between frequency of buying and predicted repeat purchases. The strong correlation shows that the number of frequent repeat purchases predicted by the BG/NBD model are almost identical to the frequency of buying. Surprisingly, correlation between frequency of buying and repeat purchases is larger than the correlation between predicted repeat purchases and repeat purchases. The correlation of frequency of buying and repeat purchases is (r = 0,864, p = 0,000) while the correlation of predicted repeat purchases and repeat purchases is (r = 0,864, p = 0,000). This analysis shows that the BG/NBD model performs less well the frequency of buying in predicting repeat purchases.

(26)

Table 3: Bivariate correlations repeat purchases, predicted repeat purchases and frequency of buying

Bivariate correlations

(1) (2) (3)

(1) Repeat purchases 1

(2) Predicted repeat purchases 0,864 1

(0,000)

(3) Frequency of buying 0,864 0,999 1

(0,000) (0,000)

In addition, it has been visualized how well the model performs across customers with the same level of frequency of buying. In Figure 2, the horizontal axis shows the frequency of

buying. Further, the vertical axis shows the average numbers of predicted repeat purchases and the average number of repeat purchases for the customers with the same frequency of buying. Looking at both lines, the average predicted repeat purchases and the average repeat purchases are almost the same for customers with a low frequency of buying. Further, when the frequency of buying increases, the average predicted repeat purchases and repeat purchases deviate from each other. The increase in deviation shows that the average predictions of the BG/NBD model are more accurate for customers with a low frequency of buying; and that the predictions of the model get less accurate when the frequency of buying increases. The deviation between the predicted repeat purchases and repeat purchases are likely to be a results of the assumptions that the BG/NBD model holds. Our dataset is characterized with extreme frequent visit on short interval and therefore it is likely that the assumptions have been violated using this particular data.

Figure 2: The average number of predicted repeat purchases versus repeat purchases per level of frequency of buying

(27)

At last, the study compares the correlation results with the CDNOW dataset of Fader et al (2005). In this study, the authors test how well it performs the BG/NBD model performs in relation to the Pareto/NBD model in predicting repeat purchases. Using the CDNOW data, the relationship between predicted repeat purchases and repeat purchases finds the following correlation of (r =0,626, p = 0,000). Further the correlation between frequency of buying and repeat purchases of (r = 0,557, p = 0,000) has been found. Both correlations are weaker than the correlations of the previous analysis, which show that past purchase behavior in brick-and-mortar grocery retailing predicts the number of frequent repeat purchases better than the customer purchase-behavior at online-CD retailing. However, the BG/NBD model did improve predictions in respect to frequency of buying using the CDNOW data. The correlation increases from r = 0,557 to r =0,626, while the correlation on this particular data decreases from r = 0,864 to = 0,864. Because the BG/NBD model didn’t perform well enough on this data, it has been decided to use number of actual repeat purchases of the validation period for the final analysis instead of the predictions of the BG/NBD model. It uses a better measurement of repeat purchases and thereby increases the interval validity of this study.

Furthermore, the study tests the correlation between customers purchase behavior as well as product return behavior and repeat purchases. It has been assumed that both the number of products purchased and amount of money spent have a positive correlation with repeat purchases; and that the number of products returned and amount of money back (due to product return) both have a negative correlation with repeat purchases.

Table 4 examines the results of the correlation analysis on the independent and dependent variables. To start, products purchased has a correlation of (r = 0,469, p = 0,000) with repeat purchases and money spent has a correlation of (r = 0,456, p = 0,000). This shows that customers who purchased more number of products and spent more amount of money, have a higher

number of frequent repeat purchases.

Following, as expected, a negative correlation has been found for average number of products purchased and average amount of money spent on the number of frequent repeat purchases. The customers who purchase on average less products per visit and spend on average less money per visit are the customers who visit the store more frequent. Therefore, customer who purchased on average less number of product and spent on average less amount of money has a higher the number of frequent repeat purchases. The relationship between average number

(28)

of products purchased and repeat purchases is (r = -0,358, p = 0,000); and the correlation between average amount of money spent on repeat purchases is (r = -0,353, p = 0,000). Comparing both correlations with “number of products purchased” and “amount of money spent”, the “average number of products purchased” and “average amount of money spent” have a weaker correlation with repeat purchase.

Focusing on product return behavior, the number of products returned has a correlation of (r = 0,128, p = 0,000) on repeat purchases and amount of money back (due to product return) a positive correlation of (r = 0,140, p = 0,000). Finding a positive correlation on both variables is surprisingly, since the study expects that customers who return more number of products and got back money amount of money (due to product return) have a lower number of frequent repeat purchases. The positive correlation is in line with third reasoning of the relationship between product return behavior and repeat purchases; it holds that customers who return more products and get more money back are more delighted and therefore more often come back.

Yet, it could be that customers who return more number of products also purchased more products and therefore have a higher number of frequent repeat purchases. Taking into account the number of visits, the relationship between average number of products returned and repeat purchases has a correlation of (p = -0,156, p = 0,000). Further the relationship between average amount of money back (due to product return) and repeat purchases has the correlation of (r = -0,105, p = 0,000). Finding a negative correlation confirms the assumption that customers who return on average more products and get on average more money back (due to product return) are more likely to be disgusted with the grocery retailing and therefore have a lower number of frequent repeat purchases. Nevertheless, respective of its direction, the variables “products returned”, “money back” and “average products returned” and “average money back”, all have correlation of below r = 0,17 on repeat purchases. Given the fact that the variables “products purchased”, “money spent”, “average products purchased” and “average money spent” all have a strong correlation with repeat purchases it is likely to expect that the relationship between

product return behavior on repeat purchases will weaken when controlling past purchase

behavior. The next section, therefore tests a partial correlation, including the relationship of past purchase behavior as well as the product return behavior on repeat purchases.

Lastly, the study inquires the relationships between the independent variables. Starting with the relationship between past purchase behavior and product return behavior it has been

(29)

assumed that customers who purchased more number of products and spent more amount of money also returned more number products and got more amount of money back (due to product return). This is because customers first need to purchase a product to return one; and the more products a customer purchased the more likely it is that a customer purchased a defected product. The relationship between number of products purchased and number of products returned has the following correlation of (r = 0,278, p = 0,000). Further the correlation of (r = 0,201, p = 0,000) has been found on the relationship between amount of money spent and amount of money back (due to product return). Thus, customers who purchased more number of products also returned more numbers of products; and customers who spent more amount of money got also more amount of money back. The results are in line whit the prior expectations of this study.

Secondly, the study assumes that customers who purchased more products also spent more money; and that customers who returned more products also got back more money. The relationship between number of products purchased and amount of money spent has the following correlation of (r = 0,952, p = 0,000). The strong correlation confirms that customers who purchase more number of products also spend more amount of money11. Noteworthy, is that the relationship between products returned and money back (due to product return) finds a weaker correlation of (r = 0,711, p = 0,000). A possible explanation for the weaker correlation has been caused by the missing information on “purchase_quantity negative” and

purchase_amount positive”.

11 The same (strong) correlation holds for average number of products purchased and average amount of money

(30)

Table 4: Bivariate correlations of products purchased, products returned, money spent and money back on repeat purchased Bivariate correlations (1) (2) (3) (4) (5) (6) (7) (8) (9) (1) Repeat purchases 1 (0,000) (2) Products purchased 0,469 1 (0,000) (3) Products returned 0,128 0,278 1 (0,000) (0,000) (4) Average products purchased -0,358 0,499 0,145 1 (0,000) (0,000) (0,000) (5) Average products returned -0,156 0,130 0,855 0,324 1 (0,000) (0,000) (0,000) (0,000) (6) Money spent 0,456 0,952 0,239 0,465 0,092 1 (0,000) (0,000) (0,000) (0,000) (0,000) (7) Money back 0,140 0,206 0,711 0,073 0,584 0,201 1 (0,000) (0,000) (0,000) (0,000) (0,000) (0,000)

(8) Average money spent -0,352 0,459 0,103 0,950 0,270 0,513 0,065 1 (0,000) (0,000) (0,000) (0,000) (0,000) (0,000) (0,000)

(9) Average money back -0,105 0,085 0,623 0,219 0,719 0,079 0,886 0,201 1

(0,000) (0,000) (0,000) (0,000) (0,000) (0,000) (0,000) (0,000)

Note: The values in the brackets represent the P-values

Explanatory analysis

Previous correlation analysis finds a moderated/strong correlation between past purchase behavior and repeat purchase behavior; and a weak correlation between product return behavior and repeat purchase behavior. Further, the analysis shows that customers who purchased more also returned more products. Therefore, it is likely that the correlation between product return behavior on repeat purchase behavior weakens when controlling for the past purchase behavior. In Table 5 the previous statement has been tested using a partial correlation analysis. To start, when controlling for the number of products purchased, the relation between number of products returned and number of frequent repeat purchases finds the following partial correlation (r = -0,003, p = 0,862). This partial correlation is insignificant with a value of almost zero which indicate that, when taking the number of products purchased into account, customers don’t differ in number of frequent repeat purchases if they have returned more or less products. Yet, the positive correlation between products purchased and repeat purchases of (r = 0,448, p = 0,000)

(31)

show that, when taking the number of products returned into account, customers with a higher number of products returned also have a higher number of frequent repeat purchases.

Furthermore, money spent has a partial correlation with repeat purchases (r = 0,441, p = 0,000) and money back has a partial with repeat purchase (r = 0,055, p = 0,000). Again, analysis show a moderated correlation between money spent and repeat purchases while the correlation between money back and repeat purchases almost disappears. This shows that when taking into account the amount of money customers spent, customers don’t differ in number of frequent repeat purchases if they have got more or less money back (due to product return). Further, the more amount of money a customer spent, the higher number of frequent repeat purchases

At last, looking the third and fourth partial correlation, the correlation of the average numbers of products returned and the average amount of money back (due to product return) on number of frequent repeat purchases has weaken in comparison to the previous bivariate

correlation analysis; while the correlation of the average number of products purchased and the average amount of money spent on number of frequent repeat purchases has almost the same value. The results confirm the assumption that customers who purchased on average more number of products per visit have a higher number of frequent repeat purchases; and customers who spent on average more amount of money per visit have a higher number of frequent repeat purchases. Furthermore, it disconfirms the assumption that customers who returned on average more number of products have a higher number of frequent repeat purchases; and that customers who got on average more money back (due to product return) have a higher number of frequent repeat purchase. It is therefore likely to assume that product return behavior is irrelevant at brick-and-mortar grocery retailing and that differences in number of repeat purchases are a logical outcome of the number of products purchased which increase the likelihood that a product is broken or defected.

(32)

Table 5: Partial correlations of products purchased, products returned, money spent and money back on repeat purchased

Partial correlations (1) (2) (3) (4) Products purchased 0,459 (0,000) Products returned -0,003 (0,862) Money spent 0,441 (0,000) Money back 0,055 (0,000)

Average products purchased -0,329

0,003

Average products returned -0.0456

(0,000)

Average money spent -0,341

(0,000)

Average money back -0,037

(0,017)

Note: the values in the brackets represent the P-values

Following, the study conducts a hierarchical multiple regression analysis to test the hypotheses. For the clarification of the 8 hypotheses, the same conceptual framework as Figure 1 has been presented below (see Figure 4). Further, Table 6 presents the descriptive statistics of the key variables, which are used to interpret the regression coefficients b. The results have been analyzed in the same order as the partial correlation analysis: starting with the hierarchical multiple regression analysis of the effect of the number of products purchased (hypothesis 1) as well as the effect of the number of products returned (hypothesis 3) on the number of frequent repeat purchases, followed by the amount of money spend (hypothesis 2) and amount of money back (hypothesis 4).

The hierarchical multiple regression analysis holds two levels. The first level analyzes the regression coefficient b and the standardized regression coefficient β of the number of products purchased on the number of frequent repeat purchases; and the regression coefficient b and the standardized regression coefficient β number of products returned on the number of frequent repeat purchases. Thereafter, the second level, tests the effects of both variables together using the partial regression coefficient b and the standardized partial regression coefficient β. The standardized regression coefficients β have been used to compare both effects which each other.

(33)

Further, at the end of the section, the explained variance R² statistic has been used to find which dimension of customer behavior predictor of repeat purchase behavior is the best.

Figure 4: Conceptual framework of the relationship of products purchased, products returned, money spent and money back on repeat purchases

Table 6: Descriptive statistics of the variables used for the hierarchical multiple regression analysis

Variable Range Percentiles

25% 50% 75%

products purchased [71 – 7.380] 1.174 1.695.5 2.340

products returned [0 – 225] 2 7 18

average products purchased [2,9 – 144,7] 15 21,2 39,3

average products returned [0 – 3,8] 0,03 0,1 0,2

money spent [265 – 23310] 3.109.2 4.568.9 8.422,5

money back [0 – 1309] 3 11 26,2

average money spent [8,6 – 322,9] 40,1 57,1 79,7

average money back [0 –13,5] 0,1 0,3 0,5

repeat purchase [4 – 220] 30 43 61

Note: values are rounded to one number after the decimal.

Tables 7 examines the results of the hierarchical multiple regression analysis of the effects of number of products purchased as well as number of products returned on number of frequent

(34)

repeat purchases. To start, the first two steps show that products purchased did significantly predict repeat purchase, (b = 0,013, β = 0469, t = 34,27, p < 0,001), and that product returned significantly predicts repeat purchases, (b = 0,163, β = 0,128, t = 8,35, p < 0,001). The variable of products purchased ranges from 71 through 7.380 whereby 75% of the customers have purchased between 71 and 2.340 products; and the variable products returned ranges from 0 through 225 whereby 75% of the customers have returned between 0 and 18 products (see Table 6). As expected (based on previous correlation analysis), the standardized regression coefficient β of products purchased is higher than the standardized regression coefficient β of products returned, which mean that the effect of product return is the strongest predictor of repeat purchases. Further, the explained variance R² indicates that the number of products purchased explain around 22% of the variance in number of frequent repeat purchases whereas product returned only explain 1,6% of the variance in number of frequent repeat purchases.

Testing both effects, the analysis shows that products returned did not significantly predict repeat purchases, (b = -0,00, β = -0,002, t = -0,17, ns); however, products purchases significantly predict repeat purchases, (b = 0,013, β = 0,470, t = 32,97, p < 0,001). Hypothesis 1, which assumes that customers who purchase a higher number of products have a higher number of frequent repeat purchases, is therefore confirmed. Further, the study rejects hypothesis 3 since there is no significant differences in number of frequent repeat purchases between customers who return more and less number of products.

Table 7: Hierarchical multiple regression analysis of products purchased and products returned on repeat purchases Variable b SE t P β R² Step 1 Products purchased 0,013 0,000 34,27 < 0,001 0,469 0,220 Constant 25,020 0,771 32,43 Step 2 Products returned 0,163 0,019 8,35 < 0,001 0,128 0,016 Constant 46,217 0,483 95,65 Step 3 Products purchased 0,013 0,000 32,97 < 0,001 0,470 0,220 Products returned -0,003 0,018 -0,17 n.s. -0.002 Constant 25.030 0.773 32.36

(35)

Following, the study conducted a hierarchical multiple regression analysis to see if the amount of money spent and the amount of money back (due to product return) predicted number of frequent repeat purchases. Table 8 shows that money spent significantly predicted repeat purchases, (b = 0,004, β = 0,456, t = 33,03, p < 0,001) and that money back significantly predicted repeat purchases, (b = 0,096, β = 0,140, t = 9,11, p < 0,001). To interpreted the

regression coefficients, the variable money spent ranges from 265 through 23.310 whereas 75% of the customers spends between 265 and 8.422 dollars; and the variable money back ranges from 0 through 1.309 whereas 75% of the customers get between 0 and 26 dollars back due to product return. Considering the range of the variable money back, the regression coefficient b weakly predicts the number of frequent repeat purchases. Further, the standardized regression coefficient β for money spent is higher than the standardized regression coefficient β for money back. Thus the amount of money customer spent predict the number of frequent repeat purchases better than the amount of money customer get back (due to product return). Following, the amount of money customers gets back (due to product return) explain only 2% of the variance in the number of frequent repeat purchase; while the amount of money customers spends explain 20,7% of the variance in the number of frequent repeat purchases.

Testing the effects of both variables, money spent significantly predicts repeat purchases, (b = 0,004, β = 0,445, t = 31,69, p < 0,001) and money back significantly predicts repeat

purchases, (b = 0,034, β = 0,050, t = 3,58, p < 0,001). The study confirms hypothesis 2 since the analysis show that customer who spent more amount of money have a higher number of frequent repeat purchases. Noteworthy, is that the effect of money back on repeat purchases has almost vanished when including the effect of money spent while the effect of money spent on repeat purchases remains the same. This indicates that, taken into account the amount of money

customers spent, there are almost no differences in number of frequent repeat purchases, between customers who get more amount of money back (due to product return) and customers who get less amount of money back. The study rejects hypothesis 4, which assumes that customer who get back more money (due to product return) have a lower number of repeat purchases.

Referenties

GERELATEERDE DOCUMENTEN

A standard network was created which was used to find the influence of the data speed, the network topology, the amount of data sent through the network, the

1) The general manager finds it difficult to define the performance of the physical distribution and reverse logistics at Brenntag. This makes it impossible to ensure that

0 2 1 0 2 0 0 1 2 0 1 1 0 1 0 2 2 0 0 1 1 0 1 0 1 0 1 Planning phase Portfolio management Proficient Portfolio management Insufficient portfolio management

Results show there is hardly a connection between CAPE ratios and subsequent short term future stock returns, but increasing the return horizon improves the

To validate the research model, the effect of availability of purchase history, average product rating, brand familiarity and the two-way interactions effects on

In addition, literature (Urista &amp; Day, 2008) confirms that users satisfy their need for personal and interpersonal desires with online activities. Hypothesis 2,3 and 4 state

To identify whether individual differences in exposure can explain inter-individual variability in response to telmisartan, linagliptin, and empagliflozin, we successfully

By using WTC as the focus of this research, the famous Big Five personality traits and the Management Communication style (MCS) were incorporated in order to investigate on