• No results found

Product returns: What is the effect of shopping cart composition?

N/A
N/A
Protected

Academic year: 2021

Share "Product returns: What is the effect of shopping cart composition?"

Copied!
62
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Product returns: What is the effect of

shopping cart composition?

An analysis of a Dutch online fashion retailer

By Jeroen Aarts

(2)

2

Product returns: What is the effect of

shopping cart composition?

An analysis of a Dutch online fashion retailer

By Jeroen Aarts January 2017 S2801728 Hofstraat 12, 9712 JB Groningen j.j.m.l.aarts@student.rug.nl 0634917029

Msc. thesis Marketing Management & Marketing Intelligence

University of Groningen Faculty of Economics & Business

Department of Marketing PO Box 800, 9700 AV Groningen

(3)

3

MANAGEMENT SUMMARY

This research focuses on the composition of the shopping basket in relation to the product return probability. Since the amount of returns has been increasing over the years with the upcoming of online shopping, the need to reduce return rates is increasing for retailers. Therefore, to provide further insight in the drivers of product returns, the objective of this research is to create insight in the influence of substitute items in the basket on product return probability, the effect of complementary items on the number of substitute items in relation to product return probability and the influence of product bundling (i.e. when items belong to an outfit) on product return probability. In addition the influence of item price on product return probability is researched.

These aspects are analysed by using transactional data from a Dutch online fashion retailer for women. On item level the dataset contains amongst others information regarding the product category, whether the item is part of an outfit, price, order date, time of ordering, device that is used for the transaction and incoming channel. Also different customer-related factors are taken into account, such as age, total amount previously spend, amount of previous returns and how often and recent a customer purchases. These variables are linked by a unique customer and transaction ID. In order to determine whether an item is a substitute or complement, the different product categories were analysed to see if items could be worn together (complementary) (e.g. sweater and pants) or that they would replace each other (substitutes) (e.g. 2 sweaters). To measure the return probability (item returned yes/no) a random-effects logit model was used. This method is considered to be most appropriate since the data follows a panel-like structure, with multiple item lines for each transaction, meaning that the different items can be nested in a basket.

(4)

4

However, when the number of different product categories is large (i.e. with a high number of complementary items), this increases the chance that an item does not match with the other categories, reducing the strength of the decreasing effect.

When an item is part of an outfit, the research shows counter-intuitive results. The results indicate an increase in the product return probability when an item is part of an outfit compared to when an item is part of a collection. One would expect that when an item is part of an outfit, the fit uncertainty is lower, because it is already visible from the website that the items match with each other. A possible explanation for this increasing effect could be that when a customer purchases items belonging to an outfit and one of the items does not match expectations, the whole set is returned, leading to a higher product return probability. For item price, the results are confirming expectations and earlier research, that an increase in price leads to an increase in product return probability (e.g. Anderson 2009). The reason behind this is that customers become more critical in evaluating the product specifications, leading to a higher return probability.

What is also interesting from the results from the control variables, is that when customers make a purchase via mobile, the return probability is lowest compared to desktop, and when customers come to the website via Email and Affiliate websites, this leads to the highest return probability.

When combining these findings, it is recommended for retailers to reduce the number of substitutes in a basket and increase the number of complementary items in a basket with

substitutes. A way to do this for managers could be for example to reduce upsell possibilities and increase cross-sell possibilities on the website. This can be done by altering the way products are displayed on the website. Another option to reduce the return probability for outfits could be to provide multiple options for the same item within the same outfit (i.e. show 2 types of vests that both match the rest of the outfit), to stimulate customers to only return the item that lacks fit instead of returning the whole outfit.

(5)

5

PREFACE

I am proud to have finalized this thesis in the marketing intelligence and marketing management area. Although the process was difficult sometimes, I am looking back at it with a good feeling. Writing this thesis increased my skills and capabilities of carrying out independent academic research and enhanced my analytical skills. Furthermore, since I have written the thesis for a retailer, the results are valuable input for the retailer to build marketing strategy on, which is also giving me a great feeling.

First I would like to thank my supervisor Tammo Bijmolt for his support and feedback during the process. Also I would like to thank Sander Beckers from De Nieuwe Zaak, who guided and supported me in gathering the data and provided valuable insights and tips that I could use in analysing and interpreting the data. Furthermore I would like to thank my family and friends for giving me the opportunity to find a balance between writing a thesis and free time. Especially during the busy weeks.

(6)

6 TABLE OF CONTENT MANAGEMENT SUMMARY ... 3 PREFACE ... 5 1. INTRODUCTION ... 8 1.1. THEORETICAL BACKGROUND ... 8 1.2. RELEVANCE ... 9

1.3. CONTRIBUTION OF THIS RESEARCH ... 10

1.4. STRUCTURE ... 11 2. LITERATURE REVIEW ... 11 2.1. PRODUCT RETURNS ... 11 2.2. BASKET COMPOSITION ... 13 2.2.1. Substitute items ... 13 2.2.2. Product bundling ... 15 2.3. ITEM PRICE ... 16

2.4. EFFECT COMPLEMENTARY ITEMS ON SUBSTITUTE ITEMS ... 17

3. METHODOLOGY ... 19

3.1. DATA COLLECTION ... 19

3.2. DATA DESCRIPTION ... 19

3.2.1 Operationalization of key variables ... 21

3.2.2. Operationalization of control variables ... 22

3.2.3. Data inspection and data cleaning ... 22

3.2.4. Missing value analysis ... 24

3.2.5. Correlations and multicollinearity ... 25

3.2.6. Description of sample ... 26 3.3. REPRESENTATIVENESS OF DATA ... 27 3.4. MODEL ... 27 3.4.1. Model specification ... 28 4. RESULTS ... 29 4.1. DESCRIPTIVE RESULTS ... 29

4.2. RESULTS FINAL MODEL ... 31

4.2.1. Model fit comparison ... 32

(7)

7

4.2.3. Results key variables ... 34

4.2.4. Results control variables... 38

5. CONCLUSIONS ... 43

6. RECOMMENDATIONS, LIMITATIONS AND FUTURE RESEARCH ... 46

6.1. MANAGERIAL AND THEORETICAL IMPLICATIONS ... 47

6.2. LIMITATIONS AND FUTURE RESEARCH ... 49

7. REFERENCES ... 51

8. APPENDICES ... 56

8.1. APPENDIX 1: Correlation matrix... 56

8.2. APPENDIX 2: VIF-scores ... 60

8.3. APPENDIX 3: Division product categories in substitute and complementary items ... 61

(8)

8

1. INTRODUCTION

1.1. THEORETICAL BACKGROUND

Nowadays, the amount of online shopping is increasing over the last decade (Oestreicher-Singer & Sundararajan 2012). Consumers spend an increasing amount of resources online. Research shows that in 2015, 20 percent of the US consumers has purchased products online via mobile, tablet or computer (Statista 2016). In addition, research shows that digital customers in the U.S. on average spend nearly $3.000 online (Statista 2016). The Internet has several advantages, but also disadvantages for both consumers and retailers. When a consumer goes to an offline store, it can immediately evaluate what the product looks like in real life before purchasing (Hong & Pavlou 2014). However, this is difficult in an online environment because the product is not immediately tangible (Shulman, Cunha & Saint Clair 2015).

The intangibility results in an increasing amount of items that are returned to the retailer. From a retailer perspective this leads to high logistics cost and a reduction in firm performance (Hammond & Kohler 2001). Specifically, Hess & Mayhew (1997) found that customers are returning 25% of their products, and a study from Kerr (2013) found that product returns cost U.S. manufacturing and retailers nearly $264 billion per year. For some industries, such as the fashion industry, return rates can be even higher, with percentages up to 50% (Asdecker 2015). Because of these high return rates and costs that are involved, product return behavior has gained interest from both practitioners and researchers.

A definition about product return behavior is derived from Guide & Souza et al. (2006). They define product return behavior as: “The act of sending back products for any reason within a

specific time period after sale” (Guide & Souza et al. 2006). Several researches have been done

(9)

9

Besides, research has focused on reducing fit uncertainty by looking at the effects of different web technologies, such as a zoom-function and providing additional pictures (De, Hu, Rahman 2013), and pre-purchase information effects (Shulman, Cunha & Saint Clair 2015).

1.2. RELEVANCE

Because the online fashion industry has such a high return rate, it is interesting to examine purchase and return behavior further within this industry. Within fashion, consumers have a large variety of different items to choose from (Trabold, Heim & Field 2006). This is a benefit for the consumer, because customers seek for variety when making a purchase (Cheng & Chang et al. 2012). However, this large amount of options also leads to a wide variety of items that are ordered, and an increase in complexity of basket composition.

Until now, a large amount of research has been done regarding the analysis of a market basket, which often leads to interesting insights for retailers to optimize strategies and performance (e.g. Russel & Kamakura 1997). However, research regarding market basket analysis in relation to product returns is limited. The fact that in the fashion industry so many different items are ordered, in different sizes and combinations, makes it interesting to research the complex shopping basket, in order to get insight in the effects different types of items have on product returns.

Therefore, this research will focus on these aspects by means of answering the following research question.

“What is the effect of basket composition on predicting product returns?”

To get insight in the effect of different types of items in a basket, a division is made between substitute and complementary items, and whether a customer has bought (part of) an outfit (product bundle) or single items. In addition, also the price of an item is taken into account.

This leads to a split in the following sub-questions.

- What is the influence of substitute items in the basket on the product return probability? - What is the influence of outfits (product bundling) in the basket on the product return

(10)

10

- What is the influence of item price on product return probability?

- What is the influence of the number of complementary items on the relationship between

substitute items and product return probability?

The research questions are analysed by using transactional data from a Dutch online fashion retailer for women. The retailer offers a large assortment ranging from for example sweaters and jeans to different kinds of accessories.

1.3. CONTRIBUTION OF THIS RESEARCH

This research makes several contributions. From a managerial perspective it provides insights for (online) fashion retailers to make better predictions regarding product returns with a certain composition of the shopping basket. With these results retailers have better insight in the effect of different product types (i.e. substitute and complementary) on the product return probability. This research shows that customers who purchase substitute items have an increased chance to return an item. The effect of substitute items is however diminished when a complementary item is included in the basket. The decreasing effect on the relationship between substitute and complementary items and product returns is strongest when the number of complementary items is low. When the number of complements is high, the decreasing effect is still there, but to a lesser extent. Therefore, retailers should stimulate the number of complementary items that are purchased in a transaction in order to reduce return rates. A way to do this is for example by stimulating cross-selling and reduce upsell possibilities, leading to an increase in complementary items and a decrease in substitutes. This can be done for example by focusing on the presentation of the products on the website (De et al. 2012), or by focusing on the website lay-out, matching the right items together (Zhou et al. 2006).

(11)

11

Theoretically, this research contributes by making a distinction between substitute and complementary items in relation to product returns. Former research mainly focused on the product characteristics (e.g. price of an item) (Anderson et al. 2008), and type of products (e.g. electronics versus furniture) (Minnema et al. 2016), and not making the distinction between items that complement or replace each other. In addition, this research contributes by addressing the effect of product bundling in relation to product returns, by investigating the effect of complete outfits versus collection items. Previous research mainly focused on the effects of individual products in relation to product returns.

1.4. STRUCTURE

The rest of this paper is structured as follows. First a chapter consists of a literature review, which addresses the concepts and relationships between product returns, the elements of basket composition and the price promotion effect on this relationship. Then a section elaborates on the methodology that is used to gather and analyse the data. This is followed by the results section, and a section concerning conclusions, interpretations and recommendations. Finally, this report concludes with a section regarding limitations and directions for further research.

2. LITERATURE REVIEW

This section elaborates on the theoretical background of the concepts that are researched. First a general overview is given of previous research that is carried out regarding product returns in general. Then the literature review focuses specifically on basket composition and specifically on the relationship of substitute items, complementary items, outfits and item price and product return probability. In addition, the relationship of complementary items on substitute items and product return probability is discussed.

2.1. PRODUCT RETURNS

(12)

12

In addition, research found that the amount of pre-purchase information affects product returns. Specifically, Schulman, Cunha & Saint Clair (2015) found that return rates increase when the information leads to higher expectations, and decreases when information reduces uncertainty in expectations. Especially in the online fashion industry customers have limited options to evaluate and try the product and therefore it is difficult for retailers to meet expectations (Shulman, Cunha & Saint Clair 2015; Hong & Pavlou 2014).

The fit uncertainty is thus higher for fashion compared to for example electronics, and therefore the return ratio for fashion is higher (Mollenkopf et al. 2007). In line with this, retailers try to reduce fit uncertainty and provide realistic customer expectations, by providing pre-purchase information on the website regarding the product, such as reviews and product specifications (Minnema et al. 2016). Retailers use technologies such as pictures and a zoom function (De, Hu & Rahman 2013) to reduce fit uncertainty and thus product returns.

Despite all these measures that are used, consumers still experience uncertainty, leading to unpredictable purchasing behaviour and indirectly to a complex shopping basket to analyse for the retailer. Therefore this research tries to give insight in the influence from the basket composition of items in a shopping basket on product return probability. A visualization of the researched effects is shown in figure 1. Specifically, the number of substitutes in a basket and product bundles from certain products that together create a complete outfit are taken into consideration. Next to this distinction, also the effect of product price on product returns is taken into account. The relationship between the number of substitutes on product return probability is expected to be moderated by the number of complementary items in a basket.

Figure 1: Conceptual model

Basket composition

- Composition

 H1: Nr. of substitute items (+)  H2: Outfit (product bundling) (-) - H3: Item price (-)

Product returns

(13)

13

2.2. BASKET COMPOSITION

It is expected that product returns are influenced by the composition of the shopping basket. In order for the retailer to get insight in the shopping basket, often market basket analysis is done, which can be defined as: the analysis from the composition of products purchased during a

shopping experience (Russel & Petersen 2000).

Research on market baskets provides insights for optimizing advertising strategies, store layout and product placement (Blischok 1995). More importantly, the analysis gives a better insight in customer purchase behaviour within and across categories and thus enhances decision-making on product bundling, because brand and product associations can be made (Russel & Kamakura 1997; Peacock 1998).

Although market basket analysis gives insights in purchase behaviour, literature on the relationship between the composition of the basket and product return behaviour is limited. For example Hong & Pavlou (2014) researched the distinction of experience and search goods and found that product return rates are higher for experience products compared to search products. The reason behind this is because customers perceive more fit uncertainty in the former case (Hong & Pavlou 2014). Especially for online fashion, these are goods that customers have to try before they can evaluate the fit (Hammond & Kohler 2001).

A significant amount of literature exists regarding substitute and complementary products in relation to sales (e.g. Levy & Grewal et al. 2004), but is limited in relation to product returns. The following sections are diving deeper into this distinction and the relation with product returns.

2.2.1. Substitute items

Products are considered substitutes in economic terms if raising the price of one product leads to an increase in sales of another (Bucklin, Russell, & Srinivasan 1998; Russell & Bolton 1988; Russell and Petersen 2000).

(14)

14

The customer can have preference for one or the other product based on personal taste rather than quality, but they do negatively influence each other’s sales (Shocker et al. 2004). When relating this to fashion products, a customer’s specific purpose could be to buy a t-shirt. When the choice is between the same item (t-shirt) in different colours (e.g. red and blue), they are considered to be substitutes-in-use since the products are identical, serve the same specific purpose and have the same customer (Shocker et al. 2004).

Occasional substitutes serve a higher-order, more generic buying purpose (Shocker et al. 2004). Shocker et al. (2004) explain that the more general the buying goal, the more products compete with each other to achieve this goal. When relating this to fashion, a customer goal could be to purchase clothing that provides warmth. Then the choice would for example be between a coat or a vest, but a customer will only purchase one of the two because of financial resource limitations (Lehmann & Winer 2000). When linking these prior findings to product returns for this research, a positive relationship can be expected for both types of substitutes (substitute-in-use or occasional substitute).

When two substitute items are to be found in the basket, the probability that one of them is returned is expected to be higher, because a consumer has a limited budget, and therefore it is unlikely that a customer buys two products for the same purpose (i.e. buy a coat and a vest, or two coats or vests) (Shocker et al. 2004; Heath & Soll 1996). This increases the likelihood that one of the items from a category is returned.

However, the product return probability is not only depending on whether an item is a substitute or not, also the amount of substitutes in the shopping basket plays a role. When a customer buys more items, and these are substitutes from the focal product, it is expected that the return

probability for this item is increased even further compared to when a customer would purchase only one substitute. Thus it is expected that the amount of substitutes in a basket is increasing the product return probability even further.

(15)

15

So when more substitute items are purchased in the same transaction it increases the effect on the product return probability compared to when only a single substitute item is purchased. Based on this reasoning the following hypothesis can be derived.

H1: The number of substitute items in a basket increases the product return probability.

2.2.2. Product bundling

A special form of combining products is to bundle products together. It is a popular sales promotion tool (Yang & Lay 2006), and is increasingly used in the e-commerce business. A product bundle consists of items from different categories that together serve the needs of the customer (Shocker 2004). Retailers often create bundles that customers can select, or adapt to their needs (e.g. a standard package for a desktop) (Shocker 2004). Also at the retailer of interest customers can either buy a complete bundle (complete outfit), or only select the items from the outfit they need.

Influential literature regarding product bundling comes from Stremersch & Tellis (2002), who define product bundling as: ‘The sale of two or more separate complementary products in a

package.’ They identify two forms of bundling: Pure bundling versus mixed bundling. With pure

bundling a firm sells only the bundle and not the products separately. With mixed bundling the firm sells a product bundle, but also the same products separately (Stremersch & Tellis 2002). The assortment from the online retailer that participated in this research provides product bundles of certain items that together create a complete outfit, but also offers the items separately. Therefore, in this research the focus will be on mixed bundling.

(16)

16

In line with this, Harris & Blair (2006) found that consumers prefer product bundles over individual items, because it reduces functional compatibility risk, especially when consumers are less sure of their product knowledge.

When individual items are displayed as product bundles, consumers know directly that certain items match together. Another reason for preferring product bundles over individual items was that bundles provide benefits to the customer by reducing search and assembly effort before making a purchase (Harris & Blair 2006).

When relating this prior literature to product returns it seems that when customers are buying a product bundle it reduces uncertainty and perceived risk, which has in turn a negative effect on product returns (Hong & Pavlou 2014; Petersen & Kumar 2015). Therefore a negative relationship is expected between the purchase of (part of) an outfit/product bundle and the number of product returns, indicating a decrease in product return probability. This leads to the following hypothesis.

H2: The purchase of (part of) an outfit decreases the product return probability.

2.3. ITEM PRICE

Anderson et al. (2008) found that the probability that a product is returned also depends on the price of an item (Anderson et al. 2008). This relationship is confirmed by other research, indicating that customer are more critical for more expensive products and thus are more likely to return a product that lacks fit (e.g. Hong & Pavlou 2014; Anderson et al. 2009; Hess & Mayhew 1997).

For example, Anderson et al. (2009) found that customers become more critical when the price is higher and are more likely to return an item. Additional evidence for this relationship comes from Petersen & Kumar (2009). They found that when customers bought products on a price promotion (i.e. lower priced products), the amount of returns was significantly lower. Based on this literature, the following hypothesis can be derived.

(17)

17

2.4. EFFECT COMPLEMENTARY ITEMS ON SUBSTITUTE ITEMS

On the opposite of substitute items, a basket can also consist of complementary items. Products are considered to be complementary when a decrease in price of one product leads to a decrease in sales of another product (Bucklin, Russell, & Srinivasan 1998; Russell and Bolton 1988; Russell and Petersen 2000). Based on this definition, several researches try to make the distinction between different types of complementary products more specific.

Berry & Kumar et al. (2014) make a distinction between quantity and quality types of complements. With quantity complements the value of one item leads to higher value for another, for example a left shoe and right shoe. Quality complements imply that higher quality of an item leads to higher value of quality of the other item. For example a suit and tie (Berry & Kumar et al. 2014).

In line with these findings, Shocker et al. (2004) identified two types of complements; complements-in-use and occasional complements. With complements-in-use, it implies that one product enhances the value of the other product and they hardly can function without each other. For example PC and software have a positive influence on each other’s growth. One product has limited value without the other (Shocker et al. 2004).

For this research, following the research from Shocker et al. (2004), the focus is on occasional complements. These are products that are intended to be used together, but can also function independently. For example products that are displayed next to each other may serve as a reminder effect. When taking into account the research from Berry & Kumar (2014), the focus for this research is on quality complements.

(18)

18

For this research both descriptions are used to identify complementary goods. An item is considered to be complementary in this research when a product from another category (e.g. jeans and sweater) is found in the same basket, and when you can wear them at the same time (for example a coat and dress). An overview of the distinction between the categories can be found in appendix 3, table 16.

Research indicates that when customers buy from more categories, they buy more compared to when they only buy in a single category (Kumar, George & Paneras 2008). In line with this, Petersen & Kumar (2009) argue that when customers buy more, they also return more, indicating a positive relation between cross-buying and product returns.

However, the argument of increased purchases and product returns holds for all products (regardless the type of product). When customers buy more, they also could purchase more substitute products, also leading to an increase in product returns. Therefore, in this research the line of reasoning of Shocker et al. (2004) is leading, indicating that when bought together, the items are complementing each other and enhancing the total value of the purchase (Shocker et al. 2004). This value enhancement leads to the fact that customers are less likely to return a product, because this would lead to a reduction in value-added.

Based on this reasoning, the relationship between the number of substitutes and complementary items can be explained. It is expected that the return probability of substitute items is reduced when complementary items are included in the shopping basket. Reason behind this, is that when additional complementary items are included in a basket, the probability increases that these complementary items are matching to one of the substitute items, which is in turn reducing the return probability of that item. Therefore the following hypothesis can be derived.

H4: The number of complementary items in a basket decreases the effect of the number of substitute items in a basket and product return probability.

(19)

19

3. METHODOLOGY

3.1. DATA COLLECTION

For this research cross-sectional transactional data from a Dutch online fashion retailer for women is used. This retailer both has a web shop and physical stores, meaning that customers can purchase and more importantly also return products in a physical store in addition to the web shop. For the scope of this research, however, the focus is on the purchases and returns from the web shop. This might influence the number of items that are returned by the customers. However, since the dataset contains a large amount of observations, enough data points are available to provide reliable estimates for the return probability.

3.2. DATA DESCRIPTION

The dataset contained information on all orders placed via the company’s web shop. The raw dataset comprised a total of 517.558 items divided over 207.273 transactions and 147.882 customers. The time span of measurement is from 1 January 2016 until 31 May 2016 (5 months).

For each item purchased, the dataset contains several information entries, shown in table 1. For the analysis backend data from the retailer is used, enriched with data from Google Analytics. Furthermore, external data from the Dutch Institute of Meteorology (KNMI) was used to obtain data regarding weather effects (temperature and rain) and demographic data from the Central Bureau of Statistics (CBS) was used to obtain the number of inhabitants per ZIP-code.

(20)

20

Variable Data source Coding Explanation of coding

Product returns Back-end system retailer Dummy 0/1 1 when returned

Substitute Back-end system retailer Dummy 0/1 1 when 2 items come from same product category Complementary item Back-end system retailer Dummy 0/1 1 when 2 items from different product categories or can be

worn together

Substitute*complementary items Back-end system retailer Dummy 0/1 1 when item is complementing an item and substituting another

Individual item Back-end system retailer Dummy 0/1 Other individual items (ref. category)

Outfit Google Analytics Dummy 0/1 1 when customer has made a transaction and visited an

outfit product page

Item price Back-end system retailer In euros List price inc. VAT

Product category (i.e. sweater, leggings, pants etc.)

Back-end system retailer Dummy 0/1 1 when an item belongs to a certain category (Tops as ref. category)

Age Back-end system retailer In years

Urbanity Central Bureau of Statistics

(CBS)

x1000 inhabitants

ZIP-code from customer is linked to number of inhabitants

Relationship length with retailer Back-end system retailer In weeks To calculate, date of first transaction was used.

Recency Back-end system retailer In weeks Time since last purchase

Frequency Back-end system retailer Dummy 0/1 1 when more than one purchase is made Monetary value Back-end system retailer x1000 euros Total amount spend until last purchase Nr of previous returns Back-end system retailer In quantity

Device category Google Analytics Dummy 0/1 1 when it belongs to the category desktop, mobile, tablet or multiple (desktop = ref. category)

Incoming channel Google Analytics Dummy 0/1 1 when it belongs to a channel (Email as ref. category) Time an item is active on website Back-end system retailer In weeks Time an item is active on the website

Temperature Dutch Institute of

Meteorology (KNMI)

In °C Temperature at day of purchase

Rain Dutch Institute of

Meteorology (KNMI)

In millimeters Amount of rain at day of purchase

Time of purchase Back-end system retailer Dummy 0/1 1 when the purchase occurs outside office hours (9:00 – 17:00)

Part of week of purchase Back-end system retailer Dummy 0/1 1 when the purchase occurs in the weekend (Saturday or Sunday)

Delivery policy Back-end system retailer Dummy 0/1 1 when transaction date is after the change in free delivery policy from 25 to 30 euros per transaction (25 April 2016)

(21)

21

3.2.1 Operationalization of key variables

To measure product returns, information regarding whether a product was returned or not is used (Petersen & Kumar 2009). Then this variable is operationalized as an indicator variable, which gets value 1 when an item is returned. During this research, the return policy did not change. Customers could return items within a time period of 14 days via a parcel delivery service. Besides, customers also have the option to return items that they have ordered online in a physical store within 14 days after purchase, but these returns are not taken into account in this research because data was not recorded in the database.

First the items were divided into 14 different product categories (sweater, vest, trousers, jackets, leggings, accessories and other) with dummy coding, to be able to make a distinction between

substitute and complementary items. When two items from the same category (e.g. two jackets)

were found in the same basket, they are considered to be substitutes. When two items from different categories are found in the same basket (i.e. a pair of jeans and a sweater), or when they can be worn together, they are considered to be complementary items. When only a single type of item is included in a basket (e.g. only one pair of jeans), it is considered as individual item, which is used as the reference category. To use it in the model, the total amount of different items in a basket were summed for both substitute and complementary items.

(22)

22

3.2.2. Operationalization of control variables

Next to the main effects, several other factors might influence the product return decision as well. Therefore in the model several control variables are taken into account.

First of all, customer characteristics might play a role in the return probability (e.g. age). Also the

number of previous returns is influencing the return probability and controlled for. Research

shows that the return probability is higher for customers who already returned in the past

(Petersen & Kumar 2009; Minnema et al. 2016). Next to this, the device a customer uses to make a purchase (mobile, tablet, desktop or multiple) and the channel how customers come to the website (i.e. via Paid Search, Direct, Organic, Social, Affiliate, Referral, Display or Other) also might play a role in the return probability. These categorical variables are included as separate dummy variables in the model, in order to be able to specify a reference category manually.

Also product characteristics might play a role. To control for category effects, the product

category of the item is included in the model as separate dummy variables (i.e. sweater, jacket,

pants, skirts) that gets value 1 when an item belongs to that category. The product category ‘Tops’ is used as reference category, because these items are purchased the most. Also, to control for popularity and newness-effect of an item, the time that an item is placed on the website is included in the model, measured in weeks (Time on market) (Minnema et al. 2016).

During the period of measurement, the delivery policy changed. From 25 April 2016 onwards, the amount for free delivery was raised from 25 to 30 euros per transaction. In order to control for the effect that customers might purchase for a larger amount of euros to use free delivery, this effect is taken into account by a dummy variable that gets value 1 after the change in delivery policy. In order to link customer- and purchase behavioral variables with each other, a unique customer identification number and unique transaction identification number were used.

3.2.3. Data inspection and data cleaning

(23)

23

In addition, the descriptive statistics were used to identify outliers. The first method that was used was the Malahanobis distance method. This is a common measurement to detect multivariate outliers (Cousineau & Chartier 2010). However, based on this method half of the dataset would be labeled as outliers and excluded from analysis. The second method used was based on the 2,2 multiplier rule of the IQR, to determine the lower and upper limit of each variable (Seo 2006). But when used, this method would also exclude too many data points. Therefore the following method is used. Only the most extreme cases were excluded from analysis in order to maintain variance and reliability in the data, based on the average amount of units purchased per order and average amount spend per order. In order to identify the most extreme cases, boxplots were used for visualization. In addition, the total amount of orders, basket size per order, total amount spend per customer and total amount of units purchased per customer were taken into account to identify customers that were showing an extremely deviating purchasing and spending pattern. In table 2 the cutoff-values are shown that are used to determine the extreme observations. All observations above the cut-off value are excluded from analysis, maintaining the long tail in the data. In total 3.283 items in 928 transactions were excluded, divided over 40 customers.

Selection variable Cut-off value

Average amount of units purchased per order 35 units

Average amount spend per order 4000 euros

Total amount of orders per customer 40 orders

Basket size in units per order 40 units

Basket size in euros per order 990 euros

Total amount spend per customer 4220 euros

Total amount of units purchased per customer 110 units

Table 2: Variables used for data cleaning with cut-off value

When inspecting age from the customers, initial analysis indicated that 25 customers had an age between 0 and 13 years old. Since these are anomalies in the dataset and not representative for the rest of the customers, that have an average age of 46,95, the 131 cases belonging to these customers are recorded as missing values, to maintain a reliable dataset.

(24)

24

Also, some of the device categories had multiple recordings per transaction. This affected 7.2% (11.409) of the total amount of transactions. Since this is a large amount, first the data was inspected to see which device category was used the most in the other transactions. In the majority of all transactions (63%), the desktop was used to make a transaction. However, because the observations where multiple devices are used might provide valuable information, these duplicate cases were labeled as ‘Multiple’.

For the number of days the item was on the website (popularity), 533 cases had negative values, which means that customers have ordered the product before it was available. To solve this, the negative values were recoded as missing values in order to maintain valid data. In addition, the relationship length (days since first purchase), and recency (days since previous purchase) also contained negative values for February. This is because of coding issues with date and time format. To solve this, the negative values were set to 0 in order to maintain a valid dataset.

3.2.4. Missing value analysis

After the recoding, descriptive statistics showed that some of the variables had a large amount of missing values (see table 3). For example with the variable age, 57% of the total values of 512.173 was missing. These missing values are affecting the results of analysis, because they are traditionally deleted list wise and not included in the model estimation. Age recordings are primarily missing because of non-response/ non-recording in the back-end system from the retailer. Device and incoming channel recordings are missing because of the discrepancies between Google Analytics and the backend data from the retailer. For the days that an item is active on the website, the values are unknown for some of the items because of measurement error in the back-end system from the retailer.

Valid N Nr of missing cases % Missing Mean

Age 217.963 294.210 57,44% 46,97

Inhabitants 511.963 235 0,05% 87741

Device category 406.361 105.812 20,66% N/A

Incoming Channel 390.799 121.374 23,70% N/A

Time item is active on website 480.987 31.186 6,09% 193,65

(25)

25

To address this problem, the categorical variables ‘Device Category’ and ‘Incoming channel’ are recoded by adding an additional category named ‘Missing’. For the numeric variables age, inhabitants, and days active on website, a method of mean replacement is used, based on the mean from the initial descriptive statistics. This method is used because it is a fast and valid way to deal with a large amount of missing values (Cousineau & Chartier 2010).

3.2.5. Correlations and multicollinearity

This section elaborates on the dependency of the independent variables, by testing and controlling for multicollinearity between the predictor variables. This is done by means of a correlation matrix. The full matrix can be found in appendix 1, table 14).

The number of previous orders and total amount previously spend indicated a sign of multicollinearity (r= ,844 with p < 0,01) (see appendix 1, table 14). To correct for this, the number of previous orders were recoded as dummy variable, which gets value 1 if a customer has purchased more than one order. This led to a correlation of r= ,553 with p < 0,01.

But this led to a higher correlation between the number of previous orders (i.e. frequency) and the days since first purchase (i.e. relationship length) (r= ,861). Taking into account the VIF-scores of these two variables, frequency has a VIF score of 4,085 and relationship length has a VIF score of 4,224 (see appendix 2, table 15. This is below the cut-off value of 5, so therefore not considered to be a problem with distorting the results.

(26)

26

Additionally, the moderating effect of number of substitute*complementary items showed marginal multicollinearity, with the complementary (r= ,637 with p < 0,01) and substitute (r= ,578 with p < 0,01) item variables. This can be explained by the fact that this variable is a combination of the number of substitutes and complementary items. Therefore it is logical that those two items are correlated with each other. According to a recent paper of Disatnik & Sivan (2016) this is not considered to be a problem, because the multicollinearity exists because of interval scaling.

Also, the order date dummy variable is correlated with the temperature (r= ,625 with p< 0,01). This can be explained by the fact that the order policy was changed on 25 April, which is also a date that the temperature naturally starts to rise. In addition, the order date dummy variable is slightly negatively correlated with the days since last order (i.e. recency) (r= -,536).

3.2.6. Description of sample

After the data preparation the final dataset contained 202.143 transactions in total, including 508.890 different purchased items (observations). These items are divided into 14 product categories (see table 4). The items are divided into substitutes, complementary items or both, based on product category.

This leads up to an amount of 125.155 different substitutes, 105.744 complementary items, 197.461 were classified as both and 80.530 items were classified as single item. The average item price is 24,76 euros.

(27)

27

3.3. REPRESENTATIVENESS OF DATA

After having addressed the data preparation issues, also limitations of the data exist. Next to the option of purchasing and returning items online, the retailer also provides the option to purchase items online, and return items offline. However, the retailer does not keep track of which items are bought online and returned offline. Therefore for this research only items that are purchased and returned online are taken into account. This might be influencing the results and might not provide a full picture of the effect on product returns. However, the dataset contains enough valid data points to provide a reliable overview of the estimates.

In addition, only customers who made a purchase are taken into account. Google Analytics data showed that some consumers returned items, without having made a purchase. This is considered as a measurement error, occurring when customers are closing the browser before the Google Analytics script has loaded on the web page. Since customers have to make at least one purchase in order to return an item, only these customers and transactions are taken into account for the analysis.

When further examining the data, it seems that also males are making purchases. However, since the retailer only sells women apparel, the male customers are treated as female customers for analysis purposes, and gender is not included in the analysis.

3.4. MODEL

(28)

28

3.4.1. Model specification

When including all the variables in a function, this leads to the discrete choice regression

function 𝑃𝑖 = 𝐹(Β′Χ𝑖𝑗) = 𝑒Β′Χ𝑖𝑗

1+ 𝑒Β′Χ𝑖𝑗, which indicates the probability that an item 𝑖 is returned,

given a certain transaction 𝑗.

Β′Χ𝑖𝑗 can be defined with the following regression formula, where C is indicating a control

variable, and where the variables vary for each item 𝑖and per basket (transaction) 𝑗.

𝛽0 + 𝛽1NrSubstitute_itemsi + 𝛽2NrComplementary_itemsi +

𝛽3NrSubstitute*Complementary_itemsi + 𝛽4Outfit_dummyi + 𝛽5Item_pricei + 𝛽7Sweater +

𝛽8Vesti + 𝛽9Blousei + 𝛽10Tunieki + 𝛽11Shirti + 𝛽12Pantsi + 𝛽13Leggingsi + 𝛽14Dressi +

𝛽15Skirti + 𝛽16Blazersi + 𝛽17Jacketsi + 𝛽18Accessoiriesi + 𝛽19Otheri + 𝛽20C_Agej +

𝛽21C_Urbanityj + 𝛽22C_Device_mobilej + 𝛽23C_Device_tabletj + 𝛽24C_Device_multiplej +

𝛽25C_Device_Unknownj + 𝛽26C_Channel_Organicj + 𝛽27C_Channel_Directj +

𝛽28C_Channel_PaidSearchj + 𝛽29C_Channel_Socialj + 𝛽30C_Channel_Referralj +

𝛽31C_Channel_Affiliatesj + 𝛽32C_Channel_Displayj + 𝛽33C_Channel_Otherj +

𝛽34C_Channel_Unknownj + 𝛽35C_Recencyj + 𝛽36C_Frequencyj + 𝛽37C_Montetary_valuej +

𝛽38C_Previous_returns+ 𝛽39C_Relationship_lengthj + 𝛽40C_Time_market_itemi +

𝛽41C_Delivery_policyj + 𝛽42C_Time_Purchasingj + 𝛽43C_PartofWeekj + 𝛽44C_Temperaturej +

(29)

29

4. RESULTS

This chapter elaborates on the results, based on the model described in the previous section. First some descriptive results are given to get an initial overview of the data and subsequently the results from the logit regression analysis are discussed.

4.1. DESCRIPTIVE RESULTS

To provide a first overview of the data related to purchase and return behavior, some descriptive results are given in table 4-8. From the total quantity of items that are purchased, 136.207 items were returned, leading to an overall return percentage of 26% (see table 4). Shirts and accessories are returned the least (14% and 13% respectively), and dresses returned the most (33%). When an item is classified as both substitute and complementary, it is returned the most (36%) and when it concerns a single item it is returned the least (12%) (see table 5). The return percentage for substitute and complementary items are nearly the same (23% and 22% respectively). Items belonging to an outfit are slightly more returned compared to individual items (28% compared to 26%) (see table 6). Customers are returning the least items when using a mobile device (21%) to make a transaction, and most when they use multiple devices (37%) (see table 7).

For incoming channel, customers are returning the least when they come on the website via Social (23,6%), and the most when they come via Affiliates (30%) (see table 8).

Product category Purchased Returned Return%

Sweater 9909 2766 28% Vest 49721 11416 23% Blouse 53285 15890 30% Tuniek 27085 8465 31% Tops 146634 36279 25% Shirt 11101 1555 14% Pants 115536 30791 27% Leggings 26431 5439 21% Dress 35625 11803 33% Skirt 7220 2174 30% Blazers 13889 4342 31% Jackets 13548 3842 28% Accessories 10352 1336 13% Other 567 109 19% TOTAL 520903 136207 26%

(30)

30

Item type Purchased Returned Return%

Substitute 128784 30261 23%

Complementary 107145 23317 22%

Both 200156 72747 36%

Single item 84818 9882 12%

TOTAL 520903 136207 26%

Table 5: Return percentage per item type

Outfit vs. collection Purchased Returned Return%

Collection 423197 108723 26%

Outfit 97706 27484 28%

TOTAL 520903 136207 26%

Table 6: Return percentage outfit versus collection

Device category Purchased Returned Return%

Missing 107303 26066 24% Desktop 246107 67832 28% Mobile 46659 9611 21% Tablet 83493 18805 23% Multiple 37341 13893 37% TOTAL 520903 136207 26%

Table 7: Return percentage per device category

Incoming channel Purchased Returned Return%

Missing 122694 33863 28% Direct 62908 16192 26% Email 125929 33588 27% Organic Search 109289 26683 24% Paid Search 40262 10029 25% Referral 5017 1415 28% Affiliates 17113 5199 30% Display 26062 6449 25% Social 4974 1175 24% (Other) 6655 1614 24% TOTAL 520903 136207 26%

Table 8: Return percentage per incoming channel

(31)

31

Variable Chi-square value

Item type 20087,266*** (3)

Product category 4022,950*** (13)

Outfit 195,636*** (1)

Device category 4027,505*** (4) Incoming channel 523,334*** (9)

Note: *** when p< ,001, numbers in brackets indicate the degrees of freedom

Table 9: Chi-square test of differences in return percentage per category.

4.2. RESULTS FINAL MODEL

When having described the first initial results, the bivariate statistics are giving a broad idea on the return probability, but do not take all the different factors into account that might be influencing the return probability as well. For example the weather at time of purchase, or the amount of previous returns from a customer are not taken into account here. Therefore a more sophisticated test is used that takes these and other effects into account when determining the product return probability. Since the dataset matches a panel data structure (multiple lines with items for each transaction), a paneled random effects XT logit model is used.

To correct for the fact that different items occur in the same basket (transaction) a random effects logit model for panel data will be estimated, paneled by transaction ID. An advantage from a random effects model is that it assumes that individual heterogeneity varies across items (De, Hu, Rahman 2013). Additionally, all observations are included in a random effects model, also the ones without variability (De, Hu, Rahman 2013), which increases the representativeness of the sample.

The remainder of this section elaborates on the results from this logistic regression analysis. For the analysis different models were estimated to determine which model has the best fit to the data, based on the model specified in section 3.4.1.

(32)

32

4.2.1. Model fit comparison

After having estimated both models, with and without interaction effects, this section elaborates on the model fit of both models.

Measures Intercept-only model (panelled)

Model 1 - XT-logit RE Model 2 - Full XT logit RE with interaction effect

McFadden R2 0,102 0,103 -2 Log Likelihood 533039,20 478.784,920 478.311,120 Likelihood ratio Chi2 22.283,16*** (43) 22.756,95*** (44) AIC 533039,20 478870,92 478399,12

Table 10: Goodness of fit measures XT-logit models

The full model with the interaction effect (model 2), has a slightly better fit than the model without the interaction effect, based on the Mc Fadden R2 (R2= 0,103 compared to R2= 0,102) (see table 10). Also the difference in Chi-square of the log likelihood function with the intercept-only model is significantly higher with the full model (Chi2= 22.756,95) compared to the model without interaction (Chi2=22.283,16). In addition, the AIC is lower, which indicates a better fit, when correcting for adding additional predictors to the model (AIC= 478.399 compared to AIC= 487.871). This means that when the moderating effect is added to the model, the fit to the data is slightly better, predicting better the product return probability of an item.

4.2.2. Model results

After having discussed the fit of both models, this section elaborates on the estimates and results from the two models. In table 11, the results are presented in odds ratios (Exp(B)). In general, most of the variables show a significant effect on product return probability.

Model 1 – Xtlogit RE Model 2 – Full Xtlogit RE with interaction effect

Exp(B) Exp(B) Nr of Substitute items 1,322*** 1,396*** Nr of Complementary items 1,255*** 1,287*** Substitute*complementary items ,981*** Outfit dummy 1,055** 1,052** Item price 1,012*** 1,012*** Constant ,060*** ,056***

Note: * when p<.10, ** when p<.05, and *** when p<0.01, n.s. = not significant.

(33)

33 Sweater 1,475*** 1,476*** Vest ,913*** ,898*** Blouse 1,421*** 1,418*** Tuniek 1,593*** 1,592*** Shirt ,498*** ,497*** Pants 1,319*** 1,317*** Leggings 1,001n.s ,999n.s Dress 1,891*** 1,776*** Skirt 1,405*** 1,411*** Blazers 1,530*** 1,514*** Jackets 1,341*** 1,308*** Accessoires ,348*** ,331*** Other ,727** ,670** Age ,999n.s ,999n.s Nr of Inhabitants (x1000) ,999*** ,999*** Device Mobile ,662*** ,667*** Device Tablet ,783*** ,787*** Device Multiple 2,003*** 1,989*** Device Unknown ,556*** .551*** Channel Unknown 1,695*** 1,772*** Channel 1 Direct ,927*** ,922*** Channel 3 Organic ,905*** ,903***

Channel 4 Paid Search ,965n.s ,965n.s.

Channel 5 Referral ,981n.s ,979n.s Channel 6 Affiliate 1,088** 1,082** Channel 7 Display ,984n.s ,990n.s Channel 8 Social ,817*** ,821*** Channel 9 Other ,860** ,859** Recency (weeks) 1,000n.s 1,001n.s Frequency ,951** ,947**

Monetary value (x100euros) ,989*** ,989***

Previous returns 1,073*** 1,071***

Relationship length (weeks) 1,003*** 1,002***

Time on market (item) ,997*** ,997***

Delivery policy 1,033n.s 1,030n.s

Time of purchasing (out of office hours) 1,037** 1,034**

Part of week (weekend) ,999n.s ,999n.s

Temperature 1,001n.s 1,001n.s

Rain 1,003n.s 1,003n.s

(34)

34

In general, the odds ratios from the key variables (substitute and the direct effect of the complementary*substitute effect) are increased when the interaction effect is included in the model (see table 11). The effect for substitutes is increased with a factor 0,074, and the direct effect for complementary items increases with factor 0,032. This means that the effects get stronger when complementary items are added to the basket. The odds ratios from the other variables remain similar to the odds ratio in the full model when the interaction effect is added. Because the model fit is best when the interaction effect is included, the results from the full model are interpreted in the next section.

4.2.3. Results key variables

The strongest effect in the model (table 11) comes from whether a basket contains a substitute

item (p< ,001 with Exp(B)= 1,396). This means that when an additional substitute item is added

to the basket the relative risk (odds) for returning an item increases by a factor of 1,396 (=39,6%), given that all factors in the model are held constant. This result supports hypothesis 1, indicating that the number of substitutes in a basket has a positive effect on product return probability of an item. For example when a customer buys a pair of jeans, and buys another pair of jeans, the probability that a pair of jeans is returned increases with 39,6%, because the two jeans are replacing each other’s value.

This significant positive effect is also visible for the direct effect of complementary items, with an odds ratio of 1,287 (p< ,001, with Exp(B)= 1,287), indicating that the probability of return increases with 28,7% when one additional complementary item is purchased. This is counter-intuitive, since complementary items are expected to enhance each other’s value (Shocker et al. 2004), leading to a lower product return probability. A possible explanation for the positive direct effect is based on research could be that when customers buy two items they might in the end not fit together in terms of product characteristics (i.e. color) for example, which increases the return probability of the item.

(35)

35

An initial explanation for this effect could be that the addition of a complementary item increases the chance that it matches with one of the other items in the basket, leading to a lower return probability of the substitutes.

One could argue that this significant moderating effect is due to the increase of items in the basket, following the line of reasoning of Kumar, George & Paneras (2008) who found that customers purchase more when purchasing from different categories, compared to when they only buy in a single category. In line with this, Petersen & Kumar (2009) found that customers who buy more, also return more. In order to control for this potential basket size effect, the same model was estimated, with substitute and complementary items included as dummy variables (getting value 1 when the item was a substitute or complement) and including basket size in units. The results from this estimation can be found in appendix 4, table 18. These results still showed a significant decreasing effect of complementary items on the relationship between a substitute item and product returns, confirming the results from the initial model estimates. The direct effects of substitutes and complementary items also got a similar significant increasing effect on the product return probability (see appendix 4, table 18). The dummy model even provided stronger odds ratios for the direct effects of substitute and complementary items compared to the initial model estimates, indicating that the potential basket size does not affect the direction of the effects.

(36)

36

For outfit, the effect is included as X=1 because they are returned more than collection items. For device and incoming channel the effects are also included as 0 because these values occur the most within this variable. For the frequency dummy variable, the value of 1 is included for X because most of the customers have purchased more than once. For the delivery policy dummy variable, the value is included as 0, because most orders occurred before the change in free delivery policy.

Also with time of ordering, the dummy variable is included as 1 to maintain the effect of out-of-office hours (most purchased) in the constant. For part of the week, the dummy variable is included as 0, because during weekdays the effect on return is the highest.

Mean Standard deviation Item price 24,7609 11,512 Age 46,9625 7,009 Temperature 7,963617 4,922 Rain 25,22 41,711 Recency (weeks) 7,3931 6,015

Monetary value (x100euros) 4,5997 8,546

Previous returns ,9399 3,012

Relationship length (weeks) 40,0797 39,939 Time on market (item) 27,5581 20,089 Nr of inhabitants (x1000) 87,8142 132,075 Number of substitutes 1,518 1,987

Number of complements 1,759 2,423

Table 12: Mean and standard deviation from numeric variables

(37)

37

Figure 2: Interaction effect of amount of substitute items*complementary items on product return probability.

The difference in slopes of the two lines for complementary items indicates a negative interaction effect, meaning a decreasing influence on the effect of the (increasing) effect of substitute items and product return probability (see figure 2). When the number of complementary items in a basket is low (i.e. 1 item), the decreasing effect on the relationship between the number of substitute items and return probability is larger compared to when the number of complementary items in a basket is high (i.e. 9 items). In short this means that when a low amount of complementary items is added to a basket with substitute items, the return probability of an item is lower, compared to when a high amount of complementary items would be added to the same basket.

A possible explanation for this result is that when only one additional complementary item is added to the basket, it increases the chance that the complementary item matches with the substitute items, leading to a decrease in return probability of the substitute items. However, when the amount of complements in the basket is large, this decreasing influence on the product return probability of substitute items is lower. An explanation could be that with a large amount of complementary items, the amount of different product categories in a basket becomes also large (Kumar, George & Paneras 2008). This is increasing the chance that the different items do not match with each other, making the earlier mentioned ‘matching-effect’ less strong

For outfit, analysis shows that when customers purchase (part of) an outfit, it increases the product return probability compared to when they would buy an item from the collection.

0,726 0,946 0,944 0,978 0,00 0,10 0,20 0,30 0,40 0,50 0,60 0,70 0,80 0,90 1,00 R e tu rn p ro b ab ili ty

Low amount substitutes (1) High amount substitutes (7)

Low amount complementary items in basket (1)

(38)

38

This means that hypothesis 2, suggesting a negative effect on the product return probability, is not confirmed. When customers purchase (part of) an outfit, the probability that an item will be returned increases with a factor 1,052 (or with 5,2%) compared to when customers purchase an item from the collection. A reason for this could be that when customers have an item from an outfit that does not match their expectations, the whole outfit is returned because it belongs to a total set. Another explanation could be that customers think that they are less likely to find other matching items together with the items from the outfit that do match expectations. This then might lead to the return of the whole outfit as well.

Price of an item has a slightly positive significant effect on the product return probability. This is

confirming hypothesis 3, indicating that the price of an item has a positive effect on the product return probability of an item (with Exp(B)= 1,012). This means that when an item price is increased with 1 euro, the probability that an item is returned increases with 1,2%. This is in line with previous research (e.g. Anderson 2009; Petersen & Kumar 2009),who found that consumers are more critical in their evaluation when items are more expensive.

4.2.4. Results control variables

(39)

39

In line with previous research (e.g. Anderson et al. 2009) the results indicate that customers become less critical in the evaluation when a commodity item is purchased, because the price is generally lower compared to other types of clothing, leading to a lower product return probability. The results also indicate that Leggings are not significantly different from the benchmark category ‘Tops’ (Exp(B)= ,999 with p= ,986).

Note: A positive percentage (red bar) means an increase in the return probability compared to the reference category ‘Tops’. A negative percentage (green bar) means an decrease in the return probability compared to the reference category ‘Tops’. Grey bars are not significant.

Figure 3: Effect on product return probability in percentage per product category

The amount of previous returns has a slightly positive effect on the product return probability (Exp(B)= 1,071). This research shows that for each additional previous item that is returned, the return probability increases with 7,1%. This is in line with previous research of Petersen & Kumar (2009) who found similar results. They found that when customers have previously returned an item, they have experience with the return process, leading to an increase in probability that future purchased items are returned. The frequency (number of previous orders) has a small significant negative effect (Exp(B)= ,947), meaning that when customers have ordered more than once, the product return probability decreases with 5,3% compared to when they did not previously order at the retailer.

(40)

40

This is in line with previous literature suggesting that the more familiar customers are with the retailer and its assortment, the better they know what they want and the less uncertainty they experience, leading to a lower return probability (Petersen and Kumar 2009; Minnema & Bijmolt et al. 2016).

For device category, purchases that are made via mobile and tablet have a negative effect on the product return probability (Exp(B)= ,667) (= 33,3%) and Exp(B)= ,787 (= 21,3%) respectively) compared to desktop which is used as reference category. An explanation could be that most customers purchase via desktop, leading to the highest return probability for this category, following the line of reasoning from Petersen & Kumar (2009) that customers who purchase more also return more. When customers come to the website with multiple devices, this increases the chance that an item is returned with a factor 1,989 compared to desktop, holding all other factors in the model constant. A reason for this could be that when customers use multiple devices, they are doing an extensive product search, making them extra critical in the evaluation , leading to an increase in the product return probability. When the device is unknown, the chance decreases with a factor ,568 (44,2%) compared to when customers purchase via desktop.

The effects of incoming channel on product return probability, are mostly negative on product return probability compared to the reference category Email (see figure 4).

Note: A positive percentage (red bar) means an increase in the return probability compared to the reference category ‘Email’. A negative percentage (green bar) means a decrease in the return probability compared to the reference category ‘Email’. Grey bars are not significant.

Figure 4: Effect on product return probability in percentage per incoming channel

-40,0% -20,0% 0,0% 20,0% 40,0% 60,0% 80,0% 100,0%

Direct Organic Paid Search

Referral Affiliate Display Social Other

Referenties

GERELATEERDE DOCUMENTEN

The t-statistics for the book return on equity and the historic market cost of equity are all significantly negative, indicating that smaller firms of all

Is the DOW-effect present in returns that are adjusted to the market beta, market capitalization and book-to-market ratio of firms listed on the Dutch

As the weather variables are no longer significantly related to AScX returns while using all the observations, it is not expected to observe a significant relationship

One the one hand it could be the case that when the gift is related to the ordered product (e.g. same product category), customers might appreciate the gift more just as in the

For example, when a customer would purchase a product from the Electronics category and a product from the Garden category at the same time, what are the mutual influences regarding

contrast to the third hypothesis of the experiment, namely the effect of a healthy recipe flyer on the healthiness of the shopping basket would be pronounced for consumers with strong

Based on the main idea of the basket balance hypothesis that additional products are able to counterbalance the perceived threat to the public identity coming from the

The general mechanical design of the Twente humanoid head is presented in [5] and it had to be a trade-off between having few DOFs enabling fast motions and several DOFs