Identifying the characteristics of online traffic builders: The importance of first purchases

(1)

Identifying the characteristics of online traffic builders:

The importance of first purchases

Bas L. Nijhuis

MSc. Marketing Intelligence MSc. Marketing Management

Supervisor: prof. dr. T.H.A. Bijmolt 2nd_{Supervisor: dr. P.S. van Eck}

Date: 15 January 2021

Jozef Israelsstraat 53, 9718GD Groningen B.L.Nijhuis@student.rug.nl

(2)

2

ABSTRACT

(3)

3

MANAGEMENT SUMMARY

The concept of an online traffic builder is central to this thesis. In the offline environment, traffic builders are products that are discounted in order to attract customers to their stores, and consequently, these customers purchase more products. It is interesting to know whether these products exist in an online context. This thesis posits the online traffic builder concept and tries to identify its characteristics. The thesis defines an online traffic builder as a first purchase that increases the number of additional purchases and results in a larger basket size. The main research question is: What product characteristics determine the first online purchase and what

product characteristics of first purchases increase the number of products bought?

In the theoretical framework, several characteristics are hypothesized to affect the probability of being a first purchase and the number of additional purchases. These characteristics are price, price promotions, product quantity, and product variety.

The data is provided by a Dutch retailer focused on construction materials and provided data of over 10 years from six different webstores. The dataset contained data of 181.858 observations, however many observations needed to be removed, resulting in a total number of observations of 70.936 to be used.

Two models are created to analyze the different characteristics affecting the probability of a first purchase and additional purchases. To inspect the first purchase probability, a logit model is used, and to inspect the expected number of additional purchases, a Negative Binomial model is employed.

The results show that several hypothesized effects are supported by the analysis. Resulting in the fact that products that are higher priced and are purchased in larger quantities seem to be significantly more likely to be the first purchase. On the contrary, first purchases that are higher priced decrease the number of expected additional purchases. Furthermore, first purchases of which the product itself offers more variety (in terms of offering several sizes and colours) increases the expected number of additional purchases. In contrary to the expectations, discounts did not show to have a substantial effect.

(4)

4 implications, an online retailer should not focus on discounting its product to increase the number of additional purchases. The thesis suggests that online retailers should focus on advertising on products that are often bought in higher volumes and online retailers should try to find the perfect balance between the prices of its products and the variety offered by its products in order to pursue the online traffic builder effect. At last, this thesis notes several theoretical and suggest ideas for further research.

(5)

5

1.INTRODUCTION

§1.1 Introduction

In the past, offline retailers have found products that can increase traffic to their stores when they are offered for reduced prices. Retailers still use this tactic by employing deep discounting strategies (Gangwar, Kumar and Rao 2014). The products that are discounted are so-called traffic builders. These products are promoted and discounted, i.e. beer and diapers, to attract customers and increase traffic to their stores. The ultimate purpose of traffic builders is that customers will do their regular grocery shopping at these stores. These traffic builders could be considered one of the main reasons for visiting the retailer as the reduced pricing attracts the customers.

In turn, most online retailers are interested in increasing traffic to their website and trying to do so by using digital advertising. The amounts of money spent on digital advertising by retailers in the United States are substantial and are expected to reach 28.23 billion dollars in 2020. These expenses are lower than initially expected due to the current covid-19 pandemic. Nevertheless, these expenses are incredibly high and are expected to grow substantially in the coming years (Cramer-Flood 2020). The most important and most used digital advertising tools by companies are search advertising, social media advertising, banner advertising and video advertising (Statista 2020). Of course, traffic to the website is important for the business in an online environment as having no traffic to the website represents no sales. It might be expected that these companies, which are investing substantially in these advertising possibilities, know the customers' reasoning behind visiting their website and why the customers are buying their products. However, this is mostly not true. As most companies focus on understanding the customer journey, which encompasses all the actions a customer takes to arrive at the moment of purchase (Lemon and Verhoef 2016), they collect data on how customers land on their website and on which products. In their attempt to understand the customer journey, the data retrieved from digital advertising is actually instrumental and is a meaningful aspect of understanding the customer journey. Several academic research papers are focused on understanding and influencing this path that a customer takes (De Haan et al. 2018; De Haan, Wiesel and Pauwels 2015) and do provide useful insights in this context.

(8)

8 customers' visiting intentions and the reasoning behind their purchases. With this knowledge, companies can use advertising channels more efficiently.

In the online shopping environment, retailers have not yet discovered products that function as traffic builders. This thesis tries to identify these online traffic builders to understand why customers visit websites and purchase specific products. Identifying and understanding what the characteristics of online products are that determine first purchase, increase traffic, and the number of additional purchases could be of immense value to online and multichannel retailers. As such, online retailers could aim their digital advertising expenses more efficiently on those products that matter, could more effectively focus on SEO activities and could optimize their inventory management. To my understanding, no products have yet been identified that represent these traffic builders in an online environment. Therefore, the central question to be addressed in this study is:

What product characteristics determine the first online purchase and what product characteristics of first purchases increase the number of products bought?

To be able to answer this question sufficiently, sub-questions are needed to be answered. The research question will be divided into two sub-questions. The first question that needs to be answered is:

RQ1: What products are remarkably more often a first purchase instead of additional purchases?

This question actually determines the most important aspect of the research, as this distinction needs to be made to establish potential online traffic builders. Without distinguishing between these two, the whole concept of an online traffic builder will be hard to posit. This first purchase mimics the product's role that is the reason for visiting a store in an offline environment. Therefore, finding products purchased mostly as a first purchase instead of additional purchases is very interesting. Having found those products, these products' characteristics that determine being the first purchase can be analyzed.

(9)

9

RQ2: How do different first purchases influence the basket size?

This question covers another relevant issue of this thesis. As the aspect that makes a traffic builder actually valuable is the additional purchases, the customer makes in the store. If customers only purchase the (discounted) traffic builder, this would have serious implications for retailers' strategy. In essence, the traffic builder increases the basket size in terms of the number of different products bought by the customer. Determining the effect that different first purchases have on the basket size is extremely important in defining the products that act as a traffic builder. This question will give insights if traffic builders do exist in an online environment. The characteristics representing these additional purchase decisions conditional on the first purchase decision should be analyzed to investigate this relation.

§1.2 Scope of the thesis

This thesis is focused on identifying online traffic builders, and thus is closely related to online traffic. However, this thesis is focused on product-specific characteristics that may influence the traffic on the website. The observed characteristics should be accountable to the specific products. This research will not cover the impact of SEOs or other digital advertising possibilities. SEOs, of course, are extremely important for online traffic generation. However, this research is trying to find other aspects that are valuable for online traffic. Actually, this research is trying to find relations that can eventually improve SEO and digital advertising possibilities. Thus, these advertising opportunities will not be researched in this thesis.

Further, this research will not investigate the demographics of customers, although it could be exciting to find relations between age and differing purchasing patterns. However, as the provided data is from a website that sells building materials, most sales emanate from companies. Therefore, this thesis cannot focus on finding any relationship between demographics and online purchasing behaviour.

(10)

10

§1.3 Theoretical and social relevance

Having answers to the proposed questions is very interesting for online retailers. These questions build to understand online shopping behaviour, and this behaviour might be influenced as such. Understanding the characteristics of first and additional purchases and potential differences between them will benefit online retailers through adjusting their assortment, accordingly, developing more effective pricing strategies and making more effective use of marketing instruments. Furthermore, this thesis will add to understanding what characteristics of products determine the order sequence. The research aids in understanding what psychological aspects can play a role in their purchasing process in an online environment.

§1.4 Structure of the thesis

This thesis will use sales data to analyze characteristics regarding first and additional purchases to provide answers to the above-mentioned questions. The data will be analyzed utilizing two modelling techniques; logistic regression and Negative Binomial regression. These techniques and their usage will be elaborated further upon in chapter 4. The data is provided by an online retailer in the Netherlands. The retailer is specialized in construction materials and provides the research with sales data of over ten years.

(11)

11

2. THEORETICAL FRAMEWORK

§2.1 Online traffic builder

This thesis is trying to identify the characteristics of a traffic builder in an online environment. A traffic builder in an offline environment attracts customers to stores, and subsequently, customers purchase more in that store. In the offline environment, it is hard to allocate the sequence of the purchase of products. In an online environment, however, data availability makes it possible to track the purchase sequence at a large scale. This has some implications for the definition of the traffic builder in an online environment.

This study defines an online traffic builder as a first purchase that increases the number

of additional purchases and results in a larger basket size. Per definition, a traffic builder can

only be the first product added to the online basket by a customer. However, not every first product that is added to the online basket can be considered a traffic builder. Only those first purchases, that increase the basket size, in terms of the number of unique products can be considered an online traffic builder. Therefore, a traffic builder has the characteristics that it is a first purchase and increases the basket size.

(12)

12

§2.2 Conceptual model

The conceptual model used in this thesis is provided in Figure 1. The expected relations are discussed in §2.3 to §2.6.

Figure 1. Conceptual model of an online traffic builder.

§2.3 Price

Researchers have reached consensus about the price being one of the most important determinants for customers in whether or not to buy a product at a specific store (Lichtenstein, Ridgway and Netemeyer 1993; Tellis 1988). Customers perceive the price as the amount of money that must be given up, and if this amount is higher, this negatively affects the purchase probability (Lichtenstein, Ridgway and Netemeyer 1993).

The fact that prices of products can and do fluctuate over time makes it necessary to define the price definition in this section. In this section, the non-promotional price of a product will be used. This means that the prices of products are used where no temporary reductions in price are applied. This does not mean that the product price cannot be lowered. Product prices fluctuate in the long run, for instance, based on competitive and strategic rationales or increases in products' cost price. Therefore, when in this section, a lower price is discussed, this indicates that the price is lower than the average non-promotional prices.

(13)

13 an offline store (Kukar-Kinney and Close 2010). This stresses the comprehensiveness of the pricing concept and the factors that affect different pricing strategies' effectiveness.

In the deep discounting strategy, higher-priced products are offered with larger discounts to increase traffic and higher sales (Gangwar, Kumar and Rao 2014). Following the reasoning about discounting the products that have higher monetary value could increase sales, this research expects that products that are higher priced are more likely to be a first purchase. Therefore, hypothesis 1a is proposed:

Hypothesis 1a: First purchases are more often higher-priced products than low-priced products.

Furthermore, the price fairness theory is expected to have an underlying effect on different shopping behaviours. Price fairness theory proposes that consumers evaluate whether a price is fair by using external reference prices. The perception of whether a price is fair or unfair stems from their assessment of whether the price is reasonable, acceptable, or justifiable (Homburg, Lauer and Vomberg 2019). The presence of utilitarian or hedonic products does also play a role in this perception. For utilitarian products, the differences in specific products are clearer, whereas differences for hedonic products are harder to estimate (Okada 2005). Thus, it is more reasonable to be able to make a proper fairness indication about utilitarian products. Given that customers for utilitarian products compare more and more websites (Li et al. 2020), products' prices seem highly important and could make the difference. The effect of the price fairness perception might emphasize the effect of lower prices. In addition to that, following the reasoning that prices play an instrumental role in choosing one webstore over another, lower prices can attract customers to their webstores. This thesis proposes the following hypothesis.

Hypothesis 1b: Lower priced products as first purchase increase the number of additional purchases, compared to high-priced products.

§2.4 Price Promotions

(14)

14 takes into account consumer behaviour, competitor actions, and supply in order to determine products prices (Fisher, Gallino and Li 2018), ‘everyday low price’ and ‘HiLo’ which account for having low prices always and offering temporarily deep discounts respectively (Bell and Lattin 1998). All these strategies are used to increase revenue for their particular situation, and most of these strategies are making use of price promotions. Though making use of these price promotions, ultimately, a retailer does not only want to sell the products that are discounted or on promotion. A retailer wants to make effective use of promotions to determine what the most profitable pricing strategy is. An important part of an effective pricing strategy is knowing what or if promotions actually increase the number of products bought by a customer.

Price promotions can increase sales substantially in the short run (Bijmolt, Van Heerde and Pieters 2005). Using multi-unit promotions even increases the sales more than single-unit promotion (Drechsler et al. 2017). That price promotions increase sales in the short run seems to be a positive assumption. In turn, Bijmolt, Van Heerde and Pieters (2005) state that the positive effect of these price promotions, in the long run, could be somewhat diminished as price promotions are mostly followed by post-promotional dips of the specific products. The effect that price promotions can increase sales substantially is interesting as this assumes that products that are on promotion are more likely to be the first purchase of a customer.

(15)

15 Hypothesis 2a: First purchases are more often promoted on price, compared to additional purchases.

Hypothesis 2b: The number of additional purchases is higher when first purchases are promoted on price, compared to non-promoted first purchases.

§2.5 Product Quantity

The amount a customer buys of one product could play a role in identifying characteristics of a traffic builder. Partly based on the quantity matching heuristic, which posits that respondents are more likely to choose an assortment when the number of options in the assortment matches their purchase quantity goals (Chernev 2008), this thesis tries to investigate the effect the volume of a product bought has on the number of additional purchases. According to the quantity matching heuristic, customers determine the place they purchase based on the store's offering and their intended quantity to buy (Chernev 2008). Purchasing a large volume of one product implies that the visited store matches their quantity goals and that customers intent to buy more products. Therefore, the product bought in a high volume could be the reason for the visit and might be more likely to be a first purchase. Further, as the quantity matching heuristic assumes that this high-volume purchase accompanies other product purchases, it might be expected that a product bought in high volume increases the number of additional purchases. Therefore, the following hypotheses are proposed:

Hypothesis 3a: First purchases are more often bought in higher volumes, compared to low volumes.

Hypothesis 3b: The number of additional purchases is higher when first purchases are bought in higher volume, compared to low volume first purchases.

§2.6 Product Variety

(16)

16 assumption, Mallapragrada, Chandukala and Liu (2016) state having a wider variety available, facilitates greater motivation for customers to purchase and subsequently spend more on purchases. Therefore, it can be stated that offering a wider variety of products would be beneficial for webstores. However, webstores can offer different products; products that themselves offer more variety, i.e. different colours and sizes, or products that only come in one size and/or colour. Having products that themselves offer the customer, a greater variety could extend the product offering and impact the customers' purchase behaviour. Furthermore, Berger, Draganska and Simonson (2007) found that customers, without having any brand preference, prefer the brand that offers more product variety, indicating that products offered in more variations are more preferred and might therefore be bought earlier compared to products offering less variation. Based on this and the idea that having a greater product variety increases the motivation to purchase more products and possibly spent more, this thesis hypotheses the following:

(17)

17

3. RESEARCH DESIGN

§3.1.1 Data

The used dataset is provided by a Dutch online construction materials retailer. The retailer used several webstores and provided data from six webstores. The retailer is selling its products solely in their webstore and does not have any physical stores. The provided dataset contains order data of the time period between March 2010 and August 2020. The dataset provides data about product prices, discount percentages, the number of products ordered and the type of product variation.

The provided data contained 181.858 observations. Each observation represents a specific product that is accounted to a specific order. Of course, outliers were present in the data and these needed to be handled accordingly. Many of these outliers were orders that are not possible to account to specific products and therefore needed to be removed. The total of removed observations is 1301, and the used strings to remove these outliers can be found in appendix A. Furthermore, 19 observations did contain a value of ‘0’ for the variable store_id, this ‘0’ did not indicate an actual webstore and these observations are therefore removed. As this research is interested in finding first purchases, the actual placed orders must be determined. The initial amount of unique orders was 51.151. However, not every order in the dataset was actually placed, and therefore a substantial amount of orders was needed to be deleted. If the variable qty_ordered was equal to the variable qty_invoiced an order is considered an actual placed order. This restriction reduced the number of unique orders to 31.498 and the number of observations to 70.936. Furthermore, as first purchases are one of the main concepts of interest, these first purchases need to be determined. In the data, products placed first in the order are considered first purchases as the ranking in the orders is based on the moment a product is placed in the online basket. The number of first purchases, logically, coincides with the number of orders and amounts to 31.498. What is less obvious is the number of first purchases that have been made with additional purchases as a result. The number of these first purchases amount to 16.743, and the maximum number of unique products per order is 46.

§3.1.2 Created and used variables

(18)

18 to increase the readability. Afterwards, in Table 2, an overview of the relevant basic statistics of these variables is provided. Ultimately, the operationalized versions of the variables used in the models are presented in Table 3.

Existing variables

product_id Indicates the unique, specified value of a specific product and is used to make the distinction between different products clearer

order_id, Indicates the number of the order wherein the product was purchased

price Represents the non-promotional price for which the product is bought

qty_ordered Indicates the ordered quantity of a specific product that is purchased

discount_percent Indicates the discount percentage a product received

product_type Represents the variety wherein that product is offered. Where ‘simple’ indicates that the product was sold without any variety, and ‘grouped’ indicates that a product was sold in more variety

Created Variables

pricecat Represents the created categories of the variable price. A value of 1, indicates prices between €0,00 and €5,49, a value of 2, indicates prices between €5,50 and €19,99, a value of 3, indicates prices between €20,00 and €64,99 and a value of 4 indicates prices of €65,00 or above

qty_orderedcat Represents the created categories of the variable qty_ordered. A value of 1 indicates a quantity between 1 and 2. A value of 2 indicates a quantity between 3 and 9. A value of 3 indicates a quantity between 10 and 29 and a value of 4 indicates a quantity of 30 and more

discount_percentcat Represents the created categories of the variable discount_percent. A value of 1 indicates a discount percentage between 0.00 and 0.99, a value of 2 indicates a discount percentage between 1.00 and 4.99 and a value of 3 indicates a discount percentage of 5.00 or more

product_typeBin A binary variable of product_type where ‘single’ is represented by a ‘0’ and ‘grouped’ by a ‘1’

number Represents the order in which the product is added to the basket. Where a four represents that the product is added to the basket as fourth

firstpurchase A binary variable, and a 0 represents that a purchased product was not a first purchase and a 1 represents that a product was a first purchase. This variable is created based on the sequence in the order id. The first product in the order id is considered the first purchase, and the other purchased products are considered additional purchases

additionalpurchase A numerical variable where the values represent the number of products that are added to the order after the specific product was added. For clarification: a product received a value of seven when the product itself was the 2nd

(19)

19 Quarter Represents seasonality. The value is created based on the date the order is

placed and received a value of 'Q1' if the order is placed in the first three months of the year. The quarters do not represent the seasonality perfectly as winter actually starts on the 21st of December, and quarter one starts on the 1st of January

basketsize Represents the total number of unique products in that specific product's placed order. The variable is created by taking the maximum value of the variable number per order id. Thus, an observation receives the value of that order’s basket size

totalbasketvalue Provides the total monetary value of the placed order of that specific product, including discounts. This variable is created by summing the purchase amount per order and subtracting it with the sum of discount amount

totalquantityordered Presents the total quantity of the products bought in a specific order

NumberFirstPurchases Counts the number of first purchases of a specific product id. The variable is created by summing firstpurchase per product id

NumberAdditionalPurchases Counts the times a purchase of a specific product resulted in additional purchases in that order

NumberPurchases Counts the number of purchases of a specific product id. The variable is created by counting the number of occurrences of a specific product

PercentageFirstPurchase Represents the percentage of the times a product is bought, that product is the first purchase

PercentageAdditionalPurchase, Represents the percentage of the times a product is bought; it resulted in

additional purchases

store.2.5.6 Dummy variable representing a ‘1’ if a product is bought in store 2, 5 or 6

Table 1: Existing and created variables.

(20)

20 variety, and 28732 observations were sold with variety, indicating that 40.5% of the products are sold 'grouped'. Lastly, in the second quarter, most products are sold.

Variable

Min Mean Max

price 0.05 44.0 1284.30 discount_percent1 _2.00 _7.48 _41.65 qty_ordered 1 17 5760 basketsize 1 2.26 46 totalbasketvalue 0,4 581 11541 additionalpurchase 0 1.26 45 NumberFirstPurchase 0 4.5 1050 NumberAdditionalPurchases 0 3.3 1383 NumberPurchases 1 10.3 1936 totalquantityordered 1 38 5760 PercentageFirstPurchase 0 0.49 1 PercentageAdditionalPurchase 0 0.273 1 0 1 firstpurchase 39438 31498 0 1 product_type 42204 28732 Single Grouped product_typeBin 42204 28732 0 1 Store.2.5.6 41299 29637 Q1 Q2 Q3 Q4 Quarter 14038 27943 18563 10392 Categorized variables 1 2 3 4 pricecat 18059 17978 17501 17398 discount_percentcat 64851 4606 1479 - qty_orderedcat 25850 21237 14902 8947

Table 2: Basic descriptive statistics.

Variable Description

Price

𝑝𝑟𝑖𝑐𝑒_! The non-promotional price of product j

𝑝𝑟𝑖𝑐𝑒𝑐𝑎𝑡_! The category of the non-promotional price of product j. Where 1 = €0,00 - €5,49, 2 = €5,50 - €19,99, 3 = €20,00 - €64,99, 4 = > €65 Price Promotions

𝑑𝑖𝑠𝑐𝑜𝑢𝑛𝑡_𝑝𝑒𝑟𝑐𝑒𝑛𝑡_! The amount of discount percentage of product j

𝑑𝑖𝑠𝑐𝑜𝑢𝑛𝑡_𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑐𝑎𝑡! The category of the discount percentage of product j. Where 1 = 0.0 –

1.0%, 2 = 1.0 – 5.0%, 3 = > 5.0%, Quantity Ordered

𝑞𝑡𝑦_𝑜𝑟𝑑𝑒𝑟𝑒𝑑_"! The number ordered of product j in basket i

𝑞𝑡𝑦_𝑜𝑟𝑑𝑒𝑟𝑒𝑑𝑐𝑎𝑡_"! The category of the number ordered of product j in basket i. Where 1 = 1 - 2, 2 = 3 - 9, 3 = 10 - 29, 4 = > 30

(21)

21

Product Variety

𝑝𝑟𝑜𝑑𝑢𝑐𝑡_𝑡𝑦𝑝𝑒! The type of product j: Single or Grouped

𝑝𝑟𝑜𝑑𝑢𝑐𝑡_𝑡𝑦𝑝𝑒𝐵𝑖𝑛_! The type of product j. Where 0 = Single and 1 = Grouped Seasonality

𝑄𝑢𝑎𝑟𝑡𝑒𝑟_! The quarter wherein product j is purchased Dummy variable

𝑆𝑡𝑜𝑟𝑒. 2.5.6_! Dummy variable for different stores. Where 1 = product j is purchased in stores 2, 5 or 6

Table 3: Operationalized variables used for model building.

§3.1.3 Different stores

The provided data contains six different webstores and, which was already shortly mentioned in section 3.1.1, store 0 represents 'orders' of the admin, as these orders do not represent actual orders, these observations are removed. Store 1 was mainly concerned with selling all kinds of bricks. Store 2 was focused on light tools and equipment, such as angle brackets and wall-cramps. Store 3 concentrated mainly on construction materials, like roof tiles and concrete bricks. Store 4 was more focused on wooden construction materials, such as wooden shelves and baseboards. Store 5 was focused on all kinds of different paving, and store 6 was concerned with all kinds of floor tiles.

As the stores do sell different products, differences in sales between the various stores are present. Table 4 shows the basic descriptives of the different stores. It also provides the values of the variables given that a product is more than 75% of the times purchases as a first purchase and given that a purchase of a product resulted in more than 75% of the times in an additional purchase, to indicate potential differences.

Store 1 Store 2 Store 3 Store 4 Store 5 Store 6

Mean Basket size - First Purchase > 75% - Additional Purchase >75% 3.39 3.27 4.79 1.47 1.45 4.71 2.55 2.67 5.23 2.88 2.8 4.46 1.85 1.70 3.76 1.86 2.01 3.72 Mean Basket value

- First Purchase > 75% - Additional Purchase >75% 363.65 388.63 411.84 57.45 58.21 92.43 730.53 729.73 925.43 315.15 395.06 356.59 579.22 531.82 737.78 522.54 556.25 617.65 Mean order quantity

- First Purchase > 75% - Additional Purchase>75% 66.84 62.41 62.06 7.95 8.01 14.95 39.67 52.35 109.42 26.62 20.28 37.39 36.19 30.64 65.52 28.98 34.00 46.54

Number of unique products sold 1383 1376 1226 407 2246 276

Number of orders - First Purchase > 75% - Additional Purchase >75% 2932 774 1291 1584 1397 86 11510 2953 963 711 225 200 13887 4062 331 874 359 39

(22)

22 Table 4 shows differences in the mean prices between stores, where store 2 shows to have the lowest mean basket value and store 3 the highest. Next to that store 3 and 5 do show to be the stores with the highest amount of orders. The most important observation is that stores 2, 5 and 6 show a mean basket size below the 2, whereas stores 1, 3 and 4 show a mean basket size above 2.5. This smaller mean basket size is also reflected in the number of orders per stores. Since the number of orders given that a product is, more than 75% of the times, a first purchase is relatively high for stores 2, 5 and 6 indicating that many orders in those stores do include just one product. On the contrary the number of orders given that a product is, more than 75% of the times, an additional purchase is relatively high for stores 1, 3 and 4 indicating that those stores sell more additional products.

To visualize these differences, Figure 2 plots the percentage of the times that a specific product is a first purchase against the percentage of the times that the purchase of a specific product resulted in an additional purchase, given the fact that a product is purchased more than five times, per store. In Figure 2, therefore, all products that are purchased more than five times are plotted. The figure shows that store 2, 5 and 6 indeed include more products that are more often bought as a first purchase and stores 1, 3 and 4 show that the products in those stores result more often in additional purchases. However, for store 3, this is less clear.

(23)

(24)

24 Figure 3: Mean percentage of first purchases and additional purchases of different stores.

§3.2.1 Data analysis

As this research is interested in identifying first purchases and the potential basket size of that customer. Two models needed to be built, one that inspects the utility of a customer purchasing a product first, and one that examines the utility of a customer of purchasing additional products. In the first model, a customer decides to purchase a product first if the utility of purchasing that product first is higher than the utility of not purchasing that product first. In the second model, regarding the additional purchases, the customer decides to purchase more additional products if the utility of purchasing more products is higher than the utility of purchasing fewer products. In this research, it is assumed that customers are striving to maximize their utility.

The dependent variables in these models are the first purchase and additional purchases. The created variable firstpurchase, discussed in section 3.1.2, is used as the first model's dependent variable. Regarding the additional purchases, the created variable

additionalpurchase is used as the dependent variable and is considered count data. The creation

(25)

25 Furthermore, as in the First Purchase model, all factors in the normal dataset affecting the probability of the dependent variable are needed, the dataset described in §3.1.1 is used for this model. However, for the Additional Purchase model, an adaptation of this dataset is needed. The impact of different aspects of first purchases needs to be assessed. Thus, the dataset needs to be filtered because it can only contain observations of actual first purchases. The dataset is filtered on the created variable firstpurchase. If an observation contained a '1' for this variable, this observation is included in the adapted dataset. Resulting in a dataset that encompasses 31498 observations.

To test hypotheses H1a, H2a, H3a and H4a a model is built with first purchase as the dependent variable. Since this DV is binary, and a logistic distribution is assumed, logistic regression is employed to build a logit model (Leeflang et al. 2015, p. 264-266).

For the second model that tests H1b, H2b, H3b and H4b, a model is built with additional purchases as the dependent variable. As the DV represents count data, this research employs a Negative Binomial model (Leeflang et al. 2015, p. 285-288). To conclude that a Negative Binomial model is the best model for the data, several steps have been made. In the first instance, it is assumed that a Poisson model could fit the data. To test for this, several violations have to be checked. One of these is to check for dispersion. Therefore, the overdispersion-test is performed. The results (p-value < 2 ∗ 10!"_{and 𝛼 = 0.795) show that overdispersion was}

(26)

26

§3.2.2 Model specification

The following section discusses the variables that are specified in the proposed models.

Product characteristics

Price. To build the model, this research argues that price is one of the main determinants for

increasing the first purchase utility. The price for every observation is included and is the non-promotional price of the product. This indicates that if a promotion is offered on that product, its price is not changed. Furthermore, price is categorized in four categories to investigate the effect of different price classes.

Price promotions. As price could be of importance, this research argues that promotions could

have an actual impact on the dependent variables. Therefore, in both models, the promotions variable is included. This variable represents the discount percentage that is offered on a specific product. Again, a categorized variable of this variable is created to see the impact of different discount classes.

Quantity ordered. The number of an ordered product is expected to influence the dependent

variables in this research and therefore, is incorporated in the models. Just as the previous variables, quantity ordered is also categorized to measure the impact and make more sound conclusions about the variable's effect.

Product variety. The variety of a product that is offered is expected to have an impact. The

variable product_type represents this variety. The variable is a factor with the levels ‘simple’ and ‘grouped’, where ‘simple’ indicates that only one product variation is offered and ‘grouped’ that more than one variation is offered. This variable is made binary to make the interpretation of the results clearer.

Control Variables

Seasonality. Besides the variables that represent the product characteristics, this research also

included a control variable: Seasonality. This variable is incorporated to control for possible influences of the season. The variable Quarter is included and represents the four seasons that are present in a year.

Store dummy. The dataset contains data from six webstores. To account for potential differences

(27)

27

Models

The logit model, that is created to analyze the characteristics of first purchase, includes

firstpurchase as a dependent variable and the explanatory variables in the model are the

variables that represent price, price promotions, quantity ordered, product variety, seasonality and the dummy variable.

The Negative Binomial model, that is created to analyze the effect of the characteristics of first purchase on the number of additional purchases, includes additionalpurchase as a dependent variable and the explanatory variables included in the model are the variables that represent price, price promotions, quantity ordered, product variety, seasonality and the dummy variable.

§3.2.3 Model comparison

First of all, testing for multicollinearity in the models resulted in Variance Inflation Factors (VIF) scores with values below 2. These values are provided in appendix B. This result gives reason to assume that multicollinearity is not an issue in the analysis.

To ensure that the models predict better than a model without any explanatory variables (null model), the models used in the analysis are compared to the null model. To test the models' fit, the dataset is randomly divided into a training and a test set. These training and test set are used to compute the hit rate and the Root Mean Squared Error (RMSE). The used measures for the R2_{in this thesis to indicate the explained variance are Nagelkerke, McFadden, and} CoxSnell. Furthermore, to determine the model fit of the different models, several diagnostics are used. Besides the AIC and BIC values and the R2_{values, the hit rate and log-likelihood are} used for the logit model. The RMSE is used next to the AIC, BIC and R2_{values for the Negative} Binomial model.

§3.2.4 Model development

(28)

28 values of the model to test if the model improved its quality. If the values increased, the model was not continued with, and another change to the model was made. The model with the lowest AIC and BIC values, and the most significant variables was used to estimate the results.

§3.2.5 Interpretation of the effects

To interpret the effects of the models, only variables that show significant effect are used. For the categorized continuous variables in the logit model, the marginal effects explain their effects. For the binary variables in the logit model, the odds ratio is used to explain the effects. The interpretation of the main effects for the different stores is different for the different types of variables. Regarding the continuous variables, the sum of the marginal effects of the main effects and the dummy variables are used to determine the dummy variable's main effect. For the binary variables, the exponent of the sum of the coefficient of the main effect and the dummy will be calculated to determine the main effect of the dummy variable.

(29)

29

4.RESULTS

§4.1.1 First Purchase Model

For the First Purchase model, several models have been made to check which model has the best relative quality. This is judged based on each model's AIC and BIC values, and several R2_{’s are used to analyze the explained variance. The first model (Model 1) included all the IV’s} discussed above and the main and interaction effects of the store dummy. The interaction effect of Quarter with the store dummy showed not to be significant and is removed in model 2. In model 2, only the dummy's interaction effect with the variable qty_ordered was not significant and therefore not used in model 3. All included variables in model 3 showed highly significant effects (p-value < 0.01), thus in trying to improve the model, the variable price is categorized in model 4. In model 5, the variable qty_ordered is categorized. The variable discount_percent is categorized in model 6. For the first time, the newly created model (model 6) did not improve the AIC and BIC values, and therefore model 6 is not continued with. No remaining variables were able to be categorized; therefore, no new models are created. Model 5 showed the best AIC and BIC values, just as the different R2_{values and is used as the final estimation model.} The steps taken in this process can be found in Table 5. The explanatory variables in the final model are pricecat, discount_percent, qty_orderedcat, product_typeBin, Quarter, store.2.5.6 and the interaction effects of store.2.5.6 with pricecat, discount_percent and product_typeBin.

Model fit

(30)

30

Variables Null

model

Model 1 Model 2 Model 3 Model 4 Model 5 Model 6

Price x x x Pricecat x x x discount_percent x x x x x discount_percentcat x qty_ordered x x x x qty_orderedcat x x product_typeBin x x x x x x Quarter x x x x x x store.2.5.6 x x x x x x price * store.2.5.6 x x x pricecat * store.2.5.6 x x x discount_percent * store.2.5.6 x x x x x discount_percentcat * store.2.5.6 x qty_ordered * store.2.5.6 x x product_typeBin * store.2.5.6 x x x x x x Quarter * store.2.5.6 x AIC 97430 92637 92634 92635 91892 91572 91654 BIC 97439 92783 92753 92745 92002 91682 91764 Nagelkerke𝑅! ₀ _0.088 _0.088 _0.088 _0.101 _0.107 _0.105 McFadden𝑅! ₀ _0.050 _0.050 _0.049 _0.057 _0.060 _0.060 CoxSnell𝑅! ₀ _0.066 _0.066 _0.066 _0.075 _0.080 _0.078

Table 5: Model development of different First Purchase models.

§4.1.2 Additional Purchases Model

(31)

31 quality of the model. Thus model 5 showed the best AIC and BIC values, just as the different R2_{’s values and is used as the final model for estimation. The steps taken in this process can be} found in Table 6. The explanatory variables in the final model are pricecat, discount_percent,

qty_orderedcat, product_typeBin, Quarter, store.2.5.6 and the interaction effect of store.2.5.6

with discount_percent, qty_orderedcat and product_typeBin.

Model fit

To test whether the final model (Model 5) fits better than the null model. The test set is used to compute the Root Mean Squared Error (RMSE). The RMSE of the final model is lower (1.978) than the RMSE of the null model (2.028). This indicates that Model 5 is able to predict better than the null model. Thus, Model 5 has a better model fit than the null model. The R2_{’s values} are the highest for the final model, indicating that the model explains more of the variance in the data than the other models. However, the R2_{’s are relatively low, indicating that many other} factors still play a role in determining what impacts the additional purchase decision.

Variables Null model Model 1 Model 2 Model 3 Model 4 Model 5 Model 6

price x x x x pricecat x x discount_percent x x x x x discount_percentcat x qty_ordered x x x qty_orderedcat x x x product_typeBin x x x x x x Quarter x x x x x x store.2.5.6 x x x x x x price * store.2.5.6 x x pricecat * store.2.5.6 discount_percent * store.2.5.6 x x x x x discount_percentcat * store.2.5.6 x qty_ordered * store.2.5.6 x x x qty_orderedcat * store.2.5.6. x x x product_typeBin * store.2.5.6 x x x x x x Quarter * store.2.5.6 x AIC 97381 92805 92802 92800 92641 92201 92312 BIC 97398 92947 92919 92909 92750 92310 92421 Nagelkerke𝑅! ₀ _0.143 _0.142 _0.142 _0.147 _0.160 _0.156 McFadden𝑅! ₀ _0.047 _0.047 _0.047 _0.049 _0.053 _0.052 CoxSnell𝑅! ₀ _0.136 _0.136 _0.136 _0.140 _0.152 _0.149

(32)

32

§4.2.1 Estimating First Purchases

The effects of the independent variables on the first purchase decision are reported in Table 7.

First Purchase

Coefficient SE Marginal effects Odds Ratio

Price 𝑝𝑟𝑖𝑐𝑒𝑐𝑎𝑡! 0.442∗∗∗ (0.011) 0.109∗∗∗ 1.555∗∗∗ 𝑝𝑟𝑖𝑐𝑒𝑐𝑎𝑡_! ∗ 𝑠𝑡𝑜𝑟𝑒. 2.5.6 −0.174∗∗∗ _(0.017) _−0.043∗∗∗ _0.840∗∗∗ Price Promotions 𝑑𝑖𝑠𝑐𝑜𝑢𝑛𝑡_𝑝𝑒𝑟𝑐𝑒𝑛𝑡_! 0.045∗∗∗ _(0.005) _0.011∗∗∗ _1.046∗∗∗ 𝑑𝑖𝑠𝑐𝑜𝑢𝑛𝑡_𝑝𝑒𝑟𝑐𝑒𝑛𝑡_! ∗ 𝑠𝑡𝑜𝑟𝑒. 2.5.6 −0.054 ∗∗∗ _(0.006) _−0.013∗∗∗ _0.948∗∗∗ Quantity Ordered 𝑞𝑡𝑦_𝑜𝑟𝑑𝑒𝑟𝑒𝑑𝑐𝑎𝑡_! 0.187∗∗∗ _(0.009) _0.046∗∗∗ _1.205∗∗∗ Product Variety 𝑝𝑟𝑜𝑑𝑢𝑐𝑡_𝑡𝑦𝑝𝑒𝐵𝑖𝑛_! _−0.410∗∗∗ _(0.024) _−0.100∗∗∗ _0.664∗∗∗ 𝑝𝑟𝑜𝑑𝑢𝑐𝑡_𝑡𝑦𝑝𝑒𝐵𝑖𝑛_! ∗ 𝑠𝑡𝑜𝑟𝑒. 2.5.6 0.177 ∗∗∗ _(0.039) _0.044∗∗∗ _1.193∗∗∗ Seasonality 𝑄𝑢𝑎𝑟𝑡𝑒𝑟_!𝑄2 −0.078∗∗∗ _(0.022) _−0.019∗∗∗ _0.925∗∗∗ 𝑄𝑢𝑎𝑟𝑡𝑒𝑟!𝑄3 0.067∗∗ (0.024) 0.017∗∗ 1.069∗∗ 𝑄𝑢𝑎𝑟𝑡𝑒𝑟_!𝑄4 0.088∗∗ _(0.027) _0.022∗∗ _1.092∗∗ Dummy 𝑠𝑡𝑜𝑟𝑒2.5.6_! 1.183 ∗∗∗ _(0.055) _0.286∗∗∗ _3.267∗∗∗ AIC 91572 BIC ₉₁₆₈₂ *: p < 0.05 **: p < 0.01 ***: p < 0.001

Table 7: Effect of IV’s in the First Purchase model.

(33)

33 increases the probability of a first purchase with 6.6%, given that the product is bought in store 2, 5 or 6.

H2a is partially supported, as price promotion shows a positive, significant effect on the probability of a first purchase. This indicates that if the discount percentage increases, the probability of a first purchase becomes higher. Increasing the discount percentage marginally, i.e. an increase in the discount from 2 percent to 3 percent, increases the probability of a first purchase by 1.1 percentage points. However, this positive effect is not present for products bought in store 2, 5 or 6. When the discount percentage increases marginally, the probability of a first purchase decreases with 1.1 - 1.3 = 0.2 percentage points, given that a product is bought in store 2, 5 or 6.

In H3a this thesis posited that a first purchase is more often bought in higher volumes. This hypothesis is supported by the data as the categorized variable of the quantity ordered shows a positive, significant effect on the probability of a first purchase. The analysis shows that if the quantity ordered category increases marginally, i.e. changing from category 1 to 2, the probability of a first purchase increases by 4.6 percentage points.

Contrary to the other hypotheses regarding the probability of a first purchase, H4a is not supported by the analysis. The variable product_type shows a negative, significant effect on the probability of a first purchase. Based on the odds ratio, the analysis shows that when a product is a ‘grouped’ product, the probability of a first purchase versus not a first purchase decreases with 33,6%. This effect, however, is mediated by the different stores. As the exponent of the sum of the coefficients is exp(-0,410+0.177) = exp(-0.233) = 0.792, which decreases the probability with 20.8% when a product is a ‘grouped’ product compared to a ‘single’ product, given that the product is bought in store 2, 5 or 6.

Control Variables

(34)

34

§4.2.2 Estimating Additional Purchases

The effects of the independent variables on the additional purchase decision are reported in Table 8. Additional Purchase Coefficient SE Exp(coefficient) Price 𝑝𝑟𝑖𝑐𝑒𝑐𝑎𝑡_! _−0.333∗∗∗ _(0.009) _0.717∗∗∗ Price Promotions 𝑑𝑖𝑠𝑐𝑜𝑢𝑛𝑡_𝑝𝑒𝑟𝑐𝑒𝑛𝑡_! _−0.048∗∗∗ _(0.004) _0.953∗∗∗ 𝑑𝑖𝑠𝑐𝑜𝑢𝑛𝑡_𝑝𝑒𝑟𝑐𝑒𝑛𝑡_!∗ 𝑠𝑡𝑜𝑟𝑒. 2.5.6 0.046∗∗∗ _(0.006) _1.048∗∗∗ Quantity Ordered 𝑞𝑡𝑦_𝑜𝑟𝑑𝑒𝑟𝑒𝑑𝑐𝑎𝑡_! −0.206∗∗∗ _(0.011) _0.814∗∗∗ 𝑞𝑡𝑦_𝑜𝑟𝑑𝑒𝑟𝑒𝑑𝑐𝑎𝑡_!∗ 𝑠𝑡𝑜𝑟𝑒. 2.5.6 _0.050∗∗∗ _(0.015) _1.051∗∗∗ Product Variety 𝑝𝑟𝑜𝑑𝑢𝑐𝑡_𝑡𝑦𝑝𝑒𝐵𝑖𝑛! 0.374∗∗∗ (0.022) 1.454 ∗∗∗ 𝑝𝑟𝑜𝑑𝑢𝑐𝑡_𝑡𝑦𝑝𝑒𝐵𝑖𝑛_!∗ 𝑠𝑡𝑜𝑟𝑒. 2.5.6 −0.265∗∗∗ _(0.033) _0.767∗∗∗ Seasonality 𝑄𝑢𝑎𝑟𝑡𝑒𝑟_!𝑄2 _0.072∗∗∗ _(0.020) _1.074∗∗∗ 𝑄𝑢𝑎𝑟𝑡𝑒𝑟!𝑄3 −0.071∗∗ (0.022) 0.931∗∗ 𝑄𝑢𝑎𝑟𝑡𝑒𝑟_!𝑄4 −0.087∗∗∗ _(0.025) _0.912∗∗∗ Dummy 𝑠𝑡𝑜𝑟𝑒2.5.6! −0.826∗∗∗ (0.034) 0.438∗∗∗ AIC 92201 BIC 92310 *: p < 0.05 **: p < 0.01 ***: p < 0.001

Table 8: Effect of IV’s in the Additional Purchase model.

The results support H1b as the price category showed to have a negative, significant effect. This indicates that higher prices decrease the number of additional purchases. Thus, lower-priced first purchases show to have the highest expected number of additional purchases. The exponents of the coefficients are used to examine the effect of the variable. If the price category increases with one (keeping all other variables in the model constant), the number of additional purchases is multiplied by exp(-0.333) = 0.717, hence it decreases with 28.3%.

(35)

35 effect of discount_percent. Thus, the effect for store 2, 5 and 6 is the multiplication with the exponent of -0.048 + 0.046 = -0.02 = exp(-0.02) = 0.980, hence, if the discount percentage increases with one (keeping all other variables in the model constant) the expected number of additional purchases for store 2, 5 and 6 decreases with 2.0%

The results do not support H3b; the categorized variable of the quantity ordered shows a negative, significant effect indicating that it decreases the number of additional purchases. If the quantity ordered category increases with one (keeping all other variables in the model constant), the number of additional purchases is multiplied by exp(-0.206) = 0.814, hence it decreases with 18.6%. The effect is moderated for the stores 2, 5 and 6. The effect for store 2, 5 and 6 is calculated in the same manner as in H2b and is exp(-0.156) = 0.856. Thus, the expected number of additional purchases for store 2, 5 and 6 decreases with 14.4% if the category of the quantity ordered increases with one (keeping all the other variables in the model constant)

H4b is supported by the data. The coefficient shows to have a positive significant effect on the number of additional purchases. If the product is sold as a ‘grouped’ product, rather than a ‘simple’ product (keeping all others variables constant), the expected number of additional purchases is multiplied by exp(0.374) = 1.454. Hence, the expected number of additional purchases increases with 45.4%. The effect, however, is moderated by the presence of the different stores. The effect of the product type for the stores 2, 5 and 6 is calculated in the same manner as previous calculations. Therefore, the expected number of additional purchases for store 2, 5 and 6 is multiplied by exp(0.109) = 1.115, hence it increases with 11.5% if the first purchase is a ‘grouped’ product, rather than a ‘simple’ product (keeping all the other variables in the model constant).

Control variables

(36)

36

5.CONCLUSION

§5.1 Discussion

In this thesis, the concept of an online traffic builder is posited, and the identification of different aspects of an online traffic builder is a central concept. The main research question pertained to identifying first online purchases and identifying product characteristics of first purchases that increase the number of products bought. To answer this question, two research questions were drawn up; RQ1, which was concerned with the identification of first purchases, and RQ2, which examined the effects of different characteristics of first purchases on the number of unique products in the basket. Therefore, this thesis investigated the relations between product characteristics as price, price promotions, product variety and product volume, and the probability of a first purchase and the number of additional purchases employing a logit model and a Negative Binomial model, respectively. These models are tested on a dataset that encompasses more than 10 years of data provided by a Dutch online construction materials retailer. Several hypotheses are proposed, and Table 9 presents an overview of whether the proposed hypotheses are supported or not.

Hypothesis Hypothesized direction Supported?

Price

1a: First purchases are more often higher-priced products than low-priced products

+ Yes

1b: Lower priced products as first purchase increase the number of additional purchases, compared to high-priced products.

- Yes

Price Promotions

2a: First purchases are more often promoted on price,

compared to additional purchases + Partially

2b: The number of additional purchases is higher when first purchases are promoted on price, compared to non-promoted first purchases

+ No

Product Quantities

3a: First purchases are more often bought in higher volumes, compared to low volumes

+ Yes

3b: The number of additional purchases is higher when first purchases are bought in higher volume, compared to low volume first purchases.

+ No

Product Variety

4a: First purchases are more often products which are offered in more variations

+ No

4b: The number of additional purchases is higher when the first purchase is a product offered in more variations.

+ Yes

(37)

37 The most important findings in answering the first research question, which concerned the characteristics of first purchases, is the effect of product prices and product volumes. Although the effect differs for different stores, hypotheses 1a is supported by the data, resulting in the fact that higher-priced products are more likely to be a first purchase. Indicating that customers in their customer journey will add the higher-priced product to their basket first is more likely than adding lower-priced products first. Another important finding regarding first purchases' characteristics is that products purchased in higher volume are more often bought as a first purchase than additional purchases. Therefore, the results indicate that customers will be more likely to add the product, of which they intend to buy higher volumes, as first to their shopping basket. As the thesis hypothesized more possible effects regarding the probability of a first purchase, there are also proposed effects that are not supported or partially supported by the research in this thesis. Hypothesis 2a is partially supported, as the results indicated that the effects are different for various stores. The results show that the effect is slightly negative for the stores 2, 5 and 6, whereas the effect for the stores 1, 3 and 4 is positive. As in the stores 2, 5 and 6, the presence of first purchases is higher, indicating that these stores' basket size is smaller than those of store 1, 3 and 4. Thus, the effect of discounts does not show to be present in stores where the number of unique products per order is relatively low, whereas, in stores where the number of unique products per order is higher, the discount seems to have a positive effect on the probability of being a first purchase. However, the discount effect does not show to be that substantial. The thesis failed to find support for the hypothesis that products that offer, by themselves, more variety for the customer are more likely to be bought as first by the customer. Indicating that having more variety of choices per specific product does not influence a customer to purchase that product first. On the contrary, the results show that offering more variety per product decreases the probability of adding that product first to their basket. Consequently, providing an answer to RQ1, this thesis identified two characteristics that increase the probability of a product being a first purchase; higher product prices and higher purchase volume.

(38)

38 product first are more likely to purchase additional products at that retailer than customers who bought a higher-priced product first. Another important finding regarding the second research question is that this thesis's results suggest that the first purchases offered in more variety increase the number of additional purchases. Indicating that having greater choice variety at the first purchase level impacts the customer, as such that the customer decides to purchase more additional products. Again, not all hypotheses in this thesis regarding the expected number of additional purchases found support. The proposed effect of price promotions on the number of additional purchases was not supported and, conversely, shows a decreasing effect on the basket size (in terms of unique products). Just as the proposed effect of price promotions, the proposed effect of a first purchase bought in higher volumes does not seem to be present. The opposite effect seems true, indicating that if a first purchase is bought in high volumes, the customer is less likely to purchase more additional purchases.

To answer the main research question, the conclusions drawn from the other research question can be combined. Resulting in the fact that this thesis found that products that are higher priced and are bought in higher volumes are more likely to be bought as a first purchase. First purchases that offer more choice for the consumer and are lower priced increase the probability of a larger basket size in terms of unique products. A combination of these characteristics could actually show the desired online traffic builder effects. Indicating that a product that is of a higher monetary value and offers variety to the customer can express an online traffic builder’s effect. However, based on this research, this is hard to conclude as the characteristics show opposite effects on the probability of first purchases and additional purchases. Causing the situation wherein one characteristic increases the probability of a being a first purchase, and the other decreases the probability of being a first purchase. The result shows that these effects are moderated by the different stores in which the products are bought, indicating that certain types of products and stores could show greater effects if those products have both characteristics, than other types of products. Consequently, the perfect balance between the characteristics and the product type needs to be found to realize the potential traffic builder effect.

§5.2.1 Theoretical implications

(39)

39 that probability are identified. However, combining these characteristics to achieve the online traffic builder's desired effect is hard as the products' characteristics are likely to have opposite effects for both aspects. Therefore, actual online traffic builders are not identified. However, this thesis's findings add to the understanding of these online traffic builders' potential existence.

Furthermore, this thesis confirms that variety in the product offering can be beneficial for online retailers and adds to the literature that the variety offered by a product itself can increase the number of products bought by the customer. Indicating that a product that is offering a wider variety at its product page can have important implications regarding that retailer's sales.

Lastly, this thesis's results add to the understanding of a customer's path in the customer journey. This thesis indicates that customers are more likely to add products with a higher monetary value to their basket first. Next to that, products that are more often bought in higher volumes are more likely to be purchased first. Both of these aspects have negative effects for the expected number of additional purchasing, indicating that both products with higher monetary value, and products bought in higher volumes are likely to be the customers’ main reason of visiting the website.

§5.2.2 Managerial implications

This thesis provides some insights that might benefit online retailers. As higher-priced products are more likely to be first purchases and products that themselves offer more variety for the customers, products that have a combination of both could be advertised on by an online retailer. As these products are more likely to be purchased first and could increase the additional purchases, advertising on the products could increase the advertising effectiveness of the online retailer. However, the characteristics can show opposite effects. Thus, an online retailer needs to find the right balance between these aspects.

(40)

40 and advertising on those products that are often bought in larger quantities, to attract that specific customer could be a wise thing to do for a retailer to maintain and increase its sales.

Another relevant finding in this research for online retailers is the small effect or almost no effect of discounts. Although discount did show a significant positive effect for certain stores, this effect was small, and for the other stores, the effect was even negative. This results in the implication that online retailers trying to realize the effect of an online traffic builder should not focus on discounting its products. Both aspects of online traffic builders seem not to be, substantially, affected by the discount offered on the products.

§5.3 Limitations

In this thesis, the product variety is only defined in terms of a product offering having variety or not having variety. However, the effect of the number of different sizes and colours of specific products is not researched. The presence of much variety is not researched, and the optimal amount of variety offered is not determined. Many different sizes and colours can show different results, which this thesis did not succeed to do so.

(41)

41 The effect of discounts showed only to have a little effect on determining a first purchase. In this research, the maximum discount percentage offered just exceeded 40% once, and only once, whereas these discount percentages can range, of course, to higher amounts. Thus, the effect has only been tested to a limited amount, and the higher discount percentage could show other effects.

§5.4 Future research

Follow-up research is needed to be able to identify actual online traffic builders. As this research is one of the first to highlight the potential existence of online traffic builders and identified several characteristics that affect an online traffic builder's building blocks, future research needs to focus on confirming these findings and identifying other characteristics. Research focusing on data that contains many different product categories can be beneficial for theoretical and practical purposes. Analyzing those databases could be the first step in further understanding the characteristics of online traffic builders. Knowing which product categories potentially show the desired effect can be precious for retailers and can help science to discover the main rationale for a product to be an online traffic builder.

(42)

42

REFERENCES

Bell, David R. and James M. Lattin (1998), “Shopping behavior and consumer preference for store price format: Why "large basket" shoppers prefer EDLP,” Marketing Science, 17(1), 66-88.

Berger, Jonah, Michaela Draganska and Itamar Simonson (2007), “The influence of product variety on brand perception and choice,” Marketing Science, 26(4), 460-472.

Bijmolt, Tammo H.A., Harald J. Van Heerde and Rik G.M. Pieters (2005), “New Empirical Generalizations on the Determinants of Price Elasticity,” Journal of Marketing Research, 42(2), 141-156.

Chang, Chingching (2011), “The Effect of the Number of Product Subcategories on Perceived Variety and Shopping Experience in an Online Store,” Journal of Interactive Marketing, 25, 159-168.

Cheng, Andong and Cynthia Cryder (2018), "Double Mental Discounting: When a Single Price Promotion Feels Twice as Nice," Journal of Marketing Research, 55(2), 226-238.

Chernev, Alexander (2008), “The Role of Purchase Quantity in Assortment Choice: The Quantity-Matching Heuristic,” Journal of Marketing Research, 45, 171-181.

Cramer-Flood, Ethan (2020), “US Retail Digital Ad Spending 2020,” (accessed December 9, 2020, [available at: https://www.emarketer.com/content/us-retail-digital-ad-spending-2020].

De Haan, Evert, Thorsten Wiesel and Koen H. Pauwels (2015), “The effectiveness of different forms of online advertising for purchase conversion in a multiple-channel attribution framework,” International Journal of Research in Marketing, 33, 491-507.

———, Pallassana K. Kannan, Peter C. Verhoef and Thorsten Wiesel (2018), “Device Switching in Online Purchasing: Examining the Strategic Contingencies,” Journal of

Identifying the characteristics of online traffic builders: The importance of first purchases