ONLINE SHOPPING

(1)

ONLINE SHOPPING

The effect of email marketing on customer churn.

By

Joep Schaap

University of Groningen

Faculty of Economics and Business

(2)

A

BSTRACT

The effect of email marketing on customer churn.

Expenditures in the email marketing by companies are still rising even tough the

digital marketing is changing rapidly and customers are contacted through more and

more channels. Through the greater accessibility of the digital market churning

becomes easier by the day. In this paper, the effectiveness of email marketing in

preventing customers to churn is studied. A logistic regression with the average

product price as a moderator for email marketing and customer churn and with

average order quantity as an extra explanatory variable is applied. The results show

that the effect of email marketing in preventing customers to churn is changing. The

amount of emails opened by a customer at the start of their relationship with the

company increases the probability to churn. Possible explanations for this affect are

personal relevance and the amount of emails sent.

(3)

(4)

1. I

NTRODUCTION

Internet becomes more and more important for companies since the e-commerce

market is still growing. In 2020 global e-commerce sales will exceed the 3.5 trillion

dollars, which accounts for 12% of the sales worldwide (Lindner, 2015). The growth

of the e-commerce market also provokes overseas shopping. According to EMarketer

(2018) 57% of the worldwide online shoppers purchases from overseas retailers. The

e-commerce market bursts of competitors and also of customers. Retaining old

customers and gaining new customers thus becomes more of challenge each day.

Preventing customers to churn is top priority for most companies since it is tied to

profitability and value (Neslin et al., 2006). According to Bensoussan et al. (2014)

phone companies report a churn rate of 2% each month, while pay-TV firms report a

churn rate of 1% per month (Green, 2016). This results in a cost for the companies of

10 billion dollars annually (Ascarza et al., 2016). Likewise, companies can increase

the average net present value of a customer up to 95% by boosting the retention rates

with 5% (Kim et al., 2013). Attracting new customers can be very expensive (Lin &

Wang, 2006). Acquiring new customers may sometimes cost five times more than

retaining old customers (Bhattacherjee, 2001). These studies show how important

customer retention is for companies, especially for telecom providers (subscriptive

companies).

Compared to the offline market, the amount of interactions with a customer is limited

for online firms. With limited amount of interactions with a customer in the online

market, email becomes vital as a communication tool in customer relation

(5)

for achieving repeat traffic in 2005. Reimers et al. (2016) concluded that permission

based email marketing provides traffic and return visits to a specific website.

Approximately 1,5 billion dollars was spent on email marketing in 2011 (Kim et al.,

2011). In 2017 2.81 billion dollars was spent on email marketing globally (Scott,

2018). This huge growth of dollars spent on email marketing indicates the huge role

email marketing still has for companies.

The email statistics report (2018) state that worldwide the amount of email users will

peak 3.8 billion in 2022. Half of the population of the globe will have an email

account and thus can potentially be reached through email. Thus, email marketing is

still a very important tool for companies and it importance seems to be increasing by

the years.

The average order quantity is studied as an explanatory variable for customer churn to

assess insights for whom to target with retention programs. Volume segmentation is

considered an important segmenting variable in marketing since heavy and light

shoppers behave differently (Chiou & Pan, 2009) (Spencer, 2010).

(6)

The following research question is formed to study the effect of email marketing,

price and quantity on customer churn: how is email marketing affecting customer

churn?

A sample of 130.000 customers from the database of Alleeninkt.nl is conducted.

Subsequently a logistic regression is applied to study the effect of email marketing on

customer churn. The choice to study this topic for this specific company has several

reasons. First, Alleeninkt.nl has actual sales data available of the past two year, which

enables to study actual email marketing effectiveness instead of reported or intended

effectiveness. Second, they have sent email randomly to their customer base, which

ensures that any differences between and within customers are not systematic at the

outset of the experiment.

Email marketing is split in the amount of opens and the amount of clicks. The results

suggest that email marketing still significantly influences customer churn. A

(7)

2. C

ONCEPTUAL FRAMEWORK

The conceptual model is displayed in figure 1. The model studies the effect of average

product price on the relationship between email marketing and customer churn. The

average order quantity is also added as an explanatory variable for customer churn.

The main focus in this model is on the effect of email marketing as an explanatory

variable of the customer churn, because multiple studies have shown that email

marketing is still one of the most important cornerstones of online companies.

(8)

2.1 D

EPENDENT VARIABLE

:

C

USTOMER CHURN

The growth of the commerce market is in line with the accessibility of the

e-commerce market that becomes greater each day. Therefore it gets easier for

customers to switch between companies. Switching between companies is referred to

as churning (Ascarza, 2018). Customer churn can have a negative effect on profit,

erode price premiums and increase acquisition costs (Athanassopoulos, 2000).

Churning customers occurs in contractual settings and also in non-contractual

settings. In contrast to contractual businesses, non-contractual businesses are unaware

of their customers being churned since the customers are not bonded to a contract.

The customers can buy the product any time at any company they want (Fader &

Hardie, 2009). Therefore modeling customer churn in a non-contractual setting is

considered a challenge (Fader & Hardie, 2009). Since this is considered both as a

challenging and profitable for a company, this study focuses on customer churn in

non-contractual businesses.

In other studies customer churn is defined as the percentage of customers who change

provider at their convenience and without notification (Radosavljevik et al., 2010).

Since this study is focused on individual level data, customer churn is defined as the

probability that a customer is churned.

2.2 I

NDEPENDENT VARIABLE

:

EMAIL MARKETING

(9)

seller-2003)(Reimers et al., 2016). Already quit some research in the field of email

marketing is conducted. However almost all studies performed in the field of email

marketing and churn rate are older than ten years. In a study performed by DuFrene et

al. (2005) email marketing is found to be one of the most effective tools for achieving

repeat traffic after a first-time visit. However with new devices like the iWatch and

Google Glass, more possibilities for online direct communication arise. Companies

thus have access to more channels, which may affect the role of email marketing.

Therefore, these studies are not considered representative for the effect of email

marketing on customer churn in the current digital market.

Another factor that could have changed the role of email marketing is the greater

accessibility for online shops overseas. This accessibility has resulted in a rapid

expansion of cross-border e-commerce (Huang & Chang, 2017). The competition for

online shops becomes fiercer since the reach of an online shop is growing. Customers

thus have a higher amount of online shops where they can buy their products from.

The effect and importance of direct communication tools as email marketing may thus

change. With the, prior stated, rapid change of the e-commerce marketing it is

considered important to include email marketing in this study.

In this study email marketing is split into two variables, namely opens and clicks.

Opens is defined as the amount of emails that a recipient has opened and clicks is

defined as the amount of links, provided by the emails, a recipient has clicked on. A

study conducted by Sahni et al. (2018) argues that a higher amount of emails opened

leads to more sales from the already existing customers. Therefore it is hypothesized

that:

(10)

Micheaux (2011) measures the success of email newsletters in terms of the

click-through rate. This measure captures the clicking behavior of the consumer. Search

advertising platforms rely on per-click models (Lee et al., 2018). These models are

based on the assumption that more clicks leads to higher sales for the advertiser.

Therefore, it is hypothesized that:

H2: The amount of clicks negatively influences the customer churn

2.4. I

NDEPENDENT VARIABLE

:

A

VERAGE ORDER QUANTITY

This study defines average order quantity as the average amount of products that a

customer buys per order. A study conducted by Chiou & Pan (2009) argues that heavy

shoppers lay more attention to the trust they have in an online shop. Light users are

considered less involved in online shopping and heavy shoppers are thus considered

more loyal (Sur, 2015)(Wang et al., 2006). One of the factors forcing this behavior is

that the switching costs for heavy shoppers are much higher than for light shoppers

(Kim et al., 2001). It thus can be argued that heavy shoppers are less likely to churn.

High-involved customers are argued to be more brand loyal than low-involved

customers (Ferreira, 2015). According to Holmes et al. (2013) customers are more

involved when the price of a transaction is higher. Since the overall transaction price

will become higher when the order consist out of more products, it is considered that

people will get more involved in the transaction. This suggests that people with a

higher average order quantity are less likely to churn. And therefore it is argued that:

(11)

2.3. M

ODERATOR

:

A

VERAGE PRODUCT PRICE

According to McConnell (1968) customers develop more loyalty for higher priced

products, which may result in higher tolerance for email marketing. However this is

study is very old and therefore not representative for the current market. Therefore

including the average product price in this study is of interest. The average product

price is defined as the average price a customer paid for buying a product (i.e., if

someone buys one product for two euro’s and one product for eight euro’s, the

average product price is five euro’s.).

According to Mishra et al. (2012) the consumer perception of sales promotion is

linked with price since sales promotion is considered as a reduction in price or as a

reduction in the amount of resources spent by the consumer. Price is very influential

in generating brand loyalty (Sohail et al., 2017) (Mishra et al., 2012). For more

expensive products a price promotion is argued to be more effective in generating

loyalty (luxury goods excluded). Therefore it is argued that cheaper prices should lead

to higher loyalty.

Osuna et al. (2016) argues that customers who buy products of brands with a higher

relative price (for example, high price in absolute terms with respect to the average

price of the category) are more likely to redeem a coupon. Since price is related to

brand loyalty and to promotional incentives, it is hypothesized that:

H4: The average product price positively influences the effect of the amount of opens

on the customer churn

(12)

H5: The average product price positively influences the effect of the amount of clicks

on the customer churn

It implies: A higher average product price will strengthen the relationship between the

amount of clicks and customer churn

The direct effect that the average product price has on customer churn, is

hypothesized as follows:

H6: The average product price has a positive effect on customer churn

(13)

3. D

ATA DESCRIPTION

The empirical study is focused on the effect that email marketing and average order

quantity has on the customer churn. Sales data and email data are required for the

empirical analyses. Therefore a customer database and an email database are

employed. Sales data is extracted form the Alleeninkt.nl customer database and the

Alleeninkt.nl email database provides data for emails sent to its customers. This data

together includes email, emails opened, emails clicked, product id, date, price of the

product, customer id, title of the product and quantity.

S

AMPLE

The sample (observation period) used for analysis contains all the customers from the

1

st

of October 2018 until the 19

th

of April 2019. The sample is based on this

timeframe since the email data available is only since October 2018. If customer data

from before October 2018 is used the effect of email marketing in that period cannot

be captured. The sales data available is from 3 May 2017 until 19 April 2019. The

first transaction of a customer could have been before 3 May 2017, but this is not

captured. Therefore each customer that have not transact with the company since 3

May 2017 until 30 September 2018 is considered a new customer.

(14)

can be divided by the amount of emails they may have received. For example, a

customer that have been with the company for two years has twice as many emails

received than if he had been with the company for one year. The period from 3 May

2017 until 30 September 2018 is divided into two periods to capture that effect. This

results in three cohorts: cohort 1 with customers that have been with the company

since the 1

st

of October 2018, cohort 2 with customers that have joined the company

between 15 January 2018 and 30 September 2018 and cohort 3 with customers that

joined the company between 3 May 2017 and 14 January 2018. The emails used for

analyses contain solely promotional incentives.

Table 1 displays the descriptives from cohort 1, table 2 contains the descriptives from

cohort 2 and table 3 shows the descriptives from cohort 3.

On average a customer in cohort 1 spends 33.38 euro’s, buys 1.54 products per order

and makes 1.063 transactions during the observation period. A customer in cohort 2

spends on average 39.68 euro’s, buys 1.78 products per order and makes 1.13

TABLE 1: DESCRIPTIVES COHORT 1 TABLE 2: DESCRIPTIVES COHORT 2

(15)

euro’s, buys 1.86 products per order and makes 1.14 transactions on average in the

observation period. On average, customers in cohort 2 and cohort 3 opens almost

three times more emails and click even more than 5 times more on links than

customers do in cohort 1 during the observation period. This difference is probably

because customers in cohort 1 could have become customer at the end of the

observation period. So these customers have not received many emails yet.

Appendix A includes some descriptives for customers of cohort 2 and cohort 3. On

average, customers from cohort 2 have bought 1.95 products, have spent 31.31 euro’s

and have made 1.57 transactions before the observation period. Customers in cohort 3

have spent on average 44.74 euro’s, bought 3.12 products and have made 2.39

transactions before the observation period.

M

EASURES

Customer churn

Two possible measures will be handled to determine churn. First the Pareto/NBD

model will be applied to determine churn. However, if this model is not applicable to

determine customer churn a threshold will be used.

Pareto/NBD

(16)

(Fader et al., 2005). In parallel with other models, the Pareto/NBD model is very easy

to implement. The Pareto/NBD model assumes customers behave as if they transact

randomly around their mean propensity until they churn. It thus can be used to infer

the probability that a customer with a particular purchase history is still a customer

(Ascarza et al., 2018). Therefore it is argued that the probability of being alive

(P(Alive)) is a necessary intermediate step for the key modeling effort of this study.

Threshold

A threshold of six months will be used to determine customer churn. If a customer has

not made a repeat transaction within six months, he or she is considered a churner.

The threshold is set on six months since the observation period exist of just six and a

half months. This is considered most ideal in the current context with a limited

timeframe. The customers of Alleeninkt.nl transact on low frequency and therefore

the biggest feasible threshold is used. Appendix B shows that determining customer

churn according a threshold is robust. The significant estimated parameters do not

change a lot and they don’t change from positive to negative or the other way around.

There are just some minor changes in significance of variables. Changing the cut-off

point to five and a half month will result in a significant average order quantity

parameter for cohort 1 (from 0.131 to 0.018). The average product price becomes

significant for cohort 2 (from 0.051 to 0.008) and cohort 3 (from 0.115 to 0.009).

Email marketing

(17)

provided by the email. In total 52 mailings have been sent since October 2018 until

April 2019, resulting in 104 dataframes. Two variables for each mailing are

composed, one to indicate whether a customer have opened the email and one to

indicate whether a customer have clicked on the link provided by the email. This

results in 104 variables. For the complexity of the model, all variables for opening an

email are summed into one variable and all variables for clicks are summed into one

variable. Than the variable opens represent the amount of emails a customer have

opened and the variable clicks represent the amount of emails where a customer have

clicked on the link provided by the email for at least one time.

Average product price

To measure the average product price a new variable is conducted out of the existing

price variable. The original price variable contains the total price of an order for each

customer. All prices of products bought in the observation period are summed and

divided by the amount of products bought for computation of the average product

price. This results in a variable for each customer with an average price for each item

bought.

Average order quantity

The average order quantity is measured according data available from the sales

database of Alleeninkt.nl. The database of Alleeninkt.nl captures each transaction of a

customer and transforms this transaction into an order for each product (i.e., if

someone buys two different types of products in one transaction, the database

transforms this transaction into two orders. One for each type of product.). Therefore

all orders of a customer that have been made on the same date are merged.

(18)

(19)

4. M

ETHODOLOGY

The main goal is to measure the effectiveness of email marketing in churn context.

For estimation of customer churn a Pareto/NBD model is employed. The parameters

of the Pareto/NBD are estimated by using all the available data from the sales

database (from 3 May 2017 until 19 April 2019). The sample is split into a calibration

sample and validation sample for validation of the model. The cut-off point is set on

26 April 2018 since this is exactly halfway the data sample. Therefore, 50% percent

of the data can be used to estimate the model and the other 50% to validate the model.

Note that both treated and untreated customers are included in both samples. Thus

customers that have opened emails and clicked on the links and also customers that

have not opened the emails or clicked on the links.

P

ARETO

/NBD

(20)

The predicting capability of the model is quit well (figure 2). The conditional

expectation (figure 3) validates that the fit of the model also holds into the hold out

period. The information of bin 0, 1 and 2 are considered most important since these

are by far the biggest bins. So the predicting capability of these bins means a lot more

than for bin 3+.

P(A

LIVE

)

The distribution parameters of the Pareto/NBD model are acquired through the

maximum likelihood estimation (MLE). Next, the probability of being alive for each

customer is calculated from day 1 to day 100 of the observation window.

The calculated probabilities of still being alive for the customers of the calibration

period appear to be very high (>99.99%). This high change of being alive is possibly

the result of customers making transactions at low frequency. This will be

substantiated through the expectation function of the Pareto/NBD model, which

suggest that a new customer will make 0.15 repeat transactions in the first year. The

conditional expected transactions function returns a value of 0.02. This signifies that a

customer from the calibration sample makes 0.02 transactions in the hold out period

FIGURE 2: FREQUENCY OF REPEAT

(21)

(day 101 until 200 from the observation window). The drop out rate is very low

because customers transact on a low frequency. The MLE parameters of the

Pareto/NBD model already suggested these outcomes (r = 2.26e-01, alpha =

5.64e+02, s = 2.41e-08, beta = 1.13e+01).

There is no variation in customer churn since the Pareto/NBD model returns P(Alive)

values of above 99,99% for each customer. Therefore customer churn will be

determined through another method. A threshold will be used to determine whether a

customer is churned or not.

F

UNCTIONAL FORM

After estimation of the P(Alive) a threshold is used to determine customer churn. The

customer churn is than modeled as a function of email marketing, average order

quantity and average product price in a binary choice model.

The binary choice model assumes that the choices customers make depends on the

utility they gain from each of the choice (Franses & Paap, 2003). However utility is

not observed and does not have a proper scale. Figure 4 shows the function for utility.

Where U

i

is the latent utility of a customer’s choice,

α

is the constant,

β

is parameter

of x

i

, x

i

is the observed independent variable and

ε

i

is the error term.

FIGURE 4: UTILITY FUNCTION

(22)

This latent utility is translated into the observable decision choice of the customer

(Y

i

), which is called probability and has a scale between 0 and 1 (figure 5). It assumes

that if U

i

<= 0 than Y

i

is 0 and if U

i

> 0 than Y

i

is 1.

A logistic regression and a probit regression can be applied to study the proposed

conceptual framework. The choice of which to use will be made according to the

information criteria and the log likelihood of the models. The information criteria are

estimators of the relative quality of the model (Franses & Paap, 2003). The log

likelihood is a measure to check how close the model is to the real data.

The estimated model is as follows:

With:

i = customer

Where in customer i:

Y = customer churn

EO = emails opened

EC = emails clicked

PC = price class

(23)

5. R

ESULTS

Outliers are excluded from the sample before estimation of the Pareto/NBD model to

gain proper result of the analysis. It turns out that, according the data, all quantities

above one are outliers. This results in a loss of 33.970 data points. Excluding these

data points from the dataset would result in a big loss off data. After dialogue with the

owners of Alleeninkt.nl it is decided to exclude only the data points which have a

quantity of above 5 since these orders are likely to be for business purposes and not

for private use. So 679 observations are excluded from the sample.

The outlier check for price per unit results in 19.396 outliers. Excluding all of these

would again lead to a big loss of data. The threshold for excluding outliers, in

consultation with the owners of Alleeninkt.nl, is set on 80. All unit prices of above 80

are likely to be orders for business purposes. So 6.140 observations are excluded from

the sample.

The outlier check for amount of opens and amount of clicks suggest that amount of

opens has 6.520 outliers and that amount of clicks has 7.461 outliers. These outliers

are not excluded from the database since none justifiable threshold could be made and

excluding all of them would result in a huge data loss (more than 20%).

M

ULTICOLLINEARITY

(24)

First multicollinearity is tested according the VIF score test. The most commonly

used rule of thumb is 10 (O’brien, 2007). If VIF scores of explanatory variables

exceed 10 than multicollinearity is present. No multicollinearity for all three cohorts

is identified according this test. All explanatory variables of the logistic model have

VIF scores below 10 (Appendix C).

A correlation matrix is conducted to substantiate the results of the VIF test (Appendix

D). According to this correlation matrix none explanatory variable appears to be

multicollinear for all cohorts. All values among all cohorts are less than 0.32, which

means very little correlation. Therefore it can be stated that there is no

multicollinearity.

I

NTERPRETATION OF THE LOGISTIC REGRESSION

For interpretation of the results three ways of assessing the impact of the explanatory

variables are used. First, the estimates of the original model are interpreted (table 4).

Second, the odds ratio will be used (table 5). The odds ratio represents the likelihood

of an event happening versus not happening (i.e., if the odds ratio is 3, than the odds

of happening versus not happening is 3 to 1) (Franses & Paap, 2003). Third, the

average marginal effects will be handled (table 6). The average marginal effects

displays by how much the probability of observing an event happening increases if a

binary variable changes from 0 to 1 or if a continues variable changes instantaneously

(Williams, 2012).

(25)

TABLE 4: ESTIMATED PARAMETERS LOGISTIC MODEL TABLE 5: ODDS RATIO LOGISTIC MODEL TABLE 6: AVERAGE MARGINAL EFFECTS LOGISTIC MODEL

Email marketing

(26)

H1: The amount of opens negatively influences the customer churn.

H2: The amount of clicks negatively influences the customer churn

No evidence is found to support H1, although the amount of opens does significantly

influence the customer churn for cohort 1. The amount of opens appears to be

positively influencing the customer churn with an estimate of 0.13905

(p-value=0.000) (table 4). This suggests that the possibility that a customer from cohort

1 churns will increase if that customer opens more emails. If that customer opens one

more email, keeping all other parameters constant, the chance of churning versus not

churning will increase with 14,92% for that customer (table 4). A marginal change in

the amount of opens will lead to an increase for the possibility to churn of 1.09%

(p-value=0.000) (table 6).

No statistically significant relationship is found for the amount of opens and customer

churn in cohort 2 and cohort 3. So it can be concluded that the more emails a

customer in cohort 1 opens, the higher the probability is that he or she churns.

(27)

in cohort 2 will decrease with 68.19% if that customer clicks on one more link (table

5). A marginal change in the amount of clicks will lead to a decrease for the

possibility to churn of 4.66% for a customer in cohort 2 (p-value=0.000) (table 6).

The chance of churning versus not churning if a customer in cohort 3 clicks on one

the amount of clicks will lead to a decrease for the possibility to churn of 5.08% for a

customer in cohort 3 (p-value=0.000) (table 6).

It thus can be stated that the probability that a customer, from all cohorts, will churn

decreases if he or she clicks on more links.

Average order quantity

It is hypothesized that the average order quantity is negatively related to customer

churn. This effect is tested according the following hypothesis.

H3: The average order quantity negatively influences customer churn.

No evidence is found to support H3 since the parameters, for all cohorts, of the

average order quantity appear to be insignificant (value=0.131 for cohort 1,

p-value=0.713 for cohort 2 and p-value=0.599 for cohort 3) (table 4). It is thus

suggested that the average order quantity does not affect customer churn.

Average product price

(28)

measured for both emails opened and emails clicked to measure this effect. Resulting

in the following hypotheses:

H4: The average product price positively influences the effect of emails opened on the

customer churn

H5: The average product price positively influences the effect of emails clicked on the

customer churn

Also a direct effect is tested:

H6: The average product price has a positive effect on customer churn

No evidence is found to support H4, H5 or H6 (table 4). H4 is not supported since the

parameters from the moderation variable for opens are not significant for all cohorts

(p-value=0.372 for cohort 1, p-value=0.898 for cohort 2 and p-value=0.919 for cohort

3). No support for H5 is found since all the parameters from the moderation variable

for clicks are insignificant for all cohorts (p-value=0.074 for cohort 1, p-value=0.434

for cohort 2 and p-value=0.089 for cohort 3). H6 is not supported since the direct

effect parameters of the average order price are insignificant for all cohorts

(p-value=0.716 for cohort 1, p-value=0.051 for cohort 2 and p-value=0.115 for cohort 3).

It is thus suggested that the average product price does not moderate the effect of

email marketing on customer churn.

M

ODEL VALIDATION

(29)

methods to measure the relative quality of the statistical model. The model with the

lowest value has the best relative quality (Franses & Paap, 2003). The log-likelihood

(LL) measures how close the predictions of the model are to the observed data. The

model with the highest value has the closest predictions related to the observed data.

Table 7 displays the AIC, BIC and LL from the different models for each cohort.

‘Logitcoh1’, ‘Logitcoh2’ and ‘Logitcoh3 are the logistic regression models of cohort

1, 2 and 3. ‘Probitcoh1’, ‘Probitcoh2’ and ‘Probitcoh3’ are the probit regression

models of cohort 1, 2 and 3. The probit model scores best for cohort 1 on AIC (19643

compared to 19648 for logistic), BIC (19702.29 compared to 19707.34 for logistic)

and LL (-9814.727 compared to -9817.248 for logistic). However the logistic model

scores better than the probit model for cohort 2 and 3 on AIC (2082.3 and 3109.5

compared to 2100.4 and 3122.2 for probit), BIC (2127.54 and 3156.545 compared to

2145.645 and 3169.122 for probit) and LL (1034.144 and 1547.748 compared to

-1043.191 and -1554.037 for probit). So according to the AIC, BIC and LL a logistic

regression scores best for two of the three cohorts. Therefore it is suggested that a

logistic regression is preferred.

Another measure used to consider the use of a logistic or a probit regression is the

pseudo R-squared. A simple definition of R-squared does not apply with binary data

since the variation in the dependent variable is either 1 or 0. Therefore the Cox &

(30)

Snell R-squared, Nagelkerke R-squared and McFadden R-squared are applied. Table

8 shows the results of these three different pseudo squared measures. A higher

R-squared value represents better explanation of the dependent variable by its

independent variables (Smith & McKenna, 2013).

The probit regression scores best on the McFadden (0.03218 compared to 0.03193 for

logistic), Nagelkerke (0.04263 compared to 0.04230 for logistic) and CoxSnell

R-squared (0.01955 compared to 0.01940 for logistic) for cohort 1. However, idem as

with the AIC, BIC and LL, a logistic regression seems favorite for cohort 2 and 3

according the McFadden (0.06873 and 0.06405 compared to 0.06058 and 0.06025 for

probit), Nagelkerke (0.08469 and 0.08145 compared to 0.07480 and 0.07670 for

probit) and CoxSnell R-squared (0.03166 and 0.03396 compared to 0.02796 and

0.03198 for probit). Again, for cohort 2 and 3 a logistic regression scores best and

therefore it is argued that logistic regression is preferred.

The last non-statistical argument for applying a logistic regression instead of a probit

regression is that logistic regression is favored by marketers and customer churn

modelers due to its ease of use, interpretability and robustness (Tamaddoni et al.,

2016).

(31)

A log-likelihood ratio test is applied to further validate the logistic model (Franses &

Paap, 2003). A log-likelhood ratio test is applied to measure whether the logistic

model outperforms the null model. This test appears to be significant for all three

cohorts, which rejects the H0 (the variables are redundant) of the test. Therefore it can

be argued that the logistic regression is significantly better than the null model for all

three cohorts.

For face validation, the coefficients of the model are evaluated to argue whether they

make sense or not. Odd estimates indicate bad face validity. At first sight it seems odd

that emails open has a positive effect on customer churn since clicks has a strong

negative effect and literature on email marketing suggested a positive effect as well.

However clicks and opens are not the same and therefore can have different effects.

The plausible reason for the positive effect of emails open will be further explained in

the general discussion.

M

AIN EFFECTS

A main effect model is estimated to discuss the influence of the moderation variable.

Table 9 shows the results of the AIC and BIC for both the entire model and the main

effects only model. ‘Logitmaincoh1’, ‘Logitmaincoh2’ and ‘Logitmaincoh3’

represents the main effects only models for cohort 1, 2 and 3. The predicting

capability of the model slightly

increases when the moderation

variable is excluded from the model

among all cohorts.

_T_ABLE_9:_AIC_AND_BIC_{FOR MAIN EFFECT AND}

(32)

The estimated parameters for the main effects model of all cohorts are displayed in

table 10, table 11 and table 12. These results show marginal changes for all

parameters compared to the model with moderation effect except for the direct effect

of average product price in cohort 2. This model suggests that the average product

price has a positive effect on customer churn for customers in cohort 2

(p-value=0.021). It means that customers who pay a higher average product price are

more likely to churn (table 10). The chance of churning versus not churning will

increase with 0.96% when a customer in cohort 2 has an average product price of 1

euro more (table 11). A marginal change in the average product price will lead to an

increase of 0.04% in the possibility to churn.

TABLE 10: ESTIMATED PARAMETERS MAIN EFFECT MODEL

TABLE 11: ODDS RATIO MAIN EFFECT MODEL

(33)

The significance level of the average product price in the main effects model is not a

surprise since this variable was just slightly insignificant in the full model for cohort 2

(p-value=0.051)(table 4). However it is odd that the average product price is

significant for only cohort 2. It suggests that the average price level increases the

possibility to churn for customers that have been with the company for between

fifteen and six and a half months. It does not affect customers who have been with the

company for a longer period than fifteen months or a shorter period than six and a

half months.

Overall, the main effects only model predicts customer churn slightly better than the

full model. Also, there is just a very small change in the estimated parameters. So a

main effects only model does not affect the results of the study.

G

ENERAL DISCUSSION AND MANAGERIAL IMPLICATIONS

(34)

emails that do not contain any relevant or personal information, they become

unreceptive towards these emails. The emails, in this particular context, could be

considered as not relevant since it is found that Alleeninkt.nl sells low frequency

products. Customers do not have to buy the products often and therefore may find

receiving emails a couple of times per month bothersome. This is also argued by

Bruner & Kumar (2007) who found that people who have opt-in for email marketing

still find receiving too many messages bothersome.

Thus firms should reconsider the personal relevance of the emails they send.

Improving the personal relevance of the email should improve the effectiveness of the

emails (Batra & Keller, 2016). An example is the amount of emails sent. Firms that

have low frequency customers might reconsider the amount of emails they send. For

these type of customers, a couple of emails per month seems to be too much and leads

to an increase in customer churn in the first period of the relationship with the

company. Push strategies, like sending email, become ineffective and therefore firms

must employ new and effective internet marketing strategies.

(35)

So it turns out that emails opened does not influence customer churn like literature

suggested. Shankar et al. (2016) argues that the use of mobile for marketing purposes

is growing dramatically. Besides mobile marketing other new marketing channels are

developing and growing as well. According to Opreana & Vinerean (2015) these

developments lead to less effectiveness of email marketing since customers can

choose their object of interaction. So online firms must reconsider their traditional

marketing strategies to reduce customer churn and must respond to the development

of the digital market to retain effective customer churn strategies.

Like literature suggested, the amount of clicks do decrease customer churn. This

effect gets bigger for cohort 2 compared to 1 but than remains almost the same for

cohort 3. Generating more clicks through email marketing thus reduces customer

churn. Online firms should focus on email optimization to generate more clicks and

thus reduces customer churn.

It is argued that the average product price is not moderating the effect email

marketing has on customer churn, despite what the literature claims. Also no

significant direct effect from the average product price on customer churn is recorded.

So different price oriented customers do not response different on email marketing in

a customer churn context. Kaura et al. (2015) discusses the effect of price fairness and

customer loyalty. A fair price leads to higher customer loyalty than an unfair price.

However price fairness can be equal among different product prices. So the price of a

more expensive cartridge can be considered as fair as a cheaper one. For

(36)

but the people with highest sensitivity leads to a decrease in customer churn.

However, targeting the email marketing on different product prices is not effective.

The average order quantity has no significant effect on churn rate among all cohorts,

unlike literature suggested. Customers that buy private label brands are less involved

than customers that buy manufacturer brands (Vahie & Paswan, 2006). The goods

Alleeninkt.nl sells are utilitarian products from a private label and therefore it can be

argued that, no matter the quantity, customers of Alleeninkt.nl are low involved. So

targeting customers on their order quantity is not effective for firms with low involved

customers.

L

IMITATIONS AND SUGGESTIONS FOR FURTHER RESEARCH

The way of calculating customer churn in this context is not ideal. For modeling

customer churn with low frequency customers a lot more data is needed. Almost two

year of data is used for calculating the P(Alive) (May 2017 until April 2019), however

this was not enough. There was no variation in the P(Alive) and thus it is suggested

that modeling customer churn in this very context is quit challenging and that data is

needed from a bigger time period (longer than two years). This study had limited time

and therefore the determination of customer churn is performed through a threshold

instead of other models. Other models to determine customer churn could have been

studied and applied if more time was available. So future research could focus on

modeling customer churn for low frequency firms.

The actual (prior) conceptual framework could not be studied through some

(37)

rain, temperature) had to be excluded which result in a very thin conceptual

framework. Another consequence of misfortunes in data gathering is the lack of

customer characteristics and control variables. Data storage/collection is new for

Alleeninkt.nl and this study is the very first study performed with their data. This

study is part of a graduation program and had a limited timeframe. Only after eight

weeks in the program the university gave agreement to use the data of Alleeninkt.nl

with insurance that the AVG privacy concerns would not be violated. The data could

not be made available before this insurance. Therefore there was not much time to

find other data after finding out about the lack of the data.

The effect of email marketing on customer churn in the digital market is decreasing.

Nowadays customers are faced with promotions and advertisements through more

channels than ten years ago (Opreana & Vinerean, 2015). Therefore the effectiveness

of these new channels in a customer churn context is of interest for future research.

The average product price is slightly insignificant for cohort 2 in the full model and

significant in the main effects only model. It means that new customers and the oldest

customers are both not affected by the average product price in their churn

probability. No literature is found to support this result. Therefor future research is

needed to explain this phenomenon.

(38)

R

EFERENCES

:

Ascarza, E. (2018). Retention futility: Targeting high-risk customers might be

ineffective. Journal of Marketing Research, 55(1), 80-98.

Ascarza, E., Iyengar, R., & Schleicher, M. (2016). The perils of proactive churn

prevention using plan recommendations: Evidence from a field experiment. Journal

of Marketing Research, 53(1), 46-60.

Ascarza, E., Netzer, O., & Hardie, B. G. (2018). Some customers would rather leave

without saying goodbye. Marketing Science, 37(1), 54-77.

Athanassopoulos, A. D. (2000). Customer satisfaction cues to support market

segmentation and explain switching behavior. Journal of business research, 47(3),

191-207.

Batra, R., & Keller, K. L. (2016). Integrating marketing communications: New

findings, new lessons, and new ideas. Journal of Marketing, 80(6), 122-145.

Bensoussan L, Del Bosch LM, Naud F, Benichou P (2014) Churn value management:

How can companies retain customers in an increasingly volatile world. Accessed July

24, 2018, http://www .oliverwyman.com/content/dam/oliver-wyman/global/en/2014/

oct/201408_Churn_value_management_Screen.pdf.

Bhattacherjee, A. (2001), “An empirical analysis of the antecedents of electronic

commerce service continuance”, Decision Support Systems, Vol. 32 No. 2, pp.

201-214.

(39)

Chiou, J. S., & Pan, L. Y. (2009). Antecedents of internet retailing loyalty: differences

between heavy versus light shoppers. Journal of Business and Psychology, 24(3), 327.

Chittenden, L. and Rettie, R. (2003), “An evaluation of email marketing and factors

affecting response”, Journal of Targeting, Measurement and Analysis for Marketing,

Vol. 11 No. 3, pp. 203-217.

DuFrene, D., Engelland, B., Lehman, C. and Pearson, R. (2005), “Changes in

consumer attitudes resulting from participation in a permission e-mail campaign”,

Journal of Current Issues and Research in Advertising, Vol. 27 No. 1, pp. 65-77.

Email Statistics Report, 2018 - 2022. 2018. Page on Radicati’s website. Retrieved

from

https://www.radicati.com/wp/wpcontent/uploads/2018/01/Email_Statistics_Report,_2

018- 2022_Executive_Summary.pdf

eMarketer. (2018). What Percent of Purchases Are Digital Buyers Worldwide Making

Domestically vs. Cross-Border? (% of respondents, May 2018). Consulted at

2-4-2019, from:

https://www.emarketer.com/Chart/What-Percent-of-Purchases-Digital-

Buyers-Worldwide-Making-Domestically-vs-Cross-Border-of-respondents-May-2018/223102

Fader, P. S., & Hardie, B. G. (2009). Probability models for customer-base

analysis. Journal of interactive marketing, 23(1), 61-69.

(40)

Franses, Philip H. and Richard Paap (2001), Quantitative Models in Marketing

Research. Cambridge, UK: Cambridge University Press.

Green H (2016) Economic analysis of proposed Sky/Vodafone merger. Report,

Axiom Economics, Petersham, Australia.

Grimes, G., Hough, M. and Signorella, M. (2007), “Email end users and spam:

relations of gender and age group to attitudes and actions”, Computers in Human

Behavior, Vol. 23 No. 1, pp. 318-332.

Holmes, A., Byrne, A., & Rowley, J. (2013). Mobile shopping behaviour: insights

into attitudes, shopping process involvement and location. International Journal of

Retail & Distribution Management, 42(1), 25-39.

Huang, S. L., & Chang, Y. C. (2017, January). Factors that impact consumers'

intention to shop on foreign online stores. In Proceedings of the 50th Hawaii

international conference on system sciences.

Kaura, V., Durga Prasad, C. S., & Sharma, S. (2015). Service quality, service

convenience, price and fairness, customer loyalty, and the mediating role of customer

satisfaction. International Journal of Bank Marketing, 33(4), 404-422.

Kim, B. D., Shi, M., & Srinivasan, K. (2001). Reward programs and tacit

collusion. Marketing Science, 20(2), 99-120.

Kim, Y., Trail, G. and Ko, Y. (2011), “The influence of relationship quality on sport

consumption behaviors: an empirical examination of the relationship quality

(41)

Kim, Y. S., Lee, H., & Johnson, J. D. (2013). Churn management optimization with

controllable marketing variables and associated management costs. Expert Systems

with Applications, 40(6), 2198-2207.

Kumar, S. and Sharma, R. (2014), “An empirical analysis of unsolicited commercial

e-mail”, Paradigm, Vol. 18 No. 1, pp. 1-19.

Lee, J. Y., Fang, E., Kim, J. J., Li, X., & Palmatier, R. W. (2018). The effect of online

shopping platform strategies on search, display, and membership revenues. Journal of

Retailing, 94(3), 247-264.

Lin, H.-H. and Wang, Y.-S. (2006), “An examination of the determinants of customer

loyalty in mobile commerce contexts”, Information & Management, Vol. 43 No. 3,

pp. 271-282.

Lindner, M. (2015). Global e-commerce sales set to grow 25% in 2015. Internet

Retailer, July, 29.

Martínez, S. C., & Guillén, M. J. Y. (2006). Can price promotions improve tourist

loyalty to tour operators?. Journal of Hospitality & Leisure Marketing, 14(4), 33-46.

McConnell, J. D. (1968). The development of brand loyalty: an experimental

study. Journal of Marketing Research, 5(1), 13-19.

Micheaux, A. L. 2011. “Managing E-mail Advertising Frequency from the Consumer

Perspective.” Journal of Advertising 40 (4): 45–66.

Mishra, U. S., Das, J. R., Mishra, B. B., & Mishra, P. (2012). Perceived benefit

analysis of sales promotion: a case of consumer durables. International Research

(42)

Nagar, K. (2009). Evaluating the effect of consumer sales promotions on brand loyal

and brand switching segments. Vision, 13(4), 35-48.

Neslin, Scott A., Sunil Gupta, Wagner Kamakura, Junxiang Lu, and Charlotte H.

Mason (2006), “Defection Detection: Measuring and Understanding the Predictive

Accuracy of Customer Churn Models,” Journal of Marketing Research, 43 (2), 204–

11. O’brien, R. M. (2007). A caution regarding rules of thumb for variance inflation

factors. Quality & quantity, 41(5), 673-690.

Opreana, A., & Vinerean, S. (2015). A new development in online marketing:

Introducing digital inbound marketing. Expert Journal of Marketing, 3(1).

Osuna, I., González, J., & Capizzani, M. (2016). Which categories and brands to

promote with targeted coupons to reward and to develop customers in supermarkets.

Journal of Retailing, 92(2), 236-251.

Radosavljevik, D., van der Putten, P., & Larsen, K. K. (2010). The impact of

experimental setup in prepaid churn prediction for mobile telecommunications: What

to predict, for whom and does the customer experience matter?. Trans. MLDM, 3(2),

80-99.

Reimers, V., Chao, C. W., & Gorman, S. (2016). Permission email marketing and its

influence on online shopping. Asia Pacific Journal of Marketing and Logistics, 28(2),

308-322.

(43)

Sahni, N. S., Wheeler, S. C., & Chintagunta, P. (2018). Personalization in email

marketing: The role of noninformative advertising content. Marketing Science, 37(2),

236-258.

Scott, S. (2018, 16 april). No, advertising spend is not moving online – here’s why.

Consulted at 1-5-2019, from:

https://www.thedrum.com/opinion/2018/04/16/no-advertising-spend-not-moving-online-heres-why

Smith, T. J., & McKenna, C. M. (2013). A comparison of logistic regression pseudo

R2 indices. Multiple Linear Regression Viewpoints, 39(2), 17-26.

Sohail, M. S., Al-Jabri, I. M., & Wahid, K. M. (2017). Relationship between

marketing program and brand loyalty: Is there an influence of gender?. Journal for

Global Business Advancement, 10(2), 109-124.

Spencer, D. M. (2010). Segmenting special interest visitors to a destination region

based on the volume of their expenditures: An application to rail-trail users. Journal

of Vacation Marketing, 16(2), 83-95.

Sur, S. (2015). The role of online trust and satisfaction in building loyalty towards

online retailers: Differences between heavy and light shopper groups. In LISS 2014

(pp. 489-494). Springer, Berlin, Heidelberg.

Tamaddoni, A., Stakhovych, S., & Ewing, M. (2016). Comparing churn prediction

techniques and assessing their performance: a contingent perspective. Journal of

service research, 19(2), 123-141.

Vahie, A., & Paswan, A. (2006). Private label brand image: its relationship with store

image and national brand. International Journal of Retail & Distribution

(44)

Wang, H.-C., Pallister, J. G., & Foxall, G. R. (2006). Innovativeness and involvement

as determinants of website loyalty: III. Theoretical and managerial contributions.

Technovation, 26(12), 1374-1383.

(45)

A

PPENDICES

Appendix A: Descriptives older customers Appendix B: robustness check Cohort 1, cohort 2 and cohort 3 are estimates for 6 months threshold. Cohort 2.1, cohort 2.2 and cohort 2.3 are estimates for 5,5 months threshold. Cohort 3.1, cohort 3.2 and cohort 3.3 are estimates for 5 months threshold. TABLE 13: DESCRIPTIVES OLDER CUSTOMERS COHORT 2

TABLE 14: DESCRIPTIVES COHORT 3 OLDER CUSTOMERS

(46)

Appendix C: VIF scores

TABLE 16: ESTIMATES THRESHOLD 5,5 MONTHS

TABLE 17: ESTIMATES THRESHOLD 5 MONTHS

(47)

(48)

(49)

(50)

(51)

(52)

(53)

(54)

(55)

(56)

(57)

(58)

(59)

(60)

(61)

(62)

(63)

(64)

(65)

(66)

(67)

(68)

(69)

(70)

(71)

(72)

(73)

(74)

(75)

(76)

(77)

(78)

(79)

(80)

(81)

(82)

(83)

(84)

(85)

(86)

(87)

(88)

(89)

(90)

(91)

(92)

(93)

(94)

(95)

(96)

(97)

(98)

(99)

(100)

(101)

(102)

(103)

(104)

(105)

(106)

(107)

(108)

(109)

(110)

(111)

(112)

(113)

(114)

(115)

(116)

(117)

(118)

(119)

(120)