ONLINE SHOPPING
The effect of email marketing on customer churn.
By
Joep Schaap
University of Groningen
Faculty of Economics and Business
A
BSTRACT
The effect of email marketing on customer churn.
Expenditures in the email marketing by companies are still rising even tough the
digital marketing is changing rapidly and customers are contacted through more and
more channels. Through the greater accessibility of the digital market churning
becomes easier by the day. In this paper, the effectiveness of email marketing in
preventing customers to churn is studied. A logistic regression with the average
product price as a moderator for email marketing and customer churn and with
average order quantity as an extra explanatory variable is applied. The results show
that the effect of email marketing in preventing customers to churn is changing. The
amount of emails opened by a customer at the start of their relationship with the
company increases the probability to churn. Possible explanations for this affect are
personal relevance and the amount of emails sent.
1.
I
NTRODUCTION
Internet becomes more and more important for companies since the e-commerce
market is still growing. In 2020 global e-commerce sales will exceed the 3.5 trillion
dollars, which accounts for 12% of the sales worldwide (Lindner, 2015). The growth
of the e-commerce market also provokes overseas shopping. According to EMarketer
(2018) 57% of the worldwide online shoppers purchases from overseas retailers. The
e-commerce market bursts of competitors and also of customers. Retaining old
customers and gaining new customers thus becomes more of challenge each day.
Preventing customers to churn is top priority for most companies since it is tied to
profitability and value (Neslin et al., 2006). According to Bensoussan et al. (2014)
phone companies report a churn rate of 2% each month, while pay-TV firms report a
churn rate of 1% per month (Green, 2016). This results in a cost for the companies of
10 billion dollars annually (Ascarza et al., 2016). Likewise, companies can increase
the average net present value of a customer up to 95% by boosting the retention rates
with 5% (Kim et al., 2013). Attracting new customers can be very expensive (Lin &
Wang, 2006). Acquiring new customers may sometimes cost five times more than
retaining old customers (Bhattacherjee, 2001). These studies show how important
customer retention is for companies, especially for telecom providers (subscriptive
companies).
Compared to the offline market, the amount of interactions with a customer is limited
for online firms. With limited amount of interactions with a customer in the online
market, email becomes vital as a communication tool in customer relation
for achieving repeat traffic in 2005. Reimers et al. (2016) concluded that permission
based email marketing provides traffic and return visits to a specific website.
Approximately 1,5 billion dollars was spent on email marketing in 2011 (Kim et al.,
2011). In 2017 2.81 billion dollars was spent on email marketing globally (Scott,
2018). This huge growth of dollars spent on email marketing indicates the huge role
email marketing still has for companies.
The email statistics report (2018) state that worldwide the amount of email users will
peak 3.8 billion in 2022. Half of the population of the globe will have an email
account and thus can potentially be reached through email. Thus, email marketing is
still a very important tool for companies and it importance seems to be increasing by
the years.
The average order quantity is studied as an explanatory variable for customer churn to
assess insights for whom to target with retention programs. Volume segmentation is
considered an important segmenting variable in marketing since heavy and light
shoppers behave differently (Chiou & Pan, 2009) (Spencer, 2010).
The following research question is formed to study the effect of email marketing,
price and quantity on customer churn: how is email marketing affecting customer
churn?
A sample of 130.000 customers from the database of Alleeninkt.nl is conducted.
Subsequently a logistic regression is applied to study the effect of email marketing on
customer churn. The choice to study this topic for this specific company has several
reasons. First, Alleeninkt.nl has actual sales data available of the past two year, which
enables to study actual email marketing effectiveness instead of reported or intended
effectiveness. Second, they have sent email randomly to their customer base, which
ensures that any differences between and within customers are not systematic at the
outset of the experiment.
Email marketing is split in the amount of opens and the amount of clicks. The results
suggest that email marketing still significantly influences customer churn. A
2.
C
ONCEPTUAL FRAMEWORK
The conceptual model is displayed in figure 1. The model studies the effect of average
product price on the relationship between email marketing and customer churn. The
average order quantity is also added as an explanatory variable for customer churn.
The main focus in this model is on the effect of email marketing as an explanatory
variable of the customer churn, because multiple studies have shown that email
marketing is still one of the most important cornerstones of online companies.
2.1
D
EPENDENT VARIABLE:
C
USTOMER CHURNThe growth of the commerce market is in line with the accessibility of the
e-commerce market that becomes greater each day. Therefore it gets easier for
customers to switch between companies. Switching between companies is referred to
as churning (Ascarza, 2018). Customer churn can have a negative effect on profit,
erode price premiums and increase acquisition costs (Athanassopoulos, 2000).
Churning customers occurs in contractual settings and also in non-contractual
settings. In contrast to contractual businesses, non-contractual businesses are unaware
of their customers being churned since the customers are not bonded to a contract.
The customers can buy the product any time at any company they want (Fader &
Hardie, 2009). Therefore modeling customer churn in a non-contractual setting is
considered a challenge (Fader & Hardie, 2009). Since this is considered both as a
challenging and profitable for a company, this study focuses on customer churn in
non-contractual businesses.
In other studies customer churn is defined as the percentage of customers who change
provider at their convenience and without notification (Radosavljevik et al., 2010).
Since this study is focused on individual level data, customer churn is defined as the
probability that a customer is churned.
2.2
I
NDEPENDENT VARIABLE:
EMAIL MARKETINGseller-2003)(Reimers et al., 2016). Already quit some research in the field of email
marketing is conducted. However almost all studies performed in the field of email
marketing and churn rate are older than ten years. In a study performed by DuFrene et
al. (2005) email marketing is found to be one of the most effective tools for achieving
repeat traffic after a first-time visit. However with new devices like the iWatch and
Google Glass, more possibilities for online direct communication arise. Companies
thus have access to more channels, which may affect the role of email marketing.
Therefore, these studies are not considered representative for the effect of email
marketing on customer churn in the current digital market.
Another factor that could have changed the role of email marketing is the greater
accessibility for online shops overseas. This accessibility has resulted in a rapid
expansion of cross-border e-commerce (Huang & Chang, 2017). The competition for
online shops becomes fiercer since the reach of an online shop is growing. Customers
thus have a higher amount of online shops where they can buy their products from.
The effect and importance of direct communication tools as email marketing may thus
change. With the, prior stated, rapid change of the e-commerce marketing it is
considered important to include email marketing in this study.
In this study email marketing is split into two variables, namely opens and clicks.
Opens is defined as the amount of emails that a recipient has opened and clicks is
defined as the amount of links, provided by the emails, a recipient has clicked on. A
study conducted by Sahni et al. (2018) argues that a higher amount of emails opened
leads to more sales from the already existing customers. Therefore it is hypothesized
that:
Micheaux (2011) measures the success of email newsletters in terms of the
click-through rate. This measure captures the clicking behavior of the consumer. Search
advertising platforms rely on per-click models (Lee et al., 2018). These models are
based on the assumption that more clicks leads to higher sales for the advertiser.
Therefore, it is hypothesized that:
H2: The amount of clicks negatively influences the customer churn
2.4.
I
NDEPENDENT VARIABLE:
A
VERAGE ORDER QUANTITYThis study defines average order quantity as the average amount of products that a
customer buys per order. A study conducted by Chiou & Pan (2009) argues that heavy
shoppers lay more attention to the trust they have in an online shop. Light users are
considered less involved in online shopping and heavy shoppers are thus considered
more loyal (Sur, 2015)(Wang et al., 2006). One of the factors forcing this behavior is
that the switching costs for heavy shoppers are much higher than for light shoppers
(Kim et al., 2001). It thus can be argued that heavy shoppers are less likely to churn.
High-involved customers are argued to be more brand loyal than low-involved
customers (Ferreira, 2015). According to Holmes et al. (2013) customers are more
involved when the price of a transaction is higher. Since the overall transaction price
will become higher when the order consist out of more products, it is considered that
people will get more involved in the transaction. This suggests that people with a
higher average order quantity are less likely to churn. And therefore it is argued that:
2.3.
M
ODERATOR:
A
VERAGE PRODUCT PRICEAccording to McConnell (1968) customers develop more loyalty for higher priced
products, which may result in higher tolerance for email marketing. However this is
study is very old and therefore not representative for the current market. Therefore
including the average product price in this study is of interest. The average product
price is defined as the average price a customer paid for buying a product (i.e., if
someone buys one product for two euro’s and one product for eight euro’s, the
average product price is five euro’s.).
According to Mishra et al. (2012) the consumer perception of sales promotion is
linked with price since sales promotion is considered as a reduction in price or as a
reduction in the amount of resources spent by the consumer. Price is very influential
in generating brand loyalty (Sohail et al., 2017) (Mishra et al., 2012). For more
expensive products a price promotion is argued to be more effective in generating
loyalty (luxury goods excluded). Therefore it is argued that cheaper prices should lead
to higher loyalty.
Osuna et al. (2016) argues that customers who buy products of brands with a higher
relative price (for example, high price in absolute terms with respect to the average
price of the category) are more likely to redeem a coupon. Since price is related to
brand loyalty and to promotional incentives, it is hypothesized that:
H4: The average product price positively influences the effect of the amount of opens
on the customer churn
H5: The average product price positively influences the effect of the amount of clicks
on the customer churn
It implies: A higher average product price will strengthen the relationship between the
amount of clicks and customer churn
The direct effect that the average product price has on customer churn, is
hypothesized as follows:
H6: The average product price has a positive effect on customer churn
3.
D
ATA DESCRIPTION
The empirical study is focused on the effect that email marketing and average order
quantity has on the customer churn. Sales data and email data are required for the
empirical analyses. Therefore a customer database and an email database are
employed. Sales data is extracted form the Alleeninkt.nl customer database and the
Alleeninkt.nl email database provides data for emails sent to its customers. This data
together includes email, emails opened, emails clicked, product id, date, price of the
product, customer id, title of the product and quantity.
S
AMPLEThe sample (observation period) used for analysis contains all the customers from the
1
stof October 2018 until the 19
thof April 2019. The sample is based on this
timeframe since the email data available is only since October 2018. If customer data
from before October 2018 is used the effect of email marketing in that period cannot
be captured. The sales data available is from 3 May 2017 until 19 April 2019. The
first transaction of a customer could have been before 3 May 2017, but this is not
captured. Therefore each customer that have not transact with the company since 3
May 2017 until 30 September 2018 is considered a new customer.
can be divided by the amount of emails they may have received. For example, a
customer that have been with the company for two years has twice as many emails
received than if he had been with the company for one year. The period from 3 May
2017 until 30 September 2018 is divided into two periods to capture that effect. This
results in three cohorts: cohort 1 with customers that have been with the company
since the 1
stof October 2018, cohort 2 with customers that have joined the company
between 15 January 2018 and 30 September 2018 and cohort 3 with customers that
joined the company between 3 May 2017 and 14 January 2018. The emails used for
analyses contain solely promotional incentives.
Table 1 displays the descriptives from cohort 1, table 2 contains the descriptives from
cohort 2 and table 3 shows the descriptives from cohort 3.
On average a customer in cohort 1 spends 33.38 euro’s, buys 1.54 products per order
and makes 1.063 transactions during the observation period. A customer in cohort 2
spends on average 39.68 euro’s, buys 1.78 products per order and makes 1.13
TABLE 1: DESCRIPTIVES COHORT 1 TABLE 2: DESCRIPTIVES COHORT 2
euro’s, buys 1.86 products per order and makes 1.14 transactions on average in the
observation period. On average, customers in cohort 2 and cohort 3 opens almost
three times more emails and click even more than 5 times more on links than
customers do in cohort 1 during the observation period. This difference is probably
because customers in cohort 1 could have become customer at the end of the
observation period. So these customers have not received many emails yet.
Appendix A includes some descriptives for customers of cohort 2 and cohort 3. On
average, customers from cohort 2 have bought 1.95 products, have spent 31.31 euro’s
and have made 1.57 transactions before the observation period. Customers in cohort 3
have spent on average 44.74 euro’s, bought 3.12 products and have made 2.39
transactions before the observation period.
M
EASURESCustomer churn
Two possible measures will be handled to determine churn. First the Pareto/NBD
model will be applied to determine churn. However, if this model is not applicable to
determine customer churn a threshold will be used.
Pareto/NBD
(Fader et al., 2005). In parallel with other models, the Pareto/NBD model is very easy
to implement. The Pareto/NBD model assumes customers behave as if they transact
randomly around their mean propensity until they churn. It thus can be used to infer
the probability that a customer with a particular purchase history is still a customer
(Ascarza et al., 2018). Therefore it is argued that the probability of being alive
(P(Alive)) is a necessary intermediate step for the key modeling effort of this study.
Threshold
A threshold of six months will be used to determine customer churn. If a customer has
not made a repeat transaction within six months, he or she is considered a churner.
The threshold is set on six months since the observation period exist of just six and a
half months. This is considered most ideal in the current context with a limited
timeframe. The customers of Alleeninkt.nl transact on low frequency and therefore
the biggest feasible threshold is used. Appendix B shows that determining customer
churn according a threshold is robust. The significant estimated parameters do not
change a lot and they don’t change from positive to negative or the other way around.
There are just some minor changes in significance of variables. Changing the cut-off
point to five and a half month will result in a significant average order quantity
parameter for cohort 1 (from 0.131 to 0.018). The average product price becomes
significant for cohort 2 (from 0.051 to 0.008) and cohort 3 (from 0.115 to 0.009).
Email marketing
provided by the email. In total 52 mailings have been sent since October 2018 until
April 2019, resulting in 104 dataframes. Two variables for each mailing are
composed, one to indicate whether a customer have opened the email and one to
indicate whether a customer have clicked on the link provided by the email. This
results in 104 variables. For the complexity of the model, all variables for opening an
email are summed into one variable and all variables for clicks are summed into one
variable. Than the variable opens represent the amount of emails a customer have
opened and the variable clicks represent the amount of emails where a customer have
clicked on the link provided by the email for at least one time.
Average product price
To measure the average product price a new variable is conducted out of the existing
price variable. The original price variable contains the total price of an order for each
customer. All prices of products bought in the observation period are summed and
divided by the amount of products bought for computation of the average product
price. This results in a variable for each customer with an average price for each item
bought.
Average order quantity
The average order quantity is measured according data available from the sales
database of Alleeninkt.nl. The database of Alleeninkt.nl captures each transaction of a
customer and transforms this transaction into an order for each product (i.e., if
someone buys two different types of products in one transaction, the database
transforms this transaction into two orders. One for each type of product.). Therefore
all orders of a customer that have been made on the same date are merged.
4.
M
ETHODOLOGY
The main goal is to measure the effectiveness of email marketing in churn context.
For estimation of customer churn a Pareto/NBD model is employed. The parameters
of the Pareto/NBD are estimated by using all the available data from the sales
database (from 3 May 2017 until 19 April 2019). The sample is split into a calibration
sample and validation sample for validation of the model. The cut-off point is set on
26 April 2018 since this is exactly halfway the data sample. Therefore, 50% percent
of the data can be used to estimate the model and the other 50% to validate the model.
Note that both treated and untreated customers are included in both samples. Thus
customers that have opened emails and clicked on the links and also customers that
have not opened the emails or clicked on the links.
P
ARETO/NBD
The predicting capability of the model is quit well (figure 2). The conditional
expectation (figure 3) validates that the fit of the model also holds into the hold out
period. The information of bin 0, 1 and 2 are considered most important since these
are by far the biggest bins. So the predicting capability of these bins means a lot more
than for bin 3+.
P(A
LIVE)
The distribution parameters of the Pareto/NBD model are acquired through the
maximum likelihood estimation (MLE). Next, the probability of being alive for each
customer is calculated from day 1 to day 100 of the observation window.
The calculated probabilities of still being alive for the customers of the calibration
period appear to be very high (>99.99%). This high change of being alive is possibly
the result of customers making transactions at low frequency. This will be
substantiated through the expectation function of the Pareto/NBD model, which
suggest that a new customer will make 0.15 repeat transactions in the first year. The
conditional expected transactions function returns a value of 0.02. This signifies that a
customer from the calibration sample makes 0.02 transactions in the hold out period
FIGURE 2: FREQUENCY OF REPEAT
(day 101 until 200 from the observation window). The drop out rate is very low
because customers transact on a low frequency. The MLE parameters of the
Pareto/NBD model already suggested these outcomes (r = 2.26e-01, alpha =
5.64e+02, s = 2.41e-08, beta = 1.13e+01).
There is no variation in customer churn since the Pareto/NBD model returns P(Alive)
values of above 99,99% for each customer. Therefore customer churn will be
determined through another method. A threshold will be used to determine whether a
customer is churned or not.
F
UNCTIONAL FORMAfter estimation of the P(Alive) a threshold is used to determine customer churn. The
customer churn is than modeled as a function of email marketing, average order
quantity and average product price in a binary choice model.
The binary choice model assumes that the choices customers make depends on the
utility they gain from each of the choice (Franses & Paap, 2003). However utility is
not observed and does not have a proper scale. Figure 4 shows the function for utility.
Where U
iis the latent utility of a customer’s choice,
α
is the constant,
β
is parameter
of x
i, x
iis the observed independent variable and
ε
iis the error term.
FIGURE 4: UTILITY FUNCTION
This latent utility is translated into the observable decision choice of the customer
(Y
i), which is called probability and has a scale between 0 and 1 (figure 5). It assumes
that if U
i<= 0 than Y
iis 0 and if U
i> 0 than Y
iis 1.
A logistic regression and a probit regression can be applied to study the proposed
conceptual framework. The choice of which to use will be made according to the
information criteria and the log likelihood of the models. The information criteria are
estimators of the relative quality of the model (Franses & Paap, 2003). The log
likelihood is a measure to check how close the model is to the real data.
The estimated model is as follows:
With:
i = customer
Where in customer i:
Y = customer churn
EO = emails opened
EC = emails clicked
PC = price class
5.
R
ESULTS
Outliers are excluded from the sample before estimation of the Pareto/NBD model to
gain proper result of the analysis. It turns out that, according the data, all quantities
above one are outliers. This results in a loss of 33.970 data points. Excluding these
data points from the dataset would result in a big loss off data. After dialogue with the
owners of Alleeninkt.nl it is decided to exclude only the data points which have a
quantity of above 5 since these orders are likely to be for business purposes and not
for private use. So 679 observations are excluded from the sample.
The outlier check for price per unit results in 19.396 outliers. Excluding all of these
would again lead to a big loss of data. The threshold for excluding outliers, in
consultation with the owners of Alleeninkt.nl, is set on 80. All unit prices of above 80
are likely to be orders for business purposes. So 6.140 observations are excluded from
the sample.
The outlier check for amount of opens and amount of clicks suggest that amount of
opens has 6.520 outliers and that amount of clicks has 7.461 outliers. These outliers
are not excluded from the database since none justifiable threshold could be made and
excluding all of them would result in a huge data loss (more than 20%).
M
ULTICOLLINEARITYFirst multicollinearity is tested according the VIF score test. The most commonly
used rule of thumb is 10 (O’brien, 2007). If VIF scores of explanatory variables
exceed 10 than multicollinearity is present. No multicollinearity for all three cohorts
is identified according this test. All explanatory variables of the logistic model have
VIF scores below 10 (Appendix C).
A correlation matrix is conducted to substantiate the results of the VIF test (Appendix
D). According to this correlation matrix none explanatory variable appears to be
multicollinear for all cohorts. All values among all cohorts are less than 0.32, which
means very little correlation. Therefore it can be stated that there is no
multicollinearity.
I
NTERPRETATION OF THE LOGISTIC REGRESSIONFor interpretation of the results three ways of assessing the impact of the explanatory
variables are used. First, the estimates of the original model are interpreted (table 4).
Second, the odds ratio will be used (table 5). The odds ratio represents the likelihood
of an event happening versus not happening (i.e., if the odds ratio is 3, than the odds
of happening versus not happening is 3 to 1) (Franses & Paap, 2003). Third, the
average marginal effects will be handled (table 6). The average marginal effects
displays by how much the probability of observing an event happening increases if a
binary variable changes from 0 to 1 or if a continues variable changes instantaneously
(Williams, 2012).
TABLE 4: ESTIMATED PARAMETERS LOGISTIC MODEL TABLE 5: ODDS RATIO LOGISTIC MODEL TABLE 6: AVERAGE MARGINAL EFFECTS LOGISTIC MODEL
Email marketing
H1: The amount of opens negatively influences the customer churn.
H2: The amount of clicks negatively influences the customer churn
No evidence is found to support H1, although the amount of opens does significantly
influence the customer churn for cohort 1. The amount of opens appears to be
positively influencing the customer churn with an estimate of 0.13905
(p-value=0.000) (table 4). This suggests that the possibility that a customer from cohort
1 churns will increase if that customer opens more emails. If that customer opens one
more email, keeping all other parameters constant, the chance of churning versus not
churning will increase with 14,92% for that customer (table 4). A marginal change in
the amount of opens will lead to an increase for the possibility to churn of 1.09%
(p-value=0.000) (table 6).
No statistically significant relationship is found for the amount of opens and customer
churn in cohort 2 and cohort 3. So it can be concluded that the more emails a
customer in cohort 1 opens, the higher the probability is that he or she churns.
in cohort 2 will decrease with 68.19% if that customer clicks on one more link (table
5). A marginal change in the amount of clicks will lead to a decrease for the
possibility to churn of 4.66% for a customer in cohort 2 (p-value=0.000) (table 6).
The chance of churning versus not churning if a customer in cohort 3 clicks on one
more link will decrease with 63.61% for that customer (table 5). A marginal change in
the amount of clicks will lead to a decrease for the possibility to churn of 5.08% for a
customer in cohort 3 (p-value=0.000) (table 6).
It thus can be stated that the probability that a customer, from all cohorts, will churn
decreases if he or she clicks on more links.
Average order quantity
It is hypothesized that the average order quantity is negatively related to customer
churn. This effect is tested according the following hypothesis.
H3: The average order quantity negatively influences customer churn.
No evidence is found to support H3 since the parameters, for all cohorts, of the
average order quantity appear to be insignificant (value=0.131 for cohort 1,
p-value=0.713 for cohort 2 and p-value=0.599 for cohort 3) (table 4). It is thus
suggested that the average order quantity does not affect customer churn.
Average product price
measured for both emails opened and emails clicked to measure this effect. Resulting
in the following hypotheses:
H4: The average product price positively influences the effect of emails opened on the
customer churn
H5: The average product price positively influences the effect of emails clicked on the
customer churn
Also a direct effect is tested:
H6: The average product price has a positive effect on customer churn
No evidence is found to support H4, H5 or H6 (table 4). H4 is not supported since the
parameters from the moderation variable for opens are not significant for all cohorts
(p-value=0.372 for cohort 1, p-value=0.898 for cohort 2 and p-value=0.919 for cohort
3). No support for H5 is found since all the parameters from the moderation variable
for clicks are insignificant for all cohorts (p-value=0.074 for cohort 1, p-value=0.434
for cohort 2 and p-value=0.089 for cohort 3). H6 is not supported since the direct
effect parameters of the average order price are insignificant for all cohorts
(p-value=0.716 for cohort 1, p-value=0.051 for cohort 2 and p-value=0.115 for cohort 3).
It is thus suggested that the average product price does not moderate the effect of
email marketing on customer churn.
M
ODEL VALIDATIONmethods to measure the relative quality of the statistical model. The model with the
lowest value has the best relative quality (Franses & Paap, 2003). The log-likelihood
(LL) measures how close the predictions of the model are to the observed data. The
model with the highest value has the closest predictions related to the observed data.
Table 7 displays the AIC, BIC and LL from the different models for each cohort.
‘Logitcoh1’, ‘Logitcoh2’ and ‘Logitcoh3 are the logistic regression models of cohort
1, 2 and 3. ‘Probitcoh1’, ‘Probitcoh2’ and ‘Probitcoh3’ are the probit regression
models of cohort 1, 2 and 3. The probit model scores best for cohort 1 on AIC (19643
compared to 19648 for logistic), BIC (19702.29 compared to 19707.34 for logistic)
and LL (-9814.727 compared to -9817.248 for logistic). However the logistic model
scores better than the probit model for cohort 2 and 3 on AIC (2082.3 and 3109.5
compared to 2100.4 and 3122.2 for probit), BIC (2127.54 and 3156.545 compared to
2145.645 and 3169.122 for probit) and LL (1034.144 and 1547.748 compared to
-1043.191 and -1554.037 for probit). So according to the AIC, BIC and LL a logistic
regression scores best for two of the three cohorts. Therefore it is suggested that a
logistic regression is preferred.
Another measure used to consider the use of a logistic or a probit regression is the
pseudo R-squared. A simple definition of R-squared does not apply with binary data
since the variation in the dependent variable is either 1 or 0. Therefore the Cox &
Snell R-squared, Nagelkerke R-squared and McFadden R-squared are applied. Table
8 shows the results of these three different pseudo squared measures. A higher
R-squared value represents better explanation of the dependent variable by its
independent variables (Smith & McKenna, 2013).
The probit regression scores best on the McFadden (0.03218 compared to 0.03193 for
logistic), Nagelkerke (0.04263 compared to 0.04230 for logistic) and CoxSnell
R-squared (0.01955 compared to 0.01940 for logistic) for cohort 1. However, idem as
with the AIC, BIC and LL, a logistic regression seems favorite for cohort 2 and 3
according the McFadden (0.06873 and 0.06405 compared to 0.06058 and 0.06025 for
probit), Nagelkerke (0.08469 and 0.08145 compared to 0.07480 and 0.07670 for
probit) and CoxSnell R-squared (0.03166 and 0.03396 compared to 0.02796 and
0.03198 for probit). Again, for cohort 2 and 3 a logistic regression scores best and
therefore it is argued that logistic regression is preferred.
The last non-statistical argument for applying a logistic regression instead of a probit
regression is that logistic regression is favored by marketers and customer churn
modelers due to its ease of use, interpretability and robustness (Tamaddoni et al.,
2016).
A log-likelihood ratio test is applied to further validate the logistic model (Franses &
Paap, 2003). A log-likelhood ratio test is applied to measure whether the logistic
model outperforms the null model. This test appears to be significant for all three
cohorts, which rejects the H0 (the variables are redundant) of the test. Therefore it can
be argued that the logistic regression is significantly better than the null model for all
three cohorts.
For face validation, the coefficients of the model are evaluated to argue whether they
make sense or not. Odd estimates indicate bad face validity. At first sight it seems odd
that emails open has a positive effect on customer churn since clicks has a strong
negative effect and literature on email marketing suggested a positive effect as well.
However clicks and opens are not the same and therefore can have different effects.
The plausible reason for the positive effect of emails open will be further explained in
the general discussion.
M
AIN EFFECTSA main effect model is estimated to discuss the influence of the moderation variable.
Table 9 shows the results of the AIC and BIC for both the entire model and the main
effects only model. ‘Logitmaincoh1’, ‘Logitmaincoh2’ and ‘Logitmaincoh3’
represents the main effects only models for cohort 1, 2 and 3. The predicting
capability of the model slightly
increases when the moderation
variable is excluded from the model
among all cohorts.
TABLE 9: AIC AND BIC FOR MAIN EFFECT ANDThe estimated parameters for the main effects model of all cohorts are displayed in
table 10, table 11 and table 12. These results show marginal changes for all
parameters compared to the model with moderation effect except for the direct effect
of average product price in cohort 2. This model suggests that the average product
price has a positive effect on customer churn for customers in cohort 2
(p-value=0.021). It means that customers who pay a higher average product price are
more likely to churn (table 10). The chance of churning versus not churning will
increase with 0.96% when a customer in cohort 2 has an average product price of 1
euro more (table 11). A marginal change in the average product price will lead to an
increase of 0.04% in the possibility to churn.
TABLE 10: ESTIMATED PARAMETERS MAIN EFFECT MODEL
TABLE 11: ODDS RATIO MAIN EFFECT MODEL
The significance level of the average product price in the main effects model is not a
surprise since this variable was just slightly insignificant in the full model for cohort 2
(p-value=0.051)(table 4). However it is odd that the average product price is
significant for only cohort 2. It suggests that the average price level increases the
possibility to churn for customers that have been with the company for between
fifteen and six and a half months. It does not affect customers who have been with the
company for a longer period than fifteen months or a shorter period than six and a
half months.
Overall, the main effects only model predicts customer churn slightly better than the
full model. Also, there is just a very small change in the estimated parameters. So a
main effects only model does not affect the results of the study.
G
ENERAL DISCUSSION AND MANAGERIAL IMPLICATIONSemails that do not contain any relevant or personal information, they become
unreceptive towards these emails. The emails, in this particular context, could be
considered as not relevant since it is found that Alleeninkt.nl sells low frequency
products. Customers do not have to buy the products often and therefore may find
receiving emails a couple of times per month bothersome. This is also argued by
Bruner & Kumar (2007) who found that people who have opt-in for email marketing
still find receiving too many messages bothersome.
Thus firms should reconsider the personal relevance of the emails they send.
Improving the personal relevance of the email should improve the effectiveness of the
emails (Batra & Keller, 2016). An example is the amount of emails sent. Firms that
have low frequency customers might reconsider the amount of emails they send. For
these type of customers, a couple of emails per month seems to be too much and leads
to an increase in customer churn in the first period of the relationship with the
company. Push strategies, like sending email, become ineffective and therefore firms
must employ new and effective internet marketing strategies.
So it turns out that emails opened does not influence customer churn like literature
suggested. Shankar et al. (2016) argues that the use of mobile for marketing purposes
is growing dramatically. Besides mobile marketing other new marketing channels are
developing and growing as well. According to Opreana & Vinerean (2015) these
developments lead to less effectiveness of email marketing since customers can
choose their object of interaction. So online firms must reconsider their traditional
marketing strategies to reduce customer churn and must respond to the development
of the digital market to retain effective customer churn strategies.
Like literature suggested, the amount of clicks do decrease customer churn. This
effect gets bigger for cohort 2 compared to 1 but than remains almost the same for
cohort 3. Generating more clicks through email marketing thus reduces customer
churn. Online firms should focus on email optimization to generate more clicks and
thus reduces customer churn.
It is argued that the average product price is not moderating the effect email
marketing has on customer churn, despite what the literature claims. Also no
significant direct effect from the average product price on customer churn is recorded.
So different price oriented customers do not response different on email marketing in
a customer churn context. Kaura et al. (2015) discusses the effect of price fairness and
customer loyalty. A fair price leads to higher customer loyalty than an unfair price.
However price fairness can be equal among different product prices. So the price of a
more expensive cartridge can be considered as fair as a cheaper one. For
but the people with highest sensitivity leads to a decrease in customer churn.
However, targeting the email marketing on different product prices is not effective.
The average order quantity has no significant effect on churn rate among all cohorts,
unlike literature suggested. Customers that buy private label brands are less involved
than customers that buy manufacturer brands (Vahie & Paswan, 2006). The goods
Alleeninkt.nl sells are utilitarian products from a private label and therefore it can be
argued that, no matter the quantity, customers of Alleeninkt.nl are low involved. So
targeting customers on their order quantity is not effective for firms with low involved
customers.
L
IMITATIONS AND SUGGESTIONS FOR FURTHER RESEARCHThe way of calculating customer churn in this context is not ideal. For modeling
customer churn with low frequency customers a lot more data is needed. Almost two
year of data is used for calculating the P(Alive) (May 2017 until April 2019), however
this was not enough. There was no variation in the P(Alive) and thus it is suggested
that modeling customer churn in this very context is quit challenging and that data is
needed from a bigger time period (longer than two years). This study had limited time
and therefore the determination of customer churn is performed through a threshold
instead of other models. Other models to determine customer churn could have been
studied and applied if more time was available. So future research could focus on
modeling customer churn for low frequency firms.
The actual (prior) conceptual framework could not be studied through some
rain, temperature) had to be excluded which result in a very thin conceptual
framework. Another consequence of misfortunes in data gathering is the lack of
customer characteristics and control variables. Data storage/collection is new for
Alleeninkt.nl and this study is the very first study performed with their data. This
study is part of a graduation program and had a limited timeframe. Only after eight
weeks in the program the university gave agreement to use the data of Alleeninkt.nl
with insurance that the AVG privacy concerns would not be violated. The data could
not be made available before this insurance. Therefore there was not much time to
find other data after finding out about the lack of the data.
The effect of email marketing on customer churn in the digital market is decreasing.
Nowadays customers are faced with promotions and advertisements through more
channels than ten years ago (Opreana & Vinerean, 2015). Therefore the effectiveness
of these new channels in a customer churn context is of interest for future research.
The average product price is slightly insignificant for cohort 2 in the full model and
significant in the main effects only model. It means that new customers and the oldest
customers are both not affected by the average product price in their churn
probability. No literature is found to support this result. Therefor future research is
needed to explain this phenomenon.
R
EFERENCES
:
Ascarza, E. (2018). Retention futility: Targeting high-risk customers might be
ineffective. Journal of Marketing Research, 55(1), 80-98.
Ascarza, E., Iyengar, R., & Schleicher, M. (2016). The perils of proactive churn
prevention using plan recommendations: Evidence from a field experiment. Journal
of Marketing Research, 53(1), 46-60.
Ascarza, E., Netzer, O., & Hardie, B. G. (2018). Some customers would rather leave
without saying goodbye. Marketing Science, 37(1), 54-77.
Athanassopoulos, A. D. (2000). Customer satisfaction cues to support market
segmentation and explain switching behavior. Journal of business research, 47(3),
191-207.
Batra, R., & Keller, K. L. (2016). Integrating marketing communications: New
findings, new lessons, and new ideas. Journal of Marketing, 80(6), 122-145.
Bensoussan L, Del Bosch LM, Naud F, Benichou P (2014) Churn value management:
How can companies retain customers in an increasingly volatile world. Accessed July
24, 2018, http://www .oliverwyman.com/content/dam/oliver-wyman/global/en/2014/
oct/201408_Churn_value_management_Screen.pdf.
Bhattacherjee, A. (2001), “An empirical analysis of the antecedents of electronic
commerce service continuance”, Decision Support Systems, Vol. 32 No. 2, pp.
201-214.
Chiou, J. S., & Pan, L. Y. (2009). Antecedents of internet retailing loyalty: differences
between heavy versus light shoppers. Journal of Business and Psychology, 24(3), 327.
Chittenden, L. and Rettie, R. (2003), “An evaluation of email marketing and factors
affecting response”, Journal of Targeting, Measurement and Analysis for Marketing,
Vol. 11 No. 3, pp. 203-217.
DuFrene, D., Engelland, B., Lehman, C. and Pearson, R. (2005), “Changes in
consumer attitudes resulting from participation in a permission e-mail campaign”,
Journal of Current Issues and Research in Advertising, Vol. 27 No. 1, pp. 65-77.
Email Statistics Report, 2018 - 2022. 2018. Page on Radicati’s website. Retrieved
from
https://www.radicati.com/wp/wpcontent/uploads/2018/01/Email_Statistics_Report,_2
018- 2022_Executive_Summary.pdf
eMarketer. (2018). What Percent of Purchases Are Digital Buyers Worldwide Making
Domestically vs. Cross-Border? (% of respondents, May 2018). Consulted at
2-4-2019, from:
https://www.emarketer.com/Chart/What-Percent-of-Purchases-Digital-
Buyers-Worldwide-Making-Domestically-vs-Cross-Border-of-respondents-May-2018/223102
Fader, P. S., & Hardie, B. G. (2009). Probability models for customer-base
analysis. Journal of interactive marketing, 23(1), 61-69.
Franses, Philip H. and Richard Paap (2001), Quantitative Models in Marketing
Research. Cambridge, UK: Cambridge University Press.
Green H (2016) Economic analysis of proposed Sky/Vodafone merger. Report,
Axiom Economics, Petersham, Australia.
Grimes, G., Hough, M. and Signorella, M. (2007), “Email end users and spam:
relations of gender and age group to attitudes and actions”, Computers in Human
Behavior, Vol. 23 No. 1, pp. 318-332.
Holmes, A., Byrne, A., & Rowley, J. (2013). Mobile shopping behaviour: insights
into attitudes, shopping process involvement and location. International Journal of
Retail & Distribution Management, 42(1), 25-39.
Huang, S. L., & Chang, Y. C. (2017, January). Factors that impact consumers'
intention to shop on foreign online stores. In Proceedings of the 50th Hawaii
international conference on system sciences.
Kaura, V., Durga Prasad, C. S., & Sharma, S. (2015). Service quality, service
convenience, price and fairness, customer loyalty, and the mediating role of customer
satisfaction. International Journal of Bank Marketing, 33(4), 404-422.
Kim, B. D., Shi, M., & Srinivasan, K. (2001). Reward programs and tacit
collusion. Marketing Science, 20(2), 99-120.
Kim, Y., Trail, G. and Ko, Y. (2011), “The influence of relationship quality on sport
consumption behaviors: an empirical examination of the relationship quality
Kim, Y. S., Lee, H., & Johnson, J. D. (2013). Churn management optimization with
controllable marketing variables and associated management costs. Expert Systems
with Applications, 40(6), 2198-2207.
Kumar, S. and Sharma, R. (2014), “An empirical analysis of unsolicited commercial
e-mail”, Paradigm, Vol. 18 No. 1, pp. 1-19.
Lee, J. Y., Fang, E., Kim, J. J., Li, X., & Palmatier, R. W. (2018). The effect of online
shopping platform strategies on search, display, and membership revenues. Journal of
Retailing, 94(3), 247-264.
Lin, H.-H. and Wang, Y.-S. (2006), “An examination of the determinants of customer
loyalty in mobile commerce contexts”, Information & Management, Vol. 43 No. 3,
pp. 271-282.
Lindner, M. (2015). Global e-commerce sales set to grow 25% in 2015. Internet
Retailer, July, 29.
Martínez, S. C., & Guillén, M. J. Y. (2006). Can price promotions improve tourist
loyalty to tour operators?. Journal of Hospitality & Leisure Marketing, 14(4), 33-46.
McConnell, J. D. (1968). The development of brand loyalty: an experimental
study. Journal of Marketing Research, 5(1), 13-19.
Micheaux, A. L. 2011. “Managing E-mail Advertising Frequency from the Consumer
Perspective.” Journal of Advertising 40 (4): 45–66.
Mishra, U. S., Das, J. R., Mishra, B. B., & Mishra, P. (2012). Perceived benefit
analysis of sales promotion: a case of consumer durables. International Research
Nagar, K. (2009). Evaluating the effect of consumer sales promotions on brand loyal
and brand switching segments. Vision, 13(4), 35-48.
Neslin, Scott A., Sunil Gupta, Wagner Kamakura, Junxiang Lu, and Charlotte H.
Mason (2006), “Defection Detection: Measuring and Understanding the Predictive
Accuracy of Customer Churn Models,” Journal of Marketing Research, 43 (2), 204–
11.
O’brien, R. M. (2007). A caution regarding rules of thumb for variance inflation
factors. Quality & quantity, 41(5), 673-690.
Opreana, A., & Vinerean, S. (2015). A new development in online marketing:
Introducing digital inbound marketing. Expert Journal of Marketing, 3(1).
Osuna, I., González, J., & Capizzani, M. (2016). Which categories and brands to
promote with targeted coupons to reward and to develop customers in supermarkets.
Journal of Retailing, 92(2), 236-251.
Radosavljevik, D., van der Putten, P., & Larsen, K. K. (2010). The impact of
experimental setup in prepaid churn prediction for mobile telecommunications: What
to predict, for whom and does the customer experience matter?. Trans. MLDM, 3(2),
80-99.
Reimers, V., Chao, C. W., & Gorman, S. (2016). Permission email marketing and its
influence on online shopping. Asia Pacific Journal of Marketing and Logistics, 28(2),
308-322.
Sahni, N. S., Wheeler, S. C., & Chintagunta, P. (2018). Personalization in email
marketing: The role of noninformative advertising content. Marketing Science, 37(2),
236-258.
Scott, S. (2018, 16 april). No, advertising spend is not moving online – here’s why.
Consulted at 1-5-2019, from:
https://www.thedrum.com/opinion/2018/04/16/no-advertising-spend-not-moving-online-heres-why
Smith, T. J., & McKenna, C. M. (2013). A comparison of logistic regression pseudo
R2 indices. Multiple Linear Regression Viewpoints, 39(2), 17-26.
Sohail, M. S., Al-Jabri, I. M., & Wahid, K. M. (2017). Relationship between
marketing program and brand loyalty: Is there an influence of gender?. Journal for
Global Business Advancement, 10(2), 109-124.
Spencer, D. M. (2010). Segmenting special interest visitors to a destination region
based on the volume of their expenditures: An application to rail-trail users. Journal
of Vacation Marketing, 16(2), 83-95.
Sur, S. (2015). The role of online trust and satisfaction in building loyalty towards
online retailers: Differences between heavy and light shopper groups. In LISS 2014
(pp. 489-494). Springer, Berlin, Heidelberg.
Tamaddoni, A., Stakhovych, S., & Ewing, M. (2016). Comparing churn prediction
techniques and assessing their performance: a contingent perspective. Journal of
service research, 19(2), 123-141.
Vahie, A., & Paswan, A. (2006). Private label brand image: its relationship with store
image and national brand. International Journal of Retail & Distribution
Wang, H.-C., Pallister, J. G., & Foxall, G. R. (2006). Innovativeness and involvement
as determinants of website loyalty: III. Theoretical and managerial contributions.
Technovation, 26(12), 1374-1383.
A
PPENDICES
Appendix A: Descriptives older customers Appendix B: robustness check Cohort 1, cohort 2 and cohort 3 are estimates for 6 months threshold. Cohort 2.1, cohort 2.2 and cohort 2.3 are estimates for 5,5 months threshold. Cohort 3.1, cohort 3.2 and cohort 3.3 are estimates for 5 months threshold. TABLE 13: DESCRIPTIVES OLDER CUSTOMERS COHORT 2TABLE 14: DESCRIPTIVES COHORT 3 OLDER CUSTOMERS
Appendix C: VIF scores
TABLE 16: ESTIMATES THRESHOLD 5,5 MONTHS
TABLE 17: ESTIMATES THRESHOLD 5 MONTHS