Do you accept my apology, or do you want more?

(1)

Do you accept my apology, or do you

want more?

A field experiment investigating the effects of a service failure for an

online retailer

Master Thesis

MSc Marketing Intelligence

(2)

2

Do you accept my apology, or do you

want more?

A field experiment investigating the effects of a service failure for an online retailer

Master Thesis

Marketing Intelligence

Date: January 2019

Author: George H. G. Radix

Address: Violenstraat 20-4, 9712 RJ Groningen Email: ghg.radix@gmail.com

Internal supervisor: Prof. dr. Jaap E. Wieringa (j.e.wieringa@rug.nl) Second supervisor: dr. Jelle T. Bouma

External supervisor: Wilson Spoor (wspoor@wehkamp.nl)

University of Groningen Faculty of Economics & Business

Department of Marketing

(3)

3

Management summary

The online retail industry is becoming increasingly important for Dutch customers as a channel of choice. And building a strong relationship with customers is becoming more important, as the competition online is fierce. Moreover, strong relationships have a positive effect on the profitability of a company (De Wulf, Odekerken-Schröder & Iacobucci, 2001). Wehkamp is one company that is working on building stronger relationships with their customers. Sometimes these relationships are negatively influenced by error that occur in the process of making a purchase.

This study is conducted on behalf of Wehkamp, to investigate the effects of so-called service failures. We investigate the effects of failure that occurs during the delivery process of the purchase. When a customer receives their package outside the timeframe they designate when making the purchase. These cases are documented, and some customers receive an apology or coupon, which are defined as service recoveries. The use of recoveries is to negate the negative effects on the relationship of the customer. Service failures often affect the satisfaction, loyalty, and buying behaviour of customers.

The results show that there is indeed a negative effect when the customer receives their package outside the timeframe. This study focusses on the effects of actual buying behaviour. We find that the amount spent, number of orders placed, and number of products bought are all affected negatively by the service failure. And that the coupon is an effective recovery strategy to negate the negative effects on buying behaviour.

We find these effects through the comparison of the customers that experience the failure and a sample of normal customers. All these customers were further divided into pre-existing segments. These segments are: new, occasional, regular, formerly regular, and

inactive customers. We find that the failure is negatively experienced mostly by the customers lower in loyalty. Subsequently, the results indicate that no difference in effectiveness between segments is found for the coupon as recovery strategy. The apology was not very effective at all, only showing few positive effects. So, we conclude that this service failure is not

(4)

4

The online retail industry in the Netherlands is becoming increasingly important for Dutch customers as a channel of choice. It is expected that the user base will grow from 78% (13.4 mln) in 2018 to about 81% (14.1 mln) in 2022. Also, the market volume of €12.706 mln in 2018 is expected to grow annually with about 8% up until 2022 (“Statista Market

Forecast”, 2018). This suggests that the share of wallet for online purchases of customers will increase in the coming years. And companies want to capture a bigger part of that increasing share of wallet. To do this, companies try to build a strong relationship with its customer.

Wehkamp is one company that is working on building stronger relationships with their customers. This online retailer has an impressive history as it was founded in 1952 and

managed to make the transition from post-order company to a pure online retailer in 2006. And now Wehkamp is facing the challenge of building a lasting relationship with its

customers. Its ambition is to more than double its regular customer base from 900.000 up to 2 million regular customers in 2021. Now Wehkamp recognises building a strong relationship with its customer is one of the ways to contribute to this growth.

Building a strong relationship with customers has a positive effect on the profitability of a company (De Wulf, Odekerken-Schröder & Iacobucci, 2001). However, in the online environment the nature of building this relationship is profoundly different from the traditional offline setting, since there is almost no interpersonal contact with the customer (Forbes, Kelley & Hoffman, 2005). Combined with low switching barriers (Kelley et al., 1993; Mutum, Ezlika, Bang & Arnott, 2014), this leads to an environment with high

competition and increasingly knowledgeable customers (McMullan & Gilmore, 2008). This makes it hard to build a lasting relationship with customers. This study investigates one topic that could help Wehkamp to build a strong relationship, thereby reaching one of its goals.

When buying a product or service at a retailer, be it online or offline, it can happen that a defect or a mistake occurs. This experience is called a “service failure”. Several studies have shown that service failures adversely affect loyalty and satisfaction of customers

(Kelley, Hoffman & Davis, 1993; Forbes et al., 2005; Roschk & Gelbrich, 2013; Vaerenbergh & Orsingher 2016).

(7)

7 terms of satisfaction and/or loyalty. Both the service-failure and -recovery come in many different forms (Kelley et al., 1993; Forbes et al., 2005) and are experienced differently in online and offline situations.

Research on service failures started with classifying what failures exist and what possible recoveries can be deployed (Kelley et al., 1993). This was augmented with many lines of research. Some exploring the differences between online and offline types of failures and recoveries (Forbes et al. 2005). Others have investigated the effects when looking at different types of customers (Gelbrich et al., 2016), varying levels of intimacy (Jeon & Kim, 2016) or relationship factors such as past experience (Hess Jr. et al., 1998).

Customers often respond negatively when experiencing a failure. Research has most often found these negative responses affect customer loyalty, satisfaction, and repurchase intention (Smith & Bolton, 2002; Grewal et al., 2008; Ha & Jang, 2009; Roschk & Gelbrich, 2013; Crisafulli & Singh, 2017). In turn, when a company deploys a recovery strategy to diminish the negative effects these studies have found wide ranging effectiveness. One study reports the existence of a service recovery paradox (Smith & Bolton, 1998). Here, customers that experience a failure and subsequent satisfactory recovery, are more satisfied than normal, non-failure, customers. Though this effect is not found in any of the other studies found by the researcher. Another study finds evidence that customers with different levels of loyalty react differently to failures and recovery strategies (Gelbrich et al., 2016). A detailed overview of the many differences we find is presented in chapter 2.

What many of these studies have in common is the analysis approach. The most commonly used techniques are surveys, the critical incident technique (CIT) or a

meta-analysis of other earlier literature. It is notable that in the literature many types of failures and recoveries are analysed simultaneously. In almost all cases this is done by having subjects answer questions about hypothetical situations (Grewal, Roggeveen & Tsiros, 2008; Ha & Jang, 2009; Roschk & Gelbrich, 2013; Gelbrich et al., 2016; Jeon & Kim, 2016). Other research examines previous experience of test subjects (Smith & Bolton, 2002; Wang, Wu, Lin & Wang, 2011). The level of analysis is mostly on an aggregated level, having little to no information about individual customers or segments. With the exception for some more qualitative findings (McMullan & Gilmore; Forbes et al., 2005).

(8)

8 consumers. These samples concern relatively small samples, between 100 and 609 per

research. And all the studies ask consumers to give opinions about situations retrospectively. We attempt to contribute in this area with a field experiment. With large test and control groups spread over different types of customers, that have all experienced the same service failure, but received a different recovery strategy afterwards. Now we can track the buying behaviour of this group of customers over an observation period of eight months. And thus, giving insight in the effects of real-world observations through data recorded in the data warehouse of Wehkamp.

The aim for this report is to contribute and provide a real-world application of the academic foundations laid before us in the last years. In the previous paragraph it becomes clear that over the last ten years a lot was done to conceptualize the effects of service failures and recovery strategies on a multitude of concepts. In this report we combine these academic insights. In cooperation with Wehkamp, a setting was developed where real-world service failures were tracked, customers received different treatments and information was stored starting from the 1st of September 2017.

1.1 Research motive

To examine the effects that service recovery has on the customer a cooperation with Wehkamp, one of the top 5 e-tailers ("Statista Market Forecast", 2018) of the Netherlands, was established. This company agreed to share data. Up until this research the effect of the service recovery policy was unclear, although €6.5 million was spent on this measure in 2017. Because of these high costs, the curiosity emerged to uncover what works and for which customers. Management aims to reduce costs in these areas. Nevertheless, when the spending is appropriate and/or effective there is no hindrance for the continued use of the service recovery. This report aims to clarify which customers react positively towards the service recovery efforts of Wehkamp. With this clarification Wehkamp will be able to allocate their resources towards appropriate efforts. Besides the managerial motivation for the research, some academic arguments also signal an urgency for the further deepening of knowledge on this subject.

(9)

9 on interviews and the self-reported behaviour of consumers, which always contains some level of interviewer- or self-reporting bias. Something which is explicitly mentioned as a suggestion for future research by Grewal et al. (2008). The set-up of the different studies offers a chance for a new approach. When asking customers about experience with service failures through the CIT, it is impossible to control the severity of the failure the customer recalls. So, some customers will recall a minor failure, while others imagine one of the worst failures they ever experienced. For the classification of the types of failures this is not a problem, it is even desirable (Kelley et al., 1993; Forbes et al., 2005). But when the goal is to find what the effect of a failure is, control of the situation is needed. When compared to previous studies, the difference with this study is, that it is now possible to observe real-world behaviour and to control the type and treatment of the failure.

In multiple studies distinctions are made between customers. This signals the presence of a heterogeneity in response among different customers to service failures and recovery strategies. In one case, McMullan and Gilmore (2008) interview different customers across three loyalty levels. These low/medium/high loyal customers all respond different to the question what companies should do to increase the loyalty of their customers. This difference can be explained according to McMullan and Gilmore by the fact that high and medium loyal customers are seeking to further develop the relationship. And that this could be achieved when the company reciprocates. This reciprocation could be in the form of a recovery strategy when a customer is confronted with a service failure. A later study by Jeon and Kim (2016) explores the effects where high vs. low intimacy customers experience failures. Intimacy in this case represents the closeness and mutual understanding between the customer and the company. There it seems that highly intimate customers are more tolerant to certain failures (i.e. a broken product) than their low intimacy counterparts. But also, that for some failures (i.e. a broken promise) the experience for intimate customers is found to be worse. In another study the finding of Gelbrich et al. (2016) is that there is a significant difference in the

response to compensation between first time and regular customers of a hotel. Where the regular customers respond better to higher compensations than the new customer group, within different failure scenarios. While Grewal et al. (2008) explicitly calls for a segmentation in future research, also calling upon the difference between frequent and infrequent customers.

(10)

10 differences found in what is the most appropriate recovery strategy for certain failures. One study shows that, when a failure in the delivering of a service is made, an apology has the best effect as recovery strategy (Smith, Bolton & Wagner, 1999). Whereas in another studies this is classified as a failed service and it should be re-performed to minimize the negative effects on satisfaction and loyalty (Roschk & Gelbrich, 2013). Additionally, different findings exist between how customers would react more positively to certain types of failures and recovery strategies. Ha and & Jang (2009) describe in their study how monetary recoveries, such as coupons and discounts would work best for customers with low relationship quality (consisting of low loyalty, trust and satisfaction). While the conclusion in the study of Gelbrich et al. (2016) is that the monetary compensation is said to work better for the best customers of a company. Finally, also different perspectives exist about the timing of a recovery. Two studies exemplify competing views about the timing of the recovery. Where one study shows that for one type of services (e.g. freight transport) delaying a resolution would give more positive results in terms of customer response. At the other hand, Crisafulli and Singh (2017) found that immediate response to an online failure has a positive effect on customer attitude.

1.2 Research question

The aim for this report is combined into one main research question:

What is the effect of service failure and -recovery, for different customer segments, on the customer buying behaviour for an online retailer?

This report aims to contribute toward four different aspects of the academic service recovery literature.

1. The elimination of self-reporting and interviewer bias

2. Controlling the type of failure and the analysis of real-world buying behaviour 3. Insight to the contradictions that are found between different types of

customers

(11)

11 Firstly, there is the question what exactly is defined as service failures and recovery strategies. And especially how these influences the buying behaviour. Within the existing literature there are several aspects and concepts of the customer that are influenced by the service failures and recoveries. And so, to identify relevant aspects and concepts for this study, the first sub question is:

Which types of service failure and recovery strategies, that influence online buying behaviour, can be identified?

Secondly, the heterogeneity among customers should not be overlooked. And more specifically, what classifications can be made and along which attributes this is done. And so, the second sub question is defined as:

What customer segments are identified where the effect of service failures (recoveries) is weakest (strongest)?

Within the service recovery literature many studies have been carried out (Kelley et al. 1993; Toufaily et al. 2013; Forbes et al. 2005; Vaerenbergh & Orsinger, 2016). This research attempts to address limitations of previous studies by analysing an increased volume of data available for a single service failure situation and multiple recovery strategies for this failure. Another benefit is that in this field experiment the data consists of observed behaviour rather than self-reported data gained through surveys or interviews. Through this effort we expect to be able to explain the difference in the pre- and post-failure customer buying behaviour, at least partially.

(12)

12

2. Literature review

In this chapter we provide an overview on what has been studied in the field of the service failure (and recovery) literature over the past three decades. The first section reviews two important studies with developments in the classification of failures and recoveries alike. The second section describes studies that have found effects of the service failures and the recovery strategies, especially on different aspects of customer attitudes and behaviour. The third and final section consists of a description how different types of customer show different reactions to a service failure and recovery strategy.

2.1 History of service failures and recovery strategies

In the first chapter we have touched upon the first types of services failures and different recovery strategies. In this section we start with the history of the service failure. Kelley et al. (1993) made a first attempt at classifying the different types of failures and recovery strategies existed within the retail environment. Although the main classification was based on earlier research (Bitner, Booms & Tetreault, 1990), it was Kelley and his colleagues that applied this for the first time toward a retail situation. Concluding that these types sufficiently held up when applied to the specific situation of service failures and recovery strategies in a retail environment.

In order to understand the different types of service failures that exist we go back to this first typology conducted by Kelley et al (1993). As mentioned in the introduction, there are three main groups that can be identified when classifying the service failures:

1. Response to service delivery system/product failure 2. Response to customer needs and requests

3. Unprompted and unsolicited actions

(13)

13 the online retail environment, these failures, matched to their main group, are described in table 2.1.

Categories Description

1. Response to service delivery system/product failure

Slow/unavailable service Cases where customers experienced delays in the delivery of products. In particular, products that did not arrive on time but eventually were delivered without any customer follow-up

System pricing This category involves any issues where the pricing caused a customer to be mischarged. Often as a result of a website error. E.g. due to technical errors, the customer was double-charged

Packaging errors When a packaging error occurred, it meant that the customer received only partial shipment, wrong items, or more items than initially ordered Out of stock A customer experiences an out-of-stock failure when they place an order

to which the e-tailer notified them of the back-order or out of stock situation

Product defect All cases here present incidents where the customer received an order that was broken, damaged, missing pieces, or not functioning Bad information In this case the customer is provided with poor, incorrect or even

misleading information about a product or its features

Website system failure This is one of the unique online service failures. The situation in which the customer had problems navigating the website or were unable to make a purchase due to technical reasons.

2. Response to customer needs and requests

Special order/request All cases where the customer specified a certain customization and where this was done incorrectly by the company. E.g. a requested engraving that was incorrect

Customer error Sometimes the mistake is caused by error of the customer. Whether the customer would admit to the error or not, it would have been clear that a mistake was made. E.g. giving the wrong shipping address

Size variation Specific cases in where the failure was caused by the fact that the merchandise varied from the normal sizing. E.g. customer always has a size 44 sneaker shoe, but in this case the size 44 was too small

Table 2.1 Overview of the online retail failures based on Forbes et al. (2005)

(14)

14 study into a suitable framework for the online retail industry. To do this one recovery strategy had to be dropped. The manager intervention was not possible for online. Another was

dropped being absent in the incident reports. In addition, one new strategy was added, specifically for online situations. The strategy of replacing or returning a product at a brick and mortar store. Besides these changes, the recovery strategies remain the same. In

accordance to Forbes et al. (2005), the company has eleven different strategies it can employ to react to service failures. Table 2.2 presents the different online service recovery strategies and a short description.

Recovery strategy Description

Discount In this case, the company offers the customer a discount on the product or service as a compensation for any problems or inconvenience that is experienced due to the failure

Correction When a correction occurs, this means that the company simply corrected its mistake. Examples are along the line of: finding misplaced items or making a repair. In any case the retailer did nothing extra, beside the correction of the initial error

Correction Plus This recovery strategy is almost the same as “Correction” expect that

here the company goes beyond the initial repair and offers additional compensation. This could be in the form of discounts, extra service or upgrades of the product

Replacement Replacement of defective products

Apology Offering of a sincere apology by the employee

Refund Refund of the amount paid by the customer

Store credit Often used a replacement of the refund, where in this case the customer would receive a credit for the store in question. The key characteristic here is that the credit would only be spendable at the store of the original failure

Unsatisfactory correction The last type of “Correction”. Where the customer perceived the effort of

the company to not be enough to satisfactorily fix the problem. An example is: repairs after a very long delay

Failure escalation This case covers all the cases where the recovery made matters worse from a customer’s perspective. In some cases, customers were blamed for the failure, or repairs were done incorrectly

Replace at brick and mortar store

Situations where customers purchased a product online, but were able to return it to one of the traditional brick and mortar stores

Nothing In this case the company made no attempt at all to recover from a failure that was experienced by the customer. In some cases, the retailer was unaware of any failure. But in most cases a decision was made to do nothing.

(15)

15 Some interesting findings from this study by Forbes et al. (2005) concludes that in the data of this research, regardless of the recovery strategy, the customers are less likely to repurchase if they have experienced a failure. This is a first indication that the effects of the failure can be severe in terms of repurchase intention. Though repurchase intention is just one of the customer reactions that is affected. And to determine what customer reactions are affected by service failure and recovery strategies, the next section will provide overview of this matter.

2.2 Exploring the effects of service failures and recoveries

We differentiate between two different types of studies. First the effects of studies taking a customer perspective are highlighted. Here we examine the effects of service failures and recovery strategies on three key variables. These studies consist of retrospective types of research, usually through surveys and experiments. Following this, we further explore effects of different types of failures and recoveries, considering differences between customers. We examine the effects for a specific recovery strategy, the compensation.

2.2.1 Customer perspective

In the last twenty years the most common approach used in the service recovery literature stems from a customer perspective. The researchers have employed surveys and experiments to gather as much information about the attitudes and intentions of customers as they could. These studies have found that service failures have negative effects on loyalty, repurchase intention, and customer satisfaction in many different situations (Forbes et al., 2005; Grewal, et al., 2008; McMullan & Gilmore, 2008; Wang et al., 2011; Roschk & Gelbrich, 2013; Jeon & Kim, 2016; Crisafulli & Singh, 2017). Some of these studies found highly significant effects (Wang et al. 2011; Roschk & Gelbrich, 2013; Jeon & Kim, 2016), while others only found partially significant results (Crisafulli & Singh, 2017; Grewal et al., 2008).

(16)

16 this would serve as signal to highlight that there are still many uncertainties regarding the service recovery literature.

To illustrate the differences in significance and effects we provide examples. In one case, which was an experiment, the scenario describes a failure of completing the order process and receiving a 30% discount for a next purchase as recovery (Crisafulli & Singh, 2017). In other research a situation is described where the customer is overcharged, combined with delayed or immediate compensation (Roschk & Gelbrich, 2013). The effect size

variation could partly be explained by the fact that some experiments use up to 200%

compensation of the suffered loss (Gelbrich et al., 2016), while others only grant an exchange of damaged goods, or a refund (Roschk & Gelbrich, 2013). Where in the most minimal case only a $10 voucher to buy a snack, as recovery for a cancelled flight is offered (Grewal et al. 2008). We conclude that many situations of failure can be imagined, along with different possibilities of recovery options. Yet because of this wide variety of recoveries and the hypothetical nature of these studies, it does not permit any generalised conclusion about the size of the effect. The only conclusion we can draw is that the effects of a recovery will be positive.

Even more problematic is that all these studies relied on retrospective appreciation of situations to gather their data, which was marked as problematic by van Vaerenbergh & Orsingher (2016). These researchers identified more than 500 relevant studies about service recovery, of which only three used observed customer behaviour. They also found strong evidence when analysing all these studies that the most important outcome of the process of service failure and recovery strategy is the customer satisfaction. When examining the studies that are used as foundation for this report we draw a similar conclusion. Table 2.3 gives overview to the most used concepts of customer reaction. And here we find that the customer satisfaction is used most of all, though closely followed by repurchase intention.

As much difference we find with the treatments used in studies over the years, so much similarity we find when looking at what is affected by the service failure and recovery strategies. As stated before, the most recurring themes that seem to be affected, can be

(17)

17

Construct Freq. Studies

(E-)Loyalty 5 Forbes et al. (2005), McMullan & Gilmore (2008), Wang et al. (2011), Roschk & Gelbrich (2013), Toufaily et al. (2013) Repurchase

intention

8 Grewal et al. (2008), Ha & Jang (2009), Toufaily et al. (2013), Zhou et al. (2014), Sarkar Sengupta et al. (2015), Jeon & Kim (2016), Crisafulli & Singh (2017), Hazée et al. (2017)

Customer satisfaction

9 Smith & Bolton (2002), Forbes et al. (2005), Roschk & Gelbrich (2013), Toufaily et al. (2013), Sarkar Sengupta (2015), Gelbrich et al. (2016), Jeon & Kim (2016), Crisafulli & Singh (2017), Hazée et al. (2017)

Table 2.3 Overview of occurrence of constructs within 13 previous studies (frequency adds up to more since multiple constructs can be present within one research)

We examine all the findings of these studies, concluding that service failures and recovery strategies have a well-established effect on the customer reaction towards the company. Be it in their attitude towards a company, measured with the satisfaction, and certain stages of loyalty. Or in their behavioural intentions towards a company, measured by the repurchase intention and by certain stages of loyalty. We do see that the antecedents for any of the three themes are the same. The situation starts with a service failure, which

negatively impacts the loyalty, satisfaction, or repurchase intention of the customer. Because this effect is undesirable for companies, different recovery strategies are used to combat this negative effect. The failures can be divided into many groups, though all impact the customer reactions negatively. The underlying cause for the negative effect is the fact that the online retailer does not meet the expectation of the customer. The many ways how these expectations can be violated are described by Forbes et al. (2005). All the studies we have examined since 1993 underpin this negative effect, where service failures negatively impact the customer reaction. Which formulates the first hypothesis as:

(18)

18 After the customer experiences a failure, the company is motivated to apply a recovery strategy to minimize the negative effects on the customer reaction. As with the types of

failure, the different types of recovery are numerous, all depending on the specific situation of a customer. The different situations, specifically for online retail situations, have been

identified by Forbes et al. (2005). The common ground of all these recovery strategies, except the one where a company does nothing, is that they influence the customer reaction positively when applied after the occurrence of a service failure. In this case the case where a company does nothing is not counted towards an actual recovery strategy because of its passive nature when compared to the 10 other strategies. The remaining 10 recovery strategies all illustrate different ways how a company could act to recover that which failed in the first instance. Other studies pick one or more of the possible recovery strategies and manipulate respondents with it in experiments and surveys. Leading to a measurement of the effect of the recovery strategy on customer reactions. Where in all cases this is found to be a positive effect. All these strategies were experienced by customers as satisfactory at minimum. This leads to the second hypothesis of the effect of the recovery strategy:

H2: A recovery strategy has a positive effect on the customer buying behaviour

2.2.2 (E-)Loyalty

In recent years Toufaily et al. (2013) described in their meta-analysis what loyalty is with respect to a commercial website. After the careful examination of forty-four studies, the following definition was created:

“The customer's willingness to maintain a stable relationship in the future and to engage in a repeat behaviour of visits and/or purchases of online products/service, using the company's website as the first choice among alternatives, supported by favourable beliefs and positive emotions toward the online company, despite situational influences and marketing efforts that lead to transfer behaviour” (Toufaily et al. 2013)

(19)

19 intention) and customer satisfaction are entangled. And perhaps, loyalty is the overarching construct of the combination of repurchase intention combined with satisfaction. One example for the entanglement can be found in the study of Wang et al. (2011). Where the loyalty is described as the behavioural intention to repurchase from a specific online retailer. Another is found in the research of Forbes et al. (2005), where they use the propensity to switch and satisfaction to measure the effects of service failures and recovery strategies and call the former of those two loyalty.

2.2.3 Repurchase intention

When we look at some of the more recent studies that use repurchase intention, we see similarities in how this concept is defined. One study clearly formulates the concept as: “The repurchase intention is the customer’s intention to re-purchase the goods or service from the same company, based on their current status and circumstances” (Jeon & Kim, 2016). The repurchase intention is influenced positively by recovery efforts throughout all the studies, though with varying effectiveness. Crisafulli & Singh (2017) investigated this positive

relationship more recently and found the effect is still consistent with previous research done, though reasoning beyond this point is not given. Wang et al. (2011) demonstrates a more interesting finding as they found that compensating customers of an online retailer does not increase their post failure repurchase intention. However, Grewal et al. (2008) finds several situations in which the compensation does in fact lead to enhanced repurchase intentions. The main difference for this that Grewal et al. find are the causes of the failures. When failure is caused by the company itself, the compensation tends to work. If the failure is beyond the control of the company, only an explanation is enough to achieve the effect on repurchase intention.

2.2.4 Customer satisfaction

One of the biggest distinctions made for customer satisfaction is between the

satisfaction with the service recovery and overall satisfaction. Some studies use both (Smith & Bolton, 1998, 2002). Others just satisfaction with the recovery (Roschk & Gelbrich, 2013; Gelbrich et al., 2016). Or only on their overall satisfaction with the company (Sarkar

(20)

20 individual customer characteristics is emotional response (Smith & Bolton, 2002). This study found that the individual customer characteristics they used to explain partly how the different reaction to recoveries of customers are formed. A customer with a negative emotional

response to a service failure places more value on the recovery attributes (such as discounts, vouchers) instead of the interaction (the manner of communication and transparency) than a customer with no negative emotional response. This has consequences for the effectiveness of a recovery and thus, ultimately, its effect on the satisfaction.

2.2.5 Types of failures and recoveries

In this section we further discuss the differences in effect of types of failures. Moreover, we investigate two frequently used recovery strategies and their effects. Studies have shown that there exist different levels of failure. One way to classify this, is by the severity of the failure (Wang et al. 2011). The severity is mentioned as an important factor in the determination of the negative effect that is experienced by a customer in the context of e-tailing. It shows that perceived higher severity failures tend to show lower loyalty levels after the failure when compared to customers that experience a perceived lower severity failure. In another study the failures are generalized into two categories, outcome- and process failure (Jeon & Kim, 2016). Where the outcome failure represents problems with the product (i.e. delivery of a wrong or broken product). The process failure represents more intangible problems (i.e. breaking a promise). Here the researchers show that high level intimacy customers have higher satisfaction scores in cases of outcome failure than low intimacy customers. However, when the process failure occurs, the difference between high and low intimacy customers has disappeared. Indicating a stronger decrease in satisfaction for the high intimacy customers. This leads to the conclusion that customers with high intimacy towards a company display more tolerance for outcome failures.

The next study examines customer behaviour after a service recovery (Evanschitzky, Brock & Blut, 2011). They tracked customers that reported a complaint and analysed their subsequent post-recovery behaviour. While tracking their purchase behaviour it was found that a satisfactory recovery leads to higher purchase volume. Though, for more loyal

(21)

21 Another study differentiates customers based on their relationship quality. The

distinction is made between high and low relationship quality. This study shows that customers with low relationship quality reacted better to receiving a discount or coupon, compared to an apology or prompt addressing of the problem. For the customers with high relationship quality addressing the problem promptly is the best solution. These findings are grounds for two hypotheses for the effect on the failure and the recovery.

H3: The service failure effect is weaker (stronger) for high (low) loyalty customers H4: The service recovery effect is weaker (stronger) for high (low) loyalty customers

2.2.6 Compensation

In this section we develop a further understanding of how compensation is applied in case of the experience a service failure. We divide compensation into two categories:

financial and psychological. Where the financial compensation is in the form of a discount, coupon or product replacement. And the psychological compensation is in the form of an apology or other response from the company or employee (Gelbrich & Roschk, 2011). This study finds that between the two types, monetary compensation performs better in resolving some of the satisfaction. Though one exception seen here is that the non-monetary

compensation has stronger resolve of the satisfaction for interactional situations. This leads to the conclusion that in most cases a monetary compensation works best, except for when the customer is treated unfairly or indecent.

Another comparable study investigates the effects of outcome and process failures on the customers satisfaction (Smith & Bolton, 1999). Here the recovery strategies of an apology and monetary compensation are included in the model. Here they conclude that from their results, customers are more satisfied with a compensation after experiencing an outcome failure. But also, that the apology works better when customers experience a process failure. In a later study they investigate negative emotional responses that might occur with a

customer. When this is the case, the apology and communication with the customer is appreciated. Otherwise, in cases of low or no negative emotions, the customer is focussed more on the outcome of the recovery (i.e. coupon or discount) (Smith & Bolton, 2002).

(22)

22 we learn from these qualitative results is that compensation could lead to higher satisfaction scores. But for the propensity to switch (i.e. the reverse of loyalty) customers show similar results between compensation and an apology. What we expect from the findings presented in this paragraph is that the monetary compensation will have a stronger recovery effect when compared to the apology. And this expectation leads to the final hypothesis:

H5: The recovery effect for a financial compensation is stronger than for an apology

2.3 Conceptual model

The literature gives us some direction for what effects we expect to find in this report. These expectations, in the form of the hypotheses formulated, are captured in figure 2.1. Notice that H5 is not graphically depicted in this figure, as the both types of compensation are within the over-arching term of recovery strategy. We continue the next chapter with the methodology of how the proposed hypotheses are tested.

(23)

23

3. Methodology

In this chapter we present an overview of the methodology that is used for this study. The research design is the starting point and is described in the first section. The second section investigates the sample distribution of customers included in the dataset. In the third section the manipulation of the independent variables is explained. The fourth section reviews some descriptive statistics of the data. Finally, in the fifth section we specify and validate the proposed models, while also dealing with outliers.

3.1 Research design

This research performs analysis on different groups of customers to investigate the effect of a service failure (outside time frame delivery) on buying behaviour. The buying behaviour is measured in three different variables. These are: number of orders, total amount spent, and number of products bought. In this study the dependent variable is measured by comparing a four-month period of buying behaviour from before and after the failure. The delta between these two periods is taken as dependent variable for this analysis. Additionally, the percentage change between the period is also calculated. This creates a total of six

dependent variables. The four-month period is chosen based on the maximum repurchase period found in the test group data. If a customer makes a repurchase, this would happen within 120 days in almost all cases.

(24)

24 did not experience the failure at all. Detailed explanation of the sample is provided in the next section, first we discuss the manner of analysis.

The analysis consists of two parts. First the main effects of the service failure and recovery strategies for all customers are tested with a multiple regression analysis. The interaction effect for the recovery strategies is included in this model as well, because a recovery can only take place after a failure has happened. Meaning that a recovery will never occur independent of a failure. Hypotheses 1 - 3 are tested through this model.

The second part investigates the moderation effects proposed in hypothesis 4. Here the multiple regression model is expanded to include the interaction effects of the service failure combined with different customer segments. This second model also includes the main effects and will test the entire model proposed in the conceptual model.

3.2 Sample

3.2.1 Sample selection

The dataset we analyse in this study is built from two separate datasets. The first dataset includes the test group and contains all customers that have experienced the service failure. There are three important sub-groups found within this dataset, with a total of 54.371 observations of unique customers. These sub-groups are: control group, apology group, and the coupon group. Data was collected from the 1st of September 2017 until the 31st of May 2018. The second dataset is a random sample of ‘normal’ customers, called the baseline group. From a group of 2.6 million customers in this population a selection of 43.365 was randomly sampled. To find the most similar customers possible we used a method that links a baseline customer to each customer in the test group. The criteria for the linkage are: same order date and same customer segment. This method worked in all cases, except within one segment. The “new customer” segment was about 60% smaller in the baseline group w.r.t. the test group.

3.2.2 Selection of dependent variables

(25)

post-25 observation period. We use the delta of the pre- versus the post-observation period as the dependent variables in our analysis.

3.2.3 Sample distributions

To ensure that effects are accurately estimated in our analysis we check if the types of customer segments are evenly distributed throughout the different treatment groups. Upon investigating the distributions, we find some reason for concern. In table 3.1 we see the distributions of all customer segments across the baseline and test groups. One distribution stands out as being particularly uneven. The new customer segment is highly over-represented in the service recovery groups. Here we see that they represent 55,3% and 52,8% of the total group. In the no treatment group, they only represent 5,6% of the total group. Because new customers show lower repeat purchase frequency, lower average order value and higher churn this could lead to biased result when estimating a model for the recovery effects. As the new customer segment is a very interesting group in terms of future potential, we estimate unit-by-unit models for all segments. By using this method, we can include this important segment to the analysis.

Table 3.1 The distribution of segments across baseline and test groups

3.3 Manipulation of the independent variables

Each of the customers present in the dataset were dummy coded with a segment identifier. As mentioned before, there are five main segments. All these groups are discussed in detail in the following section. Besides these different customer segments customers are in one of four test groups: baseline, no treatment, apology, discount group.

3.3.1 Customer segments

The customer segments at Wehkamp are divided into five main categories. We refer to figure 3.1 for a graphical overview of the segments and their (potential) value to the firm. The

N= Baseline 43365 No treatment 20743 Apology 4938 Coupon 28690 Subtotal 54371

Segments Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage Frequency Percentage

New 8.040 18,5% 1159 5,6% 2729 55,3% 15158 52,8% 19046 35,0% Occasional 4.422 10,2% 2373 11,4% 319 6,5% 1730 6,0% 4422 8,1% BFO 13.069 30,1% 7075 34,1% 794 16,1% 5200 18,1% 13069 24,0% CB 3.959 9,1% 1930 9,3% 260 5,3% 1769 6,2% 3959 7,3% FB 3.244 7,5% 1689 8,1% 220 4,5% 1335 4,7% 3244 6,0% BH 1.765 4,1% 798 3,8% 119 2,4% 848 3,0% 1765 3,2% Disengaged 4.785 11,0% 3055 14,7% 271 5,5% 1459 5,1% 4785 8,8% Dormant 4.081 9,4% 2664 12,8% 226 4,6% 1191 4,2% 4081 7,5%

Total 100,0% Total 100,0% Total 100,0% Total 100,0% Total 100,0%

(26)

26 categorising of the customer to the segment takes place on the date of their failure order. Customers can transition through these segments, we take the date of the failure as reference point for the assignment of segments.

We start the explanation of the segments with the new customers. These are so called “newbies” and have mostly placed only one order. The following period is found to be crucial for further development of these customers, data shows that 46% of the customers only place one order. Because if a new customer does not place second order within 4 to 6 months, the chances that this customer will do so are virtually zero. If this new customer does place a second order the status is updated to occasional customer. These have typically placed two orders within an eighteen-month period. When this occasional customer places three or more orders within this eighteen-month period, another upgrade takes place and this customer becomes an engaged customer.

Figure 3.1 The customer lifecycle at Wehkamp

Engaged customers are the most active customers and on average, they place at least three orders per eighteen months. Since these customers are more active than other customers, more data on their behaviour is obtained through their buying and browsing activity. Here the segment is further divided into four subcategories. These are based on several characteristics, which we will discuss per category.

(27)

27 customers typically buy across the women, men, and kids product categories. From which we derive that these customers either have a partner, child, or both.

The second subcategory is unique to Wehkamp. This segment is called “Credit Buyers” (CB). These customers have a credit account with Wehkamp. This allows them to buy products in part-payment. Another characteristic is that these customers buy many

products in the electronic category. Products like laptops and tv’s are often purchased by these customers.

The third subcategory is called “Functional Buyer” (FB). These customers are characterised predominantly purchasing products from the electronics and home & garden categories. Another typical FB characteristic is the low amount of product views per session.

The fourth and final subcategory is the “Bargain Hunter”. These customers are characterised by the fact that they have a high share at discounted “sale” products. But they buy these products outside the promotional “sale” periods. In figure 3.2 the share of the segment is plotted over time versus the sale peaks of the total customer base to exemplify this.

Figure 3.2 Bargain hunters buying behaviour over time

(28)

28

3.3.2 Baseline and test groups

Besides the segments, the customers are in one of four groups. These are: the baseline, no treatment, apology or coupon group. The baseline group consists of customers that have not experienced the service failure. The difference between the customers that do experience the failure is whether they also receive the recovery attempt. One group receives no treatment. Another group receives an apology email and the third group also receives a coupon with the apology email.

3.4 Descriptive statistics

3.4.1 Independent variables & control variables

Table 3.2 shows the independent variables and their respective descriptive statistics. All the dummy coded variables that act as independent variables are described per level, frequency, and relative frequency. The operationalization is also shortly described. The independent variables contain no missing values. The control variables contain a total of 89 missing values for the age class.

Table 3.2 Descriptive statistics of the independent variables

3.4.2 Dependent variables

(29)

29 calculated by taking the total amount spent at Wehkamp by a customer in the four-month period before the service failure and subtracting this figure from the total amount spent in the four-month period after the service failure. For the baseline group, the order date with which it is linked to the test group is taken as mid-point for the observation period. The variable delta number of products and delta amount of orders are calculated similarly. In these cases, the total amount of products or orders for both periods is taken and the before figure is subtracted from the after figure. There are no missing values for the dependent variables.

Table 3.3 Descriptive statistics for the dependent variables

We notice that for all customers a general decline in their buying behaviour is observed. This could be explained by the fact that, over time, customers have a chance to churn or shop less frequently.

3.5 Model specification and validation

This section starts with an evaluation of the dependent variables. We check the correlation between the dependent variables and create boxplots to identify possible outliers. Following this we specify the models using formulas that hold the regression equation. Lastly, we assess if the assumptions of the linear regression hold for each model.

3.5.1 Dependent variables: correlation

In figure 3.3 we can see the correlation matrix for the three dependent variables. We see here that the delta number of products has the strongest correlation to the delta amount spent (correlation = 0,74). Closely followed by the delta amount of orders (correlation = 0,72). The remaining correlation between delta amount spent and delta amount of orders is still moderate (correlation = 0,63). High correlations are expected between these variables since they partially depend on each other.

N=97736 Segments New -54,85 -0,523 -0,713 Occasional -42,27 -0,423 -0,805 BFO -91,45 -0,507 -1,572 CB -150,90 -0,568 -1,394 FB -94,20 -0,630 -1,402 BH -60,23 -0,601 -1,359 Disengaged -29,43 -0,248 -0,401 Dormant -11,03 -0,146 -0,195 Total -67,93 -0,465 -1,010

(30)

30

Figure 3.2 Correlation matrix of the dependent variables

3.5.2 Distribution of the dependent variables and outliers

In this section we investigate and evaluate the distribution of the dependent variables. We also discuss the treatment of possible outliers. The dependent variables we discuss are: delta of amount spent, delta number of products, and delta number of orders. In figure 3.2 the boxplots for these variables is found. We see for all three dependent variables that the most extreme values are well outside the interquartile range of the boxplot, both positively and negatively. Judging by the boxplots the delta in number of orders is least affected by outliers. The cut-off values and treatment of the outliers is discussed in the next paragraph.

Figure 3.2 Boxplots of the three dependent variables

(31)

31

3.5.3 Dealing with outliers

We have found that there are some potential hazards in the estimation of our models with the current data. The outliers found have very extreme values and could have a big influence on the estimations of the parameters of the model. Some of these values are so extreme that they are more than 170 times higher than the mean. These cases are likely to resemble actual customer situations, however for this research such extreme cases are not the main subject of the analysis. We try to find meaningful differences across generalised

customer segments. So, the choice was made to delete these extreme cases in order to prevent bias in the estimation of the models toward the extreme cases. To determine what a suitable cut-off value is we use a principle of the normal distribution. The one principle we use, is that 99,7% of the values drawn would be within three standard deviations of the mean. We take three standard deviations distance from the mean as the cut-off value for our dependent

variables and apply this to the dataset. The cleaning of these outliers resulted in the deletion of 3624 rows of data. We will assess the validation of the models with, and without outliers further in this chapter. New boxplots, shown in figure 3.3, show some improvement in the distribution of the dependent variable.

3.5.4 Specification of the model

Our model examines the effect of the service failure and the two recovery strategies. The data is analysed by means of a multiple linear regression model. The equation we make aims to find a straight line that best fits our data. These are expressed through a parameter we call betas. Each independent variable is assigned its own beta such that the predicted values of our equation are as close to the observed values as possible. The following equation defines the main effects:

𝑦_𝑖 = 𝛼_𝑖 + 𝛽_𝑖,1𝑥_𝑖,1+ 𝛽_𝑖,2𝑥_𝑖,2+ 𝛽_𝑖,3𝑥_𝑖,3+ 𝛽_𝑖,4𝑥_𝑖,4+ 𝛽_𝑖,5𝑥_𝑖,5+ 𝜀_𝑖 Where:

𝑦_𝑖 indicates one of the predicted continuous dependent variables delta: amount spent, delta number of products bought, or delta number of orders placed for each segment 𝑖.

𝛼_𝑖 is the intercept for segment 𝑖, and gives the value of 𝑦_𝑖 when all the independent variables in the model are set at 0.

(32)

32 𝑥_𝑖,1 is a binary variable that indicates for each customer in segment 𝑖 whether they

experienced a service failure.

𝑥_𝑖,2 is a binary variable that indicates for each customer in segment 𝑖 whether they received an apology following the failure.

𝑥𝑖,3 is a binary variable that indicates for each customer in segment 𝑖 whether they received a

coupon following the failure.

𝑥𝑖,4 is a control variable that indicates in which age group a customer in segment 𝑖 belongs

𝑥𝑖,5 is a control variable that indicates how long a customer of segment 𝑖 has been a customer

of Wehkamp, measured in months

𝜀_𝑖 expresses the error term, called residual, that represents the difference between the predicted value 𝑦𝑖 and the observed variable 𝑦𝑖 for each customer in segment 𝑖 3.5.5 Model validation

Before we examine the results of our model we discuss the assumptions that must hold true for a multiple linear regression. For each of the main effect models (with and without outliers) the mean of the residuals must be zero. Which is the case for all the models, both with and without outliers. The second assumption is that the residuals do not correlate with the independent variables. This is checked through a correlation test. Next, we check the equal variance of the residuals for all dependent variables. The variance should be a random pattern throughout the plot. The variance is plotted for each segment and corresponding dependent variable and added to the appendix 7.1. Following this we check if the residuals are normally distributed. The plots for each model can be found in the appendix. We summarize the results for these checks in table 3.4.

Table 3.4 Model validation for each dependent variable

All models have trouble with the normal distribution of the residuals. For only three of the fifteen models the distribution is minimally acceptable. We already treated the models by the removal of outliers, at this stage no more action is taken to improve the distribution. The

Hetereoskedas ticity?

Res.

Non-normality? VIF score

Hetereoskedast

icity? Res. Non-normality? VIF score

Hetereoskedas ticity?

Res.

Non-normality? VIF score

New No Yes Acceptable No Yes Acceptable Yes Yes Acceptable

Occasional Yes Yes Good Yes Yes Good Yes Yes Good

Engaged No Acceptable Good No Acceptable Good No Acceptable Good

Disengaged No Yes Good Yes Yes Good No Acceptable Good

Dormant Yes Yes Good Yes Yes Good Yes Yes Good

Model validation

(33)

33 negative consequences for the T-values of the estimates are acknowledged, but we accept the non-normality in study. The variance of the residuals showed patterns in many of the models, especially for those that relied on estimation with fewer observations. The models with more observations showed more homoscedastic variance of the residuals. Therefore, these models are more accurate compared to the models with fewer observations. Finally, we check for the presence of multicollinearity by checking the VIF scores for each model. Where we conclude that none of the models has variables with intolerable high VIF scores. So, no problems with multicollinearity are present within the models.

(34)

34

4. Results

In this chapter we present the results for the models that are estimated. We start with the model fit and significance for all models we include in the analysis. This model is estimated three times per segment, with the different dependent variables that represent the buying behaviour of customers.

4.1 Model results

The expected main effects of experiencing a service failure, receiving an apology, and receiving a coupon are estimated in the models, for each segment separately. These models test the effects of the different treatments with respect to the baseline group of customers. In table 4.1, 4.2, and 4.3 the estimation and interpretation metrics of the models are presented.

4.1.1 Model fit and significance

After inspection of the differences in model fit of the models with, and without outliers we conclude that the models that exclude the outliers outperforms the model including them. The adjusted R-squared, that is used to compare models, increases for all models. Moreover, the standard error of the residuals is lower for each of the models. These metrics indicate that the removal of outliers resulted in more accurate models. For

completeness the estimation of the models that include the outliers is added in appendix 7.2. We then investigate if the overall models are significant and interpretation is justified. All the models show significance on the P<0,001 level. The overall explanation power of the regression model is measured with the R-squared. We find a minimum of 0,0012 (delta mount spent – engaged customers) and a maximum value of 0,1816 (delta amount of orders –

dormant customers). These can be translated to percentages that measure the amount of variance that is explained by the models. Our models explain between 0,1% and 18,16% of the total variance found in the data. Given that we do not include any further explanatory variables of buying behaviour we did not expect high R-square values.

4.1.2 Effects on delta amount spent

(35)

35 segment this could be explained by the fact that a substantial number of customers only order once at Wehkamp. The decrease for the engaged customer segment could be due to a self-selection mechanism, where the probability of being included in the test group is increased when customers buy more. Which in turn could lead to the most active customers being in the test group.

The second variable is service failure. Each of the customers that experienced the failure, regardless of their treatment, are represented through this variable. Here we see that in three of the five cases we find a significant (P<0,01) negative effect on the buying behaviour, which is in line with H1, which predicts that a service failure will have a negative effect on the buying behaviour. Though for the new segment we find that the service failure has a positive effect, which is opposite of the expected effect, both for H1 and especially for H3. Where we expect a weaker effect for customers high in loyalty. Not that the customers

assumed to be low in loyalty display an increase in buying behaviour. The assumption that the segments correctly proxy loyalty levels is challenged by this finding. Further evidence for H3 is not found since the estimate for the loyal, engaged segment is insignificant and cannot be compared.

The recovery strategy of the apology is the third variable. The only significant effect found (P<0,01) is positive for new customers. Which is in line with H2, that predicts that the service recovery will have a positive effect on the buying behaviour. The other estimates show no significant effect, indicating that the apology did not make a difference for the buying behaviour of these customers.

The effect for the second recovery strategy, the coupon, is estimated in the fourth variable. Here we see that for three estimates out of the five models the coupon has a significant (P<0,01 & P<0,05) positive effect. We compare the new and engaged segment through plotting the effects of the coupon for both situations in figure 4.1. These plots show difference in slopes, in support of H4 that the service recovery has a weaker effect for higher loyalty customers.

(36)

36

4.1.3 Control variables

The control variable age class is referenced to the age group of customers below 25 years of age. The negative effects for age class indicate that the spending decreases as customers get older, which does not follow typical logic of customers getting older and having more to spend. These estimates could be biased in the sense that do not only measure the effect of age but also obtained effects otherwise captured in the error term. Relationship length shows logical results logical results, in that a longer relationship predicts an increase in spending. Significant results for this are found for the new, occasional, and engaged segment (P<0,01).

Table 4.1 Estimation of the model for dependent variable delta amount spent

Figure 4.1 Coupon effect plotted for new and engaged segments

New customers Occasional Engaged Disengaged Dormant

Intercept -138,30*** 0,59 -96,48*** -5,83 12,54** Service failure 63,65*** -56,80*** -0,22 -51,53*** -69,14*** Apology 12,53** -9,15 -5,42 3,63 5,15 Coupon 19,95*** 14,86*** 11,58** 7,75 1,6 Age class 2 -11,18*** -2,54 -7,44 0,02 2,01 Age class 3 -12,40*** -2,16 -19,65*** -12,7 1,38 Age class 4 -16,46*** -10,34* -26,31*** -11,61 -4,14 Age class 5 -10,56*** -5,48 -15,08* -17,29 2,81 Age class 6 -2,33 -5,4 -12,48 -17,68 -7,51 Relationship length 10,14*** -0,36*** 0,10*** -0,02 0,02 Observations 26561 8706 41410 9329 8023 R2 0,0517 0,0332 0,0013 0,016 0,0784 Adjusted R2 0,0514 0,0332 0,0011 0,015 0,0773 Residual Std, Error 169,37 161,77 321,54 193,03 121,58 Degrees of freedom 26551 8696 41400 9319 8013 F Statistic (df=9) 160,99*** 33,26*** 6,30*** 16,84*** 75,75***

Note: Significance scores: * = P<0.1 **= P<0.05 *** = P<0.01 _{Outliers excluded}

Multiple regression:

(37)

37

4.1.4 Effects on delta number of orders

We established that the model for the delta amount of orders is significant and that interpretation is justified. We estimated five separate models, one for each segment. The intercept of each model can be interpreted as the baseline group of customers, who did not experience a service failure. Of three intercepts that show a significant estimate we find that the segments new and engaged show a negative intercept. This would indicate that customers in this segment show a decrease in buying behaviour over time. Similar reason as for the delta spent model could be the cause of this.

The second variable is service failure. Each of the customers that experienced the failure, regardless of their treatment, are represented through this variable. Here we see that in all five models we find a significant (P<0,01) effect on the buying behaviour. Though for the new and engaged customers these effects are positive. Which is in contradiction with the expectations of H1, which predicts that a service failure will have a negative effect on the buying behaviour. The remaining three estimates for occasional, disengaged, and dormant customers give evidence in support of H1. This positive result for the engaged segment gives evidence in favour of H3, that proposes customers higher in loyalty are less sensitive to the negative effects of a service failure.

The recovery strategy of the apology is the third variable. The only significant effect found (P<0,01) is positive for new customers. Which is in line with H2, that predicts that the service recovery will have a positive effect on the buying behaviour. The other estimates show no significant effect.

The effect for the second recovery strategy, the coupon, is estimated in the fourth variable. Here we see that for all estimates the coupon has a significant (P<0,01) positive effect on the amount of orders placed. We compare the segments through plotting the effects of the coupon for five situations in figure 4.2. The slopes for these plots show only the slightest of differences, providing no substantial evidence in favour of H4.

(38)

38

Table 4.2 Estimation of the model for dependent variable delta amount of orders

Figure 4.2 Coupon effects plotted for all segments

New customers Occasional Engaged Disengaged Dormant

Intercept -1,21*** -0,02 -1,00*** -0,07 0,05* Service failure 0,35*** -0,65*** 0,29*** -0,45*** -0,73*** Apology 0,08*** 0,07 -0,11 0,09 0,03 Coupon 0,16*** 0,12*** 0,09*** 0,10*** 0,12*** Age class 2 0,07*** 0,04 0,06 0,01 0,12*** Age class 3 0,05*** 0,06* 0,02 -0,08 0,08** Age class 4 0,01 -0,003 -0,03 0,11* 0,005 Age class 5 0,01 -0,005 -0,07 0,11* 0,01 Age class 6 -0,02 -0,08 0,12* -0,24*** -0,04 Relationship length 0,10*** -0,003*** 0,001*** 0,0002 0,0004*** Observations 26561 8706 41410 9329 8023 R2 0,128 0,0859 0,0054 0,0394 0,1816 Adjusted R2 0,128 0,0849 0,0052 0,0385 0,1807 Residual Std, Error 0,8446 1,015 2,239 1,142 0,7859 Degrees of freedom 26551 8696 41400 9319 8013 F Statistic (df=9) 435.09*** 90.81*** 25,31*** 42,56*** 197,62***

Note: Significance scores: * = P<0.1 **= P<0.05 *** = P<0.01 Outliers excluded

Multiple regression:

Do you accept my apology, or do you want more?

Do you accept my apology, or do you

want more?

A field experiment investigating the effects of a service failure for an

online retailer

Master Thesis

MSc Marketing Intelligence

Do you accept my apology, or do you

want more?

A field experiment investigating the effects of a service failure for an online retailer

Master Thesis

Marketing Intelligence

Management summary

Table of Contents

Contents

1. Introduction

1.1 Research motive

1.2 Research question

2. Literature review

2.1 History of service failures and recovery strategies

2.2 Exploring the effects of service failures and recoveries

2.3 Conceptual model

3. Methodology

3.1 Research design

3.2 Sample

3.3 Manipulation of the independent variables

3.4 Descriptive statistics

3.5 Model specification and validation

4. Results

4.1 Model results