Yao Zhang

(1)

Making artificial neural networks more interpretable:

A case study of switching costs

Yao Zhang

MASTER THESIS

MSc Marketing

Intelligence

(2)

Making artificial neural networks more interpretable:

A case study of switching costs

By: Yao Zhang

University of Groningen Faculty of Economics and Business

Department of Marketing MSc Marketing Intelligence

June 2018 Supervisor: K.dehmamy

(3)

Abstract

In recent years, statistical modeling has become the dominant method in the field of management research. What's more, to deal with more and more complicated marketing issues, scholars start to adopt more and more sophisticated methods. For example, artificial neural networks (ANNs) technique is often used to deal with complicated marketing problems today. In spite of its exceptional predictive power compared to some conventional methods, ANNs have also been tagged as a “black box” due to its opaque prediction process. ANNs technique provides little

(4)

Keywords: connection weights; randomization approach; compare; regression method

Preface

I would like to thank my supervisor Keyvan dehmamy for his careful guidance in the process of my thesis writing and for introducing me into a really interesting topic: understandable machine learning. My supervisor always gives me feedback as soon as possible so that I know what to do next. Furthermore, I want to thank my second supervisor Peter van Eck in advance for reading and evaluating my thesis. At last, I want to thank my family and friends for supporting me during this process.

I hope that you will enjoy reading my thesis!

(5)

Table of content

1. Introduction ... 6 2. Empirical Example ... 7 2.1 Switching Costs ... 8 2.2 Satisfaction ... 9 2.3 Customer Loyalty ... 9 2.4 Customer Characteristics ...10 2.5 Hypotheses ...11

3. Artificial Neural Networks ...13

3.1 Interpreting network architecture and back propagation algorithm ...14

3.2 Interpreting the Input Contribution ...17

4. Methodology ...18

4.1 Data description ...18

4.2 Variable introduction ...19

4.3 Preparation of the data ...19

4.4 methods introduction ...20

4.4.1 Neutral Network Diagram ...20

4.4.2 Garson Algorithm ...23

4.4.3 Sensitivity Analysis ...26

4.4.4 Randomization approach for ANNs ...28

4.4.5 ANNs versus regression method ...33

5 Conclusion ...39

6 Limitations and further research ...42

(6)

1. Introduction

ANNs, recently, received more and more attention in marketing facet as a powerful statistical technique for market prediction1. Many scholars today adopt ANNs instead of traditional statistical methods to solve customer churn prediction problems (e.g., Tsai, Chih-Fong, and Yu-Hsin Lu, 2009; Sharma, Anuj, Dr. Panigrahi, and Prabin Kumar, 2013).

The most common function of ANNs related to predicting values of a particular dependent variable based on input variables. The performance of ANNs in predicting is remarkable, while the explanatory power problem of ANNs is often criticized by scholars because of the vague effect of explanatory variables in predicting the value of the response variable. Consequently, the interrelationship between independent variables and the dependent variable cannot be understood (Anderson, 1995; Bishop, 1995). Compared to conventional statistical methods that can quantify the effect of input variables, the “black box” property of ANN is its main drawback.

Garson (1991) put forward an idea about determining the overall influence of each explanatory variable. However, the interpretation of interrelationships between independent variables and the dependent variable is still knotty since the strength and direction of each axon connection weights belonging to the network should be tested directly. According to Bishop (1995), we can utilize the pruning algorithm to eliminate connection weights that cannot influence the prediction performance of ANNs. Simply put, the pruning method begins with a highly connected network, and then continuously removes weak connections. As for the threshold problem of judging whether connections should be eliminated, we can deal with this problem by conducting the randomization test for ANNs. The randomization test provides us a statistical pruning technique

(7)

for removing weak connections that almost impossible to influence the predicted output. By applying the randomization test approach to identify the contribution of connection weights (in terms of effect size and direction), researchers are capable of quantitatively evaluate both individual and interactive effects of independent variables within the network. Three commonly used traditional methods for understanding the neuron connections are also introduced in this study. After comparing the results obtained by these three traditional methods and randomization approach, it is not hard to find that the effect of the independent variable on the dependent variable can be statically evaluated by using the results obtained by randomization approach. In order to ensure that the results obtained by randomization test for ANNs are reliable, I compare its results with results obtained by RM (regression method). To achieve the above goals, we decide to use an empirical example regarding the customer loyalty of Bank of America.

2. Empirical Example

To illuminate the “black box” of ANNs method, this paper used the data provided by Bank of America regarding customer loyalty to show how ANNs can be used to deal with hypothesis testing problems. According to Bauer et al. (2002), the spending on obtaining new clients can approximately equal to five times as costly as retaining existing clients. Hence, a defensive strategy that aims to maintain existing markets is more viable than an aggressive strategy, which focuses on the development of new markets. To retain existing clients, one of the most important goals of the enterprise is to gain customer loyalty. Extensive research has shown that satisfaction is one of the most important antecedents of loyalty.2 Furthermore, this relationship between satisfaction and loyalty is affected by switching costs, i.e. the costs of clients due to changes in

(8)

brands, suppliers, and products. (Jones et al. 2002; Burnham et al., 2003). This paper embarked on retesting of the relationship between satisfaction and loyalty. I sought to explain how switching costs moderate this relationship. This study contributes to existing literature by showing that the positive and negative switching costs can exert different moderating effect on the relationship between satisfaction and loyalty in banking industry. Furthermore, this study contributes to existing literature by attempting to study attitudinal loyalty and behavioral loyalty separately.

2.1 Switching Costs

As mentioned, firms are very concerned about customer loyalty (Yang & Peterson, 2004). Previous scholars suggested that customer loyalty is positively correlated with the firm's profits (Lam et al., 2004). What‟s more, switching cost is considered as one of the main motivations of customer retention, which can help firms achieve customer loyalty (Jones, Mothersbaugh & Beatty, 2000). Switching costs could also create a so-called „lock-in‟ effect that could prevent consumers from shifting to a substitute provider (Yanamandaram & White, 2006). In this scenario, the benefits that consumers could obtain from switching cannot offset the costs involved in switching to another substitute provider. Firms may like observing this kind of phenomenon since the results would be consumers keeping their relationship with the firm.

(9)

relationship”). For example, the time and effort required to know a new service provider can be

seen as a kind of negative switching costs.

Switching costs are positive when they derive primarily from producing value and benefits for clients; for example, if you switch to another service provider, you may lose the special discounts provided by your current service provider, or the hazard that the new service provider will provide inferior service.

2.2 Satisfaction

Oliver (1997) highlighted that customer satisfaction could be understood the response of a customer to the status of fulfillment and its evaluation. Satisfaction, in this dissertation, is treated as a post-consumption assessment on the services or products of the bank. According to Cronin and Taylor (1992), customer satisfaction serves as a valuable antecedent of customer loyalty. Furthermore, according to Day et al. (1990), satisfaction is the key decisive factor in keeping current customers. Existing literature thus shows that satisfaction is selected as a direct determinant of customer loyalty, and the dissertation adapts this as well.

2.3 Customer Loyalty

(10)

be the metrics of customer loyalty (Oh, 1995; Kumar & shah, 2004; Cheng, 2011). Consequently, when a customer holds a loyal attitude towards a company, repeated purchases could symbolize true loyalty since this ensures that loyal behavior will continue in the future (Amine, 1998). Thus, a loyal attitude towards the firm is indispensable for loyalty. Dick and Basu (1994) explained that the recognition of consumers and their preference for the firm over other competing companies reflect attitudinal loyalty (Cheng, 2011). The most common way to determine whether consumers are attitudinal loyal is to see whether they are willing to recommend their current service provider to those around them (Bansal, Irving, & Taylor, 2004). Furthermore, attitudinal loyalty is not just about frequent purchasing behavior (Shankar, Smith & Rangswamy, 2003). While high attitude loyalty itself does not represent the purchase behavior of customers, it could lead to positive word of mouth and willingness to recommend products and services of the firm.

According to Oliver (1999), measuring loyalty solely by customer behavior, such as purchase rates or repeat purchases, has been criticized for its lack of explanatory power and for ignoring the psychological significance of loyalty. Hence, it is logical and functional to distinguish behavior loyalty and attitudinal loyalty in this research.

2.4 Customer Characteristics

(11)

2.5 Hypotheses

Existing literature indicates that satisfaction influences customer loyalty (Henard, 2001). When customers are highly satisfied, the company's services and products meet or exceed customer expectations, such that customers are more willing to recommend the provider to other clients and continue the relationship (Mittal & Kamakura, 2001). Based on these arguments, the following hypothesis is proposed.

H 1: customer satisfaction positively affects customer loyalty

Furthermore, we also propose three hypotheses about switching costs and loyalty. Clients, who perceive negative switching costs, “have to stay” with the service provider (or perceive that they

have to, irrespective of the satisfaction created in the relationship) because of the negative switching costs. Under this circumstance, customers recommend the service provider to others less than those who are in less restricted situations (Julander & Söderlund, 2003). Instead, when clients perceive positive switching costs, they want to continue the relationship. Under this circumstance, benefits arising from the relationship allow customers to perceive positive switching costs, such as discounts and identification with a firm‟s employees, which stimulate clients to recommend companies. These arguments lead to the following hypotheses:

H2a: positive switching costs positively affect attitudinal loyalty H2b: negative switching costs negatively affect attitudinal loyalty H2c: switching costs positively affect behavioral loyalty

(12)

switching costs. However, they do not distinguish between positive and negative switching costs when analyzing their moderating roles in the interrelationship between satisfaction and loyalty. This research attempts to fill this theoretical gap by looking at both positive and negative switching costs. When clients perceive positive switching costs, satisfied clients keep the relationship with service provider because of positive benefits they receive (e.g., identification with employee and brand, other similar benefits deriving from loyalty programs). In this case, clients do not believe that they can enjoy the same services in other companies. Consequently, satisfied clients are more willing to recommend and continue the relationship with the current businesses, compared with those who feel that other companies can provide similar alternative high-quality services. In other words, these positive benefits may increase customers both behavioral and attitudinal loyalty. This leads to the following hypothesis:

H3: the positive relationship between satisfaction and loyalty will be stronger when clients perceive positive switching costs

When clients perceive that they have to continue this relationship, even though they are moderately satisfied with the current service provider, they are more inclined to hold an attitude “I could have chosen a better one.” In this case, even if enterprises take some measures to

improve customer satisfaction, they will still feel that they deserve a better service provider and, therefore, are less likely to recommend this enterprise to people around them. Furthermore, they cannot continue or terminate their relationship with the company according as they wish. In this case, satisfaction plays a less role in determining customer behavioral loyalty. This leads to the following hypothesis:

(13)

At last, we propose two hypotheses about customer characteristics. According to Wakefield and Baker (1998), customer age can moderate the relationship between customer satisfaction and loyalty because senior citizens rarely collect new information (wells & Gubar, 1966). Therefore, we expect senior citizens to rely more on their perceived satisfaction, while young people are more likely to seek information that might influence their loyalty. This leads to the following hypothesis:

H5: the positive relationship between satisfaction and loyalty will be stronger for senior citizens than for young clients

Graph 1- conceptual model for empirical example

3. Artificial Neural Networks

(14)

According to Bishop (1995), there are many types of ANNs. Here we introduce the most common structure; this kind of feed-forward artificial neural network is made up of three layers and trained by the back propagation algorithm (Rumehlart et al. 1986).

3.1 Interpreting network architecture and back propagation algorithm

(15)

FIGURE-1 one hidden layer, feed-forward neural network architecture

The assigned weights between two neurons can be understood as the strength of the signal they dispatch through the axon. For feed-forward networks, the signal is transmitted from the input layer through the hidden layer to the output layer in a single direction. The value of the independent variable will determine the states of input neurons. The state of the hidden layer will be determined by the weighted sum of signals from the input layer and bias term. The activation function, which is a differentiable function, will then deal with the weighted sum from the input layer to assess the states of hidden neuron. The process of determining the state of output neuron is the same as described above. The entire process can be expressed by a mathematical formula. { ∑ ( ∑ (1)

Where:

: the input signals

: the output signals

(16)

: the connection weights between hidden neuron j and output neuron k

: these two are biases terms belonging to the hidden and output layer

: these two are activation function used in hidden and output layer

The most commonly used activation function is logistic function; which can be expressed by this formula:

(

(2)

During the process of training networks, an algorithm called back propagation will be used to determine the weights that can make the gap between the output signal and observed value small. For the continuous variable, the back propagation algorithm needs to make the fitting criterion minimum. In this case, the commonest criterion can be expressed by the following mathematical formula

∑ ‖ ‖ (3)

However, the fitting criterion changes as the type of dependent variable changes. For the binary dependent variable, the fitting criterion is described below

∑ { ( ( (4)

Where:

: predicted value : observed value

The activation function revises the connection weights in the reverse direction layer by layer in the direction of gradient descent in minimizing the error function (Olden, 2001). One iteration of the gradient descent can be expressed by a mathematical formula

(17)

Where：

: The weight change between neuron s and neuron t in the next layer

The entire training process is a recursive process in which the training data enter the network in turn. According to equation 5, the connection weights are modified accordingly. This iterative process will not stop until the stopping rule reached. The stopping rule will be achieved when the gap between prediction and observation is small, or the probability of data being over-fitting is minimal.

3.2 Interpreting the Input Contribution

As is mentioned before, although many scholars indicate that ANNs have more exceptional predictive ability than conventional statistical methods (e.g., Lek et al. 1996), researchers often call it “black box” method because this method cannot allow scholars to know the relationship

between the explanatory variable and the response variable. Fortunately, several methods have been proposed to solve this question, such as growing and pruning algorithm (Bishop, 1995), partial derivatives (Dimopoulos et al. 1995), and asymptotic t-tests.

(18)

negative effects on neurons and lower the predicted value of the response variable. Contrary to the above, positive connection weights can increase the predicted value of response variable.

Given the fact that the connection weights are essential for evaluating the contribution of the input variable, I indicate that there is one topic deserves our further attention. Recently, some scholars indicate that the contribution of input variables can be interpreted by calculating connection weights (e.g., Aoki and Komatsu; Chen and Ware 1999). Other scholars indicate that quantifying overall variable importance can be achieved by using all the weights of the network (e.g., Garson 1991) and using sensitivity analysis to figure out the range of independent variable contributions (e.g., Lek et al. 1996). Although these methods mentioned above can be used to figure out contribution of each explanatory variable, the interrelationship between the independent variable and the dependent variable is hard to evaluate because the individual connection weights must be examined directly. This is a very complicated mission. For example, if a network consists of five hidden neurons and eight input neurons, 40 connection weights need to be examined. According to Bishop (1995), we should remove some connections with small weights. However, it is tough to figure out which connections should be removed. I decided to adopt randomization test for the artificial neural networks to deal with this question.

4. Methodology

4.1 Data description

(19)

2015). The data collection process is completed through surveys. The dataset includes individual-level data.

4.2 Variable introduction

The overview of these variables is presented in table 1, table 2, and table 3 below. By this method, these variables are easier to understand.

[INSERT TABLES 1-3 ABOUT HERE]

4.3 Preparation of the data

Before training the network, variables need to be transformed. For the continuous dependent variable, they should be converted into variables with intervals from zero to one so that the sigmoid function can be used. This process can be achieved by using this formula:

( ( ( Where:

: The value before conversion for observation n : The value after conversion for observation n

For independent variable, conversion processing is also necessary because we need to ensure that a similar percentage change in weighted sum of input leads to similar change in the unit output (Olden, 2001). This process can be achieved by using this formula:

(20)

Currently, all respondents' data are in the same dataset. The key point of the empirical example is to figure out the different impacts brought by different kinds of switching costs, so the dataset is divided into two sub-datasets. All clients who perceive positive switching costs belong to one dataset, and all clients who perceive negative switching costs belong to another dataset.

4.4 methods introduction

In this section, several traditional methods that are used for quantifying input variable contribution will be introduced. Next, this paper will introduce and apply the randomization test to show the difference between traditional methods and randomization approach, and interpreting how the relationship between explanatory variables and response variable can be statistically evaluated. Before assessing the effect of input variables, the structure of the artificial neural networks (the number of hidden neurons) should be determined. The way of determining the structure of neural networks is to compare the predictive performance of networks with different hidden neurons. For both neural networks with binary output variable and neural works with continuous output variable, three hidden neurons have the best performance in terms of Pearson correlation coefficient and accuracy rate.

4.4.1 Neutral Network Diagram

(21)

production of input-hidden connection weights and hidden-output connection weights. If the direction of these two connections is the same, the input variable is positively related to the output variable. If the direction of these two connections is not the same, the input variables exert negative effect on the output variable. Figure 2 shows neural interpretation diagrams for the empirical example.

The interpretation of connections weights is not a liable mission since excessive connections make the interpretation of effect of variable very complicated. Moreover, as the number of hidden layer neurons increases, the interpretation of variable effects will only become more complex. For regression models, we only interpret the influence of significant variables. In neural network, we also need to figure out which connections do not need to be interpreted. Obviously, the neural network diagram cannot tell us which not significant connections are. I will introduce the randomization method later, which can identify non-significant connections.

(22)

Figure 2b customer who perceive positive switching costs-attitudinal loyalty (Thicker line means greater weights. Black lines mean positive signals and gray lines mean negative signals)

(23)

Figure 2d customer who perceive negative switching costs-attitudinal loyalty (Thicker line means greater weights. Black lines mean positive signals and gray lines mean negative signals)

4.4.2 Garson Algorithm

To figure out the relative importance of each input variable, Garson (1991) proposed an approach. In here, an easy example will be used to shed light on the principle of the method (see box 1 for the principle of Garson algorithm). It is not hard to find that the Garson algorithm uses the absolute value to calculate the relative importance of each input variable, and hence this method cannot provide reader with the direction of the relationship between the response variable and the explanatory variable. Figure 3 shows the relative importance of each input variable for empirical example.

Table.4 input-hidden and hidden-output connection weights

Hidden A Hidden B

Input 1 -1.52 -1.21

Input 2 0.11 -0.88

(24)

output 1.2 0.35

Table 5 the contribution of each variable through each hidden units(e.g.-1.52*1.2=-1.824)

Hidden A Hidden B

Input 1 -1.824 -0.424

Input 2 0.132 -0.308

Input 3 -0.684 0.704

Table 6 relative contribution of each input neuron (e.g. |-1.824|/|-1.824|+|0.132|+|0.684|=0.69) and the sum of input neurons contributions 0.99=0.69+0.30

Hidden A Hidden B Sum

Input 1 0.69 0.30 0.99

Input 2 0.05 0.21 0.26

Input 3 0.26 0.49 0.75

Table 7 The relative importance of each input variable (e.g. 0.50=0.99/(0.99+0.26+0.75)) BOX 1 Garson algorithm principle (this example consists of two hidden neurons (A and B) and

three inputs (1, 2 and 3))

Relative importance

Input 1 0.50

Input 2 0.13

Input 3 0.37

(25)

Figure 3b customer who perceive positive switching costs-attitudinal loyalty

Figure 3c customer who perceive negative switching costs-behavioral loyalty

(26)

4.4.3 Sensitivity Analysis

To figure out the spectrum of input variable contribution in neural network, the sensitivity analysis was proposed. Traditional sensitivity analysis works by changing one independent variable across its entire range under the condition of fixing other independent variables; then, change another input variable (still keep other variable unchanged). However, the interpretation process is very complex due to the number of variable combinations. To simplify the process, Ozesmi (1999) indicated that we could first calculate the critical measure for each variable (e.g., 20th, 40th percentile) and then change each input variable across the entire range under the condition of keeping the values of remaining independent variables as critical measures. In this study, we also conduct sensitivity analysis for our empirical example. Fig 4 shows the plots for each input variables by keeping all other independent variables at their key measures (20th, 40th, 60th, and 80th percentile). Many variables show the similar response curve, so we only interpret several representative variables in the part.

(27)

According to figure 4a, the influence of positive switching costs on behavioral loyalty show

left-skewed curve- independent variable has a strong impact when the value is high and exhibits

weak influence when the value is low.

Figure 4b customer who perceive negative switching costs-behavioral loyalty

According to figure 4b, when all other variables are low in value, the influence of negative switching costs on behavioral loyalty show gaussian response curve- independent variable has a strong impact at intermediate values and exhibits weak impact at low and high values.

(28)

According to figure 4c, the influence of negative switching costs on attitudinal loyalty show

decreasing response curve- the impact of independent variable decrease with the increase of

values.

Figure 4d age of customer who perceive negative switching costs-behavioral loyalty

According to figure 4d, the influence of age on attitudinal loyalty show flat response curve- the impact of independent variable is very stable across its entire range.

(29)

To make the interpretation of ANNs easier, we need to figure out which connections do not need to be explained. The randomization approach can identify those connections that need to be explained should differ from random significantly. The following describes how this approach works.

The first step, we need to construct an artificial neural networks model and record the following value: (1) the production of input-hidden connection weights and hidden-output connections. The second step, we need to randomized the dependent variable and then use it to construct an artificial neural networks model. This process needs to be repeated 999 times (every time we need to record the (1) mentioned above), and then null distribution can be used to compare with the original value, and then the significance level can be determined.

Table 8a, 8b, 8c, and 8d include the P values regarding the randomization tests. According to the outcome table, only a small number of input-output connection weights differ from random significantly. Hence, we can eliminate these connections that do not differ from random significantly.

Table 8a- the estimation results of randomization test for positive switching costs-behavioral loyalty model

Input variable

(30)

Note. W represents the hidden connection weights * hidden-output connection weights. P values for input-hidden connection weights * input-hidden-output connection weights are based on randomization approach. P  0.05 means statistical significance.

In light of the results of behavioral loyalty-positive switching costs model, the input variable

positive SC and satisfaction can significantly influence behavioral loyalty through hidden neuron

B and A respectively. Apparently, customer satisfaction and positive SC are negatively related to customer churn, which means that with the increase of customer satisfaction degree and positive SC, the clients are less likely to switch to another service provider. This finding is in line with our hypothesis. However, the moderation effect of age is not found, which reject our hypothesis. Furthermore, the connection weights of variable Satc*pos indicate that the positive relationship between satisfaction and behavioral loyalty will be stronger with the increase of positive switching costs.

Table 8b- the estimation results of randomization test for positive switching costs-attitudinal loyalty model

Input variable

Hidden neuron A Hidden neuron B Hidden neuron C W P W P W P Age 0.96 0.24 -0.08 0.17 210.71 0.02 satisfaction -16.64 0.04 3.37 0.26 2426.15 0.00 Positive SC -11.76 0.06 2.23 0.32 604.37 0.02 Satc*age -3.24 0.88 0.12 0.66 -1580.72 0.00 Satc*pos -26.41 0.06 2.28 0.74 10762.44 0.01

(31)

When we look at the positive switching costs- attitudinal loyalty model, the input variable

satisfaction can significantly influence the output variable through hidden neuron A and C.

However, the negative influence through hidden neuron A is counteracted by the positive influence through hidden neuron C, and hence the relationship between satisfaction and attitudinal loyalty is positive. The input variable positive switching costs can significantly influence the output variable through hidden neuron C, and the connection weights indicate that positive switching costs is positively related to attitudinal loyalty. Furthermore, we also find that customer age and positive switching costs can influence the positive relationship between customer satisfaction and attitudinal loyalty. The positive relationship between customer satisfaction and attitudinal loyalty will become weaker with the increase of customer age, and this relationship will become stronger with the increase of perceived positive switching costs.

Table 8c- the estimation results of randomization test for negative switching costs-behavioral loyalty model

Input variable Hidden neuron A Hidden neuron B Hidden neuron C W P W P W P Age 25.43 0.04 -19.94 0.06 4.15 0.29 satisfaction -13.31 0.04 10.24 0.05 -0.01 0.86 Negative SC 6.95 0.06 -9.91 0.03 -0.87 0.81 Satc*Neg SC -106.05 0.00 45.92 0.03 1.73 0.37 Satc*age -9.99 0.03 -0.69 0.64 -4.20 0.12

The outcome of behavioral loyalty-negative switching costs model indicates that negative

switching costs and satisfaction are all negatively related to customer churn. In terms of input

(32)

negative influence through hidden neuron A, so the satisfaction is negatively related to the customer churn. Negative switching cost can significantly influence output variable through hidden neuron B, and the connection weights indicate that clients are less likely to switch to another service provider with the increase of perceived negative switching costs. The connection weights of variable age*satc indicate that the negative relationship between satisfaction and customer churn become stronger with the increase clients age. Furthermore, we also find that negative switching costs can play a role in the relationship between satisfaction and behavioral loyalty. In terms of variable satc*Neg sc, the positive influence through hidden neuron B is counteracted by the negative influence through hidden neuron A, which means the negative relationship between satisfaction and customer churn become stronger with the increase of perceived negative switching costs.

Table 8d- the estimation results of randomization test for negative switching costs-attitudinal loyalty model

Input variable

Hidden neuron A Hidden neuron B Hidden neuron C W P W P W P Age -0.72 0.16 0.40 0.24 0.43 0.35 satisfaction 4.52 0.05 10.00 0.03 -5.21 0.05 Negative SC -7.11 0.04 2.64 0.13 0.35 0.87 Satc*Neg SC 5.02 0.05 -5.59 0.04 -1.73 0.31 Satc*age 0.04 0.44 -0.16 0.44 1.67 0.67

(33)

can influence the relationship between satisfaction and attitudinal loyalty. The positive relationship between satisfaction and attitudinal loyalty become weaker with the increase of perceived negative switching costs.

4.4.5 ANNs versus regression method

To verify that the “coefficients” obtained by ANNs are credible, the outcomes obtained by ANNs and regression method are then compared. We compared the results obtained by artificial neural networks with the results obtained by linear regression model for predicting customer attitudinal loyalty (continuous dependent variable) and compare the results of neural networks with logistic regression model for predicting customer churn (binary dependent variable).

To account for the interaction effect in linear regression analysis, the model needs to include the production of relevant variables. In this case, we need to add product terms satisfaction * age

and satisfaction * switching costs in the model. Another thing worth noting is that both logistic

(34)

In terms of the ANNs model for predicting customer attitudinal loyalty, the results are very similar to the results of linear regression model. Table 9A reports the results of attitudinal loyalty model. When clients perceive positive switching costs, the regression coefficient for satisfaction (0.03) is significant (p = 0.00). Furthermore, the coefficient of positive switching costs (0.05) is also significant (p = 0.04). These two coefficients indicate that customer attitudinal loyalty is positively related to positive switching costs and satisfaction which is in line with the results of ANNs model. Furthermore, the coefficient of variable positive SC*Satc (0.08) is significant (p = 0.04) indicates that the positive relationship between satisfaction and attitudinal loyalty is stronger when clients perceive positive switching costs. When clients perceive negative switching costs, the regression coefficient for satisfaction (0.08) is highly significant (p = 0.00) and the coefficient of negative switching costs (-0.18) is also significant (p = 0.00). These two coefficients demonstrate that customer attitudinal loyalty is positively related to customer satisfaction and negatively related to negative switching costs. Furthermore, the coefficient of variable Negative SC*Satc (-0.03) is significant (p = 0.028) indicates that the positive relationship between satisfaction and attitudinal loyalty will be weaker when clients perceive negative switching costs. This conclusion is in line with the finding of ANNs model for predicting customer attitudinal loyalty.

Table 9 A-estimation results for attitudinal loyalty model

DV: attitudinal loyalty Model 1(positive SC) Model 2 (negative SC) Estimate P Estimate P Intercept 0.96 0.00*** 0.51 0.00*** Satisfaction 0.03 0.00*** 0.08 0.00***

Age 0.01 0.24 -0.01 0.52 Positive switching costs 0.05 0.04*

Positive SC* Satc 0.08 0.04*

Satc*Age -0.02 0.07. 0.0005 0.98 Negative switching costs -0.18 0.00***

Negative SC* Satc -0.03 0.028* R squared 0.10 0.77

(35)

Note. p < 0.05 (*), p < 0.01 (**), p < 0.001 (***)

Table 9B reports the results of behavioral loyalty. According to the outcome of logistic regression model for predicting customer churn, when clients perceive negative switching costs, the coefficient of satisfaction (β = -0.15, p = 0.01) and negative switching costs (β = -0.03, p =

0.00) indicate that with the increase of negative switching costs and satisfaction degree, clients

are less likely to switch to another service provider. When clients perceive positive switching costs, the regression coefficient of satisfaction (β = -0.03, p = 0.00) and positive switching costs (β = -0.02, p = 0.02) means that with the increase of positive switching costs and satisfaction degree, clients are less likely to switch to another service provider. However, given the insignificant results of all interaction terms, moderation effects cannot be found in logistic regression models.

Table 9B-estimation results for behavioral loyalty model

DV: behavioral loyalty Model 1(positive SC) Model 2 (negative SC) Estimate P Estimate P Intercept -1.92 0.178 -1.76 0.00*** Satisfaction -0.03 0.00*** -0.15 0.01*

Age 0.008 0.03* 0.09 0.74 Positive switching costs -0.02 0.02*

Positive SC* Satc -0.28 0.22

Satc*Age 0.23 0.61 -0.09 0.85 Negative switching costs -0.03 0.00***

Negative SC* Satc -0.14 0.66

Note. p < 0.05 (*), p < 0.01 (**), p < 0.001 (***), behavioral loyalty (0 = stay; 1 = switch to another service

provider)

(36)

Nortin, 2003; Wiersema & Bowenz, 2009). According to Ai and Nortin (2003), the magnitude of the interaction item depend on all independent variables in the model which means it can have different effect on different observations. Hence, we interpret the interaction effect by drawing the relationship between independent variable and dependent variable by keeping the moderator at its key measures.

Figure 5a- interaction effect between satisfaction and positive switching costs

According to these three non-parallel lines in figure 5a, the positive relationship between satisfaction and behavioral loyalty (1= leave, 0= stay) is stronger with the increase of perceived positive switching costs.

(37)

According to figure 5b, age cannot exert an impact on the relationship between satisfaction and behavioral loyalty when clients perceive positive switching costs because there three lines almost have the same slope.

Figure 5c- interaction effect between satisfaction and negative switching costs

(38)

Figure 5d- age-satisfaction effect for clients who perceive negative switching costs

Based on figure 5d, we find that age can play a role in the interrelationship between satisfaction and behavioral loyalty when clients perceive negative switching costs. In this case, the negative relationship between satisfaction and customer churn stronger with the increase of clients‟ age. To compare the results obtained by these two methods intuitively, I decide to present the hypothesis tested by these two methods in the form a table

Table 10- hypothesis testing results for these two methods

Hypothesis ANNs Regression Customer satisfaction positively affects customer loyalty Support Support Positive switching costs positively affect attitudinal loyalty Support Support Negative switching costs negatively affect attitudinal loyalty Support Support Switching costs positively affect behavioral loyalty Support Support The positive relationship between satisfaction and attitudinal loyalty will

be stronger when clients perceive positive switching costs

Support Support The positive relationship between satisfaction and behavioral loyalty will

be stronger when clients perceive positive switching costs

Support Support The positive relationship between satisfaction and attitudinal loyalty will

be weaker when clients perceive negative switching costs

Support Support The positive relationship between satisfaction and behavioral loyalty will

be weaker when clients perceive negative switching costs

Rejected Rejected The positive relationship between satisfaction and attitudinal loyalty will

be stronger for senior citizens than for young clients

Rejected Rejected The positive relationship between satisfaction and behavioral loyalty will

be stronger for senior citizens than for young clients

Support(Neg SC)

(39)

These two methods differ in the interpretations of some variables. for example, when clients perceive positive switching costs, the positive relationship between satisfaction and attitudinal loyalty become weaker with the increase of clients age (W = -1580.72, P = 0.00) in artificial neural networks, but this relationship cannot be influenced by clients age in linear regression model because the variable Satc*age (β = -0.02, P = 0.07) is marginally significant. Furthermore, According to figure 5C, the moderation effect of negative switching costs is not found in the logistic regression model, which rejects our relevant hypothesis. Although the results of ANNs also reject this relevant hypothesis, the connection weights in ANNs for variable

Satc*Neg SC indicate that the positive relationship between satisfaction and behavioral loyalty

will be stronger with the increase of perceived negative switching costs. Although the interpretations of the effects of variables obtained by the two methods (ANNs & regression) are not exactly the same, these two methods show consistency in the results of the hypothesis test. Furthermore, variables that differ in significance interpretation are consistency in the direction of influence. This discrepancy may be caused by different model working principle.

5 Conclusion

(40)

influence the neural network modeling process. Therefore, ANNs can tell readers how explanatory variables affect response variables. Furthermore, ANNs technique also has some advantages that conventional statistical methods do not. According to Tu (1996), ANNs technique has the following advantages: first of all, the ANNs technique does not need to meet several assumptions required by regression method (i.e. Multicollinearity, Normalization and Heteroskedasticity). Secondly, the ANNs technique can better handle problems when data exhibit non-linear characteristics. Hence, the ANNs technique, combined explanatory insight with powerful predictive ability, can be considered as a promising method for understanding and predicting market phenomena. The ANNs technique can provide better decision support for managers.

Furthermore, several managerial implications generated from the empirical example. In line with the existing marketing literature (Cronin & Taylor, 1992; Fornell, 1992; Anderson & Sullivan, 1990; Boulding, Kalra, Staeling, & Zeithaml, 1993), this study revealed that customer satisfaction is positively related to customer loyalty. Firms should do their best to provide products and services that meet or exceed customer expectations. A service provider can improve customers‟ repurchase intentions (behavioral loyalty) and positive recommendations (attitudinal

(41)

relationship between satisfaction and loyalty. Therefore, switching costs can be seen as a very important approach for understanding customer churn rate.

Based on the previous finding, it is never wise for service providers to make clients perceive high negative switching costs. Although, to some extent, high negative switching costs can prevents customers from leaving, it will reduce customer attitudinal loyalty, which means that clients are less likely to make positive recommendations. Thus, high negative switching costs may cause negative, long-term consequences for firms (Schurr, Dwyer, & Oh, 1987).

The results obtained by our study support Jones et al.‟s (2007) idea that differentiating between positive and negative switching costs helps to refrain that one-sided opinion of switching costs. Our findings suggest service providers should take some measures that originate positive switching costs instead of measures that originate negative switching costs. Negative switching costs positively related to behavioral loyalty, but negatively related to attitudinal loyalty. Consequently, although negative switching can help firms make profits from their current clients, the competitive advantage generated in this way is very hard to maintain in the long term. By comparison, positive switching costs can make customers more willing to do positive recommendations while prevents clients from leaving. This can help firms to establish long term relationships with clients.

(42)

this goal, firms can establish personal relationships with their clients, offer points system in which clients can earn points, and convert them for rewards, and provide personalized service according to customer needs. Furthermore, when clients perceive high negative switching costs, they will develop a sense of being locked in a relationship. It is also important for companies to avoid this phenomenon. Negative switching costs are mainly made up of learning and setup costs and monetary costs. Managers should be cautious in implementing measures, mentioned above, that make customers perceive the negative switching costs.

The last implication is that age is also a variable that can influence the relationship between satisfaction and behavioral loyalty. When clients perceive negative switching costs, the positive relationship between satisfaction and customer loyalty will stronger for senior citizens than for young clients. Therefore, when firm resources are limited, if firms want to take some strategies to increase satisfaction level of customers who perceive negative switching costs, firms should give priority to improve services for the elderly.

6 Limitations and further research

(43)

Furthermore, the dataset used in this study was based on American consumer. Hence, whether the results of this study can be generalized is debatable. According to De Mooij (2002), consumer behavior is affected by cultural differences. Obviously, consumers from cultures other than the United States are likely to exhibit different consumer behaviors.

(44)

7. Reference

Ai, C., & Norton, E. C. (2003). Interaction terms in logit and probit models. Economics letters, 80(1), 123-129.

Allison, P. (2012). When can you safely ignore multicollinearity. Statistical Horizons, 5(1). Amine, A. (1998). Consumers' true brand loyalty: the central role of commitment. Journal of strategic marketing, 6(4), 305-319.

Anderson, J. A. (1995). An introduction to neural networks. MIT, Cambridge, MA, 650PP. Aydin, S., & Özer, G. (2005). The analysis of antecedents of customer loyalty in the Turkish mobile telecommunication market. European Journal of marketing, 39(7/8), 910-925.

Bansal, H. S., Irving, P. G., & Taylor, S. F. (2004). A three-component model of customer to service providers. Journal of the Academy of marketing Science, 32(3), 234-250.

Bayon, T., Gutsche, J., & Bauer, H. (2002). Customer Equity Marketing:: Touching the Intangible. European Management Journal, 20(3), 213-222.

Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford university press.

Boulding, W., Kalra, A., Staelin, R., & Zeithaml, V. A. (1993). A dynamic process model of service quality: from expectations to behavioral intentions. Journal of marketing research, 30(1), 7-27.

Burnham, T. A., Frels, J. K., & Mahajan, V. (2003). Consumer switching costs: a typology, antecedents, and consequences. Journal of the Academy of marketing Science, 31(2), 109-126 Caruana, A. (2003). The impact of switching costs on customer loyalty: A study among corporate customers of mobile telephony. Journal of Targeting, Measurement and Analysis for marketing, 12(3), 256-268.

Chen, D. G., & Ware, D. M. (1999). A neural network model for forecasting fish stock recruitment. Canadian Journal of Fisheries and Aquatic Sciences, 56(12), 2385-2396.

Colasanti, R. L. (1991). Discussions of the possible use of neural network algorithms in ecological modeling. Binary: Computing in Microbiology, 3(1), 13-15.

Cronin Jr, J. J., & Taylor, S. A. (1992). Measuring service quality: a reexamination and extension. Journal of marketing, 56(3), 55-68.

Day, G. S., & Day, G. S. (1990). Market driven strategy: Processes for creating value (pp. 1018). New York: Free Press.

(45)

Dwyer, F. R., Schurr, P. H., & Oh, S. (1987). Developing buyer-seller relationships. Journal of marketing, 51(2), 11-27.

Edwards, M., & Morse, D. R. (1995). The potential for computer-aided identification in biodiversity research. Trends in ecology & evolution, 10(4), 153-158.

Fornell, C. (1992). A national customer satisfaction barometer: The Swedish experience. Journal of marketing, 56(1), 6-21.

Garson, G. D. (1991). Interpreting neural-network connection weights. AI expert, 6(4), 46-51. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural computation, 4(1), 1-58.

Jones, M. A., Mothersbaugh, D. L., & Beatty, S. E. (2002). Why customers stay: measuring the underlying dimensions of services switching costs and managing their differential strategic outcomes. Journal of business research, 55(6), 441-450.

Jones, M. A., Reynolds, K. E., Mothersbaugh, D. L., & Beatty, S. E. (2007). The positive and negative effects of switching costs on relational outcomes. Journal of Service Research, 9(4), 335-355.

Julander, C. R., & Söderlund, M. (2003). Effects of switching barriers on satisfaction, repurchase intentions and attitudinal loyalty. SSE/EFI Working paper series in Business Administration, 1, 1-21.

Kübler R V, Wieringa J E, Pauwels K H. Machine Learning and Big Data[M]//Advanced Methods for Modeling Markets. Springer, Cham, 2017: 631-670.

Kumar, V. I. S. W. A. N. A. T. H. A. N., & Shah, D. (2004). Building and sustaining profitable customer loyalty for the 21st century. Journal of retailing, 80(4), 317-329.

Lam, S. Y., Shankar, V., Erramilli, M. K., & Murthy, B. (2004). Customer value, satisfaction, loyalty, and switching costs: an illustration from a business-to-business service context. Journal of the academy of marketing science, 32(3), 293-311.

Lek, S., Delacoste, M., Baran, P., Dimopoulos, I., Lauga, J., & Aulagnier, S. (1996). Application of neural networks to modelling nonlinear relationships in ecology. Ecological modelling, 90(1), 39-52.

Mittal, V., & Kamakura, W. A. (2001). Satisfaction, repurchase intent, and repurchase behavior: Investigating the moderating effect of customer characteristics. Journal of marketing research, 38(1), 131-142.

(46)

Olden, J. D., & Jackson, D. A. (2002). Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks. Ecological modelling, 154(1-2), 135-150.

Oliver, R. L. (1977). Effect of expectation and disconfirmation on post exposure product evaluations: An alternative interpretation. Journal of applied psychology, 62(4), 480.

Patterson, P. G., & Smith, T. (2003). A cross-cultural study of switching barriers and propensity to stay with service providers. Journal of retailing, 79(2), 107-120.

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1988). Learning representations by back-propagating errors. Cognitive modeling, 5(3), 1.

Shankar, V., Smith, A. K., & Rangaswamy, A. (2003). Customer satisfaction and loyalty in online and offline environments. International journal of research in marketing, 20(2), 153-175. Sharma, A., Panigrahi, D., & Kumar, P. (2013). A neural network based approach for predicting customer churn in cellular network services. arXiv preprint arXiv:1309.3945.

Sirdeshmukh, D., Singh, J., & Sabol, B. (2002). Consumer trust, value, and loyalty in relational exchanges. Journal of marketing, 66(1), 15-37.

Szymanski, D. M., & Henard, D. H. (2001). Customer satisfaction: A meta-analysis of the empirical evidence. Journal of the academy of marketing science, 29(1), 16.

Tsai, C. F., & Lu, Y. H. (2009). Customer churn prediction by hybrid neural networks. Expert Systems with Applications, 36(10), 12547-12553.

Tu, J. V. (1996). Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. Journal of clinical epidemiology, 49(11), 1225-1231. Wakefield, K. L., & Baker, J. (1998). Excitement at the mall: determinants and effects on shopping response. Journal of retailing, 74(4), 515-539.

Wiersema, M. F., & Bowenz, H. P. (2009). Research notes and commentaries the use of limited dependent variable techniques in strategy research: Issues and methods. Strategic Management Journal, 30(6), 679–692.

(47)

Table1: definition and measurements of variables

Variable Definition measurements Pos SC To what extent do you agree with the statement? I am

not changing my incumbent bank since i feel uncertain about whether other suppliers can give the same

service as this one.

Score of clients on this question from 0 to 5.( higher scores mean they agree more with this statement)

Neg SC To what extent do you agree with the statement? I feel locked into this supplier.

Score of clients on this question from 0 to 5.( higher scores mean they agree more with this statement)

Age the actual age of customer Younger than 20 1

21-30 2 31-40 3 41-50 4 51-60 5 61-70 6 Over 70 7

Gender The gender of clients Male 1

Female 2

Satisfaction What is your overall satisfaction with the service provider

Score of clients on this variable from 0 to 10.

Attitudinal loyalty

To what extent do you agree with the statement? I will say positive things about my current suppliers and

recommend my current suppliers to my friends.

Score of clients on this variable from 0 to 10.

Behavioral loyalty

Whether clients switch to another service provider at the end of the wave

(48)

Table 2 descriptive statistics for independent and dependent variables

variable mean min max SD

Pos SC 0.35 0.00 3.29 0.42

Neg SC 0.35 0.00 4.28 0.76 Attitudinal loyalty 7.39 0.00 9.00 2.32

Switchstr 0.12 0.00 1 0.32 satisfaction 4.81 0.50 10.00 2.39

Table 3 Customer age distribution