• No results found

Measuring Online Conversion and Exploring its Attribution: a Customer Segmentation Approach

N/A
N/A
Protected

Academic year: 2021

Share "Measuring Online Conversion and Exploring its Attribution: a Customer Segmentation Approach"

Copied!
76
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Measuring Online Conversion and Exploring its Attribution:

a Customer Segmentation Approach

by

Mark Arjan Johannes Wolters

Master Thesis

(2)

Measuring Online Conversion and Exploring its Attribution:

a Customer Segmentation Approach

Master Thesis

MSc Marketing Intelligence & MSc Marketing Management

June 17, 2019

Mark Arjan Johannes Wolters

s3031659

Willem de Kooningstraat 12

7556 PM Hengelo (ov)

markwolters3@gmail.com

University of Groningen

Faculty of Economics and Business

Department of Marketing

9700 AV Groningen

(3)

Measuring Online Conversion and Exploring its Attribution:

a Customer Segmentation Approach

Abstract: This research pertained to exploring differences in online behaviour across

customer segments. Latent class analysis was used to cluster individual customers based on a

set of behavioural variables, ultimately a five-segment solution was adopted and investigated

in detail. From linear regression analysis it appears that journey length positively affects the

number of page views for all segments. For the share of customer-initiated touch points in the

purchase journey, this was not the case. The effects of journey length and the share of

customer-initiated touch points on conversion probability were examined using binary logistic

regression. For all segments, journey length depicted the same picture; once it was measured

by only computing the (active) time spent on a touch point, conversion probability was

positively affected. Whereas when the time between two touch points was considered in the

journey length, a negative effect arises. The share of customer-initiated touch points was only

significant for Segment 5. Outcomes of the channel-level attribution using the second order

Markov model yielded similar results for all segments; two channels appear most prominent

in assigning conversion credit. However, the variety of the distribution among all channels

illustrated more inequality across segments. The combination of techniques depicted

differences among the identified customer segments, which lead to several implications.

Key words: customer segmentation; latent class analysis; conversion; logistic regression;

(4)

PREFACE

During the process of writing this thesis, I got the opportunity to work with real data

involving the online customer journey. This has always been one of the topics within the

marketing field that attracted my attention. Although writing this thesis was most of the time a

highly challenging, demanding and sometimes frustrating process, I enjoyed working with the

data. By finalising this report, my days as a student come to an end. In my opinion, this is a

positive and healthy development given that it provides the opportunity to take the next step

forward. After almost nine years of studying, I’m relieved and eager to launch my

professional career in marketing and data analysis.

I would like to express my sincere thanks to the people who supported me during my

academic journey. First of all, my supervisor dr. Peter van Eck who helped me writing this

thesis. The constructive comments and feedback were valuable and forced me to reconsider

and improve my research. Next, I want to thank Martine van der Heide for taking the time to

act as co-assessor. Besides, I also want to thank my fellow students in my thesis group for

being responsive and attentive. Moreover, I would like to thank the University, and in

particular the Department of Marketing, for providing a program with challenging academic

courses supervised by inspiring professors and teachers.

Lastly, and most importantly, I want to express my gratitude towards my family who provided

unconditional support, care and love throughout the years. I could not have reached my

objectives without your loyalty. I am grateful to be part of your family circle.

Mark Wolters

(5)

TABLE OF CONTENTS

1. INTRODUCTION

6

1.1 Problem Statement and Research Questions

7

1.2 Relevance and Contribution

8

1.3 Structure of the Thesis

8

2. THEORETICAL FRAMEWORK

9

2.1 Online Customer Journey

9

2.2 Page Views and Conversion

10

2.3 Conversion Attribution

11

2.4 Customer Segmentation

12

2.5 Characteristics of Online Behaviour

15

2.6 Conceptual Model

17

3. RESEARCH DESIGN

18

3.1 Data Collection

18

3.2 Techniques of Analysis

19

3.3 Variables

19

3.4 Model Specification

20

3.4.1 Customer Segmentation

21

3.4.2 Modelling Page Views

22

3.4.3 Modelling Conversion Probability

23

3.4.4 Modelling Conversion Attribution

24

4. RESULTS

26

4.1 Preliminary Analysis

26

4.2 Descriptive Statistics

28

4.3 Segmentation using Latent Class Analysis

30

4.4 Model Estimation and Model Selection

34

4.4.1 Page Views

36

4.4.2 Conversion Probability

37

4.4.3 Conversion Attribution

41

4.5 Hypothesis Testing

43

5. CONCLUSION

44

5.1 Discussion

44

5.2 Managerial Implications

47

5.3 Theoretical Implications

48

5.4 Limitations and Future Research

48

REFERENCES

50

APPENDICES

57

Appendix A: Descriptive Statistics

57

(6)

1. INTRODUCTION

According to the IAB Report on Digital Advertising Spend, published by Deloitte and IAB

(2018), the sum of expenses on online digital advertising in The Netherlands was 1,97 million

euro for 2017, compared to 1,16 million euro in 2012. Additionally, according to the Central

Bureau of Statistics in the Netherlands, consumers’ online spending on products and services

for private use increased from 64% in 2012, to 78% in 2018. This briefly highlights the rise of

the internet and the importance of online digital advertising. This irreversible development

comes with various challenges for companies and marketers, for example in designing online

strategies, setting up the marketing mix and deciding which customers to serve.

In the online environment, marketers try to persuade individual customers to visit their

websites. Ultimately, many companies encourage consumers to buy their products or services.

Nowadays potential customers are reached via various channels and media which are initiated

by the firm, such as paid search, advertisements, retargeting, displays, and newsletters.

Simultaneously, customers visit company websites at their own initiative, for example by

directly typing the web address of the company. Due to the increasing number of touch points

(i.e., channels), customers have an unprecedented range of options to personalize their online

customer journey. As a result, the customer journey becomes more complex, because of the

rise of all these different channels and media (Lemon & Verhoef, 2016). To illustrate; before

concluding a purchase transaction, customers often enter company websites via various touch

points multiple times (Li & Kannan, 2014; Mallapragada, Chandukala & Liu, 2016). They

might even visit competitor websites during their online journey, for example to compare

prices or characteristics of a product or service. This indicates the comprehensive and

complicated features of the online customer journey. Furthermore, every touch point has an

influence on the subsequent steps a customer takes during their journey, this influences

carryover effects (i.e., visits in the past may affect future visits), spillover effects (i.e.,

cross-channel effects) and forms the basis for future touch point exposure by customers (Anderl,

Becker, von Wangenheim & Schumann, 2016).

For marketers, possessing insights in the way customers act, behave, and make purchases is

valuable. This is not a new phenomenon, its origin was found in traditional advertising media

(i.e., TV and print), where advertisers resorted to marketing mix models by means of

(7)

opportunity for determining conversion and attribution is offered due to the fact that

advertisers possess disaggregated individual customer-level data. This type of data allows for

measuring channel effectiveness for individual customers. Neslin and Shankar (2009)

identified and addressed the challenge of conversion attribution in a multichannel

environment. Additionally, determining the degree to which individual channels actually

contribute to a company's success is demanding (Anderl et al., 2016). To illustrate; one could

find a pair of shoes when searching for it via Google, and later purchase those shoes after

clicking on an advertisement; which channel generated this purchase? Probably both. Hence,

in more complex journeys proper attribution is more demanding and not that straightforward.

Recently, an increased number of scholars focused on attributing such credit to touch points in

the online customer journey (Shao & Li, 2011; Berman, 2015; Kireyev, Pauwels & Gupta,

2016; Anderl et al., 2016). To analyse these convoluted paths to purchase, various analytical

attribution models were used, including logistic regression (Shao & Li, 2011), game

theory-based approach (Berman, 2015), VAR models (Kireyev et al., 2016), and hidden Markov

models (Anderl et al., 2016).

Due to the fact that looking into individual level data is so widely used, not much attention is

paid to comparing conversion and attribution among customer segments. Nevertheless, it is

worth looking into segments and their subsequent conversion and attribution, since

segmentation can help firms in their customer relationship management efforts (Kumar,

2018). Besides, a proper segmentation yields valuable insights in determining the allocation

of marketing mix resources across various customer segments (Verhoef & Lemon, 2013).

Therefore, this research pays attention to this gap in the literature.

1.1 Problem Statement and Research Questions

Determining where, which and how firms should implement different kind of marketing

channels involves complicated decision-making. Online marketers are responsible for

designing effective ways to allocate budgets in order to obtain revenue. As previously

mentioned, a predominant amount of studies already focused on conversion attribution. Yet,

the focal interest of these studies aimed at conversion attribution for individual customers

(Shao & Li, 2011; Berman, 2015; Kireyev et al., 2016; Anderl et al., 2016). Limited research

focused on segmenting customers based on online behavioural characteristics, and

(8)

this research will do so. In addressing this problem, the following research question is

formulated:

Which differences in online behaviour can be identified across various customer segments?

In order to thoroughly research this problem, the following sub-questions were formed:

1) How do online behavioural characteristics affect page views and conversion probability

across different customer segments?

2) How do customer segments vary in conversion attribution for different marketing

channels? That is, what are unique features in conversion attribution for specific customer

segments?

1.2 Relevance and Contribution

Many researchers have focused on individual customer attribution using various analytical

techniques. This study attempts to examine different customer segments and focuses on

determining conversion probability and conversion attribution across these multiple customer

segments. This is useful, since academia suggest to further investigate and analyse conversion

attribution across various customer segments (Li & Kannan, 2014; Kannan & Li, 2017).

Markovian graph-based data mining techniques (as applied by Anderl et al. (2016)) will be

used in order to identify differences in conversion attribution. This approach fills a unique gap

in literature due to the fact that conversion attribution for various marketing channels will be

estimated per customer segment using Markov models. Given that such research was not

conducted before, a significant contribution to the literature will be made. Ultimately, the

outcome of this research helps in identifying, defining, and discussing the differences in

online behaviour and conversion across various customer segments. This results in a better

understanding of the online customer journey. When differences are explored, specific

recommendations for various segments help companies. For example, in targeting customers,

efficiently design processes, and optimising algorithms.

1.3 Structure of the Thesis

(9)

findings, conclusions and managerial implications are formed. The thesis is concluded with its

limitations and relevant suggestions for future research.

2. THEORETICAL FRAMEWORK

2.1 Online Customer Journey

Generating sales, revenue, and profits are essential objectives for many firms (Teece, 2010).

In doing so, firms have the possibility to act online, offline, or both. Given that offline and

online customers vary in their purchase behaviour, the consideration between acting offline,

online or both should deliberately be made. Moe and Fader (2004) accurately defined key

characteristics for online customers. Firstly, one might expect that online customers visit

multiple websites without any intention of buying, simply because the costs of visiting are

negligible. Hence, lower conversion rates are observed online. Secondly, due to the low cost

of visiting websites, online shoppers are more likely to delay their purchasing decision since

they can return easily. Consequently, it is assumed to be more likely that online shoppers

make multiple trips to websites before concluding their purchase. Above-mentioned

characteristics affect how customers behave considering their purchasing behaviour. This

subsequently affects the way in which customers design their journey online. Consequently,

this has an impact on marketers’ decision-making regarding the usage of the online marketing

mix to influence customers during their journey. In all, this affects the way in which online

customer journeys are composed.

Given the uncertainty on both the supplier and buyer side, many researchers aimed their

attention at the customer journey and how it should be interpreted and defined. Yet, there is

no uniform definition for the ‘online customer journey’ that is used generally. For example,

some academia argue that customer journeys are as clearly delimited processes (Whittle &

Foster, 1991). While others state that the online customer journey is more flexible and can be

considered an open-ended process without a start or ending point (Nichita, Vulpoi & Toader,

2013). In the field of marketing, the focus has been on customers’ decision processes,

(10)

Vetvik, 2009). According to Lemon and Verhoef (2016) the online customer journey consists

of three important stages, namely: the pre purchase, purchase and post purchase stage.

Whereas Anderl et al. (2016, p. 457) define the online customer journey “as including all

touch points over all online marketing channels preceding a potential purchase decision that

lead to a visit of an advertiser's website”.

This research defines the online customer journey as a flexible open-ended process including

all touch points over all online marketing channels that lead to a visit (i.e., page view) or

purchase (i.e., conversion) at an advertiser’s website. Following Lemon and Verhoef (2016)

and Court et al. (2009), this means that the definition includes the pre-purchase (awareness,

familiarity, consideration) and purchase stage. Since the focal interest of this thesis is

investigating conversion and its attribution, there is no need to account for the post purchase

(loyalty) stage in the online customer journey. This results in a definition that is not entirely

comprehensive in comparison with widely used interpretations of the online customer

journey, such as Court et al. (2009) and Lemon and Verhoef (2016).

2.2 Page Views and Conversion

(11)

However, given that a more comprehensive measurement is beneficial in order to capture

strategic interests, conversion is included additionally.

2.3 Conversion Attribution

Lemon and Verhoef (2016) state that the multiple touch points a customer uses throughout the

journey affect the usage of future touch points in the journey. Although it generates valuable

insights in determining how online marketing channels need to be used, the multiple touches a

customer makes are rarely taken into account when measuring online marketing effectiveness

(Li & Kannan, 2014). Therefore, the issue of attributing conversion to the subsequent steps in

the online customer journey was addressed recently (Li & Kannan, 2014; Anderl et al., 2016).

Customers visit multiple websites through various online marketing channels. These different

kinds of visits might influence customers’ consecutive marketing channel choices when they

want to return to a website (Anderl et al., 2016). Additionally, separate website visits through

a certain marketing channel, “exposes customers to additional information about the

attractiveness of the product and service in relation to competing and complementary offers”

(Li & Kannan, 2014, p. 40). More specifically, visit experiences affect future visits to that

particular website through the same or other channels. These above-mentioned effects are the

so-called spillover and carryover effects. Whereas spillover effects can be defined as

(12)

2.4 Customer Segmentation

Customer segmentation consists of splitting up a heterogeneous market, which is

characterised by various individual demand, in smaller homogeneous markets that act and

respond similar within the segment, but different across those segments (Smith, 1956). In

essence, segmentation groups customers with similar needs and behaviours into segments

using one or more variables (Kotler, 1997; Venter, Wright & Dibb, 2015). Conventionally,

bases for segmentation are geographic, demographic, psychographic, and behavioural (Kotler,

1997). Other variables that might be of relevance include situational (e.g., purchase/use

occasion) and customer preferences (Kotler & Armstrong, 1999).

When designing their marketing strategy, firms want to gain an advantage over competitors

(Bharadwaj & Varadarajan, 1993; Day & Wensley, 1988). Firms can use multiple constructs

in order to generate such an advantage. Segmentation is one of those constructs that is widely

used. According to Hunt and Arnett (2004), customer segmentation generates competitive

advantages when (1) segments are identified, (2) these segments are targeted, and (3)

customised marketing mixes per segment are defined. In customer segmentation, customers

are grouped within segments based on similar behavioural characteristics (Konus, Verhoef &

Neslin, 2008). A proper segmentation of a customer base yields valuable insights for firms

and customers, mainly because investigating and creating customer groups contribute to the

effectiveness of customer relationship management efforts (Kumar, 2018). Moreover,

successful customer segmentation enables firms to determine the allocation of marketing mix

resources across customers more precisely. This helps to customize firms’ strategies per

customer group. Which has valuable implications, not only for firms, but for customers as

well, since their needs are served better and the firm has more insights in their wishes.

Subsequently, in this research segmentation is used in order to understand what drives

customers.

In the field of multichannel (i.e., offline and online) research, many academia addressed

customer segmentation (Nunes & Cespedes, 2003; Konus et al., 2008). Bhatnagar and Ghose

(2004) focused on segmenting e-shoppers only. In the literature, there is a visual distinction

between the origin of data that was used for segmentation. Nakano and Kondo (2018)

(13)

used actual data to verify the effects of different purchase channels (i.e., online and offline).

Moreover, multiple researchers argue for generalisation of findings by validating outcomes

with actual behavioural purchase data (Konus et al., 2008; Wang, Yang, Song & Ling, 2014).

Above-mentioned suggestions require clarification when it comes to the usefulness of these

recommendation. Firstly, the discrepancy between purchase intention and actual behaviour

needs to be accentuated. Research in consumer psychology argues that purchase intention is a

reasonable predictor for actual purchase behaviour. Yet, both terms cannot be used

interchangeably, simply because their measurements vary in outcomes. The literature on

survey quality research measuring respondents’ intention addressed multiple fragilities of

these survey methods. Such discrepancies lead to misprediction of consumers’ tendency and

communication effects (Abeele, Beullens & Roe, 2013; Boase & Ling, 2013).

(14)

Table 1: Behavioural segmentation variables

Behavioural variable

Literature

Device switching

Conversion rate is significantly higher when customers switch from more mobile

device (i.e., smartphone), to a less mobile device (i.e., desktop). Especially when

perceived risk is higher, price is higher, and when experience with the company is

lower. (De Haan, Kannan, Verhoef & Wiesel, 2018).

When booking travel services, mobile is usually used for searching, while PCs are

both used for searching and booking (Murphy, Chen & Cossutta., 2016).

Focal brand consideration Customers who first react to FICs and later visit the website via CICs show an

increased purchase probability (Anderl et al., 2016).

Customers who visit a company’s website more have stronger preference for that

particular firm, resulting in being more loyal and more responsive to ads (Bowman

& Narayandas, 2001).

Price sensitivity

Customers with high price sensitivity have larger consideration sets (Mehta, Rajiv

& Srinivasan, 2003).

Intensity of customer search is higher for products that have greater price

variability (Mehta et al., 2003).

Customers react to price levels and price changes (Goldsmith, 2005).

Number of unique touch

points used

The effect of individual touch points is contingent on the occurrence in the

customer journey (Lemon & Verhoef, 2016).

Specific touch points might affect the purchase probability. For example, email

retargeting increases the purchase probability for some customers, for others it

works in reverse (Li & Kannan, 2014).

Variety seeking occurs when an individual seeks to increase stimulation from

sources in its environment (Menon & Kahn, 1995). Therefore, customers might

prefer more unique touch points in order to increase stimulation simply because

they want more variety in their journeys.

When segmenting customers, various analytical approaches can be adopted. Within these

methods one can distinguish supervised and unsupervised learning of the data (Baer, 2012).

Supervised learning in segmentation implies that the number of segments is prespecified by

the researcher, and thus known a priori. In unsupervised learning the algorithm decides on the

number of segments. Given this difference, unsupervised segmentation is often viewed as a

clustering task, whereas supervised segmentation can be called a classification task (Manning

& Schütze, 2000). Considering the explorative nature of this research, unsupervised

(15)

then output a tree of several clusters” (Li, Wang & Xu, 2009, p. 227). The algorithm clusters

(dis)similarity across a set of segmentation variables. Hierarchical clustering is appropriate

since segments are not known a priori. However, hierarchical clustering has some worth

noting disadvantages. Users of this approach are likely to discard much of the information

deriving from the most important output, namely the dendrogram (Arabie, Carroll, DeSarbo &

Wind, 1981). Besides, using a dendrogram in order to decide the number of segments does

not provide an accurate solution (Fonseca, 2011). Lastly, Tripathi and Bhardwaj (2018)

highlight the downside of using hierarchical clustering on a large number of observations.

Given these disadvantages, hierarchical clustering is considered inappropriate.

Green, Carmone and Wachspress (1976) argue that latent class analysis can be used for

marketing applications (i.e., customer segmentation). Latent class analysis methods have a lot

in common with above-described hierarchical clustering, in such that heterogeneous

customers are clustered in homogeneous groups and the number of segments is not known a

priori. However, latent class analysis is able to “accommodate categorical and continuous

data, and descriptive or predictive segmentation” (Cohen & Ramaswamy, 1998, p. 14).

Moreover, Cohen and Ramaswamy (1998) argue that the most substantial difference between

the two is the type of problems they could be used for. The authors claim that hierarchical

clustering is a more descriptive methodology with a predictor-outcome relationship, whereas

latent class analysis can be used for simultaneously segmenting and predicting. The literature

provides multiple advantages when using latent class analysis compared to traditional

clustering techniques, namely: (1) latent class analysis helps in determining the number of

clusters, since it provides means to select segments (McLachlan & Peel, 2000), (2) latent class

analysis can deal with different measurement levels in the data set, as mentioned above

(Cohen & Ramaswamy, 1998; Vermunt & Magidson, 2002), and (3) traditional approaches

are outperformed by latent class analysis (Vidden, Vriens & Chen, 2016). Therefore, latent

class analysis is adopted in this study.

2.5 Characteristics of Online Behaviour

An online purchase journey includes multiple page visits, through which the consumer

processes the gathered information before making a purchase (Mallapragada et al., 2016).

Accordingly, measurements in order to determine effectiveness are necessary to study

(16)

product. The aspiration to purchase a good (i.e., high purchase involvement) drives the

general information search (Park & Lessig, 1981). Furthermore, it leads to the desire of

generating information during the pre-purchase product search (Beatty & Smith, 1987). This

implies that customers are willing to use various information sources (i.e., different online

channels) in their buying process deliberately when a product or service is characterised as a

high involvement product or service. Given that travel services are considered a high

involvement service, it is reasonable to assume that online customer journeys will be of longer

length. Blackwell, Miniard and Engel (2006) argue that customers’ degree of involvement

affects information processing. High involvement services consist of a high degree of

complexity, this requires more deliberate consideration and thought. Zaichkowsky (1985)

claims that customer need to feel connectivity with high involvement products. This affect the

online customer journey, in such that it results in longer and more extensive information

search. Therefore, it is hypothesised that:

H1: Customer journeys that take longer (in time) are more likely to have more page views.

H2: Customer journeys that take longer (in time) are more likely to end in a conversion.

H3: Customer journeys including more page views are more likely to end in a conversion.

Traditionally, firms initiate the greater part of marketing activities. Yet, in the online

environment marketing activities are often initiated by customers (Shankar & Malthouse,

2007). Hence, an important distinction between online marketing channels is the concept of

customer-initiated contact touch points (CICs) and firm-initiated ones (FICs). Both have their

unique characteristics, for FICs the advertiser determines timing and exposure and the

(17)

websites through FICs. Based on the preceding discussion of CICs and FICs, the following is

hypothesised:

H4: Customers who engage in CICs more (i.e., share of CICs in their journey), are more

likely to view more pages.

H5: Customers who engage in CICs more (i.e., share of CICs in their journey), are more

likely to have a higher conversion probability.

2.6 Conceptual Model

(18)

Figure 1: Conceptual model present research

3. RESEARCH DESIGN

3.1 Data Collection

Market Research Institute GfK provided data which was gathered from May 31st, 2015 until

October 31st, 2016. It involves quantitative panel data of Dutch customers from an

anonymous travel agency. Members of the panel gave their approval to participate and record

their online (purchase) journeys. According to Leeflang, Wieringa, Bijmolt and Pauwels

(2015), panel data involves keeping track of time-series for individual members in the panel.

This has consequences in terms of data quality. Lohse, Bellman and Johnson (1999) argue

that using panel data benefits companies, in such that individual behaviour is observed over

time, this helps in more accurately determining consumer behaviour. The data consists of two

separate data sets. The first includes browsing and purchase behaviour of the 2.456.414

sessions in total, while the second data set consists of demographics of the 9678 panellists.

Conversion

H4

H5

H3

H2

H1

Customer Segments

Control Variables

(19)

3.2 Techniques of Analysis

Given the objectives of this research, as described in the previous chapter, a combination of

several modelling techniques and the appropriate data aggregation level is required. Firstly,

this study aims its attention at identifying different customer segments based on a set of

behavioural segmentation variables. Therefore, advanced segmentation modelling is required

in order to obtain valuable customer segments. Given the reasoning provided in paragraph

2.4, latent class analysis is conducted when determining the segments. The customer

segmentation analysis is performed on user-level because the aim is to cluster individual

customers based on relevant behavioural segmentation variables. Next, the focus is shifted

towards investigating the hypothesised relations between browsing behaviour and conversion

probability across the identified customer segments. From this point in the analysis, data

aggregated on journey-level will be used since this type of aggregation provides the true

purchase journeys which are the focal interest in this study. By analysing these hypothesised

relations, knowledge will be gained regarding which online behaviour drives customers to

visit pages, and ultimately convert. Linear regression and logistic regression models derive

these insights. Lastly, information regarding conversion attribution is explored on

journey-level using higher order Markov models. Such models yield insights in the most profitable

touch points. Moreover, spillover and carryover effects are examined subsequently.

3.3 Variables

This paragraph provides information regarding the variables that are incorporated in this

study. A distinction is made among the techniques discussed in the previous paragraph.

The following variables are derived on user-level and applied in the segmentation analysis:

device switching (i.e., binary variable indicating if an individual user used one or two devices

in its journey), focal brand consideration (i.e., binary variable indicating if an individual user

visited as least one touch point of the focal brand), price sensitivity (i.e., binary variable

indicating if an individual user visited at least one comparison site, if so the user is listed as a

price sensitive customer), number of unique touch points used (i.e., count variable indicating

how many unique touch points an individual user visited during its journey). After

(20)

In the regression models, data will be aggregated on journey-level, therefore the following

variables are included: journey length (has two definitions, namely: (1) the amount of time the

journey takes in seconds, and (2) the sum of all used touch points in seconds), CIC Usage

(i.e., share of CICs in relation to FICs in a purchase journey), page views (i.e., the number of

touch points used within a journey), purchase (i.e., binary variable that indicates conversion).

Considering the completeness criteria by Little (1970), it is conceivable to assume that the

relation between browsing behaviour and conversion probability is affected by control

variables. Therefore, the following control variables are added: day of the week effects, end of

the month effect, weather (i.e., (1) temperature and (2) sunshine), and a seasonal variable.

Table 2 provides reasoning for these control variables.

Table 2: Reasoning for control variables

Control variable Reasoning

Day of the week Given that travel services are high involvement products, customers prefer to make a

deliberate decision before making a purchase. Therefore, customers are more likely to

purchase in the weekends.

People tend to shop online on Mondays (Tuttle, 2012).

End of the month Salary payment is at the end of the month. Given that travel services are expensive, people

might wait for their salary before they book. Therefore, they are more likely to buy at the

end of the month.

Weather

In literature, associations between weather and shopping behaviour are observed (Ibrahim

& McGoldrick, 2006; Shih, Nicholls & Holecek, 2009).

Sunlight positively affect willingness-to-pay (Murray, Di Muro, Finn & Popkowski

Leszczyc, 2010).

Seasonal effect

People start searching for and purchasing their (summer) holidays in January, February and

March.

Lastly, the Markov models are also estimated on journey-level. For these models, purchase

act as input variable for the identification of conversion. The chronological sequence of used

touch points in a journey determines the path taken within a session.

3.4 Model Specification

This paragraph provides insights and mathematical details regarding the multiple model

specification. Firstly, customer segmentation using latent class analysis is discussed by means

of (1) the model specification, (2) how individuals are assigned to segments and (3)

(21)

3.4.1 Customer Segmentation

Given the explorative nature of this research, it is not known a priori how customer

heterogeneity can be structured. Therefore, the segmentation technique should account for

such characteristic. Multiple techniques might be of service here, yet a well-considered

decision should be made. According to Wedel & Kamakura (1998) latent class analysis is

superior in comparison with classical clustering techniques (e.g., hierarchical clustering,

K-means). This is mainly because latent class analysis combines classical clustering procedures

with regular statistical estimation methods. Hence this research adopts the latent class analysis

approach. Following Vermunt and Magidson (2002), the latent class analysis model is written

as in equation 1:

!(#

$

|&) = ) *

+

!

+

(#

$

,

+-.

|&

+

)

(1)

Where #

$

denotes a customer’s i scores on the set of behavioural segmentation variables, K is

the number of segments, and *

+

denotes the prior probability of belonging to latent segment

k. The distribution of

#

$

, given the model parameters &, !

+

(#

$

|&) is assumed to be a mixture

of class-specific densities, !

+

(#

$

|&

+

).

In latent class analysis, classification is performed using posterior class membership

probabilities, in which each customer i is assigned to segment k with the highest posterior

probability. For classification, equation 2 is employed:

*

+|/

0

=

*

+

!

+

(#

$

, &

+

)

∑ *

+

+

!

+

(#

$

, &

+

)

(2)

Latent class analysis provides multiple segment solutions, which can be compared and

(22)

in these criteria originates from the penalties for number of parameters and sample size. BIC

and CAIC penalize more heavily than AIC. For the purpose of this study, the three

above-mentioned information criteria will be used in order to determine the number of segments.

Additionally, to assure the validity, AIC3 is added and evaluated as well.

3.4.2 Modelling Page Views

After performing the latent class analysis, page views will be modelled. The dependent

variable of interest, page views can be interpreted as “count variable” given that it can only

reaches values higher than 0. Leeflang et al. (2015) state that the General Linear Model is not

the best technique for dealing with count data. This is due to the fact that the distribution of

the dependent variable is different from the normal distribution. Instead, Wooldridge (2012)

argues that the Poisson distribution is more suitable for count data. Furthermore, Oppong,

Chongsi and Agyapong (2017, p. 457) state that “zero-truncated Poisson regression is used to

model count data for which the values (response) cannot be zero”. Therefore, truncated

regression is more suitable to determine the effect of journey length and CIC usage on page

views. However, the assumptions for count models should be satisfied, otherwise linear

regression is more appropriate (Bijmolt, 2018). The threshold for the mean for the dependent

variable that is appropriate for count data models is ≤ 10 since from that point the distribution

is considered normal. Given that Page Views has a mean of 132 for the whole dataset,

Ordinary Least Squares is more appropriate. Therefore, the following model is specified in

equation 3:

#

$

= 3 + 6

.

78

$

+ 6

9

7:

$

+ 6

;

<=<:

$

+ 6

>

?@AB

$

+ 6

C

D@AE

$

+

6

F

GDEH

$

+ 6

I

:JK

$

+ 6

L

:DM:NK

$

+ O

$

(3)

Where:

#

$

= Number of touch points member of segment k visits in purchase journey i;

78

$

= Amount of time the journey takes in seconds in purchase journey i;

7:

$

= Sum of time spent on all used touch points in seconds in purchase journey i;

<=<:

$

= Share of CICs in purchase journey i;

?@AB

$

= Day of the week effect in purchase journey i;

D@AE

$

= Dummy variable if purchase at end of the month in purchase journey i;

GDEH

$

= Temperature effect in purchase journey i;

:JK

$

= Sunshine effect in purchase journey i;

(23)

3.4.3 Modelling Conversion Probability

Secondly, a logistic regression model will provide how journey length, CIC usage and page

views contribute to the probability to convert. In this model, purchase acts as the dependent

variable. Given that purchase reports if a customer makes a purchase or not, recorded via a 1

or 0 (i.e., purchase or not), it can be stated that this variable is a binomial response variable.

Following Leeflang et al. (2015), the desired methods to analyse marketing-related problems

involving a binomial dependent variable is binary logit and/or probit models. Following

Pituch and Stevens (2016), four assumptions are associated with logistic regression models.

Assumption 1 state that a correct specification of the model is required. The correct

specification implies that (1) the correct link function is used, and (2) only appropriate

predictors are included. Given that logistic regression applies logit or probit models, a

consideration should be made. Logit and probit models are much alike, in such that the

outcomes of the models are very similar. However, due to mathematical considerations, logit

modelling is often preferred (Leeflang et al., 2015). For the identification of predictor

variables, Pitch and Stevens (2016) provide means for selecting such explanatory variables, of

which this research relies on theory and common sense. Furthermore, a stepwise modelling

approach contributes to selecting the best model and allows for comparison of multiple

models. Therefore, a stepwise approach is adopted when estimating the models. Assumption 2

contains the claim that the observations are assumed to be independent of each other. Due to

the way in which panel data is recorded, it is highly possible that observations are not

independent of each other. To illustrate; a user can have multiple purchase journeys which are

in fact measured separately. Therefore, it cannot be guaranteed that the observations within

segment solutions are completely independent. Assumption 3 consists of the fact that each

explanatory variable is measured without measurement error. Given that GfK provides the

data for the analysis, this assumption is assumed to be satisfied. GfK is highly

professionalised, well experienced, and their knowledge and expertise in collecting data is

considered non-problematic. Assumption 4 state that a large enough sample size should be

used. According to Leeflang et al. (2015), every parameter included in the model should at

least account for five observations. However, Long (1997) advices ten observations per

parameter. Given that the segmentation analysis determines the number of observations for

each segment, it is not possible to satisfy this assumption at this point.

(24)

zero. Furthermore, given that odds need to be estimated, equation 4 is deployed (Allison,

2012):

*

$

=

1

1 + exp (−{V

$

W

6})

(4)

Using this notation, the model specification for conversion probability across various

segments is as in equation 5:

*

$

=

1

1 + YVZ [− \

6

]

+ 6

+ 6

.

78

$

+ 6

9

7:

$

+ 6

;

H^

$

+ 6

>

<=<:

$

+ 6

C

?@AB

$

F

D@AE

$

+ 6

I

GDEH

$

+ 6

L

:JK

$

+ 6

_

:DM:NK

$

`a

(5)

Where:

*

$

= Probability that member of segment k converts in purchase journey i;

78

$

= Amount of time the journey takes in seconds in purchase journey i;

7:

$

= Sum of time spent on all used touch points in seconds in purchase journey i;

H^

$

= Sum of all used touch points in purchase journey i;

<=<:

$

= Share of CICs in purchase journey i;

?@AB

$

= Day of the week effect in purchase journey i;

D@AE

$

= Dummy variable if purchase at end of the month in purchase journey i;

GDEH

$

= Temperature effect in purchase journey i;

:JK

$

= Sunshine effect in purchase journey i;

:DM:NK

$

= Seasonal effect for purchase journey i.

Given that multiple models will be estimated using stepwise modelling, a selection of

comparison criteria is selected for evaluation of the models, namely: Hitrate, Pseudo R

2

Nagelkerke, Akaike Information Criterion (AIC) and the top-decile lift (TDL).

3.4.4 Modelling Conversion Attribution

According to Anderl et al. (2016), determining the degree to which channel the conversion

should be attributed is a demanding and complicated process. In conversion attribution

literature, several techniques for assigning credit to channels were used in the past. These

techniques can be divided into simple heuristic metrics (e.g., first click or last click

(25)

Montgomery and Srinivasan (2004), highlight the importance of statistical models in an

online environment. Their proposed advanced model makes probabilistic assessments about

future paths customers take, including whether the customer will convert or not. They argue

that their model can be used for setting marketing mix variables, improves purchase

conversion rates, and enhances profits. This is closely related to carryover and spillover

effects, introduced by Li and Kannan (2014). Both effects are helpful in order to determine

the future steps a customer takes. Whereas carryover effects can be interpreted as visits in the

past may influence future visits, such that customers return to a website through the same

channel. Spillover effects are more related towards cross-channel effects between various

channels, for example such that visits via SEO could lead to visits via displays. Both form the

basis for future touch point exposure by customers (Anderl et al., 2016).

Contingent on the objectives and purpose of the analysis, marketers can draw upon one or

more attribution techniques. However, it is proven that heuristic models lead to incorrect

results (Abhishek et al., 2015; Li & Kannan, 2014; Xu et al., 2014). Besides, more advanced

analytical attribution models outperform heuristic ones (Anderl et al., 2016). Given this

research framework, an attribution mechanism with the capability of discovering individual

marketing channels’ effectiveness and deriving insights on the interaction of channels (i.e.,

spillover and carryover effects) in a multichannel environment is beneficial (Anderl et al.,

2016). Therefore, the Markovian graph-based data mining technique as applied by Anderl et

al. (2016) is purposed for this study. Anderl et al. (2016) employ the approach originally

applied for individual customer conversion attribution. In these Markov models, the

chronological sequence of states (i.e., touch points, or channels) a customer visits during the

purchase journey are taken into account. A collection of possible states is the state set,

mathematically this set is specified as in equation 6:

: = {b

$

, … , b

d

}

(6)

(26)

able to shed light on the interplay of channels, i.e., spillover and carryover effects (Anderl et

al., 2016). Per customer segment, three higher order Markov models will be specified. The

predictive performance of these models is conducted by comparing the AUC (i.e., area under

the receiver operating characteristic curve) between an estimation and evaluation sample of

the dataset per segment. Afterwards, one model per segment is selected and elaborated on.

Mathematically, higher order Markov models are specified according to equation 7:

H(V

$

|V

$e.

, V

$e9

, … , V

$

) = H(V

$

|V

$e.

, … , V

$ed

)

(7)

4. RESULTS

4.1 Preliminary Analysis

In order to obtain insightful and valid results from the multiple analysis techniques, it is

salient to check whether the data is ready for analysis. A first glance at the data did not reveal

much inconsistencies or oddities, this can be explained by the fact that the dataset consists of

actual behavioural purchase data. Nonetheless, specific values in some variables were

considered problematic due to oddities, outliers or missings. This paragraph provides the

discussion how these difficulties were solved.

Firstly, variables used for the analysis were checked for outliers by means of boxplots

(Appendix A1). Not all variables show values that lie inside the boxplot. The majority is

considered realistic since the observations were based on real customer behaviour. Hence a

more profound inspection of the data identified one customer who used 64.503 touch points in

total. Compared to the other customers in the dataset this customer differs to a great extent,

since the maximum number of the remaining customers was 13.813 touch points. Given this

difference, it is decided to leave this customer out of the analysis.

(27)

percentages per grouped number of touch points in purchase journeys. Whereas Appendix A3

provides a scatterplot of the number of touch points on conversion probability. As can be

noted from both appendices, a high number of non-converting short journeys was observed.

Hence, it was decided to exclude journeys ≤ 10 touch points from the analysis.

Thirdly, missing values were observed in the variable duration. In sum, the duration of

141.030 touch points was not registered. In Appendix A4, an overview is given for which

touch points these NAs occur. The NAs in CICs were replaced by the mean for that particular

journey. It is worth noting that due to technical reasons the duration for FICs was not

registered consistently. Yet, these FICs should still be considered valuable. Given this, it was

decided to replace the NAs in the FICs by the average duration of website-visits in the

journey.

Lastly, given that the latent class analysis will be performed on user-level. It is necessary to

account for unique PurchaseIDs across the users once the data is aggregated on journey-level.

One would expect that each unique UserID has a (selection of) unique PurchaseID(s)

assigned to that UserID. For example, UserID #1 had three different journeys within the

observation period, these were recorded as PurchaseID numbers, #3, #4, and #5. Ideally,

these three PurchaseIDs are unique for UserID #1, and therefore cannot be found in other

UserID numbers. Hence, the oddity in the data was found that this was the case. Underlying

to this inconsistency might be that the data in PurchaseID was registered on cookie level,

resulting in equal PurchaseIDs for people in the same household. Given this, it cannot be

guaranteed that a PurchaseID represent a unique user. Figure 2 provides a graphical overview

of this finding.

(28)

In total, 2387 problematic PurchaseIDs were observed, of which 2104 were found in 2

different UserIDs, 217 in 3, 54 in 4, and 12 in 5. To solve this inconsistency, after conducting

the latent class analysis it will be investigated whether these problematic PurchaseIDs are

found in the same cluster solution. If so, these cases will be deleted from the dataset to ensure

that purchase journeys represent unique users.

4.2 Descriptive Statistics

In order to get a straightforward interpretation of the data, key descriptive statistics are

provided in this paragraph. These represent the data in a raw way, which helps to visualize

and understand it. Firstly, the aim is to provide insights in the variables on user-level. Table 3

decomposes the demographic structure of the users in the dataset. Missings are provided per

variable. It was decided not to impute or replace the missings, since the demographic

variables are not considered as input variables for one of the analysis techniques. Hence these

are used for profiling the segments, therefore maintaining the true distribution was considered

more appropriate because replacing or imputing does not provide the true values per

(29)

Table 3: Descriptive statistics on user-level; demographic variables

Variable

Group

Count

Age

17-34

1265

35-50

2445

51-64

2361

65+

2003

NA

1603

Gender

Male

3220

Female

4854

NA

1603

Education

Low

2073

Middle

3030

High

2543

NA

2031

Regio

North

956

East

1745

South

2112

West

3261

NA

1603

Income

Below average

2862

Average

3044

Above average

831

Rather not tell

1337

NA

1603

(30)

Table 4: Descriptive statistics on user-level; behavioural segmentation variables

Variable

Mean

S.D.

Device switching

0,1239

0,329487

Focal brand consideration

0,2367

0,425108

Price sensitivity

0,5857

0,492623

Number of unique touch points used

4,0832

2,550443

Once customer segmentation on user-level is conducted, the focus is shifted towards

analysing purchase behaviour on level. Therefore, descriptive statistics on

journey-level are reported additionally. Table 5 reports these descriptive statistics. It can be concluded

that 20,66% of the 15.912 journeys ended up in a purchase. This means a total of 3288

converting journeys, of which 174 were at the focal brand. The average journey length was

5.157.098 in seconds, implying an average journey of 59 days, 16 hours and 31 minutes.

During this timeframe, customers spent 7690,17 seconds on a touch point. Furthermore,

customers on average used at least 147 touch points during their journey, of which almost

99% was customer-initiated.

Table 5: Descriptive statistics on journey-level

Variable

Mean

S.D.

Total journey time (seconds)

5.157.098

(59,7 days)

(74,9 days)

6.476.097

Duration of touch points

7690,17

19059,05

Number of contacts

147,3

319,9

CIC

144,6

312,4

FIC

2,68

26,02

Share of CIC

98,77

6,43

Purchase any

0,2066

0,4049

Purchase own

0,0109

0,1040

End of the Month purchases

0,2097

0,4070

Day of the Week purchases

(Monday is 1, Sunday is 7)

3,932

2,1216

4.3 Segmentation using Latent Class Analysis

(31)

either binary or of ratio scale, an R package able to handle a combination of these variable

types is required. The depmixS4 package is suitable for such mixture of variables, therefore

this package was used to perform the latent class analysis using R version 1.2.1335.

After performing six different segment solutions ranging from 2 to 7 segments, the number of

users assigned to each segment is retrieved. Table 6 reports the segment sizes for the six

different solutions. In the two-segment solution, the users were almost equally distributed

(52% in Segment 1, and 48% in Segment 2). The three-segment outcome provide segment

sizes of 41% for Segment 1, 35% for Segment 2 and 24% for Segment 3. Whereas the four,

five and six segment solutions have a greater variation.

Table 6: Segment sizes (in users) x number of segments (N = 9677)

Segment

K=2

K=3

K=4

K=5

K=6

K=7

1

5003

3951

3329

1622

1894

681

2

4674

3361

2812

2365

1321

1622

3

2365

2561

848

1745

2299

4

975

2250

966

637

5

2592

2129

2102

6

1622

1622

7

714

In determining the optimal segment solution, information criteria are assessed. Table 7

provides insights in which segment solution is supported by the information criteria. In

Appendix B1, a graphical overview can be found. It should be noted that there is not much

difference among the values of the information criteria for segment solutions 2, 3 and 4. The

same applies for the differences between segment solutions 5, 6 and 7. Based on the

(32)

criteria between the five, six and seven-segment solution is not of much extent, the

five-segment solution is considered most appropriate and adopted.

Table 7: Information Criteria for various segment solutions

Number of segments Log likelihood

AIC

BIC

CAIC

AIC3

2

-34162.92

68347.84

68426.80

68363.55

68337.84

3

-33047.72

66129.44

66251.46

66133.15

66107.44

4

-32540.83

65127.66

65292.74

65119.37

65093.66

5

17688.53

-35319.07

-35110,92

-35339,35

-35365,06

6

17726,32

-35382.65

-35401.35

-35414,93

-35440,64

7

17952,34

-35822,68

-35528,41

-35866,97

-35892,68

Based on the preceding discussion, the segments within the five-segment solution are

investigated in more detail. In order to perform valid comparisons among these five segments,

the appropriate statistical procedure is required. Given the substantial differences in segment

sizes, the homogeneity of variance assumption is affected (Tomarken & Serlin, 1986). The

Levene’s and Fligner-Killeen tests on variable level prove violation of the homogeneity of

variance for device switching (p = 0,000), focal brand consideration (p = 0,000), price

sensitivity (p = 0,000), number of unique touch points used (p = 0,000), and age (p = 0,000).

A Kruskal-Wallis analysis of variance was conducted for variables being significant in the

homogeneity of variance test (Vargha & Delaney, 1998). This test does not assume a normal

distribution of the residuals. Given that gender, education, region and income were measured

on a nominal or ordinal scale a Chi-square test was used to investigate statistical differences.

Table 8 contains the means and p-values for the focal variables per segment, to ease

interpretation and comparison between segments, percentages are reported for gender,

education, region and income. According to the p-values, segments differ significantly, hence

it is not evident between which segments these differences are significant. A pairwise

(33)

Table 8: Latent class analysis results: variable comparison

Size

(%)

Segment 1

1622

(17%)

Segment 2

2365

(24%)

Segment 3

848

(9%)

Segment 4

2250

(23%)

Segment 5

2592

(27%)

p-value

Device switching

0,0068

0,2419

0,4080

0,0627

0,0498

0,000***

Focal brand consideration

0,0216

0,5480

0,8774

0,0960

0

0,000***

Price sensitivity

0,1005

0,8723

0,9976

0,0018

1

0,000***

Number of unique touch points used

1

6

9,5896

2,7013

3,6617

0,000***

Age

a

51,89

50,67

47,22

52,8

54,21

0,000***

Gender

a

Male

39%

40%

39%

40%

40%

0,9247

Female

61%

60%

61%

60%

60%

Education

a

Low

30%

23%

18%

32%

29%

0,000***

Middle

41%

41%

38%

40%

38%

High

29%

37%

44%

28%

33%

Region

a

North

12%

12%

10%

12%

12%

0,3497

East

23%

22%

19%

22%

22%

South

26%

26%

28%

24%

27%

West

39%

41%

43%

42%

39%

Income

b

Below average

45%

39%

31%

50%

42%

0,000***

Average

44%

47%

52%

40%

45%

Above average

11%

13%

17%

10%

13%

Sign. codes: 0 ‘***’ 0,001

a

Missings occur in these variables. For age, gender, education and region Segment 1 has 357 missings, Segment

2 33, Segment 3 110, Segment 4 413 and Segment 5 390.

b

For income, ‘rather not tell’ was included in the missings. Therefore, this variable had 463 missings in Segment

1, 418 in Segment 2, 143 in Segment 3, 511 in Segment 4 and 496 in Segment 5.

In the preceding discussion, insights in how the segments are profiled in terms of numbers

was provided. Below, the segments are described more comprehensively.

The Unaffected Ones (Segment 1)

The first segment is the second smallest (17%) and scores low on the behavioural

(34)

The Active Ones (Segment 2)

The second segment is the second largest (24%). This segment scores above average on

device switching, focal brand consideration, price sensitivity, and the number of unique touch

points used. These numbers imply active participation in the online journey. The majority of

this segment has at least a middle education and earns an average income.

The Smart Youngsters (Segment 3)

Segment 3 is the smallest only covering 9% of the market. This segment displays the highest

scores on device switching, focal brand consideration and number of unique touch points

used. Only Segment 5 scores higher on price sensitivity, hence this difference is negligible.

Having a mean of 47,22 years, Segment 3 is the youngest. Considering education, the

majority of this segment was highly educated.

The Passive Ones (Segment 4)

Given that low scores are displayed for the behavioural segmentation variables, Segment 4 is

a low involvement group. On price sensitivity, this segment depicts the lowest score, whereas

the scores for device switching, focal brand consideration and number of unique touch points

used are well below average. Segment 4 is one of the oldest segments (52,8 years old). When

examining income, it is noted that half of the group earns below average.

The Oldies (Segment 5)

This segment is the largest (27%) and oldest (almost 54 years old). The focal brand was not

considered by the customers in this segment. On the other hand, price sensitivity indicates that

this segment is likely to value what they are paying for. The variation in the distribution for

education is least for this segment, the majority has an average income.

4.4 Model Estimation and Model Selection

(35)

distributed among the five segments. It should be noted that Segment 1 is rather small,

especially compared to Segment 2 and Segment 5. According to Leeflang et al. (2015) each

parameter should at least have five observations, while Long (1997) state that ten

observations per parameter is sufficient. In the model selection, nine parameters were

included, following Long (1997) a minimum of 90 observations is required. Therefore, the

number of observations in Segment 1 is considered sufficient.

Table 9: Distribution of journeys per segment

Number of journeys

Share of journeys

Segment 1

243

2%

Segment 2

5709

40%

Segment 3

2724

19%

Segment 4

1831

13%

Segment 5

3793

26%

Total

14.300

100%

According to Leeflang et al. (2015), VIF scores can be used in order to assess the extent to

which each independent variable can be expressed as a linear regression to the others. By

conducting a linear multiple regression model, valuable VIF scores can be generated and

assessed. Following Chong and Jun (2005), VIF scores above 5 indicate multicollinearity, this

requires additional investigation into the problematic variables. Table 10 reports the VIF

scores per segment for the focal variables that are included in the regression models. For

Segment 1 CICS

i

was excluded because no deviation within this variable was observed. Given

Referenties

GERELATEERDE DOCUMENTEN

The final objective of this thesis, was to study the possible positive moderating effect of using a small screened device (versus a large screen) while visiting inspiration

Keywords: customer journey, touchpoints, segmentation, profit drivers, online, discounted expected transactions, negative binomial regression, latent class cluster

The gap in the literature on segments based on demographical variables and the rise in use of (multiple) devices in the customer journey do make research on this crucial. In

Both the number of touchpoints in a journey and whether someone shows switch behavior with a preference indication for one of the substitutes are positively related with the purchase

Clickstream data is the electronic record or internet usage (Bucklin and Sismeiro, 2009). However, literature lacks insights into a holistic view of actual web behavior with both

Besides investigating the overall effect of the five different customer experience dimensions (cognitive, emotional, sensorial, social, and behavioural) on customer loyalty, I

5.5.1 The use of online channels in different stages of the customer purchase journey In order to test the first hypothesis multiple logit models are tested with a channel as a

impact of average satisfaction levels during prior experiences on the current overall customer experience is mediated by the level of pre-purchase satisfaction. H4 Customers