• No results found

Shifting considerations - Insights from the consumer purchase journey

N/A
N/A
Protected

Academic year: 2021

Share "Shifting considerations - Insights from the consumer purchase journey"

Copied!
67
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

1

Shifting considerations - Insights from the consumer

purchase journey

(2)
(3)

3

Shifting considerations - Insights from the consumer purchase

journey

Eleni Marangotidou University of Groningen MSc Marketing Intelligence Master Thesis 17-06-2019 Supervisor (First): dr. P.S. (Peter) van Eck

p.s.van.eck@rug.nl

Supervisor (Second): prof. dr. J.E. (Jaap) Wieringa

j.e.wieringa@rug.nl

University of Groningen Faculty of Economics and Business

(4)
(5)

5

SUMMARY

Marketing was always aiming in reaching consumers in critical moments and influence their decisions. Some years ago, the consumer’s path-to purchase was translated into the marketing funnel. However, consumer journeys could not be fully depicted on this concept because of the explosion of digital channels: leading to a myriad of touchpoints, and highly-informed consumers. As an outcome, consumers have an enormous variety of alternative to consider for their purchases. Thus, consideration sets play an important role when consumers are actively engaging in decision journeys, since alternatives can be excluded or adding at any time. Online journeys can either consist of contacts initiated by customers either contacts initiated by firms or both. Marketers should take into account both how the different touchpoints are influencing various consumer’s decisions as well as which of these consumers can be of most value. Translating the consumers’ paths-to-purchases into conversions can be challenging, though if leveraged correctly, journeys can be of high value for the companies. Consequently, this research will contribute to the existing literature by examining the following questions:

1. Which of the touchpoints are of most value to the firms?

2. Which online consumer segments can be most influenced by firm-initiated contacts? 3. Can firm initiated contacts assist in shifting considerations?

Data from a Dutch travel company obtained from GfK are utilized for the study’s scope. Based on literature review several hypotheses are formed which will be addressed by two binomial logistic regression models, one studying the total market behavior and the other one addressing the effect of focal firm’s initiated touched on the competitor’s booking conversions. Firstly, segmentation is performed in two steps: initially a hierarchical clustering method is employed followed by a partitioning method, which identified four segments. The segmentation is followed by the two binomial logistic regression models. The results of them, indicate that both customer- and firm-initiated contacts have a positive effect on purchases. With respect to segments only two out of four (segment 1 and 4) indicated an influence on bookings, however they had different behaviors when coming across focal brands advertising. More specifically, segments two and three are influenced by focal brand’s advertising.

(6)

6

considerations and result in negative booking probabilities for the competitor. However, it is also discovered that these interactions can backfire and result in the opposite of the expected outcomes for the focal brand.

(7)

7

Preface

This thesis is the final outcome of my study period and academic education for now.

When I first came to Groningen I have not fully discovered my passion for Data, however the inspiring professors and motivating fellow students made me realize that is something that I want to build my career on.

A great work opportunity arrived at the end of the master program. Combining master thesis with full time working schedule was challenging, though a full-time learning experience. Studying for thesis gave me a lot of theoretical knowledge which is contributed to my job position and which gave me a hands-on experience. At this point I would like to thank my colleagues from work and especially my manager, for their support throughout this journey.

I would also like to express my gratitude to my supervisor dr. Peter van Eck, for his valuable insights and constructive feedback. I would also like to thank my thesis groupmates for their tips and suggestions.

I also extend my appreciation to my friends and family, without whom I could not have achieved that much. Lastly, I would like to thank my partner for his constant and unconditional support on this difficult time.

(8)

8 Contents

1. Introduction ... 10

2. Literature Review ... 12

2.1 CIC and booking conversions ... 12

2.2 Consumers’ characteristics and behavior ... 13

2.3 Firm-initiated contacts and booking conversions ... 14

2.4 Consideration stage ... 15

2.5 Interaction between firm- and customer-initiated contacts ... 16

2.6 Control variables ... 18 2.7 Conceptual model ... 19 3. Methodology ... 19 3.1 Data description ... 19 3.2 Research design ... 21 3.3 Plan of analysis ... 24 4. Results ... 24 4.1 Data manipulation ... 24 4.2 Aggregation Level ... 25 4.3 Segmentation ... 26

4.3.1 Treating for outliers and extreme values ... 26

4.3.2 Clustering analysis ... 27

4.3.3 Segment description ... 29

4.4 Preliminary analysis ... 30

4.5 Descriptive statistics ... 32

4.6 Checking logistic regression assumptions ... 32

4.7 Model selection... 33

4.8 Hypothesis H1, H2, H3a Assessment ... 36

4.9 Hypothesis H3b H4 Assessment ... 39

4.10 Graphical representation of the moderating effects ... 43

5. Discussion ... 44

5.1 Effect on CICs and FICs on booking conversions ... 44

5. 2 Moderating effects of firm-initiated contacts on segments ... 45

5.3 Moderating effects of firm-initiated on customer-initiated contacts ... 46

(9)
(10)

10

1. Introduction

We live in times where the internet is growing more rapidly than ever: it doubles in size every 12 months, and by 2020 its space will occupy about 44 zettabytes (Dold & Groopman, 2017). Consequently, online consumers are exposed to a variety of offerings and are faced with a great amount of options before making a desired purchase (Scheibehenne et al., 2010). This means that consumers have many feasible alternatives to consider, channels and media to decide which option is the best for them (Lemon & Verhoef, 2016).

Translating these options into communication channels, one can make inferences about the consumer’s path-to-purchase. A consumer purchase journey entails a part of a general customer experience which consists of distinct contacts (touchpoints) among the customer and the firm. Thus, the purchase journey can be defined as “the process a customer goes through, across all stages and touchpoints” (Lemon & Verhoef, 2016).

Businesses can derive insights from the navigational path so as to understand and maybe predict the online consumer behavior (Montgomery et al., 2004; Bucklin & Sismeiro, 2009). A touchpoint refers to “a customer contact point, or a medium through which the firm and the customer interact” (Neslin et al., 2006). Various researchers (e.g. Bowman & Narayandas, 2001; Lemon & Verhoef, 2016; Andler et al. 2016) have divided the touchpoints into two basic categories: the customer-initiated or customer-owned contacts (CIC) and the firm-customer-initiated or brand-owned contacts (FIC). A customer-initiated contact, by definition, is a contact that is initiated by a customer. This can be either a generic/ more specific search in a search engine machine, a type in the company’s/ competitor website, visiting third-party websites or an action made in an application (Wiesel et al., 2011; Lemon & Verhoef, 2016; Li & Kannan, 2014; Andler et al. 2016). While a firm-initiated contact is any touch point that results from a company’s initiative and usually is also managed by it. Advertisements such as display, retargeting, affiliate and information e-mail advertising have been classified as contacts initiated by the firm (Wiesel et al., 2011; Lemon & Verhoef, 2016; Li & Kannan, 2014; Anderl et al. 2016). It is interesting to identify which of the touchpoint types are more effective and can lead to successful conversions for the companies.

(11)

11

interactions with the various channels and advertising, can result in purchases from competitors (Ghose & Todri-Adamopoulos, 2016).

Yet, not all consumers express the same preferences in their journeys; It is important to consider consumer heterogeneity across options and do not consider their behavior as a whole (Konuş et al., 2008). Demographics and behavioral patterns differ among various segments and play a role throughout the path-to-purchase (Keaveney & Parasathy, 2001; Konuş et al., 2008; Patterson, 2007).

Consumers initiating contacts by themselves are expected to have more conversions in a firm’s website (Kumar & Venkatesan, 2005; Wiesel et al., 2011). When moving across CICs, it is believed that consumers are in an active consideration stage in which consumers evaluate their possible alternatives (Court et al., 2009). However, since the nature of the consumer purchase journey involves more than CICs; customers are encountering with firm-initiated contacts which can influence their purchase decisions (Li & Kannan, 2014). Advertising can influence the consumer’s consideration over time (Mitra, 1995) and companies can leverage this to their benefit by shaping consumer’s considerations (Singer, 2015).

However, to the researcher’s knowledge there is little or no research that has still addressed the effect of firm-initiated contacts on the purchase journey across the various consumer segments. Therefore, the main research question of this paper is:

Do Firm-initiated contacts influence the consumers’ considerations through the purchase journey?

In order to shed light to the main question, the following sub-questions are employed: 1. Which of the touchpoints are of most value to the firms?

(12)

12

impact journeys which also include competitors’ contacts, accounting for consumer’s consideration stage.

For the study’s scope, data from a travel website will be used, provided by GfK. These data contain series of various touchpoints, as well as consumer characteristics. Therefore, when the term conversion is used, it will be referred to travel bookings.

The paper is structured as follows: first will continue with a literature review and formulation of the hypothesis, then a conceptual model will be presented, followed by an explanation of methodology, presenting the model’s formula and discussion of findings. Lastly managerial implications and limitations of the study will be discussed.

2. Literature Review

In this chapter, academic literature about the main research question and sub-questions will be elaborated. The literature will form the basis for multiple hypotheses which in turn, will be depicted into a conceptual model.

2.1 CIC and booking conversions

As previously discussed, a customer-initiated contact is triggered by a consumer’s own initiative. A CIC can be a direct visit to the firm’s website, a visit to a comparison website/app, a generic or a branded search in a search engine (this includes both paid and organic results) (Anderl et al., 2016).

(13)

13

De Haan et al. (2016) mention that customers who initiate contacts search for further information, evaluate alternatives or are about to buy the product/ service. Anderl et al. (2016) have also demonstrated that, in general, all the customer-initiated contacts through the consumers’ journey have shown very high percentages of conversions.

According to Li et al. (2016), customers who engage in repeated CICs for a focal firm, are more attracted to it with respect to the competitors and thus, these touchpoints can indicate greater conversion probabilities.

More specifically, Murphy et al. (2016) have found that search engines are widely preferred when it comes to travel planning, since consumers use them both for information searching and bookings. Therefore, it can be inferred that when consumers are considering a travel booking will more actively engage in typing search terms on these engines. Following this, Yang and Ghose (2010) published that online search terms mentioning the brand names were predicting direct conversions. Another form of a customer-initiated touchpoint are third-party sources (such as comparison websites and tour operators). According to Keaveney and Parthasarathy (2001), these information sources provide accurate and objective information; therefore, consumers trust them more, and are more likely to convert when relying on them. Furthermore, such contacts are highly likely to be followed by search or direct websites visits (Mulpuru et al., 2011)

Following the aforementioned, it is expected that:

H1: CICs will have a positive effect on booking conversions.

2.2 Consumers’ characteristics and behavior

(14)

14

above mentioned, it can be assumed that the total time of journey, the number of touchpoints visited, as well as the average time spent on touchpoints will describe different consumer behavior and vary across segments.

Ieva and Ziliani (2018) also found that demographic characteristics such as gender, age and geographic area of residence, were significantly related to touchpoint exposure.

In addition to this, Cambra-Fierro et al. (2017), have also reported several socio-demographic variables which could influence profitable relationships with firms, from CICs perspective. On the contrary, Bowman and Narayandas (2001) have not found any significant effect of gender, age and income on when customers engage in CIC. Finally, in the study of Girard & Silverblatt, educational differences among consumer showed significant results in customers who were purchasing frequently online.

By considering the above, this study will segment consumers of the travel sector based on both active characteristics as well as demographics:

● Total number of touchpoints visited ● Average time spent on each touchpoint ● Total time spent on all orientations (journeys)

● Gender, age, geographic area of residence, education level and income

2.3 Firm-initiated contacts and booking conversions

As it already has been discussed, in a firm-initiated contact, the company determines the time and the exposure of the advertisement (e.g. Andler et al. 2016; Wiesel et al., 2011). Andler et al. (2016) have classified the firm-initiated contacts into display (banner/ pre-rolls), retargeting, email and affiliate.

(15)

15

Pre-rolls are 15-30 second video ads which auto-play when a consumer clicks on a video touchpoint. This type of display advertising does not seem to play a role when it comes to purchases (Ghose & Todri-Adamopoulos, 2016), however it would be interesting to study if this effect differs on the travel sector.

Another form of display advertising is banner ads (Rutz & Bucklin, 2012). This advertising type also plays an important role when it comes to bookings, since it can increase the propensity of conversions after the customers are being exposed to it (Manchanda et al., 2006).

Apart from the aforementioned, consumers can also be reached through advertisements based on their previous online behavior; this type of firm-initiated contact is called retargeting and it seems that it can increase purchases by 26.12% (Ghose & Todri-Adamopoulos, 2016).

Email advertising has also found to have a positive and significant effects on purchase probability (Andler et al 2016). This is also supported by Breuer et al. (2011), who found that email advertising has the highest (and positive) impact on sales long-term, compared to the other firm-initiated channels.

From the above discussed, the second hypothesis can be formulated as follows:

H2: Firm-initiated contacts will have a positive effect on booking conversions.

2.4 Consideration stage

It is difficult to distinguish in which purchase stage consumers are. Court et al. (2009) propose that decision journey has a circular movement and according to Andler et al. 2016, the consideration set is evoked when consumers actively consider the brands chosen for actual purchases. So, it can be concluded that consideration is closely related to a consumer’s intention to purchase and is defined as “as the extent to which the customer would consider buying the brand in the near future.” (Baxendale et al., 2015).

(16)

16

for information. Their research has found that two-thirds of the touch points during the active-consideration phase involve CICs. According to Szmigin and Piacentini (2018), consumers do not switch from brand to brand but, for their choices they consider among a subset of brands which are acceptable to them. Thus, it is highly possible for consumers to initiate contacts with multiple firms

However, it is proven that the consideration set can change over time and it has been reported that advertising can affect the composition of the set (Mitra, 1995) . Edelman and Singer (2015) state that, firms can take advantage of the channels and shape consumers’ purchase decisions delivering value both at the consumers and firms themselves.

Therefore, it would be interesting to study which of the consumer segments that will be identified, can be influenced from firm-initiated contacts and change their consideration option, both in a total market level as well as on a company level. The following hypothesis can be formed:

H3a: The interaction effect of FIC on consumer segments will have an effect on booking conversions (market level)

H3b: The interaction effect of FIC on consumer segments will have an effect on booking conversions (company level)

2.5 Interaction between firm- and customer-initiated contacts

According to Andler et al (2016), interaction effects among customer- and firm- initiated contacts can indicate purchase probabilities and can be used as proxies for conversions during the consumer purchase journey and consideration stage. Interestingly, their study indicated that consumers who previously engaged in generic CICs followed by clicks in FIC’s were more likely to make a purchase (positive and significant effect on conversions).

(17)

17 H4a: The interaction effect of retargeting on generic search will have a positive effect on booking conversions.

H4b: The interaction of retargeting on branded search will have a positive effect on booking conversions.

H4c: The interaction of retargeting on branded website visit will have a positive effect on booking conversions.

As previously discussed, academic literature has not found evidence that pre-rolls have an effect on booking conversions, however, according to Ghose & Todri-Adamopoulos (2016), display advertising seems to significantly increase the probability of a consumer visiting the firms website directly or, though organic search. Assuming that visits on websites have great propensities of purchases the following hypothesis can be formed:

H4d: The interaction of pre-rolls on generic search will have a positive effect on booking conversions.

H4e: The interaction of pre-rolls on branded website will have a positive effect on booking conversions.

De Haan et al (2016) mention that email advertising needs to be further explored: customers seem to react to emails only when they have decided where and what to buy. This could be translated into branded searches and visits on brand’s websites. From these the following hypothesis could be formed:

H4f: The interaction of email advertising on branded search will have a positive effect on booking conversions

H4g: The interaction of email advertising and branded website visits will have a positive effect on booking conversions,

while

(18)

18

2.6 Control variables

A control variable is defined as a constant which value does not change throughout the analysis process. It serves for the unobserved variance on the dataset and it can strongly influence the results in an experiment. However, it is not the main interest of the research. The control variables that will be used are analyzed as follows.

Regarding customer’s browsing behavior, Mulpuru et al. (2011) found that about 50% of purchases had been attributed to at least two contacts. Which can be translated that the number of touchpoints visited will have a significant effect on booking conversions.

During travel planning, consumers are using multiple devices. Both mobile and personal computers are used while searching, however, most bookings come from fixed devices (Murphy et al. 2016). Therefore, the device which was used first to initiate the journey, as well as the device which was used during the last day of the journey will be accounted as control variables.

(19)

19

2.7 Conceptual model

Based on the above discussed literature, the following conceptual framework is visualized as presented below.

3. Methodology

3.1 Data description

(20)

20

The observation period is from 31st of May 2015 to 31st of October 2016 and contain in general,

29,012 consumer journeys identified by PurchaseID of 9,678 unique users (UserID). There are 20 touchpoint variables, 15 for the customer-initiated contacts and 5 for the firm-initiated contacts. If a journey has led to a purchase, it is assumed that this purchase has been placed after the last touchpoint encounter on this specific journey (last click attribution).

The data set accommodates information both about the journey and demographic characteristics of the consumers. Such information includes type of touchpoints visited (customer- or firm-initiated), whether a journey has led to conversions, type of device used to assess a contact and consumer sociodemographic characteristics such as age, income, occupation, social class, education level, household size etc.

The firm-initiated contacts are available only for the focal firm, thus when the term FIC is used, it will be referred to the contacts that only the focal firm has initiated. An overview of the firm and customer-initiated contacts can be seen at table 1.

Table 1 – Variable description

Variable Type of touchpoint Category

AWB Accommodations website Customer-initiated contact

APA Accommodations App Customer-initiated contact

SA Accommodations Search Customer-initiated contact

ICW Information / comparison Website Customer-initiated contact

ICA Information / comparison App Customer-initiated contact

ICS Information / comparison Search Customer-initiated contact

WBC Website Competitor Customer-initiated contact

APC App Competitor Customer-initiated contact

SC Search Competitor Customer-initiated contact

WBF Website Focus brand Customer-initiated contact

SF Search Focus brand Customer-initiated contact

FTW Flight tickets Website Customer-initiated contact

FTA Flight tickets App Customer-initiated contact

(21)

21

GNR Generic search Customer-initiated contact

AFF Affiliates Firm-initiated contact (Focus brand)

BAN Banner Firm-initiated contact (Focus brand)

EML Email Firm-initiated contact (Focus brand)

PR Pre-rolls Firm-initiated contact (Focus brand)

RTG Retargeting Firm-initiated contact (Focus brand)

3.2 Research design

It has already been discussed that different consumers segments, may not have the same behavior during online behaviors. In order to account for this consumer heterogeneity, different segments will be formed. Two segmentation techniques are discussed for this study with the scope of providing argumentation about the most proper method. These are Latent class analysis (LCA) and K-means clustering.

LCA is frequently used to classify consumers based on various underlying or unobservable (latent) variables, by assigning to each individual a probability of belonging to a class (Patterson, Dayton, & Graubard, 2002). According to Vermunt and Magidson (2003), latent class analysis assumes that consumers of same segments, have a common joint probability distribution among variables which provides the classes and their within-class structure. Although LCA is very effective when identifying segments based on behavioral and socio-demographic consumer characteristics, it is found that is not the optimal solution when accounting for heterogeneity in preferences (Teichert et al., 2008). Because the scope of the study significantly considers the heterogeneity in preference, this method will not be used.

(22)

22

Because of this limitation, this method will be combined with a hierarchical clustering algorithm. First it will be used to determine the number of clusters, that will be later used in k-means (Punj & Stewart, 1983). One of the most popular hierarchical methods is Ward’s algorithm, which uses the minimum variance by squared Euclidean distance to generate clusters (Ward, 1963) and this is the procedure that will be therefore used in this research.

The outcomes of cluster analysis will give the different consumer segments and will be used in the later analysis of this paper.

This study will proceed with the hypothesis testing with a binary logistic model, including moderation (interaction) terms. This model is commonly used in marketing for classification problems, meaning that they have two outcomes and can be addressed with yes/no questions (Leeflang et al., 2015).

Two models will be employed, one will account for the market’s behavior effect of firm-initiated contacts, customer-initiated contacts, as well as the interaction of firm-initiated contacts on the previous segments on total booking conversions. Therefore, the dependent variable for this model will be probability of (any) booking.

The second binary model will study more specific behavior and account for the shift in brand consideration. More specifically, since firm-initiated contact information is only available for the focal firm, as a dependent variable of this model is considered the probability of purchasing at the competitor. Negative effects from firm-initiated contacts at the competitor’s booking conversion probability will be translated as shift in considerations.

(23)

23

The models are represented below:

Equation 1:

𝑃𝑏𝑜𝑜𝑘𝑗

= 1

1 + 𝑒𝑥𝑝 (−(𝛽0+ 𝛽1(𝑆𝑒𝑔𝑖∗ 𝐹𝐼𝐶𝑓) + 𝛽2𝐹𝐼𝐶𝑓+ 𝛽3𝐶𝐼𝐶𝑎+ 𝛽4𝐹𝑖𝑟𝐷𝑒𝑣 + 𝛽5𝐿𝑎𝑠𝑡𝐷𝑒𝑣 + 𝛽6𝐽𝑜𝑢𝑟𝑛𝐷𝑎𝑦𝑠))

Where

𝑃𝑏𝑜𝑜𝑘 𝑗 = Probability of a journey j making a booking from any firm

𝑆𝑒𝑔𝑖 ∗ 𝐹𝐼𝐶𝑓= Moderating effect of 𝐹𝐼𝐶𝑓 on 𝑆𝑒𝑔𝑖 , f=1…5

𝐹𝐼𝐶𝑓 = Number of firm-initiated contacts f on journey j, f=1…5

𝐶𝐼𝐶𝑎 = Number of customer-initiated contacts a on journey j, a=1…20

𝐹𝑖𝑟𝐷𝑒𝑣𝑗 = First device used for the journey j

𝐿𝑎𝑠𝑡𝐷𝑒𝑣𝑗 = Last device used for the journey j

𝐽𝑜𝑢𝑟𝑛𝐷𝑎𝑦𝑠𝑗= Duration of journey j in days

Equation 2: 𝑃𝑐𝑜𝑚𝑝𝑗 = 1 1 + 𝑒𝑥𝑝 (− (𝛽0+ 𝛽1(𝑆𝑒𝑔𝑖 ∗ 𝐹𝐼𝐶𝑓) + (𝐶𝐼𝐶𝑎∗ 𝐹𝐼𝐶𝑓) + 𝛽2𝐹𝐼𝐶𝑓+ 𝛽3𝐶𝐼𝐶𝑎 +𝛽4𝐹𝑖𝑟𝐷𝑒𝑣 + 𝛽5𝐿𝑎𝑠𝑡𝐷𝑒𝑣 + 𝛽6𝐽𝑜𝑢𝑟𝑛𝐷𝑎𝑦𝑠 ) Where

𝑃𝑐𝑜𝑚𝑝 𝑗 = Probability of a journey j making a booking from the competitor 𝑆𝑒𝑔𝑖 ∗ 𝐹𝐼𝐶𝑓= Moderating effect of 𝐹𝐼𝐶𝑓 on 𝑆𝑒𝑔𝑖 , f=1…5

𝐶𝐼𝐶𝑎∗ 𝐹𝐼𝐶𝑓= Moderating effect of 𝐹𝐼𝐶𝑓 on 𝐶𝐼𝐶𝑎 , f=1…5

(24)

24

𝐶𝐼𝐶𝑎 = Number of customer-initiated contacts a on journey j, a=1…20

𝐹𝑖𝑟𝐷𝑒𝑣𝑗 = First device used for the journey j 𝐿𝑎𝑠𝑡𝐷𝑒𝑣𝑗 = Last device used for the journey j

𝐽𝑜𝑢𝑟𝑛𝐷𝑎𝑦𝑠𝑗= Duration of journey j in days

3.3 Plan of analysis

Initially the dataset will be prepared for the analysis, this procedure will entail data cleaning, data visualization, checks for outliers and missing values. If any of these found, they will be manipulated according to the situation.

In the next step, if there is any need, new variables will be created and considered in the models. Before proceeding with segmentation, some descriptive analysis will be performed, accompanied with visual graphs. A hierarchical (Ward’s) clustering algorithm will be computed and the analysis will continue with the k-mean segmentation. The resulting segments will be described and incorporated into the dataset for the later use in the analysis.

Outlier checks will be done before computing the two logit models. This will be followed by the calibration of several model options which will be discussed and the ones with the best fit (one for any booking and one for competitor) will be chosen for the hypothesis testing.

The regressions’ results will be assessed and interpreted accordingly. The conclusions will be drawn upon the formed hypothesis and the discussed literature. Finally, managerial implications and recommendations for further research will be presented.

4. Results

4.1 Data manipulation

(25)

25

were missing observations of 1,603 users, apart from education variable that 2,031 users had missing information. Since not all the demographic variables are of interest for this study, predictive mean matching (PMM) imputation technique was used to account for the missing values. For each missing value, this technique finds a set of observed values with the closest regression-predicted mean as the missing one and imputes the missing values by a random draw from that set that is closest to the regression-predicted mean for the missing value from the simulated regression model. Pmm ensures that imputed values are plausible, therefore, is restricted to the observed values and fits categorical data as well (Heitjan & Little 1991; Schenker & Taylor 1996).

A predictor matrix was calculated to indicate which variables will be used as predictors for imputation of each incomplete variable. More specifically area of residence can be predicted by municipality size, while the variables household size, work position and social class were used to predict gender, age, education level and income.

The dataset contains 43,069 firm-initiated and 2,413,345 customer-initiated contacts. The explanatory analysis has shown that there are a total of 141,065 touchpoints without specified duration, more specifically all the 43,069 FICs and 97,996 CICs. The missing values are treated in order to use the variables for the later analysis.

Initially, the average all CICs is calculated without taking into account the missing values and the unspecified duration of these touchpoints is substituted. Since firm-initiated contacts are closely related to advertising exposure, missing FIC duration is filled with the average duration of all CICs that are related to website/ search.

After treating for all the missing values, new variables were created, which reflect the behavioral characteristics of the users. These variables are: number of touchpoints visited per user, number of journeys per user, total duration of all journeys made by a unique user, average journey duration

per user and the average time each user has spent on touchpoints.

4.2 Aggregation Level

(26)

26

purchase behavior drivers. Consequently, user ID would seem the next most logical option to allow for complete information on all individual customers. This aggregation level is used for the segmentation analysis.

However, when it comes to the hypothesis testing this could bring the issue that a unique user can have multiple journeys which could not be reflected on the individual aggregation level.

Aggregating by journey (PurchaseID) seems to be the most suitable option, since purchase behavior is captured on each journey and more valuable information about the conversions is taken into account for the analysis. Furthermore, user characteristics are taken into account by the segmentation analysis that would performed.

4.3 Segmentation

4.3.1 Treating for outliers and extreme values

As discussed, first a hierarchical approach will be used, followed by k-means clustering analysis. However before proceeding to the segmentation, outliers will be detected and removed in order to make clustering more reliable. Hawkins (Hawkins, 1980) defines an outlier as an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism. Many researched, pay special attention to outliers to ensure the robustness of the used variables. This study’s main reason for removing the outliers is because K-Means clustering algorithm is very sensitive to outliers as it uses the mean of cluster data points to find the cluster center (Chawla, & Gionis, 2013)

(27)

27

as outliers. These refer to 25 users who have visited more than 4,000 touchpoints and made 1-8 journeys. It is decided to remove these values as well, in order not to interfere with the clustering algorithms results.

The analysis continues by examining the average journey duration per user (Appendix A2). By looking at the box plot and scatterplots, it can be seen that one user has spent on average 271,575 seconds (3 days) in one journey which is clearly an outlier. Further investigating there are 6 users who have over 100,000 seconds (28 hours) per journey having in total 1-3 journeys and these are also removed from the dataset as extreme values.

Users who have not a significant amount of total journey duration will be examined. 84 users have a total journey duration of less than 3 seconds. It is assumed that this is an outcome of accidental exposure which can interfere in the clustering algorithm’s outcomes and these are excluded from the dataset. This will result in 9,558 users that are used for the clustering analysis.

4.3.2 Clustering analysis

Before proceeding with the analysis, a correlation matrix has been computed for the variables that will be used, in order to test if there are linear associations between them (Appendix B). These variables regard region, gender, age, income, education, number of touchpoints visited per user, journeys per, user average journey duration per user and average time spent on each touchpoint. The variable touchpoints per user is found correlated with journeys per user (0.578) and average journey duration (0.878), therefore these variables are excluded before proceeding with the clustering analysis. No other correlations are observed among the segmentation dataset; therefore, the clustering analyses follow.

(28)

28

adding one additional cluster does not improve the WSS. In this analysis k-means identified four clusters (Appendix C) based on the Elbow method. The size of the resulting clusters is 2388, 2383, 2396, 2391, and these form segments 1 - 4 respectively.

These four clusters are tested with ANOVA analysis in order to identify if they differ from one another. The analysis has shown that clusters are significantly different with respect to the number of touchpoints visited, age, education with p-values very close to zero (p <2e-16). Income is also found to differ significantly among clusters having a p-value of 0.0421. However, the average time that a user has spent on a touchpoint is found insignificant. The ANOVA tables can be seen on the Appendix D

An ANOVA test demonstrates if the results are significant overall, but it does not indicates exactly where those differences lie. To account for this, after running the ANOVA, Tukey's HSD test is run in the significant results, to find out which specific cluster’s means (compared with each other) are different. The test makes comparisons with all possible pairs of means.

Table 2

Variable significance (p adjusted)

Cluster Pairs

Touchpoints

visited Age Gender Income Education

2-1 0 0.01061 0 0.99995 0 3-1 0 0.46889 0.0001 0.00024 0.0010498 4-1 0 0.29192 1.3E-06 1.5E-05 0.0180826 3-2 0.9456627 3.3E-05 0 0.00019 0.000009 4-2 0.017334 7.2E-06 0 1.2E-05 0.0000001 4-3 0.0026554 0.98926 0.8089 0.92996 0.8471435

(29)

29

means with the income variable, only the pairs of 2-1 and 4-3 do not differ. Finally, when it comes to education, all pair combinations are significantly different except from pair 4-3.

The four segments are analyzed in the following section. 4.3.3 Segment description

The four clusters’ characteristics are depicted on table 3 and segments are analyzed as follows: The first cluster describes 2,388 users, who are mostly females (1,630) than males (758) with an average age of 51 years old. Most people belong to the €40,000 – 67,000€ income class and live at the West followed by East and South of the Netherlands. Finally, the majority of the users have completed the MBO education. Regarding the active characteristics, on average users of this cluster engage in 143 touchpoints and make 2.5 journeys while the average stay on each touchpoint is 72.34 seconds and 43 minutes for each journey.

The second segment has also 1,143 males and 1,240 females, making a total of 2,383 users. The average age is 50 years while geographic area of residence is mostly West, followed by South. The majority of the users have an income of €40,000 - €67,000 with the second income class being €33,500- €40,000. Most users of these group have completed MBO education while the second completed education stage is HBO. This class’ browsing behavior is described by visiting 239.6 touchpoints on average with 3.6 journeys per user. These users spend almost an hour for each journey and stay on each touchpoint 68.66 seconds on average .

Table 3

Segment Size Age

range Income Educatio n Touchpoint s visited (avg)

(30)

30 2 M: 1143 F: 1240 T: 2383 18-91 €40.000 - €67.000 MBO/ HBO 239.6 3.6 1.75 - 68272 3 M: 905 F: 1491 T: 2396 17-89 €12.900 - €67.000 MBO/ HBO 232.9 3.4 2.5 - 85676 4 M: 933 F: 1458 T: 2391 17-90 €<12.900 - €67.000 MBO 275 3.5 2.5 - 81819

The third cluster has a total of 2,391 users, 905 males and 1,491 females. The average age is 52 years old and users of these cluster mostly reside on West, South and East of the Netherlands. People of this class have an income spread of €12,900 to €67,000 and have completed MBO and HBO studies. These users visit on average 232.9 touchpoints visited and have 3,4 journeys per user while the average time spent on a touchpoint is 69.16 and have an average journey duration of 67 minutes.

Finally, in the fourth cluster there are 933 males and 1,458 females, giving a total of 2,391 users of mean age 52.15 years. These people are almost equally divided among income classes, have most users have completed MBO the geographical areas, income and education levels. On average users of this segment visit 275 touchpoints, stay 69.88 seconds on each and have a mean of 3.5 journeys with a 73 minutes of average duration per journey.

4.4 Preliminary analysis

(31)

31

days. Orientations with low length are examined, indicating that there are 3,169 journeys with 1 touchpoint visit, and a mean duration of 119.7 seconds (2 minutes). Of them, 88 have made a booking, therefore these touchpoints are not considered as outliers, since they can be an indication of either previous journeys or outcomes of offline behavior.

(32)

32 4.5 Descriptive statistics

In the data set, 12.5% of journeys have led to a booking, more specifically of them 11.9% correspond to the competitors and only 0.6% regard the focal company. Regarding segment 1, 12.2% of journeys made by this segment have led to any booking while for segment 2 the percentage corresponds to 12.8%. Segment 3 has the highest percentage of all, with 13.1% of the journeys resulting in bookings. On the contrary, 11.8% of the journeys made by this segment resulted in bookings.

It is worth mentioning that 98.5% of the touchpoints are customer-initiated, while only 1.5% are firm-initiated. On average in each journey 73.9 touchpoints are visited while each journey lasts on average 1 hour and 9 minutes. The day range is quite broad with the journeys lasting up to 519 days.

4.6 Checking logistic regression assumptions

To begin with, one of the main assumptions of logistic regression is the appropriate form of the outcome variable. Binary logistic regression requires the dependent variable to be binary, meaning that it should have the form of 0-1. This is the case for both logit models that will be tested in this research, with 0 reflecting no conversion and 1 reflecting conversion.

(33)

33

The absence of multicollinearity is probably one of the most important assumptions of the logit models: Independent variables should not be too highly correlated with each other. To test the assumption of multicollinearity, VIF scores are used, A value of VIF >5 indicates multicollinearity, therefore, this assumption is violated. First VIF scores were computed for the independent variables for both models (Appendix E). Segment 1 is taken as reference category. Extremely high VIF values are observed for the FIC variables BAN, EML, PR, RTG, with values 37.7 , 28.7, 187.7 and 94.2 respectively, as well as for their interaction terms. A high VIF score is a common phenomenon when interaction terms are used. A remedy for multicollinearity is mean centering the problematic variables. However, when mean-centering the problematic variables and checking again VIF scores, they remain exactly the same. Another explanation is explored and according to Frøslie et al. (2010) variables high VIF scores can arise when categorical variables are used as dummy- coded in the regression model. Indeed, when the reference category of segment 1 is changed by segment 4, extreme values are significantly dropped. The new values show a value above 5 which is the VIF cut-off point, for banner advertising (6.06), and for the total journey duration variable (5.21). These variables are dropped, as well as the interaction effects of the banner advertising.

In the second model, the variables AFF, BAN, Duration and interaction terms WB*RTG, IWB*RTG, and WBF*RTG are found correlated with VIF values of 5.22, 5.17, 5.58, 5.54, 5.17 and 5.35 respectively. These are removed before proceeding with the analysis.

Finally, logistic regressions require a large sample size. As a rule of thumb at minimum 10 observations per independent variable are needed. The sample size contains 27,994 observations for 37 independent variables (interaction terms are included), thus, this is assumption is not violated.

4.7 Model selection

(34)

34

based on the following criteria (1) AIC, Akaike Information Criteria, (2) Confusion matrix (3) McFadden’s pseudo R2 and (4) TDL, Top Decile Lift.

Information criteria measure model’s quality by penalizing most complex ones for additional variables used. There are several criteria however this research uses the most common one Akaike. The model with the lowest AIC value has a better fit. Confusion matrix is a metric that is very common for classification problems since it gives the prediction accuracy, that is the percentage of observations predicted correctly. Another output of this matrix are the percentage of correct predicted cases 0-1, that is no bookings - bookings. Pseudo R2, look like common R2 , ranging from 0 to 1 . Although some never achieve 0 or 1, higher values indicate better model fit, but they

cannot be interpreted as the OLS R-squared. Top-decile lift is a metric that expresses how the

10% customers with the highest model predictions are compared to the overall dataset. A top-decile lift of 1 is expected for a random model, and the higher the value is, the better the model. Tables 4 and 5 give the fit for model regarding any booking and booking from the competitor

Table 4 – Model comparison booking any

Models Accuracy Zero -

(35)

35

This research is using a step-wise approach starting from a simple model to a more complex one. All the model outputs can be seen at Appendix F

Initially models for booking any are discussed. The model number one contains only the independent variables (touchpoints and segments) without interactions and without control variables. In model 2 the interaction effects of segments and firm-initiated contacts are added and as seen the AIC rises. Later, in model 3 and 4 the first device and the journey duration in days are added respectively. Insignificant variables are removed from model 4, resulting in model 5. Finally, in model 6 the control variable of the last device of the journey is added.

Comparing the model fit among the 6 models, it can be seen that model 6 has the highest accuracy and highest percentage of one predicted value. Although the pseudo-R2 of models 4 and 5 are highest, models’ AIC value is significantly lower. Considering that AIC penalizes the model for each variable added, model 6 with will be preferred for the hypothesis testing. An ANOVA analysis and a likelihood ratio test are performed in order to test in the chosen model significantly performs better than a null one (model only with the intercept). Both have significant value of p< 2.2e-16, therefore the model is better than the null and the hypothesis will be tested on it (Appendix F1).

Following, the model fit for the competitor’s booking is discussed. The same approach is followed, starting with a simple model adding variables to result in a more complex one.

As the previous case, model 1 includes only the independent variables, while in model 2 the interaction terms of FICi * Segi are added. In model 3 the interaction terms of CICi*FICi* are added. In model 4 and 5 the control variables are added, while insignificant variables are removed, resulting in model 6.

From the table 5 it can be seen that model 6 has the highest accuracy of all models with 88.06%, predicting correctly 88.72% of the non-bookings and 47.87% of the bookings. Compared to model 5 it has lower TDL and pseudoR2 however it has the lowest AIC value of all models. As in the previous case, the lowest value of AIC will lead in choosing model 6 for the hypothesis testing. The ANOVA and likelihood ratio test showed that this model significantly differs from a null one, thus this choice is confirmed.

(36)

36

Models Accuracy Zero

predicted values One predicted values McFadden (pseudo R2) TDL AIC Model 1 0.879 0.88587 0.42353 0.07416449 3.267 15575 Model 2 0.879 0.88601 0.42529 0.07492714 3.282 15580 Model 3 0.8792 0.8863 0.4333 0.07655871 3.253 15579 Model 4 0.8783 0.88593 0.40659 0.08502179 3.282 15441 Model 5 0.8803 0.88710 0.46809 0.08997662 3.765 15360 Model 6 0.8806 0.88728 0.47872 0.08908668 3.736 15331

4.8 Hypothesis H1, H2, H3a Assessment

Before proceeding to interpretation of results, odds ratios and marginal effects are calculated. When the odds ratio is greater than 1, it describes a positive relationship. In this model the odds ratio of the email advertising is 1.052, since the email is a categorical variable it means that as the odds of e-mail increases by one touchpoint, the odds of making a booking increase by 5.2%. The interpretation slightly changes when having reference categories. For example, in the variable first device used, the reference category is fixed device and the odds ratio has a value less than 1 (negative relationship). Odds ratio can be interpreted as: the odds of using a fixed device versus a mobile device decrease the odds of making a booking by 24%.

(37)

37 Table 6 – Booking any

Estimate Std. Error z

value

Pr(>|z|) Sig Odds ratio Marginal

(38)

38

The resulting model confirms the first hypothesis, the effect of customer-initiated contacts is positive on booking conversions. From the CICs that are included in the model only the flight tickets app is found insignificant. All website- related touchpoints are found significant with values of p<2e-16. More specifically,

Firm-initiated contacts are found significant apart from pre-roll advertisements. One could assume that this is an outcome of the interaction effect however all the interaction terms of pre-rolls on segments are also insignificant. The other two firm-initiated contacts, email and retargeting have a significant and positive effect on booking conversions. This finding also confirms the second hypothesis.

Regarding segments, only segment 1 is significant, however the interaction effects that are used in the model may interfere with the other segments. Assessing the outcome, the β coefficient of segment 1 is negative, indicating a negative effect on any booking conversions, so the probability of purchasing a travel for segment 1, is decreased by -0.020713, compared to segment 4.

By looking at the β coefficients’ absolute values, it can be seen that the strongest impact on booking conversion is from first device used (β = 1.1270138) and according to marginal effects interpretation, the probability of making a booking when the first device used is fixed compared to mobile, increases with 0.084734. Segment 1 follows (β =-0.2197113), and by interpreting the marginal effects, the probability of making any booking decreases by -0.020713 for segment 1 compared to segment 4.

The control variable of journey duration in days has a negative effect on booking conversions and with respect to marginal effects, the probability of making a booking decreases by 0.026627 for each additional day the journey lasts. It is worth noting that the control variable of last device used, has a negative effect on conversions when it is fixed rather than mobile. By looking at marginal effects, the probability of making a booking when using fixed device compared to mobile is rather small (-0.00039371). It is assumed that a booking has been placed right after the last journey day and the variable last device is also the variable that is used on the last day journey day.

(39)

39

advertising on segment 2 has a negative effect on booking conversions compared to segment 4 and the probability of making a conversion decreases by 0.012837 for each additional touchpoint. Segment 3 has also a negative effect on purchases when interacting with email touchpoints (β = -0.0440652, ΜΕ=-0.0044265) compared to segment 4. The aforementioned partially confirm the hypothesis H3a, considering the total market behavior, segments have responded differently to advertising. Though, it would be wise to note that the advertising information is only available for the focal firm.

4.9 Hypothesis H3b H4 Assessment

Similarly, the result interpretation for the model which considers the probability of booking at the competitor is done based on the coefficients, odds ratio and marginal effect. The model results are seen in table 7. This model also includes interaction terms of FICs on CICs

As expected from the previously discussed findings of the market level, all the customer-initiated contacts are found significant except for competitor’s search. This is quite odd; however, this is counterbalanced with the significant and positive effect of the touchpoint website competitor visit (β = 0.003562, OR = 1.00356864, ME = 0.00033537). It is worth mentioning that, as one can logically assume, website visit of the focal brand has a negative and significant effect on competitor’s bookings (β= -0.002215) and by interpreting the marginal effects, the probability of making a purchase at the competitor decreases by -0.00020851 when one more visit at the focal brand’s website is made. Visits on an accommodation website as well as on accommodation apps have positive effect on bookings while the probabilities increase by ME= 0.00062392 and ME= 0.00012424 respectively. Visits on price comparison sites increase the probability of purchasing a travel from the competitor by ME = 0.0074316. On the contrary, customers typing generic terms, result in a 0.00059108 decrease in probabilities of booking at the competitor.

Table 7- Booking competitor

Estimate Std. Error z

value

Pr(>|z|) Sign Odds ratio Marginal

Effects

(Intercept) -2.959 7.92E-02 -7.373 < 2e-16 *** 0.05184601

(40)

40 APA 0.00132 5.74E-04 2.3 0.021443 * 1.00132051 0.00012424 SA 0.02053 1.00E-02 2.051 0.040291 * 1.02074176 0.0019327 ICS 0.07894 2.79E-02 2.825 0.004729 ** 1.0821376 0.0074316 WBC 0.003562 2.90E-04 12.307 < 2e-16 *** 1.00356864 0.00033537 SC 0.02292 2.51E-02 0.915 0.360154 1.02318565 0.0021579 WBF -0.002215 5.85E-04 -3.79 0.000151 *** 0.99778763 -0.00020851 FTW 0.003217 8.64E-04 3.723 0.000197 *** 1.00322238 0.00030288 GNR -0.006279 2.81E-03 -2.236 0.025363 * 0.99374114 -0.00059108

EML 0.0372 9.27E-03 4.011 6.05E-05 *** 1.04E+00 0.0035018

PR -0.005203 4.06E-02 -0.128 0.898034 0.99481021 -0.00048986

RTG 0.00635 2.36E-03 2.697 0.007003 ** 1.00636977 0.00059777

Seg1 -0.2255 7.75E-02 -2.908 0.003641 ** 0.79814132 -0.01986

Seg2 0.03916 5.66E-02 0.692 0.488984 1.03993246 0.0037109

Seg3 -0.01224 5.62E-02 -0.218 0.827578 0.98782979 -0.0011505

FirstDevice 1.106 1.01E-01 10.948 < 2e-16 *** 3.0233763 0.078932

LastDevice -0.2984 7.33E-02 -4.069 4.73E-05 *** 7.42E-01 -0.029904

Journeydays -0.003901 4.51E-04 -8.644 0.0000473 *** 0.99610631 -0.00036728

RTG*Seg1 0.00316 5.97E-03 0.529 5.97E-01 1.00E+00 0.00029746

RTG*Seg2 0.003335 4.33E-03 0.771 0.440736 1.00334029 0.00031394

RTG*Seg3 -0.005149 2.93E-03 -1.759 0.078661 . 9.95E-01 -0.00048476

WBA*EML -.0001576 4.83E-05 -3.265 0.001095 ** 0.99984239 -.000014839

WBC*PR -0.00146 5.15E-04 -2.833 0.004616 ** 9.99E-01 -0.00013741

WBF*PR 0.001228 5.85E-04 2.101 0.035632 * 1.00122888 0.00011562

GNR*RTG -0.003321 2.63E-03 -1.261 0.20726 0.99678763 -0.00030719

SC*EML -3.66E-02 8.55E-02 -0.428 0.668416 0.96793121 -0.3591764

(41)

41

marginal effects for email and retargeting, the probability of converting at the competitor increases by 0.0035018 and 0.00059777 respectively.

All three control variables are having the same effects as it has previously discussed on the total market behavior, meaning that when the first device is fixed (β= < 2e-16, OR = 3.0233763, ME=0.078932) has a positive effect compared to mobile, while last device used and journey days have negative effects on booking conversions (β= 0.0000473, OR = 0.7420308, ME=-0.029904 and β= < 2e-16, OR = 0.99610631, ME=-0.00036728 respectively).

The effects on segments on competitor’s booking are discussed. Segments 1 and 2 remain insignificant when they come across retargeting ads (p >1). Segment 3 is negative (-0.005149) and significant (p<0.1) when moderated by retargeting compared to segment 4. The odds of segment 3 versus segment 4 making a conversion, decrease by 0.5% for each additional retargeting touchpoint while the probability decreases by -0.00048476.

This partially rejects the hypothesis (H3b) that segments behave differently when they are exposed to advertising, in a company level, since only segment 3 compared to segment 4 has been found significant.

The analysis’ results interpretation continues with the moderating effects of FICs on CICs. Moderating term of emails on accommodations websites has a significant and negative effect on booking conversions (β = -0.005149). Interpreting, marginal effects, for one additional increase in email touchpoints, the effect of accommodation website decreases the probability of booking by -0.000014839. Accommodation websites could fall in the generic search, since they are platforms that gather all travel companies (eg. Booking.com, hotelkamerveiling.nl) without comparing their prices.

This confirms the hypothesis (H4h) of the moderating effect of email on generic search.

(42)

42

(β=0.001228) effect of pre-rolls on focal brand’s website, which increases booking conversion probability from the competitor by 0.00011562.

These results on the rejection of hypothesis H4e, since it has a positive effect on the competitor’s booking conversions, although both touchpoints are the focal brands.

In addition, the moderation of retargeting on generic search is insignificant with a p value of p>0.1. The same is for the interaction of email advertising on competitor’s search (p>0.1). However, it is worth noting that both interaction effects are found negative, although insignificant. All the rest interaction effects of FICs on CICs were found insignificant. Therefore, the rest of hypothesis are rejected.

The interaction terms will be discussed with graphical representations on the following section.

(43)

43

4.10 Graphical representation of the moderating effects

(44)

44

5. Discussion

This study tried to sed light to four main hypotheses by considering both market- as well as company-level behaviors. The first and second hypothesis were referring to the effects of customer-initiated and firm-initiated contacts on booking conversions. While the third was discussing the effect of advertising on the consumers segments. Finally, the forth hypothesis and sub-hypothesis were employed to test if shift in considerations can be interpreted as an outcome for advertising. Furthermore, it tried to segment consumers based on not only demographics but also journey characteristics.

5.1 Effect on CICs and FICs on booking conversions

Most customer-initiated contacts were found to have a positive effect on booking conversions both in market as well as in company level, confirming the findings of previous research’s (Wiesel et al., 2011; De Haan et al. 2016,; Keaveney & Parthasarathy, 2001; etc). Analytically,

accommodation’s website (𝛽𝑎 = 0.0065725 and 𝛽𝑐 =0.006627), accommodations app (𝛽𝑎

=0.0025938 and 𝛽𝑐 =0.00132), information/ comparison search (𝛽𝑎 = 0.0609949 and 𝛽𝑐

=0.07894), flight ticket’s website (𝛽𝑎 = 0.0028325 and 𝛽𝑐 =0.003217) and competitor’s website

(𝛽𝑎 = 0.0024923 and 𝛽𝑐 =0.003562) had significant and positive effects on booking conversions

both at market and firm level. Focal brand’s website, as expected, had a positive effect on any

booking, though a negative effect on competitor’s booking (𝛽𝑎 = 0.002832 and 𝛽𝑐 =-0.002215).

App search was found significant only at company-level with β = 0.02053 confirming Murphy et al. (2016) findings that consumers are using their mobile devices to search for more information. Lastly, generic searched had a negative effect on the competitor’s purchases (β=-0.006279), confirming that when consumers are in the consideration stage, type specific branded keywords, as to evaluate alternatives (Yang & Ghose, 2010). The rest of the customer-initiated touchpoints had insignificant effects and were excluded from the model analysis.

(45)

45

The variables were not considered in the analyses. With respect to email advertising and retargeting, had significant and positive effects on bookings for both models. Both email advertising and retargeting takes into account past behavior; user is providing the firm with his email and in the case with retargeting, ads are shown to users who have already visited the website (www.google.com). Regarding pre-rolls, its effect on booking conversions was found insignificant, however this variable was used for moderating effects therefore this could be an explanation. However, when looking to the first calibrated models for the both cases, which did not account for moderation, pre-rolls was found significant, having negative effects ((𝛽𝑎 = -0.1005 and 𝛽𝑐 =-0.1448152) on booking conversions. The interaction effects will be discussed later in these chapter.

5. 2 Moderating effects of firm-initiated contacts on segments

Not many interaction effects are found significant in both models. With respect to the probability of making any bookings, moderating effects on segments 2 and 3 were significant, although, with a negative probability (ME=-0.012837 and ME=-0.0044265, respectively) on booking conversions compared to the reference category of segment 4. Segment 2 consists of almost equally divided males and females around 50 years, of higher educational levels (MBO/HBO), making 239.6 touchpoints on average on every journey. Segment 3 consists of mostly females, with a mean age of 52 years visiting on average 232.9. These two are compared to the reference category of segment 4 which majority of members are females having an average age just above 52 years, visiting 275 touchpoints on average. Email advertising in these two segments is not bringing conversions, and therefore future research would require investing more in understanding motives of these segments, and probably not addressing them with this type of advertising. It is interesting though that segment 2 positively reacts in retargeting advertisements compared to segment 4, with probabilities for a booking conversion of 0.0010295. Firms that are investing in retargeting ads, can research this segment’s behavior and target this segment more. The other segments do not have significant results on booking conversions

(46)

46

considerations of this segment can be shifted. Future research on this segment could identify if it can develop loyalty patterns for the focal brand leveraging advertising towards it.

A table with the hypothesis support overview could be found bellow

5.3 Moderating effects of firm-initiated on customer-initiated contacts

Shifting considerations are studied by the moderating effect of firm-initiated to customer- initiated contacts. These were drawn upon careful literature review; however, it can be noted that defining the decision stage that customers are is tricky. This study approached this by taking as a dependent variable the purchase probability on a competitor’s website along with the fact that firm-initiated contacts are available only for the focal firm. The interaction effect of email on a visit on accommodation’s website is found significant with a negative effect on purchase probability on the competitor’s website. Accommodations websites aggregate all the alternatives in a unique platform where the user can easily choose among the alternatives. It is different from information/ comparison sites in the sense of prices of the alternatives remain the same across the platforms and are not compared with other listings. It can be inferred that when journeys include such CICs, customers are in an active stage of evaluating which is the best option for them and email advertising seems to influence their decisions and not purchasing from the competitors. As discussed, email advertising is an already targeting advertising medium because consumers have to willingly provide their emails to the firm. Leveraging this, firms can achieve value by entering into the consideration sets of consumers and even change their opinion.

(47)

47

5.5 Implications and future research

This research contributes to the consumer journey by providing insights on the consideration change. It discusses the effects of firm-initiated contacts both on the customer-initiated contacts as well as on the consumer segments. It also confirms previous studies’ findings about the effects of FIC and CIC on booking conversions on the travel sector. In addition to this, this study provides a segmentation proposal of consumers based not only on socio-demographics but also based on the active journey behavior. This is addressed by considering interaction terms for the effect of FIC on segments for both on market level as well as in company level. It also implements interaction effects of FICs on CICs for the company level in order to study the shifts in consideration sets. Managers should leverage retargeting towards segments 2 and 3 but also be careful when

reaching these segments with email advertising. Furthermore, in order to achieve greater booking conversions, should take into account the consumer’s consideration changes thought their

purchase journeys. That means, utilizing email advertising and pre-rolls the right time, at the right customers (www.thinkwithgoogle.com)

5.6 Limitations and future research

This section discusses the study’s limitations and provides suggestions for future research. Initially, not all the customer- as well as firm-initiated contacts are studied since many of them were found insignificant. Further interactions could be studied as well if a larger dataset was employed. A vast majority of the interaction effects is found insignificant; only 3 out of 8 hypothesis regarding these interaction terms could be interpreted. Therefore, further research is needed to consider whether the interaction terms actually contribute to booking probabilities. Regarding segmentation, in a future research more consumer active characteristics can be considered, since now only one was not found correlated and could be used for the clustering analysis.

In addition, more investigation could be done separately addressing the effect of banners and affiliates on consumer segments as well as on customer-initiated contacts.

(48)

48

Appendix

A. Outlier treatment 1. Touchpoints per user

(49)

49

B. Correlation plots

(50)

50

D. ANOVA tests

Touchpoints per user Df Sum Sq Mean Sq F value Pr(>F) Sign cluster 3 2.27E+07 7574742 43.63 <2e-16 *** Residuals 9554 1.66E+09 173606

Age Df Sum Sq Mean Sq F value Pr(>F) Sign cluster 3 6868 2289.4 9.852 1.75E-06 *** Residuals 9554 2220069 232.4

Gender Df Sum Sq Mean Sq F value Pr(>F) Sign cluster 3 32.1 10.693 45.52 <2e-16 *** Residuals 9554 2244.3 0.235

Income Df Sum Sq Mean Sq F value Pr(>F) Sign cluster 3 184 61.35 13.21 1.32E-08 *** Residuals 9554 44362 4.64

Education Df Sum Sq Mean Sq F value Pr(>F)

(51)

51

E. Multicollinearity checks

Income Df Sum Sq Mean Sq F value Pr(>F) Sign cluster 3 16 5.22 2.734 0.0421 * Residuals 9554 18243 1.909

Variables Vif scores Variables Vif scores Variables Vif scores AWB 2.231848 RTG 3.540885 PR*Seg3 1.247306 APA 1.073579 Seg 2 1.471627 RTG*Seg1 1.284991 SA 1.194793 Seg 3 1.479187 RTG*Seg2 1.582154 ICW 1.408203 Seg1 1.256076 RTG*Seg3 3.38927 ICA 1.14846 FirstDevice 1.763677 WB*EML 2.282521 ICS 1.186706 LastDevice 1.860147 WB*RTG 5.540188 WBC 2.623962 TotalDays 1.985355 IWB*EML 2.648099 APC 1.025976 Duration 5.584231 IWB*RTG 5.167192 SC 1.29967 AFF*Se1 2.316017 WBC*EML 6.69879 WBF 1.947219 AFF*Seg2 1.686089 WBC*RTG 3.143806 SF 1.184992 AFF*Seg3 3.379321 SRC*EML 2.934514 FTW 1.327276 BAN*Se1 1.055191 SRC*RTG 1.966429 FTA 1.085232 BAN*Seg2 1.504754 WBF*PR 2.542299 FTS 1.261986 BAN*Seg3 4.678336 WBF*RTG 5.351276 GNR 1.65077 EML*Se1 1.245647 SRF*EML 1.561245 AFF 5.227376 EML*Seg2 1.46833 SRF*RTG 1.75821 BAN 5.170663 EML*Seg3 2.657616 GNR*EML 2.713688 EML 2.516792 PR*Seg1 1.140537 GNR*RTG 2.46757 PR 1.547914 PR*Seg2 1.242861 WB*PR 2.407196 WBC*PR 3.048059 GNR*PR 1.567795

Variables Ref. cat.: Segment 1 Ref. cat.: Segment 4 Variables Ref. cat.: Segment 1 Ref. cat.: Segment 4

WB 2.089268 2.089268 Seg 3 2.739728 1.484984 AP 1.073512 1.073512 Seg 4 / Seg1 2.686582 1.265256 SR 1.190165 1.190165 FirstDevice 1.763411 1.763411 IWB 1.34554 1.34554 LastDevice 1.855789 1.855789 IAP 1.147985 1.147985 TotalDays 1.958636 1.958636 ISR 1.185134 1.185134 Duration 5.21442 5.21442 WBC 2.329169 2.329169 AFF*Se4 / T18*Se1 1.89465 2.531505 APC 1.023933 1.023933 AFF*Seg2 1.55662 1.622949 SRC 1.261159 1.261159 AFF*Seg3 2.80154 3.029434 WBF 1.392089 1.392089 BAN*Se4 / T19*Se1 7.492599 1.188285 SRF 1.097262 1.097262 BAN*Seg2 4.987835 1.670402 FLW 1.320299 1.320299 BAN*Seg3 27.42197 5.278571

FLA 1.082216 1.082216 EML*Se4 / T20*Se1 9.626806 1.123504

FLS 1.261167 1.261167 EML*Seg2 4.450174 1.537084

GNR 1.575 1.575 EML*Seg3 16.807741 2.855723

AFF 4.398728 4.962375 PR*Seg4 / PR*Seg4 77.712966 1.224501

BAN 37.724883 6.050536 PR*Seg2 47.147143 1.650334

EML 28.663749 3.378478 PR*Seg3 65.776319 1.929884

PR 187.723405 2.551427 RTG*Seg4 / RTG*Seg1 34.024435 1.043909

RTG 94.230275 2.961886 RTG*Seg2 11.970362 1.474823

Referenties

GERELATEERDE DOCUMENTEN

Terwyl die Joodse leiers daarop aanspraak gemaak het dat hulle die sinagoge van Israel kan lei, dring Matteus daarop aan dat die Christelike voorgangers die

Comparing the transition matrix for journeys where affiliates were used (Figure 4) to the journeys without any FIC, we notice some positive differences in the probabilities

However, when the touchpoint were measured individually – instead of all customer initiated touchpoints combined and all firm initiated touchpoints combined – customer initiated

• Research question: To what extend are there synergy effects towards the dependent variable of purchase probability (for purchases at any brand as well as purchases at the

5.5.1 The use of online channels in different stages of the customer purchase journey In order to test the first hypothesis multiple logit models are tested with a channel as a

impact of average satisfaction levels during prior experiences on the current overall customer experience is mediated by the level of pre-purchase satisfaction. H4 Customers

● Positive effect on generic touchpoints &amp; competitor website while negative effect on focal brand’s website → Consumers’ journeys clear preferences. Effect of FICs

Hypothesis 2 is also be proven to be correct as people with the intend to stay long in a hotel room will have a stronger impact on booking probability than users who are