• No results found

Segmenting customers on their geodemographic characteristics.

N/A
N/A
Protected

Academic year: 2021

Share "Segmenting customers on their geodemographic characteristics."

Copied!
46
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Segmenting customers on their geodemographic characteristics.

How do these segments influence customer-initiated touchpoints and purchase decisions?

Delina Post

(2)

2

Segmenting customers on their geodemographic characteristics.

How do these segments influence customer-initiated touchpoints and purchase decisions?

by

Delina Post

University of Groningen

Faculty of Economics & Business

Master Thesis Marketing Intelligence

12th of January, 2020

Jozef Israelsstraat 83, 9718 GG Groningen

0611649325

d.post.8@student.rug.nl

Student number: S2937034

Supervisor: P. S. van Eck

(3)

3

Management Summary

Online shopping has increased a lot during the last few years. New innovations offer many new opportunities for customers to encounter a shop online and to eventually decide to make a purchase or not. These opportunities can be referred to as touchpoints. When a customer can decide which road to use when deciding on a purchase, they can even be referred to as customer-initiated touchpoints. It is expected that the way in which these touchpoints influence the purchase decision differs among them. This difference should become even clearer when examined if this is also visible or maybe even magnified between segments. Evidence exists for these effects when looking at certain demographics, whereas geography has been ignored. Statements have even been made that to the increase in online behaviour, geography becomes less relevant to look at. This research paper is focused at investigating if this is really the case, or if companies should not be too soon with ignoring geographic characteristics within their

competitive strategies. This research therefore decided to investigate: ‘Whether geographic

segments can be identified within the customer base of the travel agency and if these segments differ in the way they are influenced by customer-initiated touchpoints in making the decision to make a purchase or not’.

A dataset has been used that was collected with survey data from 9678 customers. Information was collected about their demographics and about their customer journey. This data was gathered to perform a latent class cluster analysis to identify several distinct segments and to profile these segments in a way that these can be used as moderator for a negative binomial regression analysis where the type of touchpoints where the independent variable and the total amount of purchase the dependent count variable.

Three segments within a customer base from the Netherlands have been created, where the customers from segment 1 are the big families in the north, segment 2 the small families from the east and south, and segment 3 the childless singles and partners from the west. It could be concluded that geographic characteristics are indeed a good clustering concept within the

(4)

4

the way they influence the purchase decision of the individual customers and that this difference in influence is also visible between the three segments. It could therefore be concluded that the way a customer reacts to a certain customer-initiated touchpoint depends on their residence, also referred to as geodemographic characteristics. The residence could mainly be explained by the region they life in and their family composition.

Keywords: Latent Class Cluster Analysis; Customer-initiated touchpoint; Purchase Stage;

(5)

5

Acknowledgments

By completing this thesis, I am completing the last step in achieving my master’s degree in Marketing at the University of Groningen. While writing this research paper I got to use the theoretical and analytical knowledge I gained during the last 1,5 years. It was interesting to see how the knowledge comes together and can be used to get interesting insight within this theme. I enjoyed working on this specific theme since it relates to my personal interest in consumer behaviour. It is interesting to see how customer are different from each other, but that they relate to each other as well through personal characteristics or showing the same behaviour.

I would like to take this opportunity to thank my supervisor Peter van Eck. I enjoyed working with hem and tapping into his knowledge. He showed his interest in the subject and helped through answering every question in some clear and effective meetings. Offering help with steps after graduating shows his engagement with the students. For the opportunities offered with this subject, the help and the data I would like to give hem many thanks.

Furthermore, I would also like to thank my fellow students that helped by answering questions and through talking about our experiences. Last, but not least, I would like to thank my parents, my brother and friends for their support during these last few busy months.

I hope you will enjoy reading this research paper and find some interesting insights.

Kind regards, Delina Post

(6)

6

Table of Contents

Management Summary ... 3 Acknowledgements ... 5 Table of Contents ... 6 Chapter 1. Introduction ... 8

Chapter 2. Literature Review ... 10

Customer Journey ... 10 Purchase Decision ... 10 Channel Behaviour ... 11 Customer-initiated Touchpoints ... 11 Segments as Moderators ... 13 Geographic characteristics ... 13 Demographic characteristics ... 13 Conceptual Model ... 15 Chapter 3. Method ... 16 Data collection ... 16 Variables ... 16 Analysis ... 18 Chapter 4. Results ... 21

Descriptives & Data preparation ... 21

(7)

7

Appendices ... 35

Appendix 1. Boxplots for Outliers ... 35

Appendix 2. Averages for Segments ... 35

Appendix 3. Normal Distribution ... 36

Appendix 4. Dispersion Test ... 36

Appendix 5. Vif Scores ... 36

Appendix 6. R script ... 37

List of Tables

Table 1. Types of Customer-initiated Touchpoints ... 17

Table 2. Segmentation Criteria ... 18

Table 3. Model Fit Cluster Analysis ... 22

Table 4. Segment Profiles ... 22

Table 5. Negative Binomial Regression Analysis ... 24

Table 6. Negative Binomial Regression Analysis Segmentation Moderation ... 25

List of Figures

Figure 1. Conceptual Model ... 15

(8)

8

Chapter 1. Introduction

Current online and offline stores offer a lot of opportunities to help customers in their search for the perfect products or services (Lemon and Verhoef, 2016). These modern

advertising platforms offer marketers to target individual customers (Zantedeschi, Feit & Bradlow, 2017) or customer groups indicating that targeting strategies can potentially simplify the customer journey and improve the customer experience. One of these relevant opportunities is the existence of different channels a customer can use before encountering with the actual company through for example their website. Customer can visit these websites more than once and via different types of channels before he or she will consider a purchase (Li and Kannan, 2014). All these new opportunities complicate the customer journey, offers challenges to companies in effectively managing the environment, but it also offers researchers the

possibilities in starting new research themes in trying to address these challenges (Neslin et. al, 2006).

(9)

9

be explained and defined in the next chapter. It is expected that gathered information related to customer behaviour about their use of customer-initiated touchpoints could help in forming a targeting strategy, which could potentially be the ultimate tool for a company in offering customer services that are relevant for that unique individual.

In line with this expectation is the assumption that no marketing strategy fits all. Every customer has their own unique customer journey, but most of these customers do have certain similar characteristics. Customer segments can be created based on certain similar

characteristics, like age or gender. Based on previous literature about customer purchases it can be concluded that the characteristics of a customer can influence their purchase decision and journeys (Neslin and Shankar, 2009; Inman, Shankar & Ferraro, 2004). Neslin and Shankar (2009) argue that looking at segments enables the firm to paint a more informed picture of the multichannel customer. Therefore, it is relevant to shed new light on influences from channel behaviour within different customer segments. This research paper offers a unique perspective through focusing mainly on the geographic characteristics as segmentation criteria. This could give an extra dimension in creating segments that are as homogeneous as possible, but that are as heterogeneous as possible from the other segments. A light will be shed on the concept of

geodemographically different segments and their relevance in customers and their purchase decisions in the travel industry.

The research question that is formulated based on a literature review is formulated as follows: What is the influence of the type of customer-initiated touchpoints on the purchase decision of customers across different segments?

The goal of the research paper is to find with help from quantitative research if certain touchpoints have differences on the purchase decisions of customers and their segments. Knowing in which channel to invest to have an effective marketing campaign can save companies a lot of money. The structure of the research paper is as follows. The following chapter will be a review of the existing literature about the related concepts. After that the information about the data management process will be discussed in a Method chapter.

Following, the results will be examined in the Results chapter, after which they will be discussed and compared to the existing literature in a Discussion chapter. A conclusion will end the

(10)

10

Chapter 2. Literature Review

The Customer journey Being a customer indicates that you go through different stages of contact with a company. All these stages together form some sort of customer journey. Li and Kannan (2014) identify three stages referred to as the purchase decision hierarchy. The three stages are: (1) the consideration stage, (2) the visit stage and (3) the purchase stage (Li and Kannan, 2014). However, more distinctions exist within literature. Wolny & Charoensuksai (2014) also refer to the five-stage consumer decision-making process as a relevant source of mapping the customer journey. The five stages a customer is expected to go through in their journey are need recognition, information searches, alternative evaluation, purchase and post-purchase stages. Other models are for example the AIDA model or the Howard and Sheth buying behaviour model (Wolny and Charoensuksai, 2014). When looking at the different models it becomes clear that the one stage that the purchase stage is the one stage that they all have in common. The other stages can be seen as a pre- and post-purchase stage. Customers are expected to seek different benefits at the pre-purchase stage than during or even after the purchase (Wolny & Charoensuksai, 2014), which shows the relevance for looking at influences of channels within every one of the stages within the customer journey. This research paper only focuses on choice behaviour at the purchase stage, measuring the, as argued by Frambach, Roest and Krishnan (2007) and by Wolny and Charoensuksai (2014), important relationship between the buying stage and channel usage intention. Focusing on what touchpoints actually leads to a purchase instead of just the intention, helps on bridging the intention behavior gap (Fennis and Stroebe, 2016). The purchase stage is the stage in which a customer actually decides to make a purchase or not. Being the one stage that actually leads to a profit for the company.

Purchase decision. When a customer decides to make a purchase, some research

(11)

11

This is only one of the articles that relates the choice of channels to purchase decisions. Other articles, like the one from Neslin and Shankar (2009) also highlight the importance of

researching different roles of various channels in a customer’s purchase.

Channel behaviour Li and Kannan (2014) emphasize the popularity of online

marketing interventions to draw traffic to firms’ websites and, but they also emphasize the events where customers visit websites on their own initiatives. Li and Kannan (2014, p. 40) refer to channels as “Customers’ responses—such as clicking on display ads, e-mail links, or firm’s paid search ads or choosing any other sources on their own (e.g., typing in the website URL, clicking on organic search links or referral links) - that cause these communications, interventions, or links to become the channels through which customers visit and convert at the firm’s website”. Olbrich and Schultz (2014) show that aggregated results from different studies provide evidence that “in the majority of studies, online advertising, including search advertising, had a greater impact on offline than on online sales” (Olbrich and Schultz, 2014). This subject is researched many times before and shows the relevance and interest in customers and their behaviour regarding channel inferences. Questions rise about why these relationships are visible and if the same patterns in purchase behaviour are visible within different channels. Next to this, Petersen et al. (2018, p. 823) emphasize that they expect that when the customer “is exposed to marketing activities, they will be more likely to be noticed, processed and assimilated if they come from a firm for which the consumer holds a favorable evaluation, ultimately leading to a higher

likelihood to respond to that stimuli in a manner consistent with the attitude held”. Encountering these channels in contact points with online or offline stores is seen as encountering certain touchpoints.

Lemon and Verhoef (2016) identify four categories of customer experience touchpoints: brand-owned, partner-owned, customer-owned, and social/external/independent touchpoints. When looking at the customer journey, they see it as interesting to consider the effects of

(12)

12

is most effective in a segment, but also if certain combinations of touchpoints are effective as well. However, the first step in adding to the existing research, that will be performed is this research paper, is looking at the separate influence of certain touchpoints, especially within certain segments. Based on previous research it is expected that touchpoints, separate or combined, have an influence on the purchase decision from a customer.

Customer-initiated touchpoints. Even though multiple categories of touchpoints exist,

this research paper will only focus on one of them to keep a narrow and targeted focus. The research is focused at unique customer characteristics through segments. Customer-initiated touchpoints are related to individuals and their personal behaviour. From most company-initiated touchpoints it is not completely certain which individuals encountered certain touchpoints. Therefore, customer-initiated touchpoints, due to their closest connection to individuals and action steps, are the focus of this research paper and not the other categories. Lemon and Verhoef (2016, p. 78) define customer-initiated touchpoints as “customer actions that are part of the overall customer experience but that the firm, its partners, or others do not influence or control”. Verhoef, Kannan and Inman (2015, p. 175) refer to customer-initiated touchpoints as “an episode of direct or indirect contact with a brand or firm”.

Li and Kannan (2014) refer to several alternatives to reach a firm’s website, which will be referred to as touchpoints in this research paper. The channel alternatives can be categorized as, entering the URL to the firm’s website, search, referral, sites, e-mails, display banner ads and customers may choose not to click through a display impression. However, as said before, can the way in which the customer encountered the touchpoints, be initiated through different categories. This research paper uses this deviation in categories and only focuses on customer-initiated. When the customer decides to enter the URL to the firm’s websites, visit other sites, gets referrals or decides to search for information themselves this can be seen as customer-initiated touchpoints. The other previously mentioned touchpoints, display banner ads and e-mails, from Li and Kannan (2014) belong according to them to the firm-initiated category and are therefore not relevant for this study. “The website is a vital component of the omni-channel shopping environment where customers can engage with retail brands through a multitude of channels: website, physical retail stores, social media, apps, direct marketing” (Connell,

(13)

13

emphasize that coming across certain touchpoints leads to conversion, whereas coming across other touchpoints will not lead to a conversion, referred to as a purchase in this research paper. Therefore, it can be expected that there are differences between the different types of touchpoints in the way they influence the purchase decisions of customers.

Hypothesis 1: When a customer encounters a customer-initiated touchpoint he or she is more likely to make a purchase.

Segments as moderators The contribution of this article is focused at adding

segmentation in the equation. Konus, Verhoef & Neslin (2008) give an overview of the existing literature about segmentation related to the multichannel setting. The overview shows that not many researchers have researched the influence of segmentation, and whenever they do, they have used different variables. Thus, even though multiple researchers use segments in their research about multichannel marketing, the way in which they use it differs. Lemon and Verhoef (2016) use usage pattern as criteria, where Weaver & McCleary (1984) use household

characteristics and gender as segmentation criteria.

Kushwaha and Shankar (2008) finds a positive significant influence from household status on multichannel shopping. However, the authors did not research geographic

characteristics. Inman, Shankar & Ferraro (2004) find significant influences from household status and urbanicity that are neither positive nor negative on channel choice.

Geographic characteristics Between different regions throughout the country the average

characteristics can differ (Acemoglu and Dell, 2010). As an example, they refer to differences in income between regions. It could also be concluded that the way customers in certain regions use or have access to certain channels differs between regions. Verhoef et al. (2015) emphasize that due to new (online) channels, the natural borders, like geography, between channels begin to disappear. This research helps to test if indeed geography is becoming less relevant to consider the competitive strategies of companies. Neslin et. al (2006) provides evidence, by referring to several other articles, for the expectation that region influences channel choice.

Demographic characteristics According to Swart and Roodt (2015) have several

(14)

14

demographics such as gender, age, education, income and family size. The research is mainly focused on the type of residence of the customers based on certain geodemographic

characteristics and if differences in the purchase decision exist between these different types of residences. Sleutjes, De Valk and Ooijevaar (2018) found that socio-economic inequalities exist within certain districts within the Netherlands. These inequalities are for example linked to certain districts having many low-income households were other have mainly high-income households. Differences in income are visible between provinces, but between cities or municipalities as well (CBS, 2004). The number of children could also be used to create segments, since the CBS (2017) shows that the number of children differs between

municipalities, measured via the fertility of women. Next to this, emphasizes the CBS (2017) that on average more singles life in (large) cities and thus, it is expected that the household size differs between regions as well.

Through using data from the travel industry over a long period of time and by offering possibly relevant managerial implications about targeting certain customers this research is trying to fill the research gap. Next to this, is existing literature, like the ones discussed above, showing that research about multichannel marketing and customer-initiated touchpoints is mainly focused at products, where most purchases in the travel industry are related to services. According to Lemon and Verhoef (2016, p. 80), “customers differ in their preference and usage of channels across different purchase phases, and specific multichannel segment can be

identified that differ in terms of consumer characteristics”. Konus et al. (2008, p. 398) expect that “at one extreme, consumers might behave homogeneously, using all channels for the same reasons and at the other extreme, specific segments might align with specific channels”. This indicates that in relation to channel choices different segments can be formed that will probably have different preferences. This indicates that the type of touchpoints that influence the segments the most can differ across segments. A hypothesis can be formed for every separate segment. However, it is beforehand not clear yet how many segments will be formed.

Hypothesis 2: The influence of the type of customer-initiated touchpoint on purchase decisions differs across segments.

(15)

15

related to purchase behaviour within segments. However, since the variables have multiple layers like multiple segments and multiple types of touch points a more complicated conceptual model arises. Based on the previous discussed literature, like the research from Montaguti et al. (2016) a framework can be created. The framework illustrates how the availability of multiple channels induces different customer behaviors. This behaviour will be related to the purchase decision of a customer. The framework also shows that this behaviour is aggregated from the individual level to differences between segments.

(16)

16

Chapter 3. Method

Data collection To test the hypotheses, secondary data with information about the customer purchase journey within a travel agency is used. The gathered data is online, event-based data and made available through the GFK database. A quantitative research approach was followed.

Having access to the large amount of data provided, will result in more reliable results then using a smaller amount of data that the researcher would have gathered herself. A large dataset with information gathers from a large number of respondents creates a picture that is more likely to be representative for the Dutch society. The data gathered by GFK is gathered with help from a large number of respondents in a structured way. Therefore, it could be

concluded, that by using the secondary data proved by GFK, the data is relevant, and the results will be more reliable and a good representation of the customers within the travel industry.

Variables The purchase decision is the dependent variable in this research paper. In the dataset two variables can be used to represent the purchase decision of the respondents. The purchase decision can be measured specifically at the focal agency or at the general level, which includes purchase decisions at the competitors as well. The decision is made to look at the variable that represents all the purchases made in the dataset, since more purchases than the ones at the focal firm have been made. This offers the opportunity to measure a larger amount of data. The variable is a categorical variable, where 0 represents no purchase and 1 represent the fact that a purchase has been made. However, after aggregating the dataset to user level data, the variable becomes a count variable that represent the total number of purchases a customer makes. By using a count variable, the results will not only show if a customer makes a purchase or not, but also how much the total number of purchases can grow with a certain touchpoint, emphasizing the separate influences of the types of touchpoints more clearly.

The type of customer-initiated touchpoint is seen as the independent variable in this research paper. The variable includes 15 types of customer-initiated touchpoints. When

(17)

17

Table 1. Types of Customer-initiated Touchpoints

Accommodations Website, App & Search

Information or Comparison Website, App & Search

Tour operator or Travel agent Website competitor and focus brand, Search competitor

and focus brand & App competitor

Flight Tickets Website, App & Search

Generic Search

The moderator is based on segments, which must be formed with the help of 5 different variables that will together represent the type of residence of the customers. The type of

residence refers to some geographic and household characteristics. Three of these clustering variables are categorical variables, whereas the other two are continuous ones. The variables that will be used, based on a literature review, to create segments are region and size of municipality as geographic characteristics. The demographic variables that will be used, are income,

(18)

18

Table 2. Segmentation Criteria

Segmentation Criteria Subcategories

Region 1. Amsterdam, Rotterdam & Den Haag

2. West (excluding 1) 3. North 4. East 5. South Size of Municipality 1. <5000 2. 5.000 – 10.000 3. 10.000 – 20.000 4. 20.000 – 50.000 5. 50.000 – 100.000 6. >100.000 7. Den Haag (>500.000) 8. Rotterdam (>600,00) 9. Amsterdam (>800.000) Income 1. <€12.900 (minimum) 2. €12.900 – 27.000 (below average) 3. €27.000 – 33.500 (almost average) 4. €33.500 – 40.000 (average)

5. €40.000 – 67.000 (between 1 & 2 x average) 6. €67.000 – 79.000 (2 x average)

7. >€79.000 (above 2 x average) 8. Don’t know / don’t want to say

Household Size Number of persons in a household

Average Number of Children Number of children in a household

(19)

19

The clusters will be created based on the probability that a customer belongs to a certain cluster with help from the segmentation criteria. Vermunt and Magidson (2002, p. 89) define exploratory latent class (LC) analysis as “an analysis in which objects are assumed to belong to one of a set of K latent classes, with the number of classes and their sizes not known a priori”. A traditional cluster analysis forms clusters based on the nearest distance concept, whereas in LCCA the clusters are formed on the concept of probability of belonging to a certain cluster (Vermunt & Magidson 2002). Using the probability concept indicates that a statistical model is used, offering the opportunity for model selections and thus, for assessing the goodness of fit. One important reason for using the McLust package with LCCA is that it offers the opportunity to include any variable with different measurement levels (Haughton, Legrand & Woolford, 2009). The clusters will be formed based on three categorical and on two continuous variables, which will not form an issue in case of using the McLust package. The process of the LCCA consists of three important steps (Jurowski & Reich, 2000). The first step is deciding how many cluster or segments can be formed within the dataset. The second step is the process that actually assigns the respondents to the number of clusters decided on. The size of the clusters is now determined and information about the what the clusters look like can be gathered. This leads to the third step, which uses the information that can be gathered to profile, interpret and form conclusions about the clusters. The LCCA has several assumptions that should be tested before it is acceptable to actually use it. It is important that the population is mutually exclusive, which indicates that a customer can not belong to more than one cluster at the same time (Haughton et al., 2009). The cluster can be formed based on unobserved characteristics, revealing underlying ‘latent’ constructs and selecting the best possible number of clusters and judging the model fit can be done with for example the AIC and BIC values (Kaplan & Keller, 2011)

To test if the purchase decision of a customer can be (partially) explained through contact with customer-initiated touchpoints, a poisson regression analysis will be performed. A benefit of using this kind of regression analysis is that it is the appropriate analysis for a dependent count variable, represented with the number of purchases of a customer (Leeflang et al., 2015). A poisson analysis offers the opportunity to use the latent classes formed in the LCCA. Besides, the plots visible in appendix 3 show that the data is not normal distributed which is not a

(20)

20

the period of time the data represents should not be too short. This assumption is not violated since the data represents a period of 1 year and 4 months. The parameters will be estimated with the maximum likelihood concept, which indicates that the parameters will be estimated by searching for those parameter values that give the highest likelihood to observe the data (Leeflang et al., 2015). Model selection can be done with the use of a likelihood ratio test that looks at the Chi-square distribution, and the model fit can be tested through comparing the AIC or BIC values with the ones from extended or limited models. A poisson model can be

challenged or violated via several ways. The model assumes that the mean should be equal to the mean. When this assumption is violated, it is called over- or underdispersion. In case of

overdispersion, indicating that the mean is smaller than the variance, a negative binomial regression should be used instead (Leeflang et al, 2015;. When the zero events cannot be observed, a truncated model could be used and when there are more zeros than expected, it is advised to use a zero-inflated model (Leeflang et al., 2015).

The final model is portrayed mathematically as follows: 𝑃ₖₙ = 𝛼 + 𝛽ₖₙ1𝑇ₖₙ + 𝜀ₖₙ k = 1, 2, …, K n = 1, 2, … N Where,

Pₖ = number of purchases for every customer k, 𝛼 = an unknown constant (intercept),

𝛽1 = unknown slope (effect) parameters for customer k in segment n, 𝑇ₖ = type of customer-initiated touchpoint for customer k in segment n, 𝜀ₖ = an error term,

K = the number of customers, and

N = the number of segments

(21)

21

Chapter 4. Results

Descriptives & Data preparation Before analyzing the data for testing the hypothesis the data has gone through a data cleaning process. 1603 respondents have been found that did not provide any demographic information which is necessary for forming clusters. These 1603 respondents are therefore removed from the dataset. No other relevant missing values or outliers have been found. The boxplots that have been used to search for outliers can be found in

Appendix 1. After removing these respondents and aggregating the dataset to user level data, information about 7647 users remained. These users where used as input for the cluster and regression analysis discussed below. The total amount of purchases, purchased by only 2473 of the users, is 726.132 number of purchases. The customer has used the website for the

accommodations and for the travel agent’s competitor the most. The app for the flight tickets and for the travel agent of the competitor the least, next to search commands for the travel agent of the focus brand.

(22)

22

Table 3. Model Fit Cluster Analysis Figure 2. Entropy Plot

Table 4. Segment Profiles

Cluster Region Size

Municipality Household Size Number of Children Income Size % of purchases 1 North 20.000 - 100.000 inhabitants

3.87 ~ Big 1.7 Between the average and 2 times the average 2026 respondents (26,5%) 27,1 2 East & South 20.000 - 100.000 inhabitants 2.10 ~ Small 0 Between the average and 2 times the average 2641 respondents (34,5%) 34,3 3 West, Amsterdam, Rotterdam & Den Haag 50.000 - > 100.000 inhabitants 1.97 ~ Small ~ most singles 0 Between the average and 2 times the average 2980 respondents (39%) 38,6

The averages for the segmentation criteria are visible in Appendix 2 and shortly

described above in table 2. Segment 1 can be profiled as the big family households. The second segment could be referred to as the child-less customers. Segment 3 consists of respondents who can be characterized as the child-less customers from the big western cities. It is relevant to notice that the segments do not really differ in the average income level, but that there are some important differences in the regions and in the family compositions. Segment 3 is the segment that is the biggest segment and is also responsible for the biggest percentage of the total amount of purchases. It could be concluded that indeed the customers could be grouped together based on their type or residence.

(23)

23

Regression Analysis. A model with all the types of touchpoints included as independent variables has been created to perform a poisson regression for testing hypothesis 1. Hypothesis 1, as formed in Chapter 2 assumes that when a customer encounters a customer-initiated

touchpoint, he or she is more likely to make a purchase. The dependent variable is count variable number of purchases and the independent variable is based on the type of touch points. A test for multicollinearity shows collinearity for two types of touchpoints, with values just above 4. These two types, 6 and 16 will be excluded from the final model to get more reliable results. A primary assumption of the poisson regression is the assumption of equal variance and mean. However, a dispersion test shows that this assumption is violated for the final model. Therefore, a negative binomial regression has been used instead of the poisson regression to move further with. The results of the negative binomial regression on the final model is shown in table 2.

The results as shown in table 2 show that a customer is influenced by the type of

touchpoint, he or she encounters in the decision to make a purchase or not. Therefore, it could be concluded that hypothesis 1 is supported. The type of touchpoint that has the biggest influence on the expected number of purchases is the touchpoint that is characterized by a customer initiating a search for the travel agent of the focus brand. This result indicates that if a customer comes in contact with this type of touch point from the focus brand (keeping all the other variables in the model constant), the expected number of purchases increases with 5.8%. The type of touchpoint that is the least effective in getting a customer to make a purchase is the initiated search for the travel agent of a competitor. From table 2, it could be concluded that if a customer comes in contact with this type of touchpoint at the side of the competitor (keeping all the other variables in the model constant), the expected number of purchases even decreases with 0.54% compared to the influence of the other touchpoints.

(24)

24

could be concluded that the model fit for the model that includes all the types of touchpoints, except for the ones that show multicollinearity, has a good enough model fit. Therefore, it could be concluded that it is appropriate to use the final model in estimating the parameter.

Table 5. Negative Binomial Regression Analysis

B Exp B Exp

Intercept 4.090e+00 *** 59.752454 Tour

operator or Travel Agent

Website Competitor

7.103e-04*** 1.0007105

Accommodations Website 5.175e-04*** 1.0005176 App

Competitor

1.117e-04* 1.0001117

App 8.566e-04*** 1.0008569 Search

Competitor

-5.370e-03*** 0.9946446

Search 1.242e-02*** 1.0124938 Website

Focus Brand

3.381e-04*** 1.0003382

Information or Comparison

Website 1.000e-03*** 1.0010009 Search Focus

Brand

5.682e-02*** 1.0584611

App 5.956e-04*** 1.0005957 Flight Tickets Website 3.585e-05*** 1.0000359

Search X X App -3.855e-04*** 0.9996146

Generic search X X Search 1.175e-02*** 1.0118188

AIC 1911337 AIC Null

model 41592.26 BIC 1911441 BIC Null model 41606.14

(25)

25

model and therefore, negative binomial models have been created instead. The results of the negative binomial regression are visible in table 3. These are the models that have, based on the BIC, AIC and chi-square test, the best model fit. The BIC and AIC values are visible in table 3, whereas the results of the chi-square test are visible in appendix 4.

Table 6. Negative Binomial Regression Analysis Segmentation Moderation

Segment 1 Segment 2 Segment 3

B Exp B Exp B Exp

Intercept 3.783e+00 *** 43.957535 3.865e+00 *** 47.71884 4.218e+00*** 67.0397099

Accommodations Website 1.301e-03*** 1.0013015 1.027e-03*** 1.001028 5.230e-04*** 1.0005245

App 2.228e-04*** 1.0002228 X X 6.881e-04*** 1.0006417

Search -1.088e-02*** 0.9891766 1.350e-02*** 1.013587 4.272e-02*** 1.0431635

Information or Comparison

Website 1.325e-03*** 1.0013260 1.129e-03*** 1.001130 X X

App 1.447e-03*** 1.0014478 X X -1.814e-04*** 0.9999831

Search 2.347e-03* 1.0023497 1.628e-02*** 1.016412 X X

Tour-operator or Travel Agent

Website

Competitor X X 1.054e-03*** 1.001055 X X

App

Competitor 3.697e-03*** 1.0037037 X X -1.952e-03*** 0.9980389

Search

Competitor 5.182e-02*** 1.0531822 X X X X

Website

Focus Brand 7.231e-04*** 1.0007234 9.405e-04*** 1.000941 4.479e-04*** 1.0004487

Search Focus

Brand 6.394e-02*** 1.0660310 1.854e-01*** 1.203673 X X

Flight Tickets Website 3.506e-03*** 1.0035119 X X 2.871e-05*** 1.0003844

App 4.415e-03*** 1.0044251 X X 2.854e-03*** 1.0025825

Search 2.746e-02*** 1.0278367 X X 1.378e-02*** 1.0131123

Generic Search X X X X X X

AIC 391124 536475 894333.7

(26)

26

The results as shown in table 3 show that the way a customer is influenced by the type of touchpoint, he or she encounters in the decision to make a purchase or not differs between three different segments. Every type of touchpoint considered in a model shows a significant influence on the dependent variable, number of purchases. Therefore, it could be concluded that hypothesis 2 is supported as well. Besides, it is noticeable that the way in which the types of touchpoints correlate to each other differs between the three segments, emphasizing that the way in which the customers in the three segments react to or initiate the touchpoints differs between the three segments as well. The top three of touchpoints a customer can initiate that will make it more likely for a customer to make a purchase are searching for a travel agent from the focus brand, searching for a travel agent from the competitor or searching for flight tickets. However, searching for accommodations is the least likely touchpoint that will increase the expected number of purchases. It could be concluded that, for getting a customer from segment 1 to make a purchase it will be helpful to get them to search for a tour operator or travel agent from the focus brand, since it will increase the expected number of purchases by 6.6%. A customer from segment 2 that initiates a search for a travel agent of the focus brand, a search for information or comparison or a search for accommodations will be more likely to make a purchase. However, it is the least effective when a customer is actually looking at the website of Tour operator or Travel Agent from the focus brand. Searching for the Travel agent increases the expected number of purchases by 20.4%, where actually visiting the website of the Travel agent only increases the expected number of purchases with 0.09%. For targeting customers in segment 3 it will be useful to focus at types a customer’s search for accommodations, the search for flight tickets or the use of a flight tickets app, and to discourage the customers to look at the app of the travel agent of the competitor. When a customer from segment 3 initiates a search for

accommodations, his or her expected number of purchases will increase by 4.3%.

(27)

27

Chapter 5. Discussion

This research paper attempts to identify which type of touchpoint is most effective in getting customers to make a purchase. The influence of those touchpoints is measured across three different segments, identified through a segmentation study.

Customer-initiated Touchpoints Hypothesis 1 referred to the relationship between types of customer-initiated touchpoints and the decisions to make a purchase. In the results section it became clear that hypothesis 1 is supported due to the mostly positive and significant parameter values of the multiple types of touchpoints. This relationship explains that when a customer initiates contact with the organization via one of the existing touchpoints, he or she is more likely to increase their total amount of purchases. Interacting with one of the touchpoints considered in the dataset will lead to a purchase. However, 2 negative and significant parameter values can be found as well. It is important to note that this does not necessarily indicate that a customer, when initiating these 2 touchpoints, will decrease their total amount of purchases. The negative value does indicate that these 2 touchpoints are, compared to the other 13 touchpoints, less likely to increase the total amount of purchases. The results support the expectations from literature, that customer-initiated touchpoints will lead to more purchases. Li and Kannan (2014) emphasized that the influence differs between different types of touchpoints and that encountering some of them will not always lead a conversion. The degree in which the touchpoints influence the conversions of the company does differ, supporting the statement from Li and Kannan (2014). The results within the travel agency does show that in this case, every one of the 15 types is significant in their influence and does increase the probability that the number of purchases will increase. However, not every customer that has initiated a touchpoint has made a purchase.

(28)

28

The three segments have been characterized on their biggest differences within the segmentation criteria. Segment 1 was seen as the residences located in the north and being the bigger families. The customers within segment 2 are considered to be the customers that have no children and live in the east or south. Segment 3 was considered as the customers of which a lot are single and some others have a partner, but none of them has children and they are located in the west. The descriptions show that the segments differ mostly in their family composition and in their geographic location. It could be concluded that the influence of the type of touchpoint depends mostly on these factors, contradicting the statement from Verhoef (2015) that geography becomes less relevant due to the increase of online channels. In this year, it seems still relevant for companies to consider geography when creating a competitive strategy.

There is a lack of a clear difference between the segments in their average income level. The influence of income due to socio-economic inequalities within districts, as argued by

Sleutjes et al. (2018), can therefore not be supported with this research paper. The numbers from CBS (2017) created the expectation that more singles life in cities, making it a relevant concept to make distinctions between different geographic locations. The biggest cities from the

Netherlands are mostly located in the west, which is referred to as the ‘Randstad’. The segment descriptions show that indeed the most singles are located in the west, again showing support for expectations within literature. Having segments that are in line with expectations from literature, creates support for the reliability for the formed segments.

The biggest contribution of the research paper is to see how these segments relate to purchase decisions and customer-initiated touchpoints. The three segments all differ in their average geographic locations, but the household size is about the same between segments 2 and 3. However, the results do show that the influence of the type of touchpoints is different between these two segments. This distinction shows that the geographic location seems more relevant than the household size in showing which type of touchpoint when initiated will most often lead to an increase in the total amount of purchases.

(29)

29

The research question from chapter 1 was formulated as follows: What is the influence of the type of customer-initiated touchpoints on the purchase decision of customers across different segments? It becomes clear from the results that the influence of a customer-initiated touchpoint is that it positively influences the purchase decisions of a customer through increasing the total amount of purchases of a customer. This influence is visible across every type of touchpoint considered within this research paper and within every segment. However, which type of touchpoint will lead to the highest increase of the total amount of purchases of a customer does differ between three different segments. The residence of a customer, based on certain

(30)

30

Chapter 6. Conclusion

Conclusion With help from existing theories about the relationship between (customer-initiated) touchpoints, purchase decisions and segments, certain expectations have been made that are represented in two hypotheses. With help from a latent class segmentation study and a negative binomial regression analysis, the research questions could, to a certain extent, be answered. It could be concluded that the way in which a customer initiates contact with a company determines if they will decide to make a purchase or to increase their total amount of purchases.

There is a link visible between the channel behaviour of a customer and their personal characteristics. Older research papers, as the ones from for example Kushwaha and Shankar (2008) or Inman et al. (2004) have sown how certain customers, based on their personal

characteristics, react to certain touchpoints or channels. This research paper adds to research with focusing on the residences of the customer and especially relating it to geography, which is an increasingly ignored concept within this kind of research. Showing the relevance of geography, contradicts the statement of Verhoef et al. (2015) that geography lines fade due to an increase of online shopping.

(31)

31

Limitations The geographic information about the customers is based on only two variables, which refer to the size of the municipality and the region within the Netherlands. It could be more interesting to dive even deeper in the geography information and to, for example, also collect information about the province they live in. This could lead to more reliable and more specific segments. The latent class cluster analysis offers the possibility of including more segmentation criteria and also to add some other indicator variables. Next to this, has the dataset offered the possibility to get more insights about the type of touchpoints. However, the analysis has limited itself to place more focus on the general influence of the type of touchpoints by only focusing on the separate influences of the type of touchpoints and their relationship to segments and purchase decisions. More reliable clusters could have been formed when more variables and some control variables would have been included in the analysis.

Future Research Within the findings some interesting insights could be made about the more specific influence of the types of touchpoints as well. An assumption that could be made, is that initiating a search for the travel agency or for their competitor, is not the touchpoint that leads to a purchase most often. Besides, it would be an addition to see if customers are

(32)

32

Bibliography

Research papers

Acemoglu, D., & Dell, M. (2010). Productivity differences between and within countries.

American Economic Journal: Macroeconomics, 2(1), 169-188.

CBS. (2004). Grote regionale inkomensverschillen in de afgelopen halve eeuw. Retrieved from: https://www.cbs.nl/nl-nl/nieuws/2004/06/grote-regionale-inkomensverschillen-in-de-afgelopen-halve-eeuw

CBS. (2017). PBL/CBS Regionale bevolkings- en huishoudensprognose 2016–2040: analyse van regionale verschillen in vruchtbaarheid. Retrieved from:

file:///C:/Users/delin/Downloads/regionale-bevolkings-en-huishoudensprognose.pdf

Connell, C., Marciniak, R., Carey, L., & McColl, J. (2019). Customer engagement with websites: A transactional retail perspective. European Journal of Marketing, 53(9), 1882-1904.

Ehrenberg, A. (1959). The pattern of consumer purchases. Journal of the Royal Statistical

Society. Series C (applied Statistics), 8(1), 26-41.

Frambach, R., Roest, H., & Krishnan, T. (2007). The impact of consumer internet experience on channel preference and usage intentions across the different stages of the buying

process. Journal of Interactive Marketing, 21(2), 26-41.

Haughton, D., Legrand, P., & Woolford, S. (2009). Review of three latent class cluster analysis packages: Latent gold, polca, and mclust. The American Statistician, 63(1), 81-91.

Inman, J., Shankar, V., & Ferraro, R. (2004). The roles of channel-category associations and geodemographics in channel patronage. Journal of Marketing, 68(2), 51-71.

Jurowski, C., & Reich, A. (2000). An explanation and illustration of cluster analysis for

identifying hospitality market segments. Journal of Hospitality and Tourism Research, 24(1), 67-91.

Kaplan, D. & Keller, B. (2011). A Note on Cluster Effects in Latent Class Analysis, Structural Equation Modeling: A Multidisciplinary Journal, 18(4), 525-536.

(33)

33

Kushwaha, Tarun L. and Venkatesh Shankar (2008), “Single Channel vs. Multichannel Retail Customers: Correlates and Consequences,” working paper, Texas A&M University, College Station, TX 77845.

Lemon, K., & Verhoef, P. (2016). Understanding customer experience throughout the customer journey. Journal of Marketing, 80(6), 69-96.

Li, H., & Kannan, P. (2014). Attributing conversions in a multichannel online marketing

environment: An empirical model and a field experiment. Journal of Marketing Research, 51(1), 40-56.

Montaguti, E., Neslin, S., & Valentini, S. (2016). Can marketing campaigns induce multichannel buying and more profitable customers? a field experiment. Marketing Science, 35(2), 201-217. Neslin, S., Grewal, D., Leghorn, R., Shankar, V., Teerling, M., Thomas, J., & Verhoef, P. (2006). Challenges and opportunities in multichannel customer management. Journal of Service

Research, 9(2), 95-112.

Neslin, S., & Shankar, V. (2009). Key issues in multichannel customer management: Current knowledge and future directions. Journal of Interactive Marketing, 23(1), 70-81.

Olbrich, R., & D. Schultz, C. (2014). Multichannel advertising: Does print advertising affect search engine advertising? European Journal of Marketing, 48(9/10), 1731-1756.

Petersen, J., Kumar, V., Polo, Y., & Sese, F. (2018). Unlocking the power of marketing:

Understanding the links between customer mindset metrics, behavior, and profitability. Journal

of the Academy of Marketing Science : Official Publication of the Academy of Marketing Science, 46(5), 813-836.

Sleutjes, B., De Valk, H., & Ooijevaar, J. (2018). The measurement of ethnic segregation in the netherlands: Differences between administrative and individualized neighbourhoods. European

Journal of Population, 34(2), 195-224.

Swart, M. P. & Roodt, G. (2015). Market segmentation variables as moderators in the predictions of business tourist retention. Service Business, 9(3), 491-513.

(34)

34

Verhoef, P., Kannan, P., & Inman, J. (2015). From multi-channel retailing to omni-channel retailing: Introduction to the special issue on multi-channel retailing. Journal of Retailing, 91(2), 174-181.

Zantedeschi, D., Feit, E. M. & Bradlow, E. T. (2017). Measuring Multichannel Advertising Response. Management Science, 63(8), 2706-2728.

Books

Fennis, B.M., & Stroebe, W. (2016). The psychology of advertising (2nd Ed.). New York, NY: Routledge.

(35)

35

Appendices

Appendix 1 Boxplots for Outliers

Boxplot Regions Boxplot Size Municipality

Boxplot Household Size Boxplot Gross Annual Income

Boxplot Amount of children Boxplot Touchpoints

Appendix 2 Averages for segments

Cluster Region Size

(36)

36

Appendix 3 Normal Distribution

Quantile-quantile plot Density plot of purchases

Appendix 4 Dispersion test

Dispersion test for the equal variance and mean assumption of a poisson regression

P-value Dispersion

General 0.1586 4212269

Segment 1 <2.2e-16 291.924

Segment 2 1.748e-08 297.5288

Segment 3 0.1587 1350742327

Dispersion test including Trafo=2 to confirm the use of the negative binomial regression

P-value Dispersion

General <2.2e-16 1

Segment 1 7.739e-09 0.4065462

Segment 2 5.696e-07 0.538103

Segment 3 <2.2e-16 1

Appendix 5 Vif Scores

Accommodations Information / Comparison Tour operator / Travel agent Flight tickets Generic Search

Website App Search Website App Search Website C App C Search C Website F Search F Website App Search General 1.569 3.042 1.737 3.150 2.372 4.919 3.056 1.468 7.923 1.924 2.569 1.500 2.299 2.905 5.554

Segment 1 3.086 1.220 3.040 3.629 1.172 3.434 4.503 1.338 1.536 3.930 2.188 1.618 1.380 1.988 4.152

Segment 2 2.786 21.058 2.575 1.451 12.019 2.340 1.290 5.200 7.242 1.142 1.200 13.776 7.331 12.943 5.00

(37)

37

Appendix 6 R Script rm(list=ls()) #clear workspace

setwd("C:/Users/delin/OneDrive/Documenten/Marketing Thesis") TravelData <- read.csv("TravelData.csv", header=TRUE, sep=",")

TravelDataDemos <- read.csv("TravelDataDemos.csv", header = TRUE, sep = ",")

#Libraries library("foreign") library("haven") library("Hmisc") library("survival") library("survminer") library("tibble") library("MASS") library("pscl") library("AER") library("VGAM") library("tourr") library("rrcov") library("gclus") library("dplyr") library("mclust") summary(TravelData) summary(TravelDataDemos) sum(TravelData$purchase_own) View(TravelData)

#data preparation replacing NA's is.na(TravelData)

colSums(is.na(TravelData)) colSums(is.na(TravelDataDemos))

#Outliers

# Class of each variable

(38)

38

sapply(TravelData, class)

#Shows all variables through boxplots

OutVals = boxplot(TravelDataDemos$SPSS_Regio5)$out

OutVals = boxplot(TravelDataDemos$RESP_GEM_GROOTTE)$out OutVals = boxplot(TravelDataDemos$BAS_huishoudgrootte)$out OutVals = boxplot(TravelDataDemos$BAS_bruto_jaarinkomen)$out OutVals = boxplot(TravelDataDemos$afg_kinderen_huishouden)$out which(TravelDataDemos %in% OutVals)

OutVals = boxplot(TravelData$type_touch)$out OutVals = boxplot(TravelData$purchase_any)$out

# remove na in r - remove rows - na.omit function / option TravelDataDemo <- na.omit(TravelDataDemos)

colSums(is.na(TravelDataDemo)) ##dataframe for creating clusters

segments <- subset(TravelDataDemo, select = c(UserID, SPSS_Regio5, RESP_GEM_GROOTTE, BAS_bruto_jaarinkomen, BAS_huishoudgrootte, afg_kinderen_huishouden))

##compare number of clusters Model1 <- Mclust(segments) summary(Model1) Model2<- Mclust(segments, G=2) summary(Model2) Model3 <- Mclust(segments, G=3) summary(Model3) Model4 <- Mclust(segments, G=4) summary(Model4) Model5 <- Mclust(segments, G=5) summary(Model5) Model6 <- Mclust(segments, G=6) summary(Model6) wss <- (nrow(segments)-1)*sum(apply(segments,2,var)) for (i in 2:15) wss[i] <- sum(kmeans(segments,

centers=i)$withinss)

(39)

39

ylab="Within groups sum of squares")

BIC <- mclustBIC(segments) plot(BIC)

summary(BIC)

##plot if used model 3

plot(Model3, what = "classification")

# run Mclust to get the MclustOutput

outputMClust <- clustCombi(data = segments, modelNames = "VII")

entPlot(outputMClust$MclustOutput$z, outputMClust$combiM, reg = c(2,3)) # legend: in red, the single-change-point piecewise linear regression;

# in blue, the two-change-point piecewise linear regression.

# added code to extract entropy values from the plot combiM <- outputMClust$combiM Kmax <- ncol(outputMClust$MclustOutput$z) z0 <- outputMClust$MclustOutput$z ent <- numeric() for (K in Kmax:1) { z0 <- t(combiM[[K]] %*% t(z0)) ent[K] <- -sum(mclust:::xlog(z0)) }

data.frame(`Number of clusters` = 1:Kmax, `Entropy` = round(ent, 3))

##create cluster variable in data.frame segments$Clusters3 <- Model3$classification table(segments$Clusters3, Model3$classification)

##Merge datasets

TravelMerge=merge(TravelData, segments, by="UserID") colSums(is.na(TravelMerge))

(40)

40 group_by(Clusters3) %>% summarize(mean_typetouch = mean(type_touch)) TravelMerge %>% group_by(Clusters3) %>% summarize(mean_regio = mean(SPSS_Regio5)) TravelMerge %>% group_by(Clusters3) %>% summarize(mean_income = mean(BAS_bruto_jaarinkomen)) TravelMerge %>% group_by(Clusters3) %>% summarize(mean_size = mean(RESP_GEM_GROOTTE)) TravelMerge %>% group_by(Clusters3) %>% summarize(mean_huishoud = mean(BAS_huishoudgrootte)) TravelMerge %>% group_by(Clusters3) %>% summarize(mean_children = mean(afg_kinderen_huishouden)) sum(datasegment3$afg_kinderen_huishouden) sum(datasegment3$BAS_huishoudgrootte == 1) sum(datasegment2$BAS_huishoudgrootte == 1)

(41)

41 type10 = sum(visits[type_touch==10]), type12 = sum(visits[type_touch==12]), type13 = sum(visits[type_touch==13]), type14 = sum(visits[type_touch==14]), type15 = sum(visits[type_touch==15]), type16 = sum(visits[type_touch==16]), purchase = sum(purchases[purchase_any]))

TravelFinal=merge(TravelCustomer, segments, by='UserID') write.csv(TravelFinal, "travelfinal.csv")

##Normal distribution

ggdensity(TravelFinal$purchase,

main = "Density plot of purchases", xlab = "purchases")

ggqqplot(TravelFinal$purchase)

##create subsets for separate segments

datasegment1=subset(TravelFinal, subset = (Clusters3==1)) datasegment2=subset(TravelFinal, subset = (Clusters3==2)) datasegment3=subset(TravelFinal, subset = (Clusters3==3))

##poisson regression NULL model

outputNull <- glm(formula = purchase ~ 1, data = TravelFinal, family = poisson) print(summary(outputNull))

##regression IV~DV

lm_type <- glm(formula = purchase_any ~ type_touch, data = TravelMerge, family = poisson) summary(lm_type)

lm_types <- glm(formula = purchase ~ type1 + type2 + type3 + type4 + type5 + type6 + type7 + type8 + type9 + type10 + type12 + type13 + type14 + type15 + type16, data = TravelFinal, family = poisson)

print(summary(lm_types))

##test for multicollinearity library("car")

(42)

42

##regressino IV~DV excluding type 6 and type 16

lm_types2 <- glm(formula = purchase ~ type1 + type2 + type3 + type4 + type5 + type7 + type8 + type9 + type10 + type12 + type13 + type14 + type15, data = TravelFinal, family = poisson)

print(summary(lm_types2)) car::vif(lm_types2) ##dispersion test dispersiontest(lm_types2, trafo=2) dispersiontest(lm_types2) mean(TravelFinal$purchase) var(TravelFinal$purchase)

##negative binomial NULL model

outputNULLNB <- glm.nb(formula = purchase ~ 1, data = TravelFinal) print(summary(outputNULLNB))

##negative binomial final model

NBmodel <- glm.nb(purchase ~ type1 + type2 + type3 + type4 + type5 + type7 + type8 + type9 + type10 + type12 + type13 + type14 + type15, data = TravelFinal)

print(summary(NBmodel)) exp(coef(NBmodel)) #final

NBmodel2 <- glm.nb(purchase ~ type1 + type2 + type3 + type4 + type5 + type7 + type9 + type10 + type12 + type13 + type14 + type15, data = TravelFinal)

print(summary(Nbmodel2))

NBmodel3 <- glm.nb(purchase ~ type1 + type2 + type3 + type4 + type5 + type7 + type10 + +type12 + type13 + type15, data = TravelFinal)

print(summary(NBmodel3))

NBmodel4 <- glm.nb(purchase ~ type1 + type2 + type3 + type4 + type5 + type7 + type10 + type15, data = TravelFinal)

print(summary(NBmodel4))

##model fit

(43)

43 pchisq(NBmodel$deviance, NBmodel$df.residual) AIC(outputNULLNB) AIC(NBmodel2) AIC(NBmodel) AIC(NBmodel3) AIC(NBmodel4) BIC(outputNULLNB) BIC(NBmodel2) BIC(NBmodel) BIC(NBmodel3) BIC(NBmodel4)

##poisson regression with segments as moderator #segment 1

lm_segment1 <- glm(formula = purchase ~ type1 + type2 + type3 + type4 + type5 + type6 + type7+ type8 + type9 + type10 + type12 + type13 + type14 + type15 +type16, data = datasegment1, family = poisson)

print(summary(lm_segment1))

car::vif(lm_segment1) #type 7 and 16

lm_segment1final <- glm(formula = purchase ~ type1 + type2 + type3 + type4 + type5 + type6 + type8 + type9 + type10 + type12 + type13 + type14 + type15, data = datasegment1, family = poisson)

print(summary(lm_segment1final))

dispersiontest(lm_segment1final)

dispersiontest(lm_segment1final, trafo=2) mean(datasegment1$purchase)

var(datasegment1$purchase)

lm_segment1NB <- glm.nb(formula = purchase ~ type1 + type2 + type3 + type4 + type5 + type6 + type8 + type9 + type10 + type12 + type13 + type14 + type15, data = datasegment1)

print(summary(lm_segment1NB)) ##final

(44)

44

print(summary(lm_seg1NB2))

seg1_NullNB <- glm.nb(formula = purchase ~ 1, data = datasegment1) print(summary(seg1_NullNB)) AIC(seg1_NullNB) AIC(lm_segment1NB) AIC(lm_seg1NB2) BIC(seg1_NullNB) BIC(lm_segment1NB) BIC(lm_seg1NB2) exp(coef(lm_segment1NB))

anova(seg1_NullNB, lm_segment1NB, lm_seg1NB2, test = "Chisq") pchisq(lm_segment1NB$deviance, lm_segment1NB$df.residual)

#segment 2

lm_segment2 <- glm(formula = purchase ~ type1 + type2 + type3 + type4 + type5 + type6 + type7 + type8 + type9 + type10 + type12 + type13 + type14 + type15 + type16, data = datasegment2, family = poisson)

print(summary(lm_segment2))

car::vif(lm_segment2) #type 5, type 8, type 9, type 13, type 14, type 15, type 16

lm_segment2final <- glm(formula = purchase ~ type1 + type2 + type3 + type4 + type6 + type7 + type10 + type12, data = datasegment2, family = poisson)

print(summary(lm_segment2final))

dispersiontest(lm_segment2final)

dispersiontest(lm_segment2final, trafo=2) mean(datasegment2$purchase)

var(datasegment3$purchase)

lm_segment2NB <- glm.nb(formula = purchase ~ type1 + type3 + type4 + type6 + type7 + type10 + type12, data = datasegment2)

(45)

45

seg2_NullNB <- glm.nb(formula = purchase ~ 1, data = datasegment2) print(summary(seg2_NullNB)) AIC(seg2_NullNB) AIC(lm_segment2NB) BIC(seg2_NullNB) BIC(lm_segment2NB) exp(coef(lm_segment2NB))

anova(seg2_NullNB, lm_segment2NB, test = "Chisq")

pchisq(lm_segment2NB$deviance, lm_segment2NB$df.residual)

#segment 3

lm_segment3 <- glm(formula = purchase ~ type1 + type2 + type3 + type4 + type5 + type6 + type7 + type8 + type9 + type10 + type12 + type13 + type14 + type15 + type16, data = datasegment3, family = poisson)

print(summary(lm_segment3))

car::vif(lm_segment3) #type 4, type 6, type 7, type 9, type 12, type 16

lm_segment3final <- glm(formula = purchase ~ type1 + type2 + type3 + type5 + type8 + type10 + type13 + type14 + type15, data = datasegment3, family = poisson)

print(summary(lm_segment3final))

dispersiontest(lm_segment3final)

dispersiontest(lm_segment3final, trafo=2) mean(datasegment3$purchase)

var(datasegment3$purchase)

lm_segment3NB <- glm.nb(formula = purchase ~ type1 + type2 + type3 +type5 + type8 + type10 + type13 + type14 + type15, data = datasegment3)

print(summary(lm_segment3NB))

(46)

46 AIC(seg3_NullNB) AIC(lm_segment3NB) BIC(seg3_NullNB) BIC(lm_segment3NB) exp(coef(lm_segment3final))

anova(seg3_NullNB, lm_segment3NB, test = "Chisq")

pchisq(lm_segment3NB$deviance, lm_segment3NB$df.residual)

#residual deviance values

with(lm_segment1NB, cbind(res.deviance = deviance, df = df.residual, p = pchisq(deviance, df.residual, lower.tail=FALSE))) with(lm_segment2NB, cbind(res.deviance = deviance, df = df.residual, p = pchisq(deviance, df.residual, lower.tail=FALSE))) with(lm_segment3NB, cbind(res.deviance = deviance, df = df.residual, p = pchisq(deviance, df.residual, lower.tail=FALSE)))

Referenties

GERELATEERDE DOCUMENTEN

Upper bound on the expected size of intrinsic ball Citation for published version (APA):..

Keywords: Critical percolation; high-dimensional percolation; triangle condition; chemical dis- tance; intrinsic

The objective of this questionnaire is to find out who the customers in the market are, what kind of people they are and what kind of needs they have according to a sailing yacht?.

And I was about to tell you, since I heard of the good lady's death and that my lord your son was upon his return home, I moved the king my master to speak in the behalf of

The absolute foreign bank presence term is significantly positive in the model without asymmetric information involved, while both absolute and relative foreign bank presence seem

Hypothesis 2: The influence of the type of customer-initiated touchpoint on purchase decisions differs across

Next, we examined the average dose–response associations using fixed effects models (Models 2A, 4A, and 6A), to investigate whether, on average, adolescents would feel better or

It states that there will be significant limitations on government efforts to create the desired numbers and types of skilled manpower, for interventionism of