• No results found

How consumer segments influence the effect of online and offline advertisement on sales

N/A
N/A
Protected

Academic year: 2021

Share "How consumer segments influence the effect of online and offline advertisement on sales"

Copied!
57
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

How consumer segments influence the effect of online and offline

advertisement on sales

Maaike Kort

(2)

2

How consumer segments influence the effect of online and offline

advertisement on sales

Master Thesis

Maaike Kort S2354101 Stadhouderslaan 56a 9717 AK Groningen maaikekort@hotmail.com University of Groningen Faculty of Economics and Business

MSc Marketing

Management & Intelligence tracks First supervisor: Natasha Walk Second supervisor: Prof. dr. J.E. Wieringa

(3)

3

Management Summary

Several different channels, where advertisements can be placed, are available, both online and offline. People use different media channels and they react to them in different ways. In this study, specifically the exposure to video advertisement is observed. Regarding offline, this involves television advertisement and online, the exposure of video advertisement on YouTube and RTL Online is measured. In research, it is relatively unique to use advertisement exposure instead of advertisement expenditure. The study is performed on a large consumer panel dataset of Coca-Cola, where data was collected from 10,703 households over a period of 90 days by GfK.

This study investigates how consumer segments influence the effect of online and/or offline advertisement on sales. First of all, the effects of online advertisement, offline advertisement and multichannel advertisement are measured on sales using a standard linear regression. Multichannel advertisement is when a household has been exposed to the advertisement both online and offline. Here, the desired outcome is a synergy effect, where the combined effect of multiple media channels is higher than the effect of online and offline advertisement. The results show that online advertisement as well as offline advertisements increases sales. There is no synergy effect on sales when a household has been exposed to both online and offline advertisements. The effect of online advertisements is substantially larger than that of offline advertisements. Every time a household watches an additional online advertisement, the sales increase with €0.16. Whereas with offline advertisements, the sales increase only with €0.02. Furthermore, different segmentation approaches will be used. First, a supervised classification procedure will be used to evaluate the moderating effects of the variables where a priori knowledge is available, namely loyalty, age and income. Furthermore, unsupervised cluster techniques, namely the TwoStep and K-means cluster procedures, are used to define clusters and evaluate the effect these clusters have on the relation between online and/or offline advertisement on sales.

(4)

4 The unsupervised segmentation resulted in a three-cluster solution. Both the TwoStep and and the K-means cluster technique revealed that one specific segments significantly moderates the effect of online and offline advertisement on sales. This segment consists of families aged 25 till 45 that have a slightly higher income and children.

In general, it can be concluded that online advertisements have a larger effect on sales than offline advertisements. When the advertisement is targeted to specific consumer groups, the effect of either online and/or offline advertisement on sales becomes larger. Moreover, the supervised segmentation resulted in a larger moderating effect than the unsupervised segmentation. Consequently, with an advertisement campaign a company should target consumer groups based on one specific criterion.

(5)

5

Preface

As a Marketing Management and Intelligence student, this thesis challenged me to combine the best of both worlds. It was my first time working with a large consumer dataset. Though this proved some challenges, I enjoyed the fact that this was real-life data. When sometimes lost in the data, the turn to practical implications helped me to keep an eye on the end results. I did really feel a connection with the subject of my thesis, which was great motivation and a real driver.

I want to thank my supervisor Natasha Walk for the time, energy and enthusiasm she put into it together with me. The valuable feedback and guidance in difficult times were extremely helpful. Furthermore, I would like to thank my fellow students for their help and feedback. Moreover, I would like to thank my family, friends and especially roommates for their support during the stressful moments.

Maaike Kort

(6)

6 ABSTRACT

With many new advertisement channels present, this study investigates how consumer segments moderate the effect of online and/or offline advertisement on sales. The study is performed on a large panel dataset of Coca-Cola. The results show that online advertisement (YouTube and RTL online video advertisement) as well as offline advertisement (Television advertisement) increases sales. There is no synergy effect on sales when a household has been exposed to both online and offline advertisements. However, the supervised segmentation shows that loyal customers negatively moderate the relation between multichannel advertisement and sales. Furthermore, consumers with a relatively low income are more reactive to online advertisement, while consumers with a moderately high income compared to a low income are more reactive to offline advertisement. Finally, unsupervised cluster techniques reveal that families, aged 25 till 45 that have a slightly higher income and children, react stronger to online advertisement and offline advertisements.

(7)
(8)

8

1. Introduction

In many firms, marketers have a difficult time justifying their expenditures in terms of direct return on investment (Verhoef & Leeflang 2009). Some marketing consultants believe that most marketing programs do not earn an acceptable return on investment (Clancy & Shulman 1994). It is said that only half of the advertising expenditures generate economic benefits to the firm (Abraham & Lodish 1990). Many marketers do not measure their marketing actions or have difficulty finding the right metrics (Verhoef & Leeflang 2009). In this paper, online and offline marketing will be made accountable.

Firms must manage multiple forms of online and offline advertising simultaneously, and make strategic budget allocation decisions across advertising forms (Dekimpe & Hanssens 2007; Lehmann 2004; De Haan, Wiesel & Pauwels 2016). In practice, it has become a general rule that one allocates marketing budget between different media channels based on their return on investment (ROI) (Powell 2003; Venkatesan et al. 2014). Firms allocate their advertising across several media channels to increase reach, brand awareness, sales and profitability (Shrihari et al. 2016).

Advertisement management, particularly media channel selection, has never been more challenging for marketing managers (Danaher & Dagger 2013). The number of channels to keep under consideration when designing a media plan (when, to who and with which media channels the media content is conveyed) has grown dramatically, making the advertisement environment more complex and cluttered (Abedi 2017; Danager & Rossiter 2011; De Mooij & Hofstede 2010 ). This enables the marketer to use a range of less costly advertisement channels, but if one wants to utilize the full range of available advertisement channels, it will become more costly (De Mooij & Hofstede 2010). These different media channels can be distinguished in online and offline media channels. Lately, increasing advertising and marketing efforts have been made on the online contents (Chen, An and Yang 2015).

(9)

Coca- 9 Cola company Marcos de Quito said at a conference of Beverage Digest, “Digital is important, of course, but the effectiveness of TV is still very, very critical for our business, it still offers the best ROI across media channels” (Coca-Cola 2017). However, viewers have been moving away from watching traditional broadcast channels towards online video consumption to gain more control over their media consumption (Schweidel & Moe 2016). Marcos de Quito mentioned that for Coca-Cola in 2016, digital marketing accounts for 20 percent of their global marketing budget (Coca-Cola 2017).

To make advertising effective, an important component is the quality of the match between advertisers and the intended recipients of ads (Athey & Gans 2010). Firms should recognize that people have different kinds of relationships with brands (Avery et al. 2014). Marketers can utilize more media types to accomplish specific communication objectives (Batra & Keller 2016). A mix of marketing media channels, which is typically used, differ from each other in their level of targetability, effectiveness, and cost (Abedi 2017). Television ads are used to reach a large target audience and make them aware of the brand and its offerings, but are highly expensive (Batra & Keller 2016). With online advertising, consumers can be targeted directly, with for example local ad campaigns based on IP addresses (Athey & Gans 2010).

(10)

10 Overall, many new types of media have arisen in the last decade, which can be divided in online and offline channels (Abedi 2017). Consumers have different media usage patterns and utilize different media channels. This article will investigate which consumers use which media channels and how that affects sales. Firms can use this information to more efficiently and effectively allocate their marketing advertisement budget. This leads to the following research question and sub-questions:

How does the effect of online and offline advertisement on sales vary for different customer segments?

- What are the best segmentation criteria for the classification of consumer preferences regarding channel advertisement?

- How do specific segments react to online and/or offline advertisement?

- How should the marketing budget be distributed based on the given information? First, the effect of online and offline advertisement on sales needs to be determined within the Coca-Cola dataset. Afterwards, the focus of this paper is first on finding the best segmentation criteria. Based on previous research and the results of the segmentation study, the paper will show how specific consumer segments react differently to offline and online advertisement. These questions are important for the research field, as a specific segmentation for online and offline advertisement on sales has, to the best of my knowledge, not been researched before. The combination of multiple segmentation approaches, both supervised and unsupervised, will provide more depth to the segmentation literature.

Another contribution of this paper to the existing literature is that advertisement exposure is used. In most papers, advertisement expenditure is used as a measure for online and offline advertisement. In this paper, advertisement exposure will be used, which is relatively unique in research. Advertisement exposure measures how many times a household has seen the advertisement. Besides, studying and modelling individual behaviour has become more and more the focus in marketing research (Leeflang et al. 2015). In this paper, the perspective of the individual consumer will provide valuable insights for the segmentation.

(11)

11 synergies, with a combination of both. This will help them to spend their money more effective.

In the remainder of this paper, first, the literature will be reviewed on the effects of advertisement on sales, online and offline media channels, synergy effects and finally, segmentation variables. Afterwards, empirical research will show how online, offline and multichannel advertisements influence the sales of Coca-Cola. Moreover, different segmentation approaches will be used. First, a supervised classification procedure will be used to evaluate the moderating effects of the variables where a priori knowledge is available, namely loyalty, age and income. Furthermore, unsupervised cluster techniques are used to define clusters and evaluate the effect these clusters have on the relation between online and/or offline advertisement on sales.

2. Theoretical Framework

2.1 The effects of advertisement on sales

Advertising is defined as any form of paid communication by an identified sponsor aimed to inform and/or persuade target audiences about an organization, product, service or idea (Belch & Belch 2004; Tellis 2004; Yeshin 2006). Without advertising, we would not be aware that certain products or services exist (Fennis & Stroebe 2016).

In early research the relationship between advertising and sales has been established (Erickson 1992; Porter 1974). Comanor & Wilson (1967) find that advertising has a significant and positive effect on profit rates. Hirschey (1982) found that advertising leads to a significant positive profitability effect in the consumer goods industries. Dekimpe & Hanssens (1999) found that advertising has a positive and enduring effect on base sales. Also in recent papers, for example Chen & Waters (2017), a positive relationship between advertising expenditures and profits is found.

(12)

12 broader market (Kumar, Sharma & Gupta 2017). In saturated industries, advertising may be used to maintain customer loyalty and/or redistribute market shares (Elliott 2001).

Regarding the length of the effect, the share returns to advertising diminish fast, typically after the third exposure (Tellis 2003). The first exposure has the most influence (Tellis 2003). If advertising has no immediate sales impact, it will have no sales impact at all (Lodisch et al. 1995). The positive effect of offline advertisement on sales has extensively been tested and proven in literature. As a result, no hypothesis will be formulated for this research. However, the effect of offline advertisement will be tested.

2.2 Offline media channels

Offline advertising can be defined by the media spending on television, newspapers, magazines, radio, and direct mail (Naik & Peters 2008). Thus, offline media consists of mass media and individually targeted media (Naik & Peters 2008). In this paper, television, print (including newspapers and magazines), radio, display and direct mail will be included in the definition of offline media channels, also called traditional media.

Even though many new media have appeared, television remains the most trusted source of news and information by a wide margin (Danaher and Rossiter 2011; eMarketer 2012). The typical consumer is exposed to about 29 minutes of paid television advertising per day (Joo et al. 2014). The cost of a 30-second advertisement on TV at prime time in US was about 500,000 US dollars in 2012 (Chen et al. 2015). The majority of viewing still occurs at home, on TVs (GfK 2016). Kumar et al. (2017) confirms earlier findings that television advertising leads to an increase in brand sales. In this paper, television, more specifically the exposure to an advertisement of Coca-Cola on television, will be used. The effects of television advertisement on sales have been established in the literature, therefore will not be hypothesized again.

2.3 Online media channels

(13)

13 television together account for $63.4 billion in 2015. Consumer Packaged Goods firms, which include Coca-Cola, the amount of money spend on Internet advertisement represents 6% in FY 2015 (IAB 2016).

Online media channels have been defined differently within the existing literature. Batra & Keller (2017) highlight six ways of online communication: search, display, website, e-mail, social media and mobile. Danaher & Dagger (2013) look with online advertising at display, sponsored search, and social media. Goldfarb (2014) classifies three areas of online marketing: search advertising (paid search results), classified advertising (online version of newspaper ads) and display advertising (banner ads, pop-ups, video-ads, ads on social media). Sridhar et al (2016) define online advertising as all Internet-based media, including search and display advertising. In this paper, online media channels are defined as the medium that can be used in reaching the intended audience. These online media channels will be categorized into: search, display, e-mail, social-media and digital video.

Srinivasan et al. (2015) have found that online metrics explain about 15% of the variance in sales, where owned media accounts for 10%, earned media for 3% and paid media for 2%. They also found that TV advertisement accounts for 5% of the variance in sales. Kumer et al. (2017) also found that online, namely social media, has a larger effect on sales than the effect of television advertising. Agarwal et al (2011) found that for sponsored search advertisement, an online ad leads to higher conversion for an online retailer.

(14)

14 H1: Online video advertisement has a positive effect on sales.

2.4 Synergy effects

Next, the literature on synergistic effects of different media channels is reviewed. When using multiple media it is important to develop a well integrated marketing communication (IMC) plan for a brand. Integrated marketing communications research has shown that delivering a consistent message through multiple consumer touch points is more effective than managing disparate, medium-specific campaigns (Joo et al. 2014). The use of advertising on multiple media channels can lead to synergy effects. Media synergy is defined by Naik and Raman (2003) as “the combined effect of multiple (media) activities exceeds the sum of the individual effects.” Another motivation why multiple channels should be used is because people frequently use several media simultaneously (Lin et al. 2010). Users have multiple devices and devices have multiple users (GfK 2016). Nielsen (2014) estimates that 84% of US tablet and smartphone users engage in media multitasking while watching television. Recent research has looked at the integration of traditional media and online media. Chang and Thorson (2004) found that when someone is exposed to both television and online advertisements, it leads to higher customer attention and purchase intention than when exposed to repeated television advertisement. Naik and Peters (2009) found synergies between spending in online and offline media. They found that online advertising amplifies the effectiveness and synergies of offline media and increases the number of visits on the website. Joo et al. (2014) show that advertising on television can influence online behaviours such as online search and online shopping. In this paper, the synergy effect will be tested on the sales. Furthermore, not all synergies hold, Kumar et al. (2017) found that there is no significant synergy effect on sales between social media and television advertisement. Therefore, it is interesting to discover whether the synergy between television advertisement and YouTube and RTL XL video advertisement will hold. This leads to the following hypothesis:

H2: The effect of offline and online advertisement on sales together is stronger than their separate effects.

(15)

15 customers the ability to view rich visual representations of the information being conveyed (Alba et al. 1997). When marketers use several media types, it is advised to keep the specific communication objectives in mind for each media channel. Mass-media channels that effectively build brand awareness can enhance the effects of channels used to communicate more detailed information (Prins & Verhoef 2007). For this paper, the synergy effects based on communication objective is beyond the scope. The objective of both types of media channels (television advertisements and YouTube video-advertisement) will be similar. Both are mass-media channels, with one-way communication and aimed primarily at creating and maintaining brand awareness. They can only be tailored to a certain account as to on which type of television channels, and the time and frequency it is broadcasted or YouTube channels they can be used. As Coco-Cola is advertising on multiple media online and offline, it will be interesting to see if these synergies hold. Furthermore, in the next section, the segmentation variables will be reviewed. As not every consumer has seen or noticed advertisement on all media, it will be interesting to investigate whether different consumer groups have a different effect on sales.

2.5 Segmentation

Segmentation divides the market into distinct groups of homogeneous consumers who have similar needs and behaviour (Keller 2013). There are several types of customer segmentation: demographic, behavioural, or values-based (Avery et al. 2014). Braun et al. (2007) mention even more segmentation strategies, namely: demographic, perceived benefits, expected behaviour, loyalty status, lifestyle and psychographic categories of potential consumers. Socio-demographic variables are more readily available and can be applied to segmentation problems with relative ease, thus have also been researched more often (Myers 1996). Several criteria for segmentation will be discussed below.

(16)

16 Thus, segmentation is the overarching term used in this paper, while clustering refers to the unsupervised procedures and classification to the supervised one.

2.5.1 Supervised segmentation

One of these variables, loyalty, is a behavioural variable; the other variables are socio-demographic variables. Loyalty, age and income will separately be tested if they influence the relation between online and/or offline advertisement on sales.

Loyalty

Loyalty is defined as the repeat purchasing behaviour of shoppers towards one type of channel (Jones & Sasser 1995). Already from the 1980’s, Raj (1980) said that differences in the effects on consumers with different purchase characteristic are interesting to examine. He investigated consumer that were either loyal or non-loyal to the advertised brand and found that increased advertising does have different effects for these groups. For loyal customers, the advertising effects last longer and have a carryover effect of 3 to 6 months. Also Nayga et al. (2005) made segments beforehand based on consumer loyalty. Categorizing them as either a non-consumer, a consumer that bought a product once or a consumer that bought product multiple times. She found that when consumers received more information about a product, non-loyal consumers indicated they were more willing to buy the product. Loyal consumers are more habitual buyers and respond less to price and promotions (Mela et al. 1997).

H3a: Loyal consumers have a positive effect on the effect of I) online advertisement, II) offline advertisement, III) the combination of online and offline advertisement, on sales. Age

(17)

17 Regarding the effect of age on the selection of media channels, there is a gap in the literature. Several small researches online and Global News sites report different outcomes. Nielsen (2016) found in their Total Audience Report of Q1 in 2016, that the average adult in the US devotes 10 hours and 39 minutes each day to consuming media. From this total, radio (240 min) and TV (226 min) account for the most time, followed by online media as smartphones (191 min) and Internet (162 min). Younger individuals are more likely to be active on social media (Pew Research 2015). Furthermore, young people are spending more time playing and socialising online than watching television programmes according to the annual media monitoring report, based on a sample of more than 2,000, 5 to 16-year-olds (BBC 2016). A research by Thinkbox from 2014 found that in the UK, from their total viewing time, young people between the age of 16-24, still watch television 65% of the time (3.5 hours per day)(Campaign 2015). This is still relatively low compared to the average of 81% by all individuals (Campaign 2015). Another research showed that young people in the UK aged between 16 and 24 spend more than 27 hours a week on the internet in 2014, resulting in 3.8 hours per day (Telegraph 2015). In Sweden, men between the age of 16 and 25 spend the most time on the Internet, about 42.5 hours per week (Statista 2016).

Older consumers will be more inclined to use offline media channels. Therefore, when offline media advertisement is targeted at older people, sales will be higher than when it is targeted at younger people. On the other hand, younger consumers are more likely to use online media channels. When online media advertisement is targeted at younger people, sales will be higher than when targeted at older people. This leads to the following hypothesis:

H3b: The age of different consumer groups has a moderating effect on the effect of I) online advertisement, II) offline advertisement, III) the combination of online and offline advertisement, on sales.

Income

(18)

18 industry, thus buy A-brands. Research shows that income is related with buying branded products, like Coca-Cola. However, divergent results are found.

Besides, the literature has limited research on income effects on channel selection or advertisement. Mitchell (1994) found that higher income earners were more technologically savvy than lower income earners. Based on limited knowledge, this hypothesis will be exploratory.

H3c: Income differences have a moderating effect on the effect of I) online advertisement, II) offline advertisement, III) the combination of online and offline advertisement, on sales. Gender

Males and females have different patterns of thinking and behaving. This also influences how they perceive advertisement. Males tend to have more positive attitudes toward advertising than females (Kempf et al. 1997). While female respondents generally evaluate ads as more memorable than male respondents (Kam et al. 2017). Women respond more strongly to negative images and movies, and their emotional reaction lasts longer compared with men (Kring & Gordon 1998; Rottenberg et al. 2007).

Males and females also react differently to the new technologies and types of channels that have arisen. Males are more curious than females and therefore more willing to try new products including new technology (Lu et al. 2006). Females are more socially oriented and therefore prefer technology that enables them to connect to others (Goh & Sun 2014). Garbarino & Strahilevitz (2004) show that perceived ease to use has equal impact on purchase intention for female and male. Thus, the effects of how males and females react on the new online channels in general is divergent.

(19)

19 research does not include gender, as purchases are made by and for the entire household. Therefore, the role of gender will be excluded from this study.

2.5.2 Unsupervised segmentation

Within the current literature, no insights are provided regarding other socio-demographic variables. Therefore, other socio-demographic variables like household size, education and in what size of town people live will be included only in the unsupervised segmentation. Then, an algorithm that detects patterns in the data determines clusters. These clusters will be used to see if specific consumer groups enhance the effect of online and/or offline advertisement on sales.

H4: Natural clusters defined by data will enhance the relation between online and/or offline advertisement and sales

2.6 Conceptual model

(20)

20

Figure 1: Conceptual model

3. Methodology

In the following section, a detailed look will be taken at the original data provided by GfK for this case. After inspection for outliers and missing value, the data is transformed into a fitted dataset for this specific research. Furthermore, the type of analysis is described. A standard linear regression is first used to test hypothesis 1 and 2. In order to test hypothesis 3, a supervised classification approach will be used. Afterwards, two unsupervised cluster techniques will be used, namely TwoStep and K-means to test hypothesis 4.

3.1 Data description

(21)

21 Relatively unique in the research field is that the exposure of individual consumers to advertisement is measured. A more common approach that has been used in the marketing literature is to look at the advertising spending by a company. Exposure to advertisement is measured in three ways; either with an online panel, a TV panel or with an opportunity to see (OTS) TV survey panel. In the online panel 9,380 households participated, they were asked to fill in an online survey with among others the question if they have seen the advertisement on YouTube or RTL XL. Regarding the TV panel, 1,443 households participated. They had a device at home next to their television, which registers through sound when and which channels have been seen. Based on that they determine how often this household has seen the commercial of Coca-Cola. In the OTS panel, 6,913 households were asked several questions about their television behaviour. Based on this, the chance that this household has seen the advertisement is measured. Respondents were either TV panel or in the OTS TV panel, not in both. In the following section, it will be evaluated whether these data can be combined.

3.2 Data manipulation

3.2.1 Missing data and outliers

After inspection, some missing data was found in the socio-demographic dataset, as thirteen households did not provide demographic information. These households will be imputed by the most frequent category per variable, while making sure the data makes sense. So, the variable District is imputed with the most similar one, as there is a direct link between the postal code and the district.

Another variable that contains missing data is the HML-value, which describes the level of loyalty as either high, medium, low or not loyal. This value is given to consumers based on a continuous panel. However, the value 99 means that no level is assigned, thus it is missing. This holds for 887 households. Here, the predictive mean matching method of hot-deck imputation is applied using MICE in R for which a predictor matrix is created first (Van Buuren & Groothuis-Oudshoorn 2011). The variable HML-value will interchangeably be referred to as loyalty and HML-value.

(22)

22 the two datasets are merged into one dataset, labelled ‘total clean’, which contains information from 10,698 households over 90 days.

3.2.2 Variable computation

In order to be able to perform the empirical analysis, three new variables need to be computed; online advertisement, offline advertisement, and multichannel advertisement. To evaluate whether the variables Aantal_tv_contact_tv_panel and Aantal_OTS_tv_contact can be combined, first, the average will be compared. In table 1, the descriptive statistics are provided for the variables measuring the number of TV advertisements seen. From the minimum and maximum values, it can be seen that a similar measuring scale is used. However, in the TV panel variable, only 1,443 households participated, while in the OTS panel 6,913 households participated. Therefore, to calculate the mean, only the households that have participated in the panel are taken into account.

Table 1: Descriptive statistics from variables measuring TV advertisement.

To judge whether the two variables can be combined, the corrected means should not be significantly different. A chi-square test shows that the means are not significantly different (p<0.001). This means that the means of the TV panel and the OTS panel can be perceived as the same. A new variable will be computed, Offline_Advertisement, in which the two variables are combined. The data of 8,354 households is collected in this variable, indicating how many times they might have seen the Coca-Cola advertisement on television. On average a household has seen this TV ad 6.5 times within 90 days.

The variable Online_Advertisment is a summation of the variables YT_contacts and

RTL_contacts. These variables are measured in the same online panel and have an equal scale,

therefore, they can be combined. Furthermore, the variable Multichannel_Advertisement is computed after the aggregation on the week level. This variable enables the synergy effect.

Variable name N Min. Max. Mean SD

Aantal_tv_contact_tv_panel 1443 0.00 43.00 6.575 6.082

Aantal_OTS_tv_contact 6911 0.00 40.56 6.453 5.448

(23)

23 Therefore, this variable has the condition that it only has a value when a household has seen the Coca-Cola advertisement both online and offline. Once this condition is met, the two variables are multiplied, as it is an interaction effect between online and offline. There are 527 cases where households have viewed both at least one online and one offline advertisement of Coca-Cola.

3.2.3 Aggregation

In order to interpret the effects fairly, a new dataset is created where only households are included who participated in both the online and offline panel. In this dataset, 7,741 households are included with information over 90 days. Besides, the inpanel dataset is aggregated in multiple ways. The overview of the aggregation steps can be found in figure 2 below.

Figure 2: overview of aggregation steps

(24)

03-02- 24 2014. From this, it can be concluded that an online campaign for YouTube and RTL online started on 03-02-2014. Secondly, the RTL online campaign lasted probably 46 days, till 21-03-2014, as the exposure becomes zero again from that point. Third, it can be seen that TV is a much more popular channel for Coca-Cola, as the advertisements on TV have been seen many more times than online. Fourth, from this graph, we can detect two periods with no advertisement exposure; from 10-02-2014 till 23-02-2014 (week 7 and 8) and after 18-03-2014. From this, it can be inferred that Coca-Cola did not broadcast during this period. Finally, there are only three weeks where both the online and offline campaign have run at the same time. This results in only 111 times where consumers have actually seen the Coca-Cola advertisement on both channels.

Graph 1: The number of advertisements seen over a 90-day period.

Graph 2 displays the number of times an advertisement of Coca-Cola is seen and how many times the product is bought over time. On the right y-axes the number of times the advertisement is seen online is presented, while on the left y-axes this is presented for offline advertisements as well as for the number of purchases. Especially number of purchases (left y-axes) shows a very constant pattern based on the week of the week. The day of the week effect of sales can be found in Appendix B. In the period where the commercial was not broadcasted on TV, the product was still bought very regularly.

(25)

25 To conclude, based on the day of the week effect present for Coca-Cola, it is decided that the data will be aggregated per week. This will result in 7,741 households * 13 weeks = 100,633 rows with data. In order to be able to measure the online effect better, the first five weeks will be excluded from the data. This will result in a dataset of 7,741 households * 8 weeks = 61,928 rows with data.

3.2.4 Variable re-coding

The socio-demographic variables will be re-coded into numeric values for the segmentation analyses. The variable Age is divided into four categories instead of the original eleven categories. This is based on CBS categorization (CBS 2007). Regarding the variable Income,

Num ber

HML-value (Loyalty)

Age Income HH composition Children HH size

Education Size of town

District

0 Not loyal - - - No children

1 Low <24 <900 2+, no children 1 child 1 Basic <20.000 3 big cities 2 Medium 25-45 1500-2500 Alone 2 children 2 MAVO 20.000 –

50.000

North 3 High 45-65 2500-3500 Family with

young children 3 children 3 HAVO/V WO 50.000 – 100.000 East 4 >65 >3500 Family with middle age children

>4 children >4 LBO >100.000 Rest west

5 Family with teenage children MBO South 6 HBO 7 WO 0 5 10 15 20 25 30 35 40 45 50 0 500 1000 1500 2000 2500 3000 N u mb er of ad ve rti se me n ts s ee n on li n e N u mb er of ti me s Date

Coca-Cola purchases and advertisement

Coca-Cola purchases Offline advertisements Online advertisement

Graph 2: link between Coca-Cola purchases and advertisements of Coca-Cola seen

(26)

26 the original classes have been split in four groups evenly, as it is assumed the GfK categories follow a distribution based on the national distribution. All the socio-demographic variables and their categories are presented in table 2.

3.3 Type of analysis 3.3.1 Regression

A standard linear regression is performed of online advertisement, offline advertisement and multichannel advertisement on sales. The original variable Value, meaning the amount of money spend on Coca-Cola products, will be used as the dependent variable and is renamed into Sales. Sales is used as it provides us with more information than only the number of purchases as multiple units of Coca-Cola can be bought within one purchase.

Standard additive linear regression

1 𝑆!,! = 𝛼 + 𝛽!𝑁!,!+ 𝛽!𝐹!,!+ 𝛽!𝑁 ∗ 𝐹!,! + 𝜀!,! , 𝑖 = 1, …, 7741 t = 1, …, 13

𝑆!,!= Sales of Coca-Cola in week t for consumer i, measured in euro’s

α = intercept

𝑁!,! = Online advertisement exposure, number of times consumer i has seen the advertisement

on RTL online or YouTube in week t

𝐹!,! = Offline advertisement exposure, number of times consumer i has seen the advertisement on TV in week t

𝑁 ∗ 𝐹!,! = Multichannel advertisement exposure, in number of times consumer i has seen both and online and offline advertisement in week t

ε = error term

3.3.2 Supervised segmentation

(27)

27 2.1 𝑆!,! = 𝛼 + 𝛽!𝑁!,!+ 𝛽!𝐹!,!+ 𝛽!𝑀!,!+ 𝛽!𝐴2!,! + 𝛽!𝐴3!,!+ 𝛽!𝐴4!,! + 𝛽!𝑁 ∗ 𝐴2!,!+ 𝛽!𝑁 ∗ 𝐴3!,! + 𝛽!𝑁 ∗ 𝐴4!,! + 𝛽!"𝐹 ∗ 𝐴2!,!+ 𝛽!!𝐹 ∗ 𝐴3!,! + 𝛽!"𝐹 ∗ 𝐴4!,!+ 𝛽!"𝑀 ∗ 𝐴2!,! + 𝛽!"𝑀 ∗ 𝐴3!,!+ 𝛽!"𝑀 ∗ 𝐴4!,!+ 𝜀!,! 2.2 𝑆!,! = 𝛼 + 𝛽!𝑁!,!+ 𝛽!𝐹!,!+ 𝛽!𝑀!,!+ 𝛽!𝐿1!,! + 𝛽!𝐿2!,!+ 𝛽!𝐿3!,!+ 𝛽!𝑁 ∗ 𝐿1!,! + 𝛽!𝑁 ∗ 𝐿2!,! + 𝛽!𝑁 ∗ 𝐿3!,! + 𝛽!"𝐹 ∗ 𝐿1!,!+ 𝛽!!𝐹 ∗ 𝐿2!,! + 𝛽!"𝐹 ∗ 𝐿3!,! + 𝛽!"𝑀 ∗ 𝐿1!,! + 𝛽!"𝑀 ∗ 𝐿2!,!+ 𝛽!"𝑀 ∗ 𝐿3!,! + 𝜀!,! 2.3 𝑆!,! = 𝛼 + 𝛽!𝑁!,!+ 𝛽!𝐹!,!+ 𝛽!𝑀!,!+ 𝛽!𝐼2!,!+ 𝛽!𝐼3!,!+ 𝛽!𝐼4!,! + 𝛽!𝑁 ∗ 𝐼2!,! + 𝛽!𝑁 ∗ 𝐼3!,!+ 𝛽!𝑁 ∗ 𝐼4!,!+ 𝛽!"𝐹 ∗ 𝐼2!,!+ 𝛽!!𝐹 ∗ 𝐼3!, + 𝛽!"𝐹 ∗ 𝐼4!,! + 𝛽!"𝑀 ∗ 𝐼2!,! + 𝛽!"𝑀 ∗ 𝐼3!,! + 𝛽!"𝑀 ∗ 𝐼4!,!+ 𝜀!,! 3.3.3 Unsupervised segmentation

Furthermore, in order to test hypothesis 4, the households are grouped into homogenous groups using both a TwoStep cluster technique and a k-means cluster approach. These cluster analyses are preferred to a hierarchical approach, since the latter method computes all cluster combinations of all sizes and, hence, is not appropriate for large sample sizes like this research (Nasir 2017). Besides, the TwoStep cluster procedure performs well on categorical variables. In the cluster analyses, all nine socio-demographic variables will be included as described in table 2. The essence of both analyses is to place the households into groups on the basis of the similarity of their profiles on the combination of the variables.

The TwoStep Cluster Analysis procedure is an exploratory method designed to reveal natural groupings (or clusters) within a dataset that would otherwise not be apparent. As a distance measure, the log-likelihood measure is used, as categorical variables are included. A distance measure calculates how different clusters are from each other. This type of analysis automatically determines the optimal number of clusters.

(28)

28 The cluster solutions will be added as a moderator to the original regression. Equation 3.1 shows the cluster solutions of the TwoStep cluster analysis as a moderator on the relation between online, offline and multichannel advertisement on sales when assuming a 3-cluster solution. Equation 3.2 shows the same for the cluster solutions of the K-Means cluster analysis when assuming a 3-cluster solution.

3.1 𝑆!,! = 𝛼 + 𝛽!𝑁!,! + 𝛽!𝐹!,!+ 𝛽!𝑀!,!+ 𝛽!𝑇2!,!+ 𝛽!𝑇3!,! + 𝛽!𝑁 ∗ 𝑇2!,! + 𝛽!𝑁 ∗ 𝑇3!,!+ 𝛽!𝐹 ∗ 𝑇2!,!+ 𝛽!𝐹 ∗ 𝑇3!,!+ 𝛽!"𝑀 ∗ 𝑇2!,! + 𝛽!!𝑀 ∗ 𝑇3!,! + 𝜀!,! 3.2 𝑆!,! = 𝛼 + 𝛽!𝑁!,!+ 𝛽!𝐹!,! + 𝛽!𝑀!,!+ 𝛽!𝐾2!,!+ 𝛽!𝐾3!,!+ 𝛽!𝑁 ∗ 𝐾2!,! + 𝛽!𝑁 ∗ 𝐾3!,!+ 𝛽!𝐹 ∗ 𝐾2!,!+ 𝛽!𝐹 ∗ 𝐾3!,!+ 𝛽!"𝑀 ∗ 𝐾2!,! + 𝛽!!𝑀 ∗ 𝐾3!,! + 𝜀!,!

4. Results

4.1 Preliminary analysis

In order to check some preliminary insights, some frequencies are presented in table 3. This table shows the percentage of, among others, the number of times a sale has been made from the total of 61,928 rows of data. From this table it can be concluded that in the dataset many zero’s are present.

Table 3: Descriptive statistics of the dependent and independent variables Variable Frequency Percentage

Sales 1764 2.85%

Online advertisement 1016 1.64% Offline advertisement 15391 24.85% Multichannel

advertisement 105 0.17%

Sales Offline_adv Online_adv Multichannel_adv

Sales 1 .006 .007* .000

Offline_adv .006 1 -.009** .073**

Online_adv .007* -.009** 1 .468**

Multichannel_adv .000 .073** .468** 1

(29)

29 Furthermore, Pearson’s correlation matrix is created for the dependent and independent variables. From the correlation matrix in table 4, several significant correlations are found. Striking is that from the three independent variables, only online advertisement is significantly correlated with the sales of Coca-Cola.

4.2 Regression

In order to test hypotheses 1 and 2, a linear regression is performed. The dependent variable is

Sales and the independent variables are Online advertisement, Offline advertisement and Multichannel advertisement. In model 1, the regression will first be performed on the inpanel

dataset that is aggregated per week. In model 2, the same regression is performed, but then with the first five weeks excluded, as there was no online campaign present during those weeks. The results of model 1 can be found in table 5, the results of model 2 are presented in table 6.

Estimate Std. error t value Pr(>|t|)

Intercept 39.2061 0.6991 56.077 <2e-16 ***

Online advertisement 14.1222 5.9030 2.392 0.0167 * Offline advertisement 1.4472 0.7392 1.958 0.0503 . Multichannel advertisement -7.0832 6.3706 -1.112 0.2662 Residual standard error: 187.4 on 100629 degrees of freedom

Multiple R-squared: 9.084e-05, Adjusted R-squared: 6.103e-05 F-statistic: 3.047 on 3 and 100629 DF, p-value: 0.02746

Table 5: linear regression of model 1

Estimate Std. error t value Pr(>|t|)

Intercept 37.1459 0.8139 45.638 <2e-16***

Online advertisement 15.8689 5.6319 2.818 0.00484 ** Offline advertisement 1.7427 0.9615 1.813 0.06991 . Multichannel advertisement -7.3263 6.0833 -1.204 0.22846 Residual standard error: 178.4 on 61924 degrees of freedom

Multiple R-squared: 0.0001749, Adjusted R-squared: 0.0001265 F-statistic: 3.611 on 3 and 61924 DF, p-value: 0.01266

Table 6: linear regression when online exposure is present

First, it is important to notice that the overall model is significant for both models. As can be seen in table 5, the F statistic (3.047), with a p-value of <0.05, is significant. Regarding model 2, the F statistic of 3.611, with a p-value of <0.05 is significant. However, according to the adjusted R-squared the 2nd model, with the first five weeks excluded, is higher, thus performs

(30)

30 Looking at model 2, there is a positive and significant effect for online advertisement (b = 15.8689, p = 0.00484) at a 1% significance level. Compared to model 1, this effect has become stronger. Hypothesis 1 is confirmed, the more times a household has been exposed to the advertisement of Coca-Cola online, the higher the sales of Coca-Cola will be for this household. When the online advertisement increases with one unit, the sales will increase by 15.87 eurocents. Noteworthy is that when the variable Online advertisement is split into RTL Online and YouTube video advertisement, of those two only YT is significant (b = 21.2680, p<0.01), these results can be found in appendix C.

Offline advertisement also has a positive and significant effect on sales (b = 1.4472, p =0.0503) at a 10% significance level. As this relation has been extensively tested in literature, no hypothesis was presented. Still, the model confirms earlier findings in literature that television advertisement has a positive effect on sales. For Coca-Cola, when the offline advertisement increases with one unit, the sales will increase by 1.74 eurocents. Thus, when a household watches the television advertisement an additional time, they will buy more Coca-Cola for a value of €0.02. Regarding the multichannel advertisement, hypothesis 2 is not confirmed as the variable is not significant. This means that there is no synergy effect, when a household has seen the advertisement both online and offline it has no effect on the sales of Coca-Cola.

Moreover, a third model is estimated, where all nine socio-demographic variables are included besides the independent variables. As these variables are categorical, they will be included as a dummy variable. This regression is performed on the dataset where the first 5 weeks are excluded as it was concluded this model performs best. The output of model 3 can be found in table 7. Most striking is that offline advertisement is no longer significant and the effect of online advertisement has become less strong. The variables Loyalty, Age and the

(31)

31

Coefficients: (2 not defined because of singularities)

Estimate Std.Error tvalue Pr(>|t|) (Intercept) -19.9178 10.6230 -1.875 0.060802 . Online_advertisement 9.6655 5.4208 1.783 0.074587 . Offline_advertisement 0.9697 0.9266 1.046 0.295340 Multichannel_advertisement -8.3117 5.8314 -1.425 0.154067 HML-value Low 8.7228 1.7475 4.992 6.00e-07*** HML-value Medium 34.4633 2.0146 17.106 <2e-16*** HML-value High 151.5718 2.3218 65.283 <2e-16*** Age 25-45 13.9261 7.6209 1.827 0.067651 . Age 45-65 12.8342 7.5661 1.696 0.089838 . Age >65 14.6388 7.6489 1.914 0.055645 . Income 1500-2500 2.8356 2.1320 1.330 0.183501 Income 2500-3500 7.0019 2.4783 2.825 0.004725** Income >3500 21.0932 3.0117 7.004 2.52e-12*** Alone 4.2427 3.6926 1.149 0.250575

Family with young children -23.0226 7.1670 -3.212 0.001317** Family with middle age children -32.0674 7.2557 -4.420 9.90e-06*** Family with teenage children -36.3649 7.7905 -4.668 3.05e-06*** 1 child 20.3860 7.1441 2.854 0.004325** 2 children 30.6346 6.5717 4.662 3.14e-06*** 3 children 20.1175 7.0755 2.843 0.004467** >4 children NA NA NA NA HHsize: 2 people 7.7170 3.1515 2.449 0.014342* HHsize: 3 people -1.7052 3.0612 -0.557 0.577505 HHsize: 4 people NA NA NA NA Education: HaHAVO/VWO 10.7122 6.8410 1.566 0.117383 Education: HbHBO/WOkandidaats 13.0827 6.2700 2.087 0.036934* Education: HwWOdoktoraal 7.4050 6.7635 1.095 0.273588 Education: LbLBO 4.0920 6.3597 0.643 0.519956 Education: MaMAVO 9.8381 6.4139 1.534 0.125070 Education: MbMBO 11.4759 6.2362 1.840 0.065742 . Size_of_town20000-50000inw -3.4190 1.8238 -1.875 0.060847 . Size_of_town50000-100000inw -0.3544 2.0385 -0.174 0.861974 Size_of_townTot20000inw -4.9930 2.4224 -2.061 0.039285* District: Noord -8.1213 3.0005 -2.707 0.006798** District: Oost -8.8090 2.5812 -3.413 0.000643*** District: Restwest -5.0282 2.4549 -2.048 0.040539* District: Zuid 1.4106 2.5278 0.558 0.576819 Residual standard error: 171 on 61894 degrees of freedom

Multiple R-squared: 0.08204, Adjusted R-squared: 0.08155 F-statistic: 167.6 on 33 and 61894 DF, p-value: < 2.2e-16 Table 7: regression with all the socio-demographic variables

(32)

32 advertisement on sales. Besides, the model is also tested on multicollinearity using the variance inflation factor (VIF). The VIF should not exceed the threshold of 5 or 10. The test reveals that the VIF values do not exceed the threshold of 10 (Appendix D), thus, multicollinearity is not an issue in this model.

4.3 Supervised segmentation

Literature has shown that loyalty, age and income most likely have an effect on the type of media channel used by households as well as on their purchasing behaviour. Therefore, it is likely that these variables will enhance the effect of online and offline advertisement on the sales of Coca-Cola. Using a moderator, the three variables will one-by-one be tested on the linear regression. This is done one at a time in order to see the individual effects. With the supervised classification procedure hypotheses 3a, b and c will be tested. The results of these analyses can be found in table 8, 9 and 10. The overall model is significant for all three models, where model 4 has the highest adjusted R2 of 7.78%.

Estimate Std.Error t-value Pr(>|t|)

(Intercept) 7.0832 1.4266 4.965 6.88e-07*** Online advertisement -2.1228 12.9516 -0.164 0.869806 Offline advertisement -0.0281 1.7121 -0.016 0.986908 Multichannel advertisement 5.6687 15.4876 0.366 0.714357 Low 7.3245 1.9443 3.767 0.000165*** Medium 32.5761 2.2238 14.649 <2e-16*** High 148.5587 2.5323 58.665 <2e-16***

Online advertisement: Low 3.6066 15.8917 0.227 0.820463 Online advertisement: Medium -6.3709 16.9282 -0.376 0.706658 Online advertisement: High 34.8400 17.3105 2.013 0.044155* Offline advertisement: Low 0.4288 2.3335 0.184 0.854204 Offline advertisement: Medium 1.4857 2.6082 0.570 0.568928 Offline advertisement: High 1.6592 2.9436 0.564 0.572980 Multichannel advertisement: Low -7.8258 17.6370 -0.444 0.657253 Multichannel advertisement: Medium -8.2561 21.0594 -0.392 0.695032 Multichannel advertisement: High -39.5678 20.1243 -1.966 0.049283* Residual standard error: 171.3 on 61912 degrees of freedom

Multiple R-squared: 0.07808, Adjusted R-squared: 0.07785 F-statistic: 349.5 on 15 and 61912 DF, p-value: < 2.2e-16 Table 8: model 4 - regression with loyalty as moderator

(33)

33 consumers that are not loyal. For that group, when a consumer sees the advertisement online an additional time, the sales will increase with €0.07. Consequently, for every time the online advertisement is seen by very loyal consumers, the sales increase with €0.35 compared to consumers that are not loyal. Thus, the total effect on sales for the interaction effect of high loyalty and online advertisement is €0.42.

The moderating effect of a high level of loyalty with multichannel advertisement is significant (b = -39.5678, p<0.05). However, this effect is negative, thus, when a consumer has seen the advertisement both online and offline an additional time, the sales will be decreased by €0.40 compared to when a consumer is not loyal. As a positive relationship was predicted, hypothesis 3a III is partially confirmed. Hypothesis 3a II is rejected, as loyal consumers have no effect on the relation between offline advertisement and sales.

Estimate Std. Error t-value Pr(>|t|)

(Intercept) 18.559 8.629 2.151 0.0315* Online_advertisement 3.065 45.771 0.067 0.9466 Offline_advertisement -3.421 15.813 -0.216 0.8287 Multichannel_advertisement -22.798 117.575 -0.194 0.8463 25-45 23.672 8.746 2.707 0.0068** 45-65 19.741 8.711 2.266 0.0234* >65 7.701 8.819 0.873 0.3825 Online_advertisement:25-45 2.441 46.456 0.053 0.9581 Online_advertisement:45-65 24.264 46.609 0.521 0.6027 Online_advertisement:>65 9.768 51.574 0.189 0.8498 Offline_advertisement: 25-45 9.482 15.930 0.595 0.5517 Offline_advertisement: 45-65 3.085 15.869 0.194 0.8459 Offline_advertisement: >65 6.322 15.946 0.396 0.6918 Multichannel_advertisement: 25-45 11.508 118.100 0.097 0.9224 Multichannel_advertisement: 45-65 16.143 117.828 0.137 0.8910 Multichannel_advertisement: >65 2.233 120.179 0.019 0.9852 Residual standard error: 178.3 on 61912 degrees of freedom

Multiple R-squared: 0.001582, Adjusted R-squared: 0.00134 F-statistic: 6.54 on 15 and 61912 DF, p-value: 3.084e-14 Table 9: model 5 - regression with age as moderator.

(34)

34

Estimate Std. Error t-value Pr(>|t|) (Intercept) 25.0140 1.9286 12.970 <2e-16*** Online advertisement -9.1451 12.8982 -0.709 0.47831 Offline advertisement -0.4281 2.0096 -0.213 0.83129 Multichannel advertisement 0.5417 10.2201 0.053 0.95773 1500-2500 5.7242 2.3035 2.485 0.01296* 2500-3500 18.0931 2.4683 7.330 2.33e-13*** >3500 35.0482 2.9861 11.737 <2e-16*** Online advertisement:1500-2500 42.2668 15.4845 2.730 0.00634** Online advertisement:2500-3500 21.4079 16.6653 1.285 0.19895 Online advertisement:>3500 8.9236 23.3233 0.383 0.70201 Offline advertisement: 1500-2500 2.6066 2.4935 1.045 0.29586 Offline advertisement: 2500-3500 6.1764 2.7716 2.228 0.02585* Offline advertisement: >3500 0.4555 3.7103 0.123 0.90229 Multichannel advertisement: 1500-2500 -3.0949 15.1652 -0.204 0.83829 Multichannel advertisement: 2500-3500 -17.6561 15.2005 -1.162 0.24543 Multichannel advertisement: >3500 -2.7788 29.3468 -0.095 0.92456 Residual standard error: 178 on 61912 degrees of freedom

Multiple R-squared: 0.004355, Adjusted R-squared: 0.004113 F-statistic: 18.05 on 15 and 61912 DF, p-value: < 2.2e-16

Table 10: model 6 - regression with income as moderator

Income has a significant moderating impact (b = 42.2668, p<0.01) on the relation between online advertisement and sales when a consumer earns between 1,500-2,500 euro per month compared to people with an income below 1,500. Furthermore, the interaction of offline advertisement with people with an income between 2,500 and 3,500 euro is significant (b = 6.1764, p<0.05). So, people that have an income between 1,500-2,500 per month are more reactive to online advertisement, while people with an income between 2,500 and 3,500 are more reactive to offline advertisement compared to people who have an income below 1,500 euro per month. Hypothesis 3c I and II are hereby confirmed, while hypothesis 3c III is rejected, as there is no significant interaction effect between income and multichannel advertisement.

(35)

35 is not significant means that online advertisement does not have an effect when loyalty is zero, meaning a consumer is not loyal.

4.4 Unsupervised segmentation

In this section two unsupervised cluster analyses will be performed, namely the TwoStep and the K-means cluster procedures. These segments are made without any prior knowledge with the goal to make consumer groups with a similar socio-demographic profile. So, all nine socio-demographic variables are included in these analyses. A consumer profile will be sketched for every cluster. Moreover, in order to test hypothesis 4, these segments will be used in the regression as a moderator to see if they have an effect on the independent and dependent variables.

4.3.1 TwoStep clustering

The TwoStep clustering procedure resulted in a three-cluster solution. This cluster solution has a fair (between 0.2 and 0.5) cluster quality according to the silhouette measure of cohesion and separation when using the BIC as a cluster criterion as well as when using the AIC as a cluster criterion. According to Norusis (2011), a score above 0.0 is valid, as this would ensure that the within-cluster distance and between-cluster distance is valid among the different variables. Another validation step is that all the variables have a significant contribution to the cluster groups. Furthermore, according to the input predictor importance, variables with a low ranking (<0.2) should be interpreted carefully. The variables district, town size and education have similar responses across clusters. This can also be seen from the absolute distributions. Finally, the sample is randomly split in two and the analysis is performed again. If the same number of clusters is found in both the final and split solutions, and the characteristics and significance variables of the solutions are similar, then validation is confirmed (Hair et al. 2006). The results of the split can be found in appendix E and are compared to the final solution. Indeed, the results of the split solutions are almost identical to the final solution on the complete dataset. The only noteworthy difference is that the

HML-value is less important (<0.2 on predictor importance). Based on the above-mentioned

(36)

36 Profile

Cluster 1, called “Pairs”, is the largest group with 46.4% of all the households included. These households consist of only 2 people with no children. The income of this group consists of the higher income classes.

Cluster 2, called “Family”, has 30% of the households. This group consists of families with children, thus, the households with more than three people. They mainly include people aged 25-45 with a slightly higher income.

Cluster 3, the “Singles”, consists of 23.6% of the households. Within this segment, all age groups are represented. However, everyone lives alone and thus, has a household size of one. The average income of a household is low, even though the level of education is spread evenly. From this group, 43.6% is not a loyal customer.

Estimate Std.Error t-value Pr(>|t|)

(Intercept) 27.367 1.657 16.518 <2e-16***

Online advertisement 9.413 13.348 0.705 0.4807

Offline advertisement -1.233 1.897 -0.650 0.5157

Multichannel advertisement -3.548 10.302 -0.344 0.7305

TwoStep cluster 2 12.864 2.233 5.760 8.44e-09***

TwoStep cluster 3 12.680 2.044 6.205 5.52e-10***

Online advertisement: TwoStep cluster 2 7.361 15.313 0.481 0.6307 Online advertisement: TwoStep cluster 3 4.444 17.648 0.252 0.8012 Offline advertisement: TwoStep cluster 2 5.205 2.753 1.890 0.0587. Offline advertisement: TwoStep cluster 3 3.466 2.326 1.490 0.1363 Multichannel advertisement: TwoStep cluster 2 0.658 14.839 0.044 0.9646 Multichannel advertisement: TwoStep cluster 3 -10.257 15.105 -0.679 0.4971 Residual standard error: 178.3 on 61916 degrees of freedom

Multiple R-squared: 0.001438, Adjusted R-squared: 0.00126 F-statistic: 8.105 on 11 and 61916 DF, p-value: 2.505e-14 Table 11: TwoStep cluster as moderator

(37)

37 online advertisement and sales. When households with this profile sees the commercial online an additional time, the sales of Coca-Cola will increase with €0.17. Furthermore, also offline advertisement is significant (b = 3.972, p<0.1) though at a 10% significance level. This means that families with children, aged 25-45 and with a slightly higher income will be influenced in their shopping behaviour for Coca-Cola based on their exposure to online and offline advertisement.

Estimate Std. Error t-value Pr(>|t|) (Intercept) 40.231 1.553 25.913 <2e-16*** Online advertisement 16.774 7.778 2.157 0.0311* Offline advertisement 3.972 2.068 1.920 0.0548 . Multichannel advertisement -2.890 11.071 -0.261 0.7940 Residual standard error: 184.8 on 18556 degrees of freedom

Multiple R-squared: 0.0004853, Adjusted R-squared: 0.0003237 F-statistic: 3.003 on 3 and 18556 DF, p-value: 0.02918

Table 12: regression for consumer group 2

4.3.2 K-means

Based on the TwoStep cluster procedure, the k-means cluster method is performed with three clusters. The optimum converge was achieved in nine iterations, with a minimum distance across initial clusters of 7.746. Across all these variables, the differences among segments are statistically significant. The household composition and the number of children provide the greatest separation between clusters, indicated by a very high F statistic in the ANOVA table, found in Appendix G. Instead of relative group information, only absolute numbers are provided; meaning the numbers are rounded off to the largest group within the variable category. Table 1 in Appendix G displays the final cluster centers that represent the characteristics of the typical case for each cluster.

Profile

(38)

38 Cluster 2, “Families”, has 29.2% of the households. In this cluster, the majority of people has an age between 25-45 and an income of 2,500-3,500 euro per month. They are mostly families with 2 children.

Cluster 3 “Low educated pairs”, has 24.6%. This segment is very similar to the first group. The only variable where this group is different is that the majority of the households followed a different education, namely HAVO/VWO.

All clusters have the same group of people for the variables HML-value, Size of town and

District. Most likely this means that all the households are evenly distributed over the

segments based on these variables. A regression is performed with the cluster membership variable added as a moderator for the types of advertisement, the results can be found in table 13. This shows that cluster 2 and cluster 3 have a significantly different effect on the sales than cluster 1.

Estimate Std. Error t-value Pr(>|t|)

(Intercept) 36.982 1.189 31.101 <2e-16*** Online advertisement 14.953 9.924 1.507 0.1319 Offline advertisement 2.286 1.450 1.577 0.1147 Multichannel advertisement -9.548 8.544 -1.117 0.2638 K-means cluster 2 3.503 1.929 1.816 0.0694 . K-means cluster 3 -4.045 2.037 -1.986 0.0470 * Online advertisement: K-means cluster 2 2.069 12.469 0.166 0.8682 Online advertisement: K-means cluster 3 -20.074 20.318 -0.988 0.3232 Offline advertisement: K-means cluster 2 2.212 2.494 0.887 0.3751 Offline advertisement: K-means cluster 3 -2.297 2.212 -1.038 0.2992 Multichannel advertisement: K-means cluster 2 5.926 13.701 0.433 0.6654 Multichannel advertisement: K-means cluster 3 6.470 18.004 0.359 0.7193 Residual standard error: 178.3 on 61916 degrees of freedom

Multiple R-squared: 0.0006464, Adjusted R-squared: 0.0004689 F-statistic: 3.641 on 11 and 61916 DF, p-value: 3.524e-05

Table 13: regression with K-means clusters as moderators

(39)

39

Table 14: regression k-means clusters

5. Discussion and conclusion

Literature has shown that advertisement contributes positively to the sales of a product. In the last few years, many new channels have come up where an advertisement can be displayed. People might react differently to these different channels. In this study, it has been investigated how segments of consumer groups influence the relation between online and/or offline advertisement and their sales. For this study, the well-known brand Coca-Cola has been taken as the object of research. An overview of the hypotheses and the results of this paper are presented in table 15.

Hypothesis Confirmation Comments

Offline video advertisement leads to

higher sales Supported

H1: Online video advertisement has a

positive effect on sales. Supported

H2: The effect of offline and online advertisement on sales together is stronger than their separate effects.

Not supported H3a: Loyal consumers have positive

effect on the effect of

I) online advertisement, II) offline advertisement

III) the combination of online and offline advertisement

on sales.

Partially supported

I) Supported

II) Not supported III) Partially supported

For 3a III, a negative effect was found

H3b: The age of different consumer groups has a moderating effect on the effect of

I) online advertisement II) II) offline advertisement III) the combination of online and

offline advertisement on sales.

Not supported

H3c: Income differences have a moderating effect on the effect of

I) online advertisement

Partially supported

I) Supported

Cluster 1 “Pairs” Cluster 2 “Families” Cluster 3 “Low educated pairs”

(40)

40 II) offline advertisement

III) the combination of online and offline advertisement

on sales.

II) Supported

III) Not supported H4: Natural clusters defined by data will

enhance the relation between online and/or

offline advertisement and sales Supported

Table 15: Hypotheses overview

5.1 Conclusions

The results confirm, first of all, a positive relation between television advertisement and sales. Though this effect is not strong and when socio-demographic characteristics are involved, the effect of television advertisement is not present anymore. A possible explanation is that the base sales of Coca-Cola are so strong, that television advertisement does not contribute to the sales. Also, demographic characteristics, like income and the district where a household lives, have a larger impact on the sales of Coca-Cola. This indicated that the segmentation might lead to interesting results. Furthermore, the results show that hypothesis 1 is confirmed, when the video commercial is shown on online channels as YouTube and RTL Online, it has a positive effect on sales. This effect is substantially stronger than the effect of the commercial shown on television. This could be explained by the fact the online media has become more popular over the years. Also, there are more mediums where online advertisements can be displayed, from mobile phones to tablets and laptops.

Referenties

GERELATEERDE DOCUMENTEN

In this study we expected the mediators product involvement and number of connections to be mediating the effect of consumer innovativeness on the level of ingoing

Lastly, I expand the analysis of structural and photometric parameters to encom- pass all NSCs and galaxies with measured parameters in the NGVS, and combine this with the

We use the BAHAMAS (BAryons and HAloes of MAssive Systems) and MACSIS (MAssive ClusterS and Intercluster Structures) hydrodynamic simulations to quantify the impact of baryons on

issue of this ambivalence, it is important to address the roles that health care professionals.. working in the ED are expected to fill regarding this intervention process. The

III) the combination of online and offline advertisement on sales. H3b: The age of different consumer groups has a moderating effect on H3c: Income differences have a

If this is the case, it is important to ascertain which combination of cross-media marketing activities might have the greatest influence on the purchase behavior of

H1a: The exposure to offline (i.e. print, radio, television and folder) - and online advertisement (i.e. banner advertisement) has a positive effect on sales in general... H1b:

While this study builds on previous literature on online consumer reviews by studying real name exposure, spelling errors, homophily and expert status (Schindler