• No results found

AN INVESTIGATION INTO THE DETERMINANTS OF ADOPTION TIMING AND POST-ADOPTION USAGE OF A MOBILE APPLICATION

N/A
N/A
Protected

Academic year: 2021

Share "AN INVESTIGATION INTO THE DETERMINANTS OF ADOPTION TIMING AND POST-ADOPTION USAGE OF A MOBILE APPLICATION"

Copied!
79
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

AN INVESTIGATION INTO

THE DETERMINANTS OF

ADOPTION TIMING AND

POST-ADOPTION USAGE OF

A MOBILE APPLICATION

A study in the context of mobile news applications

(2)

1

An investigation into the determinants of adoption

timing and post-adoption usage of a mobile

application.

A study in the context of mobile news applications.

University of Groningen

Faculty of Economics and Business

MSc Marketing – Marketing Management & Marketing Intelligence Master Thesis

Completed on 11-01-2016

1st Supervisor: dr. H. Risselada 2nd Supervisor: dr. ir. M.J. Gijsenberg

This master thesis is a product of Wesley J.W. Klabbers Postal Address: 1e Weerdsweg 118B

7412WX Deventer The Netherlands Phone Number: + 31 6 31 56 55 90 Email Address: wjw.klabbers@gmail.com

(3)

2

Management Summary

This thesis investigates the influence of several variable sets on the adoption timing and subsequent post-adoption usage of a news application. The first variable set refers to relationship characteristics between the subscriber and provider of the newspaper. The second variable set refers to the user-generated review characteristics of the news application. In addition, the research controls for several aggregate-level demographic variables.

The research utilizes 2 distinct modelling techniques in order to obtain results. In order to assess the effect of predictors on adoption timing a Cox Proportional Hazard Model with Time-Dependent Covariates is used. Alternatively, in order to model the post-adoption usage a Multiple Linear Regression Model is used.

The results of this study indicate that subscribers who have had long-lasting relationships with the newspaper in question and have displayed high prior usage levels for previous platforms such as a website and ePaper are relatively quicker to adopt than new subscribers with relatively little prior news consumption. Moreover, subscribers living in a neighbourhood with an average income are also quicker to adopt, while subscribers in neighbourhoods with a large percentage of 15-24 year olds show the highest increase in the rate of adoption. This study also finds that long-lasting relationships and heavy prior usage have positive effects on the post-adoption usage of the news app. Furthermore, consumers with high mobility levels display higher news app usage levels as do subscribers who live in a neighbourhood with primarily average incomes.

All the findings derived in this study have been subject to robustness checks and all model validity assumptions have been fulfilled. This means that newspaper companies, as well as companies specializing in similar products (e.g. eBooks or magazines) can use these findings in a practical manner by constructing marketing programs based on the consumer profile just established for a relatively quick diffusion of mobile applications or by stimulating post-adoption usage which is ultimately beneficial to any company as usage brings about revenue. This thesis is subject to several limitations with regard to the data used and the analysis conducted. The data is subject to potential bias towards those subscribers that had a subscription prior to the start of the research period. The analysis does not account for potential endogeneity among the predictor variables used as the models are estimated separately. Lastly, the regression model does not have high explanatory power in terms of R² indicating that important predictors remain unidentified.

(4)

3 Table of Content Management Summary ... 2 Table of Content ... 3 1. INTRODUCTION ... 6 2. THEORETICAL FRAMEWORK ... 9

2.1. The Core: Innovation Diffusion Theory... 9

2.2. Applying Context to the Core: Technology Acceptance Model ... 10

2.3. From Context to Specifics: The Mobile Application Domain ... 11

2.3.1. Mobile Applications and Relationship Characteristics ... 12

2.3.2. Mobile Applications and User Generated Characteristics ... 12

2.3.3. Mobile Applications and Demographics ... 14

2.4. Extension of the Literature: Post-Adoption Usage ... 14

2.5. Going Concrete: Conceptual Model and Hypotheses ... 15

2.5.1. The Effect of Adoption Timing on Post-Adoption Usage ... 16

2.5.2. The Effects of Relational Variables ... 17

2.5.3. The Effects of User Generated Review Variables ... 20

2.5.4. The Effects of Demographic Characteristics ... 21

3. RESEARCH DESIGN ... 23

3.1. Data Sources ... 23

3.1.1. Newspaper Subscriber Database ... 23

3.1.2. Google Analytics Database ... 24

3.1.3. Bisnode Geopostcode Database ... 25

3.1.4. Review Database ... 25

3.1.5. Final Data and Exclusion of Subscribers ... 26

3.2. Data Descriptives ... 27

3.3. Research Method ... 28

4. SURVIVAL MODELS FOR ADOPTION TIMING ... 30

4.1. Estimation and Validation ... 30

4.2. Results of the Finalized Survival Model ... 36

5. REGRESSION MODELS FOR POST-ADOPTION USAGE ... 38

5.1. Estimation and Validation ... 38

5.2. Results of the Finalized Regression Model ... 45

6. DISCUSSION OF THE FINDINGS ... 47

7. MANAGERIAL IMPLICATIONS ... 50

(5)

4

References ... 53

Appendices ... 57

Appendix A – Data Descriptives ... 57

Appendix B – Cox Proportional Hazard Model ... 60

Appendix C – Linear Regression Model ... 66

(6)

5

List of Figures and Tables

Figure 1. Conceptual Model. ... 16

Figure 2. Survival Function of Model SM2E. ... 35

Figure 3. Q-Q plot for Model RM1B. ... 40

Figure 4. Residual versus Predicted for Model RM1B. ... 40

Figure 5. Q-Q plot for Model RM1C. ... 41

Figure 6. Residuals versus Predicted for Model RM1C. ... 41

Figure 7. Q-Q Plot for Model RM1D. ... 42

Figure 8. Residual versus Predicted for Model RM1D... 42

Figure 9. Q-Q Plot for Model RM1E. ... 44

Figure 10. Residual versus Predicted for Model RM1E. ... 44

Figure 11. Distribution of Post-Adoption Usage. ... 58

Figure 12. Distribution of Relationship Depth. ... 58

Figure 13. Distribution of Adoption Timing. ... 59

Figure 14. Distribution of Lag Number of Reviews. ... 59

Figure 15. Distribution of Lag Overall Rating... 59

Figure 16. Several Dfbeta plots for Model SM2E. ... 64

Figure 17. Martingale Residual versus Interaction P_15_24:Stop for Model SM2E. ... 65

Figure 18. Residual versus Predictor (RL, RD, CM) plots for Model RM1B. ... 67

Figure 19. Residual versus Predictor (CM, LN_RD, RL) plots for Model RM1C... 69

Figure 20. Residual versus Predictor (LN_RD, RL, CM) plots for Model RM1E. ... 71

Table 1. Operationalization of Research Variables. ... 26

Table 2. Cox Proportional Hazard Model SM1A (Ev. = 1170 / Cens. = 5832). ... 30

Table 3. Cox Proportional Hazard Model SM2E (Ev. = 1180 / Cens. = 5832). ... 35

Table 4. Regression Model RM1A (N = 1107 / R² = .037). ... 38

Table 5. Normality and Homoscedasticity Test for Model RM1B. ... 40

Table 6. Normality and Homoscedasticity Test for Model RM1C. ... 41

Table 7. Normality and Homskedasticity Tests for Model RM1D. ... 43

Table 8. Regression Model RM1E (N = 1081 / R² = .061) / Dependent variable is log transformed / Ex. Outliers. ... 43

Table 9. Normality and Homoscedasticity Tests for Model RM1E. ... 45

Table 10. Research Outcomes in terms of Hypotheses... 48

Table 11. Data Descriptives of Variables. ... 57

Table 12. Cox Proportional Hazard Model SM1B (Ev. = 1170 / Cens. = 5832). ... 60

Table 13. Cox Proportional Hazard Model SM1C (Ev. = 1181 / Cens. = 5832). ... 61

Table 14. Cox Proportional Hazard Model SM2B (Ev. = 1181 / Cens. = 5832). ... 62

Table 15. Cox Proportional Hazard Model SM2C (Ev. = 1181 / Cens. = 5832). ... 63

Table 16. Cox Proportional Hazard Model SM2D (Ev. = 1180 / Cens. = 5832). ... 64

Table 17. Model Fit Statistics for Models SM2B, SM2C, SM2D and SM2E. ... 64

Table 18. Regression Model RM1B (N = 1107 / R² = .035). ... 66

Table 19. Regression Model RM1C (N = 1107 / R² = .065) / Dependent variable is log transformed. ... 68

(7)

6

1. INTRODUCTION

Over the past few decades the advances in information and communications technology (ICT) have led to changes in many environments. For instance, in just a few years’ time the mobile phone has gone through a whole range of changes and nearly every new phone brought to market has new innovative features and gives users more capabilities. In turn, mobile providers are upgrading their networks at a rapid pace (e.g. 2G, 3G, 4G) which allows users to download more things at an even faster pace. These developments allow users to access whatever content they wish anywhere, anytime in whatever manner. Moreover, the focus on mobile devices and consumer mobility has led to more and more things being digitalized in the form of mobile applications – or apps – as they are often called in layman’s terms. Such apps can be accessed on tablets, mobile phones, and laptops. There are apps for shopping, social media, product comparisons, magazines, online games and other entertainment. Whatever you think of doing today, there is probably an app to help you do it. Research in this area is highly interesting as the developments in and popularity of mobile applications are going at such a fast pace that some of the traditional physical counterparts cannot keep up and are in decline. One particular example of this phenomenon is studied in this paper which considers the mobile news app that provides the user with up-to-date news. The advancements in ICT and the up rise of mobile news apps have profound effects for the newspaper industry.

In order to set the stage, it is important to identify exactly what a newspaper entails. Merriam-Webster (2015) defines a newspaper as “a set of large sheets of paper that have news

stories, information about local events, advertisements, etc., and that are folded together and sold every day or every week”. However, newspapers are more than just a set of large sheets of

paper as they have an essential role in supplying information to the general population (Shaker, 2014). Newspapers fulfil the watchdog function – one of the oldest principles in journalism – which refers to a newspaper’s ability to scrutinize government bodies and report on potential problems or illegal actions. The watchdog function ensures that people are well-informed when it comes to their government and its actions (Jeffres & Kumar, 2014), which is why it is unfortunate that traditional daily newspapers are in decline (Johnson, Goidel, & Climek, 2014; Pew Research Center, 2015). This is due to the come-uppance of online news sources which are more accessible and more frequently updated. Some refer to these developments as a state of crisis in traditional newspaper journalism (Fortunati, Taipale, & Farinosi, 2015).

(8)

7 adoption of such a mobile news app. There are many studies that look at general aspects of adoption, such as drivers and inhibitors (Rogers, 1995), but also at the adoption process. There is also literature on the adoption of mobile apps in particular (De Marez, Vyncke, Berte, Schuurman, & De Moor, 2007; Shen, 2015), but empirical research on the adoption of news apps is rather scarce due to its specificity. A news app is a rather static provider of information as news items are posted in a text format which may occasionally be supported by images or videos. News apps can be considered as a form of mass media, which is characterized by the fact that there is no direct contact between sender and receiver, or between different receivers. Moreover, the content shown is identical for all users and receivers of information cannot actively use the content or influence it in any way. That is quite different from other apps where users are often required to interact with the app content, and in turn the app content will react to the users input. This may even occur in a joint setting where multiple users from different mobile devices interact with the same app content as well as with each other which makes such apps far more complex (e.g. games or group apps). The fact that news apps are so focused on one-way information provision may be a reason as to why they are hardly mentioned in the existing research on mobile apps. Nonetheless, many newspapers are considering or have already committed to the use of mobile apps in order to create a lasting online presence. This is illustrated by a quick search in the Google Play store where quite a few large Dutch newspapers are present with over 1 million downloads (e.g. Telegraaf, Volkskrant, AD), but also more and more regional newspapers with over 10.000 downloads (e.g. Tubantia, de Stentor, Noordhollands Dagblad). So whilst news apps are becoming more common, little research is performed as to why people adopt, and if so, how fast they tend to adopt, which indicates a gap in the literature. However, one should not only focus on the adoption itself, as that is only one side of the coin. The other side of the coin is post-adoption usage, which is an under-researched aspect in the adoption literature (Prins, Verhoef, & Franses, 2009). Insight into the post-adoption usage is crucial when ascertaining the sustainability of any app and could indicate which type of app users may make most use of the app.

Therefore, this thesis will focus on answering the following problem statement: “What

drives the adoption timing of mobile news applications as well as the post-adoption usage of such applications?”

(9)

8 the newspaper in question will be referred to as “Newspaper A”. In order to answer the research question two separate models will be estimated. The first model is a Cox Proportional Hazard Model which allows for the modelling of adoption timing while taking right-censored observations into account, i.e. those subscribers that do not adopt within the set timeframe, as well as the effects of independent variables. The second model is a Linear Regression Model which models the post-adoption usage by means of several independent variables that are known to be significant according to the literature to date. In doing so, this paper aims to shed light onto the drivers of adoption timing of a mobile news app as well as the effects of those drivers on the subsequent post-adoption usage. This will contribute to the state of the art literature on adoption as research on the adoption of mobile news apps is scarce. Moreover, the link between adoption timing and post-adoption usage is also known to be under-researched, and this study will highlight this from the perspective of the news app.

This study has a wide range of managerial implications as the results may be indicative for other industries with similar apps for consumer use (e.g. magazines, eBooks), but also for companies launching new products or services as a complement/substitute for their current offering. For instance, companies could ascertain which characteristics of adopters are key in adoption timing of their app or service, while also being able to determine how both the characteristics and timing may influence the post-adoption usage.

Consequently, the findings of this study will help newspapers, for instance, in constructing more effective news app diffusion programs by targeting those subscribers with long relationship lengths, heavy prior usage and in average-income neighbourhoods with young people.

By doing so, this study also has social relevance in the sense that it will support newspapers in increasing their reach. This would translate to more people reading news items which makes them more knowledgeable and more aware of what is going on in the world around them, thus enhancing the critical role newspapers have as watchdogs and providers of information.

(10)

9

2. THEORETICAL FRAMEWORK

In order to get a complete picture on the state of the art literature on adoption, this literature review will first discuss the core papers on adoption in general. Next, the adoption concept is focused by considering technological innovations. This, in turn, sets the stage for zooming in on the mobile application domain, which is the specific setting in which the study takes place. Afterwards, this knowledge core is extended by incorporating the post-adoption usage, which reflects the other side of the coin, which is often not considered in adoption research. This chapter concludes with the creation of a conceptual model and appropriate hypotheses based on the discussed literature.

2.1. The Core: Innovation Diffusion Theory

There are two terms in the existing literature on adoption that need to be thoroughly defined as they are used rather frequently, namely diffusion and adoption. Diffusion can be defined as “the process by which an innovation is communicated through certain channels over time among the members of a social system” (Rogers, 2003, p. 5). Adoption, in the most classic of definitions, refers to “making full use of a new idea as the best course of action available” (Rogers & Shoemaker, 1971, as cited in Eveland, 1979, p. 2). When considering innovations, adoption can be considered as the acquisition of a new product, service or behaviour.

The most fundamental theory within the adoption domain is the Innovation Diffusion Theory (IDT) of Rogers (1962), which has been the basis for many studies to date. The IDT deals with the rate of adoption of innovations which represents “the relative speed by which people within a social system adopt a particular innovation” (Rogers, 1995, p. 37). The product characteristics that are relevant for the rate of adoption are identified by Rogers (1995) as relative advantage, compatibility, complexity, trialability and observability. These characteristics have different relationships with the rate of adoption as increases in relative advantage, compatibility, trialability, and observability lead to an increased rate of adoption, while an increase in complexity leads to a decrease in the rate of adoption.

(11)

10 adopter classification follows a classic normal distribution in which each group corresponds to a given percentage of the population within the social system. Using the mean time of adoption and the standard deviation 5 adopter classes are defined: innovators, early adopters, early majority, late majority and laggards (Rogers, 1962). This classification has been subject to scrutiny due to Rogers’ assumption that classes are a set percentage and do not differ across products (Bass, 1969; Mahajan, Muller, & Srivastava, 1990). Nonetheless, this classification – as well as others that have been created over the past few decades – are widely used throughout adoption studies as different classifications are known to include consumers with different demographic and psychological characteristics (Bohlen & Beal, 1957; Kavak & Demirsoy, 2009; Rogers, 2003).

While the work of among others Rogers (1995) and Bass (1969) is very comprehensive, it concerns all types of innovations and does not allow for specific types of innovations within certain contexts. It does, however, provide a fundamental insight into the adoption literature and the concepts that are important.

2.2. Applying Context to the Core: Technology Acceptance Model

(12)

11 As illustrated, the literature on adoption can be extremely broad and encompasses a lot of scenarios dealing with different types of items one can adopt (e.g. technologies, products, services, and information systems), but also the different facets of adoption (e.g. rate of adoption, timing of adoption, and intention to use). In light of the subject at hand, which concerns the adoption of a news app for mobile phones and tablets, it is therefore crucial to zoom in on the specific area concerning the adoption of mobile apps.

2.3. From Context to Specifics: The Mobile Application Domain

De Marez, Vyncke, Berte, Schuurman and De Moor (2007) highlight the fact that a mobile revolution has taken place in recent years which has led to ICT and media becoming more mobile. This notion is supported by Shen (2015) who states that in just a few years’ time the smartphone market has overtaken the traditional phone market and the app services in the market provide companies with vast opportunities. An important statement that should be considered in light of mobile apps is that apps are different from one another and should not be treated as completely equal in any research (Verkasalo, López-Nicolás, Molina-Castillo, & Bouwman, 2010). This is also emphasized by Liu and Brandyberry (2014) who state that prevalent theories in adoption literature, such as the previously mentioned IDT and TAM, require adaptation in order to account for the unique issues that exist within the adoption of mobile apps. These issues may consist of a lack of time on the users’ end, inexperience in evaluating products, the type of app in question, and there may even be other factors in play that have not come to light in the research to date.

The current literature on mobile apps is still in its infancy and much of it is based either on health-related mobile applications or on M-commerce studies, which focuses on the purchase and sale of products through mobile devices, which includes, but is not limited to, the acquisition of mobile phone apps.

(13)

12 to demographic variables. Considering the fact that demographics are known to be of influence in adoption literature, the proposed conceptual model will also include demographic variables as control variables (Chong, Chan, & Ooi, 2012; Rogers, 2003). These 3 variable sets will be discussed in the following sub-sections.

2.3.1. Mobile Applications and Relationship Characteristics

An interesting, but rather difficult, aspect to consider in the adoption of mobile apps is the relationship an individual has with the brand or distributor of an app. This does not make sense for all apps as some apps are created by individuals, not companies. However, when it comes to mobile apps that fulfil a complementary or substituting function to other products or services provided by the same brand or distributor, it makes perfect sense. It is only logical that an individual’s previous association with a certain brand influences the choice of whether to adopt a new complementary or substituting product or service from that same brand.

The relationship a customer has with a firm or brand has 3 important facets, as described by Bolton, Lemon and Verhoef (2004). The first facet is the length or duration of the relationship which can be considered as the customer retention, defined as “the probability that a customer continues (or ends) the relationship with the organisation (Bolton, Lemon, & Verhoef, 2004, p. 3)”. The second facet is the depth of a relationship which refers to the “the frequency of service usage over time” (Bolton, Lemon, & Verhoef, 2004, p. 3). The third facet constitutes the breadth of a relationship which is “the number of additional (different) products or services purchased from a company over time” (Bolton, Lemon, & Verhoef, 2004, p. 3). A study within the telecommunications domain by Prins & Verhoef (2007) reveals that relationship length – or age – has a significant non-linear effect on the adoption probability. In turn, when usage levels are high, customers are less likely to adopt, whereas customers with low usage levels are more likely to adopt (Prins & Verhoef, 2007). However, if customers adopt, those with high usage levels adopt fastest (Prins & Verhoef, 2007). These findings do not entirely corroborate with theory on the subject, which makes them interesting for further research.

2.3.2. Mobile Applications and User Generated Characteristics

(14)

13 et al., 2014, p. 218). eWOM can also be found in mobile applications as any major app store you can think of has incorporated review systems, just like the ones you would find when looking for products online on Amazon or Bol.com. Such review systems enable consumers to give their opinion on applications, which may vary from a simple rating to a full-fledged review and are among the most powerful options to introduce and/or enhance eWOM (Duan, Gu, & Whinston, 2008). Sometimes these ratings may be extremely large in number (e.g. 10.000 reviews for a laptop). In order to deal with this information overload – as one consumer cannot possibly read all those reviews – most environments measure the number of reviews and an aggregate or overall rating. There has been quite some research as to the effectiveness of these eWOM measures and the findings are known to be significant, yet somewhat inconsistent. This may be indicative for varying effects over different types of media (You, Vadakkepatt, & Joshi, 2015). Some studies find that review volume has a positive effect on performance outcomes (Duan, Gu, & Whinston, 2008), while there is also support for the effects of review valence or rating (Chintagunta, Gopinath, & Venkataraman, 2010), as well as interactions between both facets (Floyd, Freling, Alhoqail, Cho, & Freling, 2014).

(15)

14

2.3.3. Mobile Applications and Demographics

Other important variables concern demographics (Alafeef, Singh, & Ahmad, 2011) due to the fact that they differ across adopter classifications (Rogers, 2003). Chong, Chan and Ooi (2011) conduct a study on commerce and conclude males are more likely to use M-commerce in case of banking or location-based services (LBS), which can be considered as more utilitarian in nature. Females, on the other hand, are more likely to use applications that are hedonic in nature (Chong, Chan, & Ooi, 2012). This is in line with findings from Taylor, Voelker and Pentina (2011) who find that males are more likely to use applications which have a more informative nature.

According to Chong et al. (2012) older consumers are more likely to commit to M-commerce, which contrasts with many other studies as younger people are usually the ones considered to be more open-minded towards new technologies, products, and services. A potential explanation for this contradictory finding by Chong et al. (2012) is that older consumers may have more spending power.

The potential influence of spending power on M-commerce commitment would suggest that income is another demographic variable one should take into account when studying adoption. A study by Yang (2013) indeed confirms that personal income is positively related to mobile application use.

Another demographic variable that is less frequently used but also interesting in adoption research is the region of the individual. Taylor, Voelker and Pentina (2011) control for small town and suburban residential areas and find that those living in small towns are less likely to adopt gaming applications, while those in urban settings are more likely to adopt applications that involve digital images and videos.

2.4. Extension of the Literature: Post-Adoption Usage

(16)

15 several forms, but this study will focus specifically on the usage component of post-adoption behaviour due to the fact that for some products, and many services, adoption timing is just one side of the coin, while post-adoption usage is the other side. For instance, when consumers or businesses purchase complex products, such as ERP systems, it is generally accompanied by very high acquisition costs. If such a product remains underutilized – or is barely used after purchasing it – those consumers and businesses are likely to underperform and it will be hard to achieve a return on the investment (Bagayogo, Lapointe, & Bassellier, 2014). Similarly, firms that specialize in services such as telecommunications – which is a rather competitive and investment-heavy industry (Lee, Shin, & Lee, 2009) – require their customers to use that service on a regular basis in order to obtain (sufficient) revenue. This is especially the case when revenue is contingent on the usage which is often the case for telecommunication customers (Lee, Shin, & Lee, 2009). Such post-adoption usage is also required to expand and grow the business (Prins, Verhoef, & Franses, 2009).

There are numerous studies that investigate the antecedents of post-adoption usage. Some of these studies focus on elements of TAM to explain usage (Li & Liu, 2014), while others use organisational and environmental characteristics (Zhu & Kraemer, 2005) or supplier-side variables such as information or system quality (Lee, Shin, & Lee, 2009; Chen, Meservy, & Gillenson, 2012).

Of particular interest in this study is the effect of adoption timing on post-adoption usage, due to the fact there are inherent differences between adopter classifications (Rogers, 2003), which may reflect in their post-adoption usage. For instance, Prins et al. (2009) find that early adopters have lower initial usage levels than individuals who adopt later, but the usage of early adopters increases over time, while that of late adopters decreases. Studies prior to Prins et al. (2009) which investigated this relationship found that early adopters actually show a higher usage (Mahajan, Muller, & Srivastava, 1990; Morgan Jr., 1979) or were inconclusive (Ram & Jung, 1994). These inconsistent findings may indicate inherent differences for the types of technology researched as the research by Prins et al. (2009) takes place in the telecommunication domain, while that of Ram and Jung (1994) focuses on household technologies, and the study by Mahajan et al. (1990) looks at home PC-usage.

2.5. Going Concrete: Conceptual Model and Hypotheses

(17)

16 a conceptual model in which relational variables and user generated content (UGC) variables are used as predictors for adoption timing and post-adoption usage, while demographics are incorporated as controls for both variables.

Moreover,

adoption timing is also used as a predictor for post-adoption usage (see Figure 1).

Figure 1. Conceptual Model.

2.5.1. The Effect of Adoption Timing on Post-Adoption Usage

As mentioned in the previous section, there has been research on the relationship between adoption timing and post-adoption usage, but findings have been mixed. In some studies early adopters tend to display a higher usage level than later adopters (Morgan Jr., 1979; Mahajan, Muller, & Srivastava, 1990), whereas it is vice versa in other papers (Prins, Verhoef, & Franses, 2009). Theoretical arguments can be made to support either direction. Early adopters are often considered to be more innovative and skilful which would logically lead to a higher usage (Rogers, 2003). Late adopters may also exhibit higher usage due to higher expectations of the service at the time of adoption (Prins, Verhoef, & Franses, 2009).

The difference between the findings of these studies could suggest that the direction of this relationship is driven by characteristics of the innovation being researched, which is why one must consider the specifics of the innovation in question when hypothesizing the direction of the relationship.

(18)

17 that a mobile news application is brought to market as soon as possible due to the established necessity of establishing an online presence as the print newspaper circulation is in decline (Pew Research Center, 2015). Such a quick release is likely to lead to a less-than-optimal functionality due to the fact that features such as push messages, location-based services and social media interactivity will not have the developer’s priority. This would suggest that a consumer cannot or is not motivated enough to optimally use the application in the early stages of the product life cycle. Moreover, research by Boulding, Kalra, Staelin & Zeithaml (1993) indicates that customer expectations are important in determining the overall service quality of an innovation, which in turn is likely to influence usage. Early adopters mostly make their adoption decision based on external sources, such as campaigns (Rogers, 2003), so they remain quite superficial in their expectations of the news app. Joint by a less-than-optimal app functionality, it is likely that early adopters will display low usage levels. Later adopters are subject to influences by previous adopters as well as media campaigns. Moreover, it is likely that the news app is enhanced over time (e.g. more stable app, additional features, personalization) which makes it presumable that later adopters are likely to display a higher usage level. Due to the nature of a mobile news application and the relative ease by which one can navigate through such applications, it is unlikely that prior experience with news applications, personal innovativeness, or skillsets will significantly affect this relationship.

Hypothesis 1 (+): the adoption timing has a positive effect on the post-adoption usage level

2.5.2. The Effects of Relational Variables

A mobile news application is considered to be a complementary or substituting product as print newspapers, websites, and digital ePapers often already exist for that same newspaper (Chan-Olmsted, Rim, & Zerba, 2013). Therefore, consumers using a news app are likely to have a previous affiliation or contract with the news app brand. There is little literature on relationships within the mobile application domain, but plenty within other domains, such as in telecommunications (Prins & Verhoef, 2007). There are 3 important facets of relationships, which can be described as relationship length, depth and breadth (Bolton, Lemon, & Verhoef, 2004).

(19)

18 likelihood of purchasing or acquiring additional services, such as a mobile news application. This is partly supported by Prins and Verhoef (2007) who find a significant relationship between relationship age and adoption probability, but this effect is non-linear: as the relationship length increases, so does the adoption probability, if the customer has been with the provider for up to 3 years. If the customer has been with the provider longer than 3 years, the adoption probability declines. However, Prins & Verhoef (2007) find no significant main effect of relationship length on the timing of the adoption, it is only in interaction with service advertising and competitive service advertising that relationship length comes into play. One could argue that the result found by Prins and Verhoef (2007) concerning the relationship length and time of adoption could be related to their subject matter, as they study a new e-service in the telecommunications industry. This industry is quite competitive and different providers may offer similar services, so relationship length may only come into play when certain advertising occurs for the new service, or when prices of competing providers are significantly more attractive to customers than those of current providers. This would not be the case for regional newspapers, like Newspaper A, as they often have distinct and unique local news items that cannot be found in other papers because they are of no relevance to the readers of other newspapers, which means competitive pricing and promotions are likely to be of much less relevance.

(20)

19 may be logical to assume that most of the loyal readers – meaning they have been subscribers for years – stick to their current habits, and that especially newer readers are willing to try out the new app. Working on this assumption, it is likely that those subscribers who have long relationship lengths will have a longer adoption timing as it is harder for them to break their habit of print-paper, web-paper or ePaper-reading, whereas new subscribers have not yet – or in lesser extent – developed such habits. This leads to the following hypotheses:

Hypothesis 2a (+): the relationship length has a positive effect on the adoption timing Hypothesis 2b (-): the relationship length has a negative effect on the post-adoption usage

Another facet of relationships is the relationship depth, which refers to the usage over time (Bolton, Lemon, & Verhoef, 2004). Research has indicated that prior news consumption is positively related to adoption of mobile news (Chan-Olmsted, Rim, & Zerba, 2013). Moreover, prior news consumption, especially the use of media news sites, is positively related to mobile news usage (Chan-Olmsted, Rim, & Zerba, 2013). The theoretical background behind these positive relationships is that news consumption is a selective and habitual behaviour. This means that users that have a previous affiliation with news consumption are likely to continue this consumption in whatever manner possible. Chan-Olmsted et al. (2013) mention that all types of prior consumption (e.g. print, website, other services) can influence the adoption of a new news channel.

The findings of Chan-Olmsted et al. (2013) are in line with results from other research domains, as Steenkamp and Gielens (2003) find a positive relationship between usage intensity and trial probability. Meanwhile, Prins and Verhoef (2007) find that users with heavy consumption tend to adopt at a faster pace than users who have less consumption. Prins et al. (2009) also find that high levels of category usage are related to both earlier adoption and increased new service usage. Lastly, Risselada, Verhoef & Bijmolt (2014) conclude that greater usage levels lead to a higher propensity to adopt. This leads to the following hypotheses:

Hypothesis 3a (-): the relationship depth has a negative effect on the adoption timing Hypothesis 3b (+): the relationship depth has a positive effect on the post-adoption usage

(21)

20 platforms offered overlap in subscription options which does not allow for an exact number of products that a subscriber has access to. Moreover, the use of multiple news platforms is already partially covered by the relationship depth which entails prior news consumption of both the website and ePaper.

2.5.3. The Effects of User Generated Review Variables

Liu and Brandyberry (2014) find that overall rating has a significant positive effect on compatibility – one of the key characteristics of IDT - which in turn influences attitude towards an application. This is in line with research by Shen (2015) who shows that overall rating positively influences attitude towards an application and is even more pronounced for utilitarian applications. In turn, an enhanced attitude towards an application is known to positively influence the intention to use such an application as illustrated by many studies utilizing the TAM model (Davis, 1986; Davis, 1989). A news application’s main function is to provide the consumer with news worthy information using text, and in some cases imagery or video material. In that sense, a mobile news application is rather utilitarian. Overall user ratings are frequently used by consumers to sort products and they have a certain persuasive effect (Duan, Gu, & Whinston, 2008). In other words, the persuasive effect is likely to be stronger when the overall rating is higher (e.g. a 4-star rating instead of a 3-star rating), which would lead to a higher propensity and urge to adopt. Moreover, most app stores incorporate overall rating as a component of the most popular or trending apps, which would make an app more visible to consumers. This increases that persuasive effect and would lead individuals to adopt the news app more quickly. Moreover, an increased overall rating would also feed consumer’s expectations, which, as explained in section 2.5.1., would lead to higher post-adoption usage levels. Therefore, the following hypotheses can be constructed:

Hypothesis 4a (-): the overall user rating has a negative effect on the adoption timing Hypothesis 4b (+): the overall user rating has a positive effect on the post-adoption usage

(22)

21 of word of mouth. Much like the argument for the increase in overall user rating, an increase in the volume of reviews will also lead to a quicker diffusion of the innovation due to imitation behaviour amongst people in a social system (Rogers, 2003, Liu & Brandyberry, 2014). Naturally the number of reviews also feeds back into the most popular and trending category lists of app stores. Since an increase in the number of reviews also reflects an intensity, one can logically expect that it would lead to a higher post-adoption usage. Therefore, the following hypotheses are created:

Hypothesis 5a (-): the volume of reviews has a negative effect on the adoption timing Hypothesis 5b (+): the volume of reviews has a positive effect on the post-adoption usage

2.5.4. The Effects of Demographic Characteristics

Customer demographics are typically included as control variables. The choice for including them as control variables is especially sound considering that the demographic data are based on a zip-code level, so they cannot directly be tied to individual-level data.

This study will control for gender, age, income, region and consumer mobility, of which the first 4 are relatively well-researched in the literature, and the latter is a rather new but highly relevant concept when considering a mobile app.

Many studies find similar results for gender, namely that males are generally more open to new technology (Prins, Verhoef, & Franses, 2009) and that they are more likely to use, download, or purchase applications that have a utilitarian purpose, whereas women are more likely to opt for applications that provide entertainment and are thus of a hedonic nature (Chong, Chan, & Ooi, 2012; Taylor, Voelker, & Pentina, 2011). News applications are utilitarian in nature and thus it is expected that males will adopt a news app quicker than females and that they will display higher post-adoption usage levels.

(23)

22 The demographic variable of region may be of importance in the setting of news applications as they are likely to be tailored to a specific region. Those living in rural areas are more inclined to read local news in order to acquire information and to help create a feeling of community (Paek, Yoon, & Shah, 2005). However, living in an urban area would expose a person to more marketing stimuli and word-of-mouth. As such, no clear direction is expected. Income can be taken as a proxy for economic status, which was one of the original demographic variables included in the study by Bohlen and Beal (1957) on which the adopter classification is based. Income has been found to be of significant importance in multiple studies, including the study of Yang (2013) where personal income is significant and positively related to mobile applications usage. It is reasonable to assume that those subscribers with higher incomes will show shorter adoption times and higher post-adoption usage levels.

(24)

23

3. RESEARCH DESIGN

3.1. Data Sources

In order to analyse the problem statement with all the variables that are deemed to be of importance, several datasets will be used that are joint together through common identifiers. The datasets will be discussed separately below.

3.1.1. Newspaper Subscriber Database

The Newspaper Subscriber Database is provided by News Inc. and includes 269.506 subscribers of 5 regional newspapers. Of these 269.506 subscribers, 121.047 have no valid Customer IDs that can be linked to other databases.

For the purpose of this study the newspaper with the highest number of identifiable subscribers will be selected. This newspaper is called Newspaper A. There are a total of 79.999 subscribers for Newspaper A. There are several variables available for each subscriber in this dataset:

- Customer ID

- Dummy variable whether Customers have a running contract or not - Type of Subscription in which 6 days refers to Monday through Saturday

o Complete: 6 days’ print & 6 days’ digital o Digital Only: 6 days’ digital

o Digital Print: Saturday print & 6 days’ digital

o Print Only: 6 days / Saturday / Saturday + Monday / Saturday + Wednesday o Webweekend: Saturday print & Premium Digital, but no ePaper

- Acquisition Channel (Action, Receipt, Door-to-door Sales, Leaflet, Recovery, Internet, Third Party, Mailing, Spontaneous, Telemarketing, Free Sample, Street Sales)

- Zipcode

- Beginning and (if applicable) end dates of subscription - Date of Birth

- City of Residence

- Type of Subscriber (Male, Female, Firm or Unknown)

(25)

24 - Date of Birth before 05-05-1905 should not exist as the oldest Dutch person alive has a birthdate which supersedes 05-05-1905. All birthdates prior to this date were recoded to missing values. This amounted to 6 subscriber records being changed.

- Date of Birth after 19-01-2001 suggests that people who are not of age yet (16) have a valid subscription. As indicated by News Inc. this should not be possible as these people cannot take out a newspaper subscription. All DOBs past this date were recoded to missing values. This amounted to 26 subscriber records being changed.

- Small differences in acquisition channel names were corrected (e.g. Leaflet and Leaf Let)

Several subscribers were excluded for further analyses due to the following reasons: - Duplicate Customer IDs were removed (68 cases)

- Zip codes did not match the city given for the subscriber, which may indicate an error in data collection for this subscriber (11 cases)

- Subscribers which are companies were removed due to the inclusion of usage variables. A firm with all of its employees having access to the newspaper website may bias the results by exhibiting excessive usage figures (1527 cases)

3.1.2. Google Analytics Database

The Google Analytics Database was obtained from Google Analytics and includes data on all users that make use of any one of the Websites, ePapers or News Apps for any title distributed by News Inc. The data is constrained to a time window which runs from 01-01-2015 to 30-11-2015. The data includes multiple IDs, time variables, geographic variables, usage variables (e.g. sessions, session durations, hits), device variables, source and medium variables, page URL variables, and Subscriber ID variables. The Google Analytics data was directly imported from the server and has not been manually altered in any way.

(26)

25

3.1.3. Bisnode Geopostcode Database

The Bisnode Geopostcode Database was provided by News Inc. and includes sociographic, demographic, economic and geographic data on a zip code level. The most important variables to be used in this thesis are:

- Zip-Code

- Income measured in 7 dummy variables taking 1 for true and 0 for false: High, Above Average, Average, Low, Minimal, Diverse, and Unknown on a zip-code level

- Urbanisation measured in 9 dummy variables taking 1 for true and 0 for false: More than 500.000 inhabitants, 250.000 to 500.000 inhabitants, 100.000 to 250.000 inhabitants, 50.000 to 100.000 inhabitants, 20.000 to 50.000 inhabitants, 10.000 to 20.000 inhabitants, 5.000 to 10.000 inhabitants, less than 5.000 inhabitants or Unknown on a city-wide level

- Gender measured in 2 categories reflecting the percentage of males and females within the zip-code

- Age measured in 5 categories reflecting the percentage of that category within the zip-code: 0-14, 15-24, 25-44, 45-64, 65+

3.1.4. Review Database

The review database is a constructed database by the author of this thesis. It consists of all the user reviews for the news app of Newspaper A and was acquired through data mining of the reviews in the Google Play Store for Newspaper A. This was cross-referenced with AppAnnie, which is an online application monitoring tool, in order to ensure its accuracy and validity. No errors were found in this dataset which contains the following variables:

- Date of review

- Number of stars of the review (1 to 5) - Reviewer name

- Review title - Review message

(27)

26

3.1.5. Final Data and Exclusion of Subscribers

The previously mentioned datasets are blended together using Alteryx, a data blending and advanced analytics software program which utilizes drag and drop tools and workflows. The Subscriber Database is combined with the Google Analytics Database by means of Unique Subscriber IDs. This has led to the exclusion of 57.264 subscribers as they had no records in the Google Analytics database due to the fact that they do not utilize any of Newspaper A’s online products, which leaves 21.129 subscribers. A few more subscribers are excluded due to the following reasons:

- 59 subscribers who have a subscription to Newspaper A, but have only accessed digital content from Newspapers B ~ E, which may indicate a data collection error

- 46 subscribers who had zip codes that did not match the Bisnode Geopostcode database and thus had no demographic variables at all

- 535 subscribers who had unknown values for gender, age category, income or urbanisation

This leaves a total set of 20.489 subscribers of Newspaper A for further analysis. The resulting database is then combined with the Bisnode Geopostcode Database by means of zip codes. After computing an adoption date for those subscribers who have adopted the news application, the database is combined with the Review Database by means of dates. In the combined dataset all variables that will be used in this study were derived or constructed using the operationalisations listed in Table 1.

Table 1. Operationalization of Research Variables.

Variable VAR Proposed Operationalization

Post-Adoption Usage

PAUNA The average session duration in seconds for the News App in the 60 days post-adoption for a subscriber

Relationship Length

RL The number of months a subscriber has had a subscription with the newspaper at the introduction date of the News App (31-03-2015)

Relationship Depth

RD The average session duration in seconds for Website and ePaper of the subscribed newspaper in the 60 days prior to the moment of adoption for a subscriber

Lag Overall Rating

LOR The average user rating score for the News App measured on a scale of 1 to 5 with 1 decimal point at the day prior to adoption for a subscriber

Lag Number of Reviews

(28)

27

Adoption Timing AT The number of months between the News App introduction (31-03-2015) and adoption date for a subscriber

Consumer Mobility

CM The average daily percentage of times that a subscriber has accessed digital content from a city that does not match his or her residential city

Gender G Percentage of male individuals in zip code so that the reference category is female.

Income INC Bisnode Zip Code Class Dummies (High, Above Average, Average, Low, Minimum, and Diverse) in which the reference category is Average.

Region URB Bisnode Zip Code Class Dummies for Inhabitants (500K, 250K, 100K, 50K, 20K, 10K, 5K, and < 5K) in which the reference category is < 5K

Age AGE Bisnode Zip Code Class Percentages (0-14, 15-24, 25-44, 45-64, and 65+) in which the reference category is 65+

3.2. Data Descriptives

In order to get a grasp of the data it is recommended to check the descriptives in order to see whether there are ‘odd’ or ‘impossible’ values, extreme outliers, and what kind of distribution the data has. This is useful as it may uncover improbable or impossible values that need to be corrected or accounted for in any subsequent analysis (see Table 11, Appendix A). Investigating the descriptives reveals several interesting facts. First, the dummies for cities with a number of inhabitants greater than 500.000, between 250.000 and 500.000 and between 100.000 and 250.000 are always 0 indicating that there are no cities of such size in the dataset (see Table 11, Appendix A). This means that these dummies can be excluded from now on.

Second, post-adoption usage and relationship depth are skewed to the right with several extreme outliers (see Figure 11 and Figure 12, Appendix A). It should not matter whether data are skewed or not, but depending on the results one can implement transformations in order to pull in some of the outliers and make the data follow a more normal – or bell-shaped – distribution. The fact that there is a high frequency of low-value post-adoption usage values does give cause for concern. Subscribers who display only a few seconds of post-adoption usage cannot really be considered users of the news app, as using it would entail consuming a news article, which cannot be done in a few seconds. It is therefore prudent to investigate the conceptual model only on the subscribers who display a post-adoption usage of at least 60 seconds. In this manner one can account for those subscribers who do not actually use the news app, but rather download, try, and delete it in a matter of seconds.

(29)

28 of reviews, which is due to the high proportion of non-adopters within the sample (see Figure 13 and Figure 14, Appendix A). This should not be an issue in the Cox Proportional Hazard Model as it accounts for right-censoring (i.e. those subscribers who do not adopt within the timeframe of the research) or for the Regression Model as non-adopters do not have a post-adoption usage.

Fourth, the lag overall rating has few distinct values and has a very high occurrence of the value 2.8 (see Figure 15, Appendix A). This is also likely to be because of the high proportion of non-adopters in the sample and should not pose any issues for the Regression Model. One should be careful when implementing the Cox Proportional Hazard Model as it may give problems there. URB500K, URB250K, and URB 100K are constant throughout the dataset and can be left out of any further analysis (see Table 11, Appendix A).

3.3. Research Method

The problem statement will be tackled by estimating 2 separate models for 2 dependent variables. The first model is a Survival Model. This model is intended to model the timing or duration of a certain behaviour displayed by consumers, which in the case at hand is the adoption of the news app. A specific class of survival models, namely the Cox Proportional Hazard model, will be used in this thesis as it specifically allows for right-censored data, so that it includes the customers who do not adopt at all within the given timeframe. The Cox Proportional Hazard Model will model the baseline hazard, which is merely a function of time, as well as any effect on the baseline hazard caused by a set of independent variables. The model specification can be found in Equation 1.

Equation 1. Specification of the Cox Proportional Hazard Model SM1A.

𝑖(𝑡) = ℎ0(t) exp( 𝛽1𝑅𝐿𝑖 + 𝛽2𝑅𝐷𝑖 + 𝛽3𝐿𝑁𝑂𝑅𝑡∗𝑖−1+ 𝛽4𝐿𝑂𝑅𝑡∗𝑖−1+ 𝛽5𝐶𝑀𝑖+ 𝛽6𝐺𝑖+

𝛽7𝐼𝑁𝐶𝑖+ 𝛽8𝑈𝑅𝐵𝑖+ 𝛽9𝐴𝐺𝐸𝑖)

In which:

- h(t) is the hazard rate, which is the conditional probability that the event occurs at time

t, given that it has not occurred until time t

- h0(t) is the baseline hazard which is a mere function of time t

- Subscript t*i-1 denotes that the variable in question is measured at the period prior to the

(30)

29 The explanation of the other variables in Equation 1 is similar to those discussed previously (see Table 1).

The second model is a Multiple Linear Regression Model which allows for the modelling of a continuous dependent variable with multiple explanatory variables. The multiple linear regression is subject to several assumptions which will be discussed in detail after the estimation in order to validate the model. The model specification can be found in Equation 2.

Equation 2. Specification of the Multiple Linear Regression Model RM1A.

𝑃𝐴𝑈𝑁𝐴𝑖 = 𝛼 + 𝛽1𝑅𝐿𝑖 + 𝛽2𝑅𝐷𝑖+ 𝛽3𝐿𝑁𝑂𝑅𝑡∗𝑖−1+ 𝛽4𝐿𝑂𝑅𝑡∗𝑖−1+ 𝛽5𝐴𝑇𝑖+ 𝛽6𝐶𝑀𝑖+

𝛽7𝐺𝑖+ 𝛽8𝐼𝑁𝐶𝑖+ 𝛽9𝑈𝑅𝐵𝑖 + 𝛽10𝐴𝐺𝐸𝑖 + 𝜀𝑖

In which:

- Subscript t*i-1 denotes that the variable in question is measured at the period prior to the

moment of adoption t of subscriber i - ε is the disturbance term

(31)

30

4. SURVIVAL MODELS FOR ADOPTION TIMING 4.1. Estimation and Validation

As mentioned previously it is important to note that when modelling adoption timing one should take into account that within a certain timeframe not all customers may adopt an innovation at all. This is why a Cox Proportional Hazard Model is used in this study.

The first model, which will be referred to as Model SM1A, is a Cox Proportional Hazard Model with all the variables as they are operationalized (see Table 2).

Table 2. Cox Proportional Hazard Model SM1A (Ev. = 1170 / Cens. = 5832).

Variable B Exp(B) Low 95% CI Up 95% CI PH Test

Relationship Length 0.0004 . 1.0004 1.0000 1.0007 0.073 Relationship Depth 0.0001 1.0001 0.9999 1.0002 0.170 Lag Number of Reviews (0.2400) *** 0.7867 0.7772 0.7962 0.000 *** Lag Overall Rating 3.7980 *** 44.6019 25.8108 77.0734 0.000 *** Consumer Mobility 0.0008 1.0008 0.9989 1.0027 0.536 Gender: Male++++ 0.1992 1.2204 0.4184 3.5593 0.836 Age: 0-14++ 0.1557 1.168 0.5134 2.6596 0.901 Age: 15-24++ 0.3988 1.4900 0.5985 3.7095 0.110 Age: 25-44++ (0.1260) 1.0127 0.5010 2.0467 0.805 Age: 45-64++ (0.1228) 0.8844 0.4624 1.6915 0.944 Income: High+ (0.1438) 0.8661 0.6651 1.1278 0.605

Income: Above Average+ (0.0360) 0.9646 0.8343 1.1154 0.705

Income: Low+ 0.0809 1.0842 0.8606 1.3660 0.713 Income: Minimal+ 0.2352 1.2652 0.7586 2.1099 0.450 Income: Diverse+ (0.0252) 0.9751 0.8027 1.1845 0.122 Urbanisation: 50-100K+++ (0.0509) 0.9504 0.7749 1.1656 0.366 Urbanisation: 20-50K+++ (0.0364) 0.9643 0.7825 1.1883 0.481 Urbanisation: 10-20K+++ 0.0225 1.0227 0.8433 1.2402 0.771 Urbanisation: 5-10K+++ (0.0340) 0.9666 0.8037 1.1625 0.176

GLOBAL MODEL PH TEST 0.0000 ***

In which: . = p-value < 0.1; *= p-value < 0.05; ** = p-value < 0.01; and *** = p-value < 0.001

+ = reference category is Income: Average ++ = reference category is Age: 65+

(32)

31 Judging from the output in Table 2, we have 2 highly significant variables, which are the lag number of reviews and the lag overall rating. However, there seems to be an issue with both these variables. The B coefficient of lag overall rating is 3.7980, which means that a one-unit increase in the Lag Overall Rating would bring about an increase of about 4361%1 in the hazard rate. This means that the adoption timing would decrease by 4361%, since we model the hazard rate to adopt. While it may seem fortunate that such a significant effect is found, it seems rather implausible. A potential explanation for this ‘odd’ result may be that the distribution of the Lag Overall Rating, is constrained to values between 2.3 and 3.8, of which 2.8 takes the lead with a high number of occurrences (see Figure 15, Appendix A). A one-unit change in the predictor barely fits within this range, so it may not be entirely realistic. Moreover, it is unlikely that an aggregate rating would all of a sudden change by a full point as it is aggregated by the number of reviews. This suggests that an interaction effect between the lag number of reviews and the lag overall rating may improve the model, especially when considering the fact that the proportional hazard assumption is violated for both the lag number of reviews and the lag overall rating (see Table 2).

Therefore, the model specification is altered in Equation 3 and a new model (SM1B) is estimated which incorporates the interaction effect between the lag number of reviews and the lag overall rating.

Equation 3. Cox Proportional Hazard Model SM1B.

𝑖(𝑡) = ℎ0(t) exp( 𝛽1𝑅𝐿𝑖 + 𝛽2𝑅𝐷𝑖 + 𝛽3𝐿𝑁𝑂𝑅𝑡∗𝑖−1+ 𝛽4𝐿𝑂𝑅𝑡∗𝑖−1+ 𝛽5𝐶𝑀𝑖+ 𝛽6𝐺𝑖+

𝛽7𝐼𝑁𝐶𝑖+ 𝛽8𝑈𝑅𝐵𝑖+ 𝛽9𝐴𝐺𝐸𝑖 + 𝛽10(𝐿𝑁𝑂𝑅𝑡∗𝑖−1 𝑥 𝐿𝑂𝑅𝑡∗𝑖−1)

The model output can be found in Table 12, Appendix B. Both the lag variables are still highly significant, and so is the interaction. However, while the interaction effect has a B coefficient of (0.5194), the main effect of the lag overall rating exhibits an even higher B of about 8.604, which seems ever more improbable as this would lead to an increase in the hazard rate by about 545.242%. In addition, the lag number of reviews, the lag overall rating and the interaction term all violate the proportional hazard assumptions, so the interaction clearly did not solve the issue, but made it far worse. It is clear that the range and distribution of the lag overall rating is not sufficient for a Cox Proportional Hazard Model.

(33)

32 One could account for this by creating dummy variables in which the lag overall rating is recoded to 3 instances, one being lower than 3, the other equal to 3, and the last higher than 3, to incorporate potential non-linearity’s. This leads to the re-specification in Equation 4.

Equation 4. Cox Proportional Hazard Model SM1C.

ℎ𝑖(𝑡) = ℎ0(t) exp( 𝛽1𝑅𝐿𝑖 + 𝛽2𝑅𝐷𝑖 + 𝛽3𝐿𝑁𝑂𝑅𝑡∗𝑖−1+ 𝛽4𝐿𝑂𝑅𝐶𝐴𝑇𝑡∗𝑖−1+ 𝛽5𝐶𝑀𝑖 +

𝛽6𝐺𝑖+ 𝛽7𝐼𝑁𝐶𝑖 + 𝛽8𝑈𝑅𝐵𝑖 + 𝛽9𝐴𝐺𝐸𝑖)

While the lag number of reviews as well as both categories of the lag overall rating are significant the results of this model lead to similar issues. This is indicated by the high B coefficients of 3.697 and 4.228 for the negative and positive categories respectively (see Table 13, Appendix B). Not only do these values raise doubt in terms of their high explanatory power, but also by the fact that a one-unit increase in the negative overall rating category as compared to the reference category of neutral would have a positive effect on the hazard rate to adopt. One would logically expect the opposite direction as negative ratings usually have more negative effects than neutral ratings. This leads one to conclude that neither a continuous nor a categorical operationalization of the lag overall rating can be included in a Cox Proportional Hazard Model.

Alternatively, the dataset has both the number of reviews and the overall rating measured per date, so these covariates can be allowed to vary over time. This would once again call for a re-specification of the model, which can be found in Equation 5.

Equation 5. Cox Proportional Hazard Model SM2A w/ Time-Dependent Covariates LNOR and LOR.

𝑖(𝑡, 𝐿𝑁𝑂𝑅, 𝐿𝑂𝑅) = ℎ0(t) exp( 𝛽1𝑅𝐿𝑖 + 𝛽2𝑅𝐷𝑖 + 𝛽3𝐿𝑁𝑂𝑅𝑡∗𝑖−1+ 𝛽4𝐿𝑂𝑅𝑡∗𝑖−1 + 𝛽5𝐶𝑀𝑖 + 𝛽6𝐺𝑖+ 𝛽7𝐼𝑁𝐶𝑖 + 𝛽8𝑈𝑅𝐵𝑖 + 𝛽9𝐴𝐺𝐸𝑖)

When estimating the model an error occurred that the matrix was singular for both the lag number of reviews and the lag overall rating. This means that both the lag overall rating and the number of reviews are not meaningful at all, even though they showed as highly significant in previous models. This high significance was due to the fact that one of their values perfectly predicts an outcome, which is a problem known as perfect classification. Therefore, both variables will be dropped from further survival model analyses.

(34)

33

Equation 6. Cox Proportional Hazard Model SM2B.

𝑖(𝑡) = ℎ0(t) exp( 𝛽1𝑅𝐿𝑖 + 𝛽2𝑅𝐷𝑖 + 𝛽5𝐶𝑀𝑖 + 𝛽6𝐺𝑖+ 𝛽7𝐼𝑁𝐶𝑖 + 𝛽8𝑈𝑅𝐵𝑖 + 𝛽9𝐴𝐺𝐸𝑖)

The output of Model SM2B can be found in Table 14 (Appendix B). In this model, no ‘odd’ B coefficients are found and several other variables show significance levels that were previously not observed. Both relationship variables are found to be extremely significant. Relationship length and relationship depth have positive B’s, which indicates that an increase of one unit in either one leads to an increase in the hazard rate of 0.07% and 0.03% respectively and thus a decrease in the adoption timing by those same amounts. Moreover, the control variables for the percentage of 15-24 year olds and the high income dummy group show significant effects. This means that a one-unit increase in the percentage of 15-24 year olds as compared to the reference category of 65+ within a zip code leads to an increase in the hazard rate by about 242,6%. Lastly, a one-unit increase in the dummy variable of high income as compared to the reference category of average income leads to an increase in the hazard rate by about 46,6%.

However, the model still includes variables that are not at all significant, and there seems to be an issue with the proportional hazards assumption for percentage of 15-24 year olds, which may indicate a potential interaction with another predictor variable or over time. The consumer mobility variable shows this same violation, but is not even close to being significant, while the relationship depth variable shows a very low significance level which may also indicate a potential violation of the proportional hazards assumption.

In a new model specification (see Equation 7) the insignificant variables are left out and the model results are shown in Table 15 (Appendix B).

Equation 7. Cox Proportional Hazard Model SM2C.

ℎ𝑖(𝑡) = ℎ0(t) exp( 𝛽1𝑅𝐿𝑖 + 𝛽2𝑅𝐷𝑖 + 𝛽3𝐼𝑁𝐶𝑖+ 𝛽4𝐴𝐺𝐸𝑖)

(35)

34 hazard assumption is violated over time. It may be that the effect of the percentage of 15-24 year olds on the hazard rate differs over time. In order to create such an interaction, one can use the structure created in Model SM2A and introduce a new predictor which relates the percentage of 15-24 year olds to the stop variable, which is part of the time-dependent covariate structure. This leads to Equation 8. The output of the new model, SM2D, is presented in Table 16, Appendix B.

Equation 8. Cox Proportional Hazard Model SM2D w/ Interaction between Age Group 15-24 and a Time-Component

ℎ𝑖(𝑡, 𝑃_𝐴𝐺𝐸_14_25) = ℎ0(t) exp( 𝛽1𝑅𝐿𝑖 + 𝛽2𝑅𝐷𝑖+ 𝛽3𝐼𝑁𝐶𝑖+ 𝛽4𝐴𝐺𝐸𝑖 +

𝛽5(𝑃_𝐴𝐺𝐸_14_25𝑖 ∶ 𝑡𝑖𝑚𝑒)

In this model one can clearly observe that there is a significant interaction effect between the percentage of 15-24 year olds in a zip-code and time on the hazard rate. The B coefficients of the percentage of 15-24 year olds and its interaction with time are both positive. This means that the percentage of 15-24 year olds has a negative partial effect on the hazard rate of approximately 30,7%, but the effect turns positive and larger over time at a rate of approximately 112,4% per month, which means that the negative partial effect dissipates after about half a month (as the B coefficient of the interaction is positive and twice as large as the B coefficient of the percentage of 15-24 year olds). Nonetheless, there still seems to be a violation of the proportional hazard assumption for the relationship depth as it is significant (see Table 16, Appendix B).

After several estimations one can conclude that no interaction between relationship depth and any other predictor variables or time can solve this assumption as none are significant. Since the relationship depth variable is operationalized by taking the average session duration for both website and ePaper combined (see Table 1), it may be that this influences the assumption. One could account for this by incorporating dummy variable which is 1 for those who have adopted the ePaper and 0 for those who have not and thus only use the website. This leads to Equation 9 for Model SM2E.

Equation 9. Cox Proportional Hazard Model SM2E w/ Interaction between Age Group 15-24 and a Time-Component.

𝑖(𝑡, 𝑃_𝐴𝐺𝐸_14_25) = ℎ0(t) exp( 𝛽1𝑅𝐿𝑖 + 𝛽2𝑅𝐷𝑖+ 𝛽3𝐼𝑁𝐶𝑖+ 𝛽4𝐴𝐺𝐸𝑖 + 𝛽5𝑒𝑃𝐷𝑖+

Referenties

GERELATEERDE DOCUMENTEN

Evidence is provided that the personal factor PIIT and three of the six sub-dimensions of the environmental factor transformational IT leadership have a

The goal of the current study was to compare the Theory of Planned Behavior to the Technology Acceptance Model in terms of their contribution to the understanding of mobile

The model shows an adjusted R-square of 0.749, which means that 74,9% of the variance in the dependent variable Intention to Use, can be explained using the

The data that is used to estimate the nelson-siegel model are the daily yield curve estimations as published by the ECB (2016c) for the triple-A bonds a maturity vector containing

[r]

The presented prosthetic flexure-based finger joint is able to achieve 20N of contact force with an additional 5N of out of plane load over the entire 80˚ range of motion,

We proposed representation learning techniques for audio and image processing based on novel trainable feature extractors. The design and implementation of the proposed

In het Protocol wordt het 'landelijk mestoverschot 2003' gedefinieerd als 'De mest- productiecapaciteit (uitgedrukt in forfaitair stikstof, werkelijk stikstof en werkelijk fosfaat)