The effects of app pricing structures on product evaluations

(1)

The effects of app pricing structures on product evaluations

M.R.J. Wolkenfelt

106712659

Amsterdam, June 23 2017

Author: Mark Wolkenfelt Student number: 10671269

Date: June 23, 2017 Class: 2016/2017

(2)

Statement of Originality

This document is written by Mark Wolkenfelt who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the

supervision of completion of the work, not for the contents.

(3)

2.1 Microtransactions ... 8 2.2 Distinctiveness ... 9 2.3 Consumer satisfaction ... 10 2.4 Mental Accounting ... 11 2.5 Information availability ... 12 2.6 Endowment effect ... 14 2.7 Evaluation sentiment ... 14 2.8 Pricing impact ... 15 2.9 Evaluation assertiveness ... 17 2.9.1 Price fairness ... 17 2.9.2 Betrayal ... 18

3. Data and method ... 22

3.1 Methodology... 22

3.2 Data collection... 24

3.3 The research sample ... 27

3.4 Operationalisation of constructs ... 27

3.4.1 Pricing Models ... 27

3.4.2 Evaluation Sentiment ... 28

3.4.3 Evaluation assertiveness ... 28

3.4.4 Pricing & impact ... 28

3.4.5 Additional variables ... 29

3.5 Machine Learning – Topic Identification ... 29

3.6 Machine Learning – Sentiment & Assertiveness measurement ... 31

3.7 Statistical analyses ... 32

4. Results ... 33

4.1 Descriptive statistics ... 33

4.2 Pricing models and overall evaluation sentiment (H1) ... 34

4.3 Pricing models and pricing evaluation sentiment (H2) ... 35

4.4 Relative pricing topic impact (H3&H4) ... 35

4.5 Pricing models and overall evaluation assertiveness (H5) ... 37

4.6 Pricing structures and pricing evaluation assertiveness (H6) ... 37

4.7 Evaluation sentiment and mediating effect (H7&H8)... 38

(4)

5. Discussion... 42

5.1 General results discussion... 42

5.2 Managerial implications ... 44

5.3 Discussion points... 45

5.3.1 Strengths & limitations ... 45

5.3.2 Future research ... 46

6. Conclusion ... 47

7. References ... 50

8. Appendices ... 55

8.1 Appendix I – Training Sample ... 55

8.2 Appendix II - Pricing topic classifier ... 56

8.3 Appendix III – Node.js crawler ... 56

8.3.1 License & download ... 56

8.3.2 Installation ... 57

8.3.3 Package dependencies ... 58

8.3.4 Crawler library ... 58

8.3.5 Source codes ... 58

(5)

Abstract

Purpose – The purpose of this paper is to contribute to the marketing literature and practice by

examining the relationship between app pricing structures and their product evaluations. The developed framework and machine learning tools form a basis for new future research methods.

Design/methodology/approach – A set of hypotheses was developed, that state the amount and

type of impact different pricing structures have on buyers’ product evaluations. Data extracted from the Google Play Store is used to test the hypothesis.

Findings – App pricing models affect consumer review sentiment, assertiveness and topics.

Removing upfront payment obligations positively impacts the overall and pricing specific consumer sentiment and reduces assertiveness. Furthermore, evaluation assertiveness is directly affected by the consumer sentiment.

Research limitations/implications - The results reveal new effects of pricing models on the

consumer experience and form a basis for future research. The study was conducted in the gaming category of the Google Play Store and the generalizability of the findings for other app segments or marketplaces should be further tested.

Practical implications – This study explains the effects of traditional and new app pricing

structures on consumer product evaluations. This information can support firms in choosing a pricing strategy for their apps.

Originality/value – The paper uses new data collection and machine learning tools in the

context of the Google Play Store. These tools developed for this paper can form a foundation for future research in a digital context, by introducing efficient collection and analysis methods for large amounts of qualitative and quantitative data.

(6)

1. Introduction

Despite the relatively short existence of mobile applications (apps), more than 300 billion apps have already been downloaded by consumers and have proved to be able to create million-dollar revenue streams for companies (Lunden, 2015). This business potential and the low entry

barriers for both the Apple and Google app Stores, have attracted many new entrants to these marketplaces and increased the rivalry between app developers. Since Google introduced the Play Store in October 2008, the number of apps available has grown from 200 to 2.2 million. As a result, the market became more saturated, making it harder for developers to achieve top positions on the charts and become profitable. Estimates indicate that at the end of 2014, 2% of the apps were responsible for more than 50% of all the revenues made.

During the expansion of the app stores and due to increasing levels of rivalry, developers started using new ways of attracting payments from consumers (Fields, 2014). The initial pricing model available for developers was premium, which required an upfront payment in order to use the app. Driven by the need for differentiation, Apple was the first to integrate the concept of in-app purchases in their marketplace, with Google following shortly after. This new form of in-in-app payments allows developers to integrate microtransactions for advanced features or digital goods (Chen, 2009; Átila et al, 2014). Developers quickly started using these new mechanisms and the in-app payment became a popular monetisation method. The growing commercial importance of these new pricing structures is also supported by reports from Gartner (2013), that predict that in-app purchases will represent 48% of the in-app store revenue in 2017, while this only was 11% in 2012.

Due to rapid developments in information technology, new and more effective methods for research are emerging and have the possibility to disrupt traditional research methods (Kemper, 2016). Artificial intelligent programs provide the ability to collect and analyse large

(7)

amounts of quantitative and qualitative data with relatively limited resources. The core

contribution of this paper, besides the theoretical implications, is forming a foundation for the development of new digital research tools with the potential to beneficially impact future science in general. The foundation of these new techniques created for this paper will be used in the context of the global app industry.

The fast growth of the app industry and the introduction of new successful pricing structures for apps have increased the importance of understanding the effects of these new models on consumers. Developers have little knowledge about the potential effects of the in-app purchases on the consumer experience and most research related to this topic uses basic

quantitative methods and lack proper context. In this paper, theories of traditional pricing models are applied on the new structures and the impact on buyers' product evaluations is analysed. Linking new pricing models with existing theories could possibly generate new insights on differences of new pricing structures on customers’ satisfaction expressed in their online reviews. This information could support app developers in understanding the effects of pricing on the experience of customers and form a basis for improving the product experience for users.

(8)

2. Literature review and hypotheses 2.1 Microtransactions

Jansen and Bloemendal (2013) defined app stores as “Online curated marketplaces that allow

developers to sell and distribute their products to actors within one or more multi-sided software

ecosystems”. The digital distribution of apps does not require physical carriers and the current

marketplaces for apps allow for unlimited shelf space, reducing the marginal production costs for companies in this industry. With these features now being available for all developers, the

drivers of competitive advantage within this industry are no longer derived from these

operational factors. Rietveld (2016) indicates that for app developers, competitive advantage is no longer established from their competence in strategic factor markets, but primarily by their capability to attract payments from consumers. The introduction of microtransactions by Google and Apple created new ways for companies to differentiate and attract payments from

consumers. As a result, firms in numerous markets, including video games, mobile apps and music started to integrate in-app purchases in their digital products.

Currently, the three main pricing structures present in the app stores are premium,

freemium and premium+. The initial pricing model available for developers was premium, which requires an upfront payment in order to use the app, but does not offer additional payment

options within the application. Freemium and premium+ are both using in-app purchases to attract payments from consumers. Products with a freemium pricing model offer basic features for free and allow developers to monetize users for complementary features or continued use through micro-transactions (McGrath, 2010). Premium+ apps also contain in-app purchases, but require an additional upfront payment in order to download the app, which makes it a

(9)

2.2 Distinctiveness

Integrating microtransactions within applications is more than just changing the revenue model, because it adds distinct elements to the functionality of the final product. These new elements contain more involved interactions with consumers and therefore influence the overall user experience (Hienerth et al., 2011). Two categories of in-app purchases are currently integrated in mobile applications and the first group consists of virtual currencies that help a user proceed in the application. The second category contains purchases that unlock extra features or content. The potential value of these methods is promising and the two most successful free games using both categories of in-app purchases, Clash of Clans and Hay Day, generate more than 2.4 million dollars on a daily basis (Strauss, 2013). Another popular app, Angry Birds 2, integrated a

progression based in-app purchasing model. Without using microtransactions, players get four attempts to complete a certain level within a limited amount of time. In case players use all four attempts or exceed the time limit they can restart the level by paying with an in-game currency, which can be accumulated by successfully completing challenges in the app. However, the amount of free in-game currency rewards is quickly inadequate when players advance to increasingly difficult levels. App users would then have to choose between waiting a certain amount of time for new currencies for or buying coins to continue playing the game

immediately. The integration of microtransactions in applications is according to Seufert (2014) one of the primary product shaping processes in the development of the final product.

The effects and impact of these new pricing structures on the consumer experience have not been examined on a general industry level. Previous studies primarily investigated the effects of the premium pricing structure, which requires customers to pay before consumption.

(10)

applicability of these theories on the new online industry has not yet been examined. Examining these effects in the context of the online application market and their new pricing structures could therefore fill the current literature gap. In order to investigate this particular relationship in the context of the app store, the following question has been asked: What are the effects of the

different pricing structures on product evaluations?

2.3 Consumer satisfaction

Researchers agree that consumer satisfaction is one of the most important topics within the field of marketing research and that it requires the presence of a goal that the consumer wants to achieve (Jamal, 2004). There is, however, no clear consensus about the description and

measurement of consumer satisfaction (Szymanski and Henard, 2001). The concept of consumer satisfaction was introduced within the marketing literature by Cardozo (1965) followed by Howard and Jagdish (1969) who were the first to create a clear definition of consumer satisfaction and described it as “a related psychological state to appraise the reasonableness

between what a consumer actually gets and gives”.

Researchers have also investigated the influence of the pricing of a service or product on

consumer satisfaction (Herrmann et al., 2007). A general consensus has been established indicating that the price is a product or service attribute considered relevant for consumer satisfaction (Anderson, Fornell and Lehmann, 1994). Zeithaml and Bitner (1996) have

demonstrated that the amount of consumer satisfaction was affected by pricing, product quality, service quality, personal and situational factors. When consumers are evaluating a product or service, they usually consider pricing an important factor in expressing their overall satisfaction in product evaluations. These evaluations are convoluted as an assessment of value and are derived from the consumer expectations and the perceived product benefits (Bowman &

(11)

Ambrosini, 2000; Priem, 2007). Churchill and Suprenant (1982) explain that this expression of satisfaction is influenced by the coupling and comparison of the rewards and actual costs of a purchase.

2.4 Mental Accounting

Prelec and Loewenstein (1998) describe that in case of the traditional pricing structure, which requires an upfront payment, customers couple the costs and benefits of a product or service at the time of consumption. Thaler (1985) created a comprehensive framework based on this comparison of cost and benefits associated with consumer transactions that could also be relevant for apps using the recently introduced microtransactions. Results from this study indicate that consumer create a mental account when they start with a transaction (making a payment) and close that account when they start to consume the product and the transaction is completed. During the closure of that mental account, a psychological link between the costs and benefits of the transaction is created by the consumer. The link between these elements is strong when it is evident which specific payment is financing a particular consumption (Prelec and Loewenstein, 1998). Monroe (2004) also describes this coupling process and shows that “buyers'

perceptions of value are mental trade-offs based on what they gain from a purchase compared

with the sacrifice they make by paying the price”. The framework presented by Thaler (1985)

includes the sunk cost effect in this consumer coupling process. Between the start and

completion of a transaction, consumers keep the cost of this transaction at full negative hedonic value, implicating that the sunk cost impact of the transaction is consistent during the process of coupling. This is important in the context of app pricing structures, because Garland and

NewPort (1991) indicate that upfront payments create non-recoverable costs and explain that these costs influence consumer decisions regarding their subsequent behaviour related to that

(12)

transaction. According to the results, a negative coupling balance reduces the overall perceived value by consumers. Multiple papers indicate that this perceived value is the best and most relevant antecedent of satisfaction (Day & Crask, 2000; McDougall & Levesque, 2000; Oliver & Swan, 1989). The awareness of gain-loss ratios can therefore influence purchase evaluations, particularly in the financial context.

According to Prelec and Loewenstein (1998), the coupling for premium apps with a traditional pricing model occurs after the payment has been completed at the time of product consumption and creates non-recoverable costs. In contrast, freemium apps do not require an upfront payment and consumers could therefore in theory prevent a negative coupling balance by simply not paying for the in-app purchases.

2.5 Information availability

Oh (2013) categorizes the concept of satisfaction in two elements; satisfaction with the buying process (e.g. interactions with employees and the number of payment options) and satisfaction with the outcome of the purchase. The results from this study and several other papers (Bitner and Hubbert, 1994; Shankar et al., 2003) indicate that these concepts are both unique and

correlated. Bechwati and Xia (2003) have supported this theory in the context of online decision support and demonstrated that a consumers’ perception of the amount of effort online

recommendation systems put into finding relevant services or products, influences the overall consumer satisfaction. This effect was independent of the quality or type of product or service that was recommended.

Spreng et al. (1996) indicate that besides the effort made during the purchasing process by for example employees or recommendation systems, the amount of product information available in the pre-purchase phase also positively influences the overall consumer satisfaction.

(13)

This specific concept of information availability and its relationship with the consumer

satisfaction could be relevant in comparing the effects of different pricing structures within the app industry on consumers. Previous research from Herrmann et al. (2007) has revealed that satisfaction with the process of collecting information about the product or service before the final purchasing decision is likely to transfer over to the satisfaction with the purchase. Bowman & Ambrosini (2000) supports this and indicate that a freemium business model gives consumers the option to more accurately create their evaluation of a product before paying, compared to the traditional premium pricing model.

Consumers can only evaluate premium and premium+ products to a certain degree before consumption. Consumers interested in premium products would need to establish assumptions about the potential benefits and the quality of a product based on the visible attributes

(Boatwright, Kalra & Zhang, 2008). Pre-purchase product valuations are specifically difficult for consumers to create for premium products and services in entertainment markets including app stores (Priem, 2007; Zeithaml, 1988). For creating a pre-purchase evaluation of these goods and services, users are dependent on external sources such as expert and consumer reviews

(Wijnberg & Gemser, 2000). These indicators of quality can contribute to the consumers’

process of value assessment in the pre-purchase phase, but are independent sources and therefore harder to control for companies in case of negative information. Identifying the effects of

choosing a specific pricing model on consumer product evaluations can possibly benefit firms if they desire to influence the content of consumer reviews and thereby increase the positive impact of these quality indicators on new potential customers.

In contrast with premium products that require consumers to pay before consumption, apps that are free to download allow consumers to establish a more accurate and personal

(14)

evaluation. Apps that offer a diverse range of in-game microtransactions and require no upfront payment therefore allow consumers to act on their willingness to pay and reduce the amount of unused consumer surplus derived from differences between costs and willingness to pay.

2.6 Endowment effect

The endowment effect introduced by Kahneman et al. (1990) could also be relevant in

identifying the effects of different pricing structures on consumers and the relationship with the overall satisfaction. The description of the effect states that “the measures of willingness to

accept greatly exceed measures to pay”, which indicates that when a product becomes part of the

persons’ endowment, it increases the perception of its value. This would reveal that consumers value products they own higher compared to products they do not own. Harris (2013) relates this benefit to the freemium model by indicating that apps using this pricing structure do not offer payment options before a consumer downloaded and installed the app. In the freemium pricing structure, consumers could therefore theoretically value the app higher at the time of payment, because they can already use the app and create a sense of ownership before making a purchase.

2.7 Evaluation sentiment

The endowment effect, in combination with the positive effects of the increased availability of product information for freemium apps, could suggest an increase in overall satisfaction for freemium apps compared to premium and premium+ apps. Furthermore, the possibility to avoid a negative mental coupling balance for freemium apps, could also reduce the number of negative experiences and increase the average satisfaction for this model. Finally, the effects of adding microtransactions while keeping the upfront payment requirement could remove the satisfaction benefits. The microtransactions in these premium+ apps only add another payment layer and in theory lower the coupling balance for consumers, resulting in lower evaluation sentiment. In

(15)

order to test this in the context of pricing structures, the sentiment of the overall evaluations and the evaluations about pricing are analysed and compared with the following hypotheses:

Overall evaluation sentiment:

H1a: For freemium apps, the evaluation sentiment is more positive compared to premium apps.

H1b: For freemium apps, the evaluation sentiment is more positive compared to premium+ apps.

H1c: For premium apps, the evaluation sentiment is more positive compared to premium+ apps.

Pricing evaluation sentiment:

H2a: For freemium apps, the pricing evaluation sentiment is more positive compared to

premium apps.

H2b: For freemium apps, the pricing evaluation sentiment is more positive compared to

premium+ apps.

H2c: For premium apps, the pricing evaluation sentiment is more positive compared to

premium+ apps.

2.8 Pricing impact

Previous studies from Prelec (1998) and Thaler (1985) indicate that the sunk cost effect in this consumer coupling process is a dominant determent for customer satisfaction in the traditional premium pricing models. Sunk costs are less relevant for freemium apps and this would imply that pricing has a lower impact on the overall satisfaction expressed by consumers in their product evaluations. However, according to Hienerth et al. (2011), microtransactions in apps

(16)

create more involved pricing related interaction with consumers, because it adds distinct

elements to the functionality of the final product. These functions could therefore have a greater impact on the overall consumer experience and their satisfaction. Reviews for apps using microtransactions could according to new theories contain more pricing related judgements, because pricing is greater part of the experience. In order to test recent theories and measure the overall importance of pricing in consumer evaluations, the following hypotheses were

formulated:

H3(a): For freemium apps, pricing has a larger impact on product evaluations compared to

premium apps.

H3(b): For freemium apps, pricing has a larger impact on product evaluations compared to

premium+ apps

H3(c): For premium+ apps, pricing has a larger impact on product evaluations compared to

premium apps.

As discussed previously, several papers indicate that price is a product or service attribute considered relevant for consumer satisfaction (Anderson, Fornell and Lehmann, 1994). Zeithaml and Bitner (1996) demonstrated that when consumers are evaluating a product, they consider pricing an important factor in expressing their overall satisfaction in product evaluations. In order to test these theories and measure the correlation of pricing sentiment with the overall sentiment of consumer evaluations, the following hypothesis was formulated:

(17)

2.9 Evaluation assertiveness

2.9.1 Price fairness

Hermann et al. (2007) demonstrate that pricing not only directly affects consumer satisfaction, but also indirectly influences satisfaction through price fairness perceptions. Bolton, Warlop and Alba (2003) define fairness as a judgment by an individual about whether the result and/or process towards that results is acceptable, understandable or just. The perception of a product with a lower value than a financially similar product can cause consumers to perceive a price as unfair (Martins and Marielza, 1995). Unfair price perceptions ultimately lead to dissatisfaction (Oliver and Swan, 1989) and research from (Storm and Storm, 1987) indicates that

dissatisfaction is a negative experience associated with feelings of anger.

Early research by Homans (1961) indicates that a proportion of the construct fairness is affected by a distributive element. He describes that this part of fairness is related to the

judgements of consumers about the distribution of the rewards from an exchange. An unequal distribution of rewards between participants involved in an exchange relationship could create feeling of unfairness for participants who experience a disadvantage. Other papers also identified relevant factors that affect a consumers’ perception of price unfairness and consequences of these effects. Voss et al. (1998) examined effects of perceived price unfairness and indicate that this could be a key determinant of consumer satisfaction. Their findings suggest that

experiencing an unfair inconsistent outcome can have a strong negative effect on the overall satisfaction. Research from Oliver and Desarbo (1988) and Oliver et al. (1989) found similar results within this field of study and these papers argue that the strength of the effects found can be explained by the subjective nature of price fairness and the buyer’s perspective that these studies primarily focus on. Consumer tend to focus on their self-interest and try to maximize their own outcome, while creating an overall judgement (Oliver and Swan 1989). Therefore, the

(18)

feelings related to disadvantaged or advantaged price inequality are not similar. When the perceived price unfairness is in the advantage of the customer, the feeling of perceived unfairness is reduced significantly.

This can be important for measuring the assertiveness of evaluations between different pricing models, because it indicates that the extremity of feelings related to a disadvantage in price unfairness for certain apps, cannot be compensated by a buyer that experiences an

advantage in price unfairness. The latter experiences a relatively less intense emotion. Payment models without an upfront payment barrier could in theory reduce the experienced disadvantaged unfairness and should therefore have a lower amount of assertiveness within their evaluations related to this topic. These freemium users can prevent a disadvantage by evaluating the product before paying for it.

2.9.2 Betrayal

Grégoire (2008) indicates that perceived unfairness can create a feeling of betrayal. This feeling is according to Xia et al. (2004) an important motivational driver for retaliatory behaviour and research from Tripp (2011) reveals that this results in online complaints and negative word-of-mouth. Wirtz and Kimes (2007) also found similar results which demonstrate that customers respond negatively towards the seller when they believe the firm’s practice is unfair. Consumers then try to actively damage the reputation of the seller with negative responses such as writing negative online reviews about the service or product. These customer efforts to punish and cause harm to firms is described as customer retaliation (Grégoire and Fisher, 2008). Walster (1975) describes that retaliation is driven by the customers’ need to “bring down” or harm the seller in any possible way without focusing on improving their own situation by for example receiving compensation for the dissatisfaction. Consumer would conduct in reparative behaviour when

(19)

trying to receive compensation, which makes it in essence a corrective response in contrast with the disciplinary nature of retaliating behaviour.

Concluding, negative emotions of anger, dissatisfaction and betrayal could in theory occur more often if products or services require an upfront payment. Consumers using freemium apps can more easily prevent these negative feelings and this would result in less assertive behaviour such as retaliation. For apps using upfront payments, adding microtransactions could in theory increase chances on perceived unfairness and increase assertiveness. The predicted effects of the different pricing models on the evaluation assertiveness of the overall and pricing specific evaluations are stated in the following hypotheses:

Overall assertiveness:

H5a: For freemium apps, evaluations are less assertive compared to premium apps.

H5b: For freemium apps, evaluations are less assertive compared to premium+ apps.

H5c: For premium apps, evaluations are less assertive compared to premium+ apps.

Pricing topic assertiveness:

H6a: For freemium apps, pricing evaluations are less assertive compared to premium apps.

H6b: For freemium apps, pricing evaluations are less assertive compared to premium+ apps.

H6c: For premium apps, pricing evaluations are less assertive compared to premium+ apps.

It is important to analyse the effect of evaluation sentiment between pricing models and

(20)

from Xia et al. (2004) and Grégoire and Fisher (2008) justify this and indicate that perceived negative emotions such as unfairness and betrayal are an important motivational driver for assertive consumer behaviour. These emotions therefore result in negative consumer evaluation sentiment with a higher amount of assertiveness. If these more extreme negative emotions would increase the amount of assertiveness, this would indicate that sentiment predicts evaluation assertiveness. The effect of evaluation sentiment on the evaluation assertiveness is stated in the following hypothesis:

H7: Evaluation sentiment predicts evaluation assertiveness.

Finally, it is important to examine how the relation from pricing models mediates to assertiveness with respect to sentiment. This relation is still unanswered in the literature. However, it is known that the relation between the removal of payment barriers and evaluation sentiment is expected to be positive. Additionally, it is expected that sentiment predicts

assertiveness. Therefore, pricing structures could possibly predict assertiveness of evaluations through evaluation sentiment and thus it is hypothesized that sentiment positively mediates the relationship of pricing structures with assertiveness:

(21)

(22)

3. Data and method 3.1 Methodology

Recent advancements in information technology stimulated the development of new and more effective methods for research (Kemper, 2016). Several new improvements could be placed in the category “machine learning” and are related to concepts as neural networks and artificial intelligence. Researchers used to find it challenging to collect enough reliable data for their analyses; however, the volume of data available today is often harder to manage. The increasing availability of big data is driven by the swift digitalisation of modern society, the rise of the internet, advancements in computing power, diminishing storage costs for data and the increasing popularity of open data initiatives (Kemper, 2016). These developments made it possible to collect quantitative app characteristics and qualitative product evaluations for all apps in the Google Play Store.

The increase of available data has led to the introduction of the term Big Data. Wu et al. (2014) describe that Big Data starts with “heterogeneous, autonomous, large-volume sources

with distributed and decentralized control, and seeks to explore complex and evolving

relationships among data.” According to reports from IBM (2012), 2.5 quintillion bytes of data

are created daily and 90% of the data worldwide has been produced in the last two years,

indicating its exponential growth. The need for new technologies that are able to capture, manage and process this data is growing, due to the potential insights that this amount of information can provide. A core benefit from using new data collection techniques is the potential to create relative large datasets with less resources compared to traditional methods such as

questionnaires. These techniques provide an economical and efficient solution, because once the source code is configured, the data extraction is an automatic process and the and the volume of

(23)

data extracted is only limited by computing power and the crawler protective mechanisms active within the source.

These large amounts of data cannot be collected with tradition research tools and in order to make them available for future research, reliable methods have to be developed in order to give researchers access to these large data sets. For this paper, a custom crawler was build which can be adjusted and used to collect millions of data points automatically in different online contexts. Due to the digital nature of the Google Play Store and the new possibilities to collect online data with custom crawlers, it was possible to conduct large scale quantitative and

qualitative research without a research budget. The Google Play Store is a suitable environment to test the different hypotheses in this paper, because all three pricing models are present in this online platform. The marketplace is also one of the biggest in the world and therefore offers a lot of usable data. Additionally, it allows for the identification of effects of choosing a specific model on consumer sentiment, assertiveness and evaluation topics.

The dataset collected is used for a deductive research approach (Saunders et al., 2007) and based on the literature review, hypotheses were formed which can be tested with the extracted dataset. This study used a longitudinal design, because the data collected consisted of consumers reviews which are published over a period of five years; February 1st_{2012 until} February 1st_{2017. The data and theoretical insights in this study can be relevant for all} companies active within the app gaming industry and will potentially identify aspects of the relationship between pricing models and the user evaluations. For academics, the developed research methods will form an additional resource next to the new theoretical knowledge. The source code for the node.js crawler will be available online to support future research and give the opportunity to verify and replicate the study.

(24)

First, an extraction method based on the node.js open-source, cross-platform JavaScript run-time environment for executing JavaScript code server-side was developed and will be discussed. Second, multiple core elements for conducting machine learning analysis will be introduced and explained in the context of the data collected from the Google Play Store. Multiple machine learning techniques will be used in this study in order to generate a comprehensive analysis of the collected data and test robustness of methods. Finally, the research sample and operationalization of constructs will be discussed.

3.2 Data collection

To investigate the relationship between pricing structures and consumer product evaluations, quantitative and qualitative data were collected with a node.js crawler from the Google Play Store website; play.google.com/store/apps. Initially, software and extraction services from “Apifier” and “Kimono Labs” were used to collect the lists of the most popular gaming apps with their attached data points, but commercial applications seem to continuously change their pricing and services and therefore a more sustainable method had to be created. Furthermore, the Google Play Store integrated protective mechanisms against non-human (software) visitors in their systems in order to prevent abuse of their website, which blocked these commercial systems from collecting certain parts of the data. For example, reviews were a challenging data object to collect because this information was covered in a JavaScript module and hidden for crawlers. The purchase button of apps, which contains the pricing information, was also hidden for computer programs, presumably to prevent these tools to automatically download the app and artificially improve their popularity rankings. Furthermore, the list of popular apps was limited to 60 and required a manual click on a read more button in order to scroll down and see the

(25)

remaining parts of the list. The usage of a custom extraction method made it possible to bypass these protective systems without abusing the source website.

These mechanisms are a consequence of the growing problems related to the abuse of websites and crawlers can contribute to this problem, because they can be used as a foundation for distributed denial-of-service (DDoS) attacks (Mirkovic & Reiher, 2004). These attacks form a threat to the accessibility of the internet in general and many well-known websites have been victim of these practices in the past. The crawler built for this paper has lines of code built in for the end-user which can be edited to adjust the impact of tool on the target website, in order to prevent abuse and harm to the source.In case of uncertainty, it is best to contact the owners of the source and inform about their policies regarding crawling the website or app.

The crawler created for this paper was based on the Node.js (Node) framework, which is focused on developing high-performance, concurrent programs that do not rely on the traditional multithreading technology but instead use an asynchronous input/output (I/O) with an event-driven programming model (Tilkov & Vinoski, 2010). This event event-driven programming model is the core function that makes Node a suitable programming framework for collecting large amounts of data online, because functions are designed to be non-blocking. Traditionally, in a multithreading environment, most functions block until completion, which indicates that commands only execute after previous commands have been completed. In a non-blocking environment, commands execute in parallel and use “callbacks” to signal completion or failure. This structure allows Node to operate on a single thread, supporting thousands of concurrent connections without having the cost of thread context switching, making it a suitable

environment to crawl a large number of webpages simultaneously and extract the data for this paper efficiently. Node is capable of providing the ease of a scripting language (JavaScript) with

(26)

the benefits of a fast Unix network. Although Node was introduced by Ryan Dahl in 2009, it is still considered a new technique and is not yet recognized as “Stable” in comparison with older coding languages such as PHP. However, Node is used for this study because it is expected to develop into a more mainstream language and to become a favourable choice for the

development of data-intensive web-applications (Lei, Ma, Tan, 2014).

The link to the core files of the Node crawler can be found in appendix III, including several examples of code, that can be adjusted for usage in a different online context. Node can be controlled in terminal and the required libraries from the Node Package Manager (npm), which are premade packages of reusable code, can be automatically installed with basic

command lines. Once installed, several methods are available in order to retrieve the full details of an application, retrieve the reviews for an application and retrieve a list of applications from one of the collections at Google Play. In practice, the data related to these sources are then extracted from https://play.google.com/store/apps/collection/${opts.collection}, returning a list of apps within a certain collection. In this paper, the free and paid gaming collection was extracted in order to collect apps using the freemium, premium and premium+ models.

The data collection is separated in two phases, because the ranking of apps in the Google Play Store is based on the real-time number of downloads and active installations and therefore changes continuously. First, the top 120 lists for the paid and free gaming category were extracted and include app details such as the game title, ranking, price, developer name, ratings and pricing model. The first phase allowed for the collection of reviews, which took several weeks, without the interruption of changing rankings.

(27)

3.3 The research sample

In order to further investigate the impact of pricing, the reviews were extracted from a

randomised subset of apps, consisting of 40 apps per category. Machine learning techniques are required for the analysis of the reviews of each app in this database and due to limitations in available computing power for these processes, this sample of 120 apps from the top gaming lists is used. Each app had nearly 200.000 reviews and the final dataset included of 23.830.300

reviews. The data was collected in May 2017 and the reviews collected were written between February 1st 2012 and February 1st 2017 in order to create comparable data groups. For the assertiveness measurement, a sub review data set was created due to budget limitations with computing power, which consisted of the most recent reviews (maximum of 5000 per app) before February 1st_{2017, resulting in a total of 393.000 reviews.}

3.4 Operationalisation of constructs

The different content variables of the gaming apps used in this research need to be operationalised. All the different variables will be operationalised by using the different construct discussed in the literature review.

3.4.1 Pricing Models

The Google Play Store offers free and paid top charts in the gaming apps category. For both categories, free and paid, two more sub-categories are present. Games in the paid category which require an upfront payment, were divided in the “premium” and “premium+” apps. Premium games require an upfront payment but offer no additional in-game purchase options in contrast to the premium+ apps, which do offer in-game purchases. In the free category, another separation was made between the apps with and without in-game purchasing options, creating the freemium (with in-game purchases) and free gaming apps. Free apps do not contain any form of payment

(28)

options and the evaluations do not contain judgments about pricing. Therefore, this category was not usable and excluded in the final dataset.

3.4.2 Evaluation Sentiment

For this paper, publicly available reviews were analysed to measure customer sentiment and investigate the evaluation related to the topic pricing. Consumer reviews contain a short-written judgment about the app and were obtained from the app details page inside the Google Play Store. Reviews also contain scores, which represent sentiment expressed on a one to five scale. Algorithms were used to measure the sentiment of reviews and categorised them based on negative, neutral and positive elements. A sentiment score was added between 0 (negative) and 100 (positive). This score indicated how pleased or displeased the user was. Machine learning tools trained on topic analysis created a subset of pricing-related reviews in order to compare sentiment for this specific topic with the overall sentiment.

3.4.3 Evaluation assertiveness

The amount of active, aggressive and confident reactions can be expressed by the assertiveness of the evaluations. Low assertiveness would therefore indicate a passive, non-active consumer evaluation. Algorithms identified an assertiveness score between 0 (passive) and 100 (active) for each evaluation, indicating how actively or passively consumers expressed themselves. Machine learning tools trained on topic analysis created a subset of pricing related reviews in order to compare assertiveness for this specific topic with the general assertiveness.

3.4.4 Pricing & impact

For each gaming app, various pricing elements were collected. Freemium apps do not have a purchasing price, therefore the price for these games was set at $0 (free). The pricing range of in-game purchases was extracted for each app. A new variable was coded based on these data

(29)

points and was labelled “average in-game price”. This variable indicated the average price of the microtransactions available in the app. Finally, the evaluations were scanned to determine if the review contained a judgment related to the topic pricing. The impact of pricing predicted in the third set of hypotheses was indicated by the percentage of reviews containing judgements about pricing. This was necessary to measure the sentiment and assertiveness for pricing-related reviews only and identify the effect of pricing on the overall review database.

3.4.5 Additional variables

A few control variables were collected while analysing the details pages of each app. The “Pan European Game Information” (PEGI) rating was collected and indicates the age classification for each app. The amount of installations was also collected and was given as an estimation (e.g. 10.000-50.000). The average of both numbers was used to calculate and create the variable “Average Installs”. Furthermore, the genre and ranking position in the top charts was identified. Lastly, the date on which the corresponding app was updated most recently was added to the dataset.

3.5 Machine Learning – Topic Identification

Machine learning tools from Monkey Learn, Rapidminer and Appbot were used for measuring the evaluation sentiment, assertiveness and topics. Various machine learning tools with similar functionality exist and can be trained with sample datasets. Analysing the amount and percentage of reviews per app that cover the topic pricing is required to measure the impact of pricing on consumer reviews. In order to identify the number of reviews which contain the topic pricing, a sample word cloud was created and a test sample of pricing reviews was manually labelled and uploaded to the system in order to train the algorithm. In general, a sample file of over 200 reviews is recommended in order to get a reliable classification system. For this algorithm, a

(30)

Multilanguage classifier was used in order to include reviews from users around the world. The normalisation of weights for the classifier was enabled, in order to prevent categories with more samples to have an increased prior probability, which is related to prior probability of Bayesian statistical inference. Stemming indicates if the training process should transform words into their root form and is not used in this classifier in order to increase the accuracy and usability of the machine learning techniques for sentiment and assertiveness. Furthermore, in order to increase efficiency, stop-words were filtered from the reviews, because they did not contribute as classification features. For all classifiers, a Bigram or Trigram was used, as they can capture more complex expressions formed by the compositions of more than one word. Terms in these models are compounded by two or three words (n-gram size=2 or 3). These models are more effective because their classification not only depends on the frequency of words, but also on how they are combined. The algorithm for this classifier was based on the Support Vector Machines, but a Multinomial Naive Bayes algorithm would also work in this context for future research (Rennie, Shish et al., 2003). The source code of a classifier algorithm can be complex, but a lot of online services allow access to such a module for free or for a low fee, for example the Google Cloud Machine Learning Engine. After setting up the classifier configuration and training the algorithm by uploading a sample dataset, it is possible to run tests and identify the effectiveness of the process by scoring the outcomes. For this process, a user manually labels the reviews in the dataset and compares it with the outcomes of the algorithm in order to identify the concurrence score. A concurrence score above 75% is preferred in order to generate valid results and the final version of the algorithm used in this paper had a score of 86%.

(31)

3.6 Machine Learning – Sentiment & Assertiveness measurement

After indicating the percentage of reviews per app about the pricing topic, it is possible to indicate the sentiment of that topic for each app in order to get information about what kind of impact it has on customer satisfaction. These pricing specific evaluation results can also be compared with the sentiment of the overall evaluations, to check for notable differences. Comparing results between the different pricing structures could help with creating a more qualitative and in-depth answer to the main research question. Results could give additional context to the findings in the first analysis and provide more information about the relationships found in the first analysis. Creating a large training set for sentiment was relatively easy, because all reviews had a star rating attached, indicating sentiment in a quantitative way. For this paper, a one or two-star rating was categorised as negative, a three-star rating was categorised as neutral and four to five stars as positive. The reason of using machine learning for sentiment analysis while the reviews already contain a star rating, is related to the accuracy of the outcomes. The outcomes regarding sentiment are presented on a scale from 0 (extremely negative) to 100 (extremely positive) and are therefore more accurate and usable for the regression analysis and comparison with assertiveness, which was also measured on a 0 to 100 scale.

Assertiveness is a new classification in the machine learning industry and within the timeframe of this paper, it was not possible to develop a custom and effective machine learning tool for this variable. Therefore, an application programming interface (API) from appbot.co is used to measure assertiveness for the included reviews. A subset of reviews per app were used in this analysis, because the API allowed for processing a maximum of 5000 reviews per app for free. For the included 120 apps, the most recent reviews and the most recent pricing reviews (maximum of 5000) published before February 1st_{2017 were analysed in order to measure}

(32)

assertiveness and sentiment independently with the API. Therefore, this sub review dataset is part of the full review dataset, which was also analysed on sentiment with the initial custom machine learning tool. After collecting quantitative data and classifying the reviews, it is

possible to compare the three pricing structures based on several hypotheses derived from theory and find potential differences between the groups.

3.7 Statistical analyses

The final analyses were conducted with IBM SPSS version 23 (2016). The results of these analyses will possibly indicate relationships between; pricing models and evaluation sentiment (H1&H2), pricing models and evaluation topics (H3), pricing sentiment and overall sentiment (H4), pricing models and evaluation assertiveness (H5&H6), evaluation sentiment and

assertiveness (H7) and finally the mediating effect of sentiment in the relationship between pricing models and evaluation assertiveness (H8).

(33)

4. Results

4.1 Descriptive statistics

In order to test the hypotheses and generate insights from data, the influence and relationship of the collected variables was investigated within and between the three monetisation models. The data for all three monetisation models was first compared with a one-way ANOVA analysis to generate descriptive results. For each of the three pricing models available in the Google Play Store, the data from the top 40 gaming apps was collected, resulting in a dataset of 120 apps. Based on the content of reviews and an investigation of the history of one of the included apps, it appeared that this app changed their pricing strategy shortly before the data collection for this paper started, from premium+ to freemium. The recent change implicated that the collected reviews were relevant for the premium+ category, instead of the freemium category. This affected the number of apps per category and changed the number of freemium apps into 39 and the number of premium+ apps in 41. The reviews after the pricing model change for this app were excluded in the dataset. The final dataset therefore consisted of 39 freemium, 40 premium and 41 premium+ apps.

The average number of reviews available for each app varied per pricing model, resulting in an average of 566.569 reviews per app in the freemium category, 10.853 in the premium category and 31.706 in the premium+ category. This is related to the popularity of apps for each monetisation model, as freemium games were the most downloaded apps with an average number of active installations of 130.28 million. For the pricing categories which require an upfront payment, the average number of installs was 1.52 million for premium+ apps and 685.575 for premium apps. Apps offering in-game purchases ranked relatively high within the free and payed popularity charts.

(34)

Table 4.1 – Descriptive statistics

Averages per pricing model

Freemium Premium Premium+ Installations per app 130.285.385 686.875 1.522.682 Written reviews per app 566.569 10.853 31.706 % pricing reviews 0.61% 5.64% 5.81% % positive reviews 79.97% 74.72% 70.68%

App price $0.00 $3.23 $3.16

Popularity ranking 26 66 33

4.2 Pricing models and overall evaluation sentiment (H1)

The first set of hypotheses described in the theoretical framework concerned the impact of pricing on the product evaluations in relationship with the three different monetisation models. H1(a) stated that for freemium apps, the evaluation sentiment is more positive than for premium apps. H1(b) predicted that for freemium apps, the evaluation sentiment is more positive than for premium+ apps. H1(c) stated that for premium apps, the evaluation sentiment is more positive than for premium+ apps. In order to analyse the average evaluation sentiment of the three pricing models, a one-way ANOVA test was completed. According to the results, the average sentiment between the groups was not equal (p = 0.016). According to the Tukey HSD (Honestly

Significant Difference) results, as predicted, freemium apps had a significant (p < 0.001) more positive sentiment than premium apps, confirming H1(a). Furthermore, freemium apps had a significant (p < 0.001) more positive sentiment compared to premium+ apps, confirming H1(b).

(35)

Finally, premium apps did not appear to have a significant (p = 0.057) more positive sentiment than premium+ apps, rejecting H1(c).

4.3 Pricing models and pricing evaluation sentiment (H2)

The second set of hypotheses described in the theoretical framework concerned the impact of pricing on the product evaluations, specifically with the topic pricing in relationship to the three different monetisation models.H2(a) stated that for freemium apps, the pricing evaluation sentiment is more positive than for premium apps. H2(b) predicted that for freemium apps, the pricing evaluation sentiment is more positive than for premium+ apps. H2(c) stated that for premium apps, the pricing evaluation sentiment is more positive than for premium+ apps. Using a one-way ANOVA (F-test), the pricing sentiment between the three groups was compared.

According to the ANOVA results, the average sentiment between the groups was not equal (p = 0.022). According to the Tukey HSD test results, as predicted, freemium apps had a significant (p < 0.001) more positive pricing sentiment than premium apps, confirming H2(a). Furthermore, freemium apps had significant (p < 0.001) more positive pricing sentiment compared to

premium+ apps, confirming H2(b). Finally, premium apps did not appear to have a significant (p = 0.865) more positive pricing sentiment than premium+ apps, rejecting H2(c).

4.4 Relative pricing topic impact (H3&H4)

After investigating the link between the pricing models and sentiment, the impact of pricing models on the evaluation topics was measured in order to provide context for the previous results. The third set of hypotheses described in the theoretical framework concerned the impact of pricing on the topic of the product evaluations in relationship with the three different

(36)

product evaluations compared to premium apps. H3(b) stated that for freemium apps, pricing has a larger impact on product evaluations compared to premium+ apps. H3(c) stated that for

premium+ apps, pricing has a larger impact on product evaluations compared to premium apps. The percentage of reviews containing the topic pricing was measured with machine learning for each pricing model. The results indicated an average of 0.61% pricing reviews in the freemium category, 5.63% in the premium category and 5.81% in the premium+ category. Using a one-way ANOVA, the impact of pricing on the evaluation topics between the three groups was compared. The results indicated that the average amount of pricing revaluations between the groups was not equal (p < 0.001). According to the Tukey HSD test results, the freemium model had a significant (p < 0.001) lower amount of reviews about pricing than premium, rejecting H3(a). Furthermore, the results also indicated the freemium model had a significant (p < 0.001) lower amount of reviews about pricing than premium+, rejecting H3(b). Finally, premium+ apps appeared not to have a significant (p = 0.948) higher amount of reviews about pricing compared to premium, rejecting H3(c).

Furthermore, a linear regression was performed for each pricing model in order to identify the relationship between pricing evaluation sentiment and the overall evaluation

sentiment. The fourth hypothesis described in the literature review stated that pricing evaluation sentiment is positively related to overall evaluation sentiment. As predicted, the results indicated evaluation pricing sentiment had a strong positive effect (β =0.833, p < 0.001, R2_{= 0.695) on the} overall evaluation sentiment, confirming H4. The effects for each pricing model were also identified. For freemium apps, the results indicated evaluation pricing sentiment had a strong positive effect (β =0.624, p < 0.001, R2_{= 0.389) on the overall evaluation sentiment. For} premium apps, the results indicated evaluation pricing sentiment had a strong positive effect (β

(37)

=0.834, p < 0.001, R2_{= 0.696) on the overall evaluation sentiment. For premium+ apps, the} results indicated evaluation pricing sentiment had strong a positive effect (β =0.922, p < 0.001, R2_{= 0.847) on the overall evaluation sentiment. These results did fit with the average percentage} of evaluations containing the topic pricing, because higher averages resulted in a stronger

relationship between pricing sentiment and overall sentiment.

4.5 Pricing models and overall evaluation assertiveness (H5)

The fifth set of hypotheses described in the theoretical framework concerned the impact of pricing on the assertiveness of the product evaluations, specifically with the topic pricing in relationship with the three different monetisation models. H5(a) stated that for freemium apps, the evaluation assertiveness is lower than for premium apps. H5(b) predicted that for freemium apps, the evaluation assertiveness is lower than for premium+ apps. H5(c) stated that for premium apps, the evaluation assertiveness is lower than for premium+ apps.

Using a one-way ANOVA, the impact of pricing on the evaluation assertiveness between the three groups was compared. According to the ANOVA results, the average assertiveness between the groups was not equal (p < 0.001). Confirming H5(a), the results indicated the freemium model had significant (p < 0.001) less assertive evaluations than premium. Furthermore, the results also indicate the freemium model had significant (p < 0.001) less

assertive evaluations than premium+, confirming H5(b). Finally, premium apps did not appear to have significant (p = 0.561) less assertive evaluations than premium+, rejecting H5(c).

4.6 Pricing structures and pricing evaluation assertiveness (H6)

The sixth set of hypotheses described in the theoretical framework concerned the impact of pricing on the assertiveness of the product evaluations, specifically for evaluations containing the topic pricing. H6(a) stated that for freemium apps, the pricing evaluation assertiveness is lower

(38)

than for premium apps. H6(b) predicted that for freemium apps, the pricing evaluation

assertiveness is lower than for premium+ apps. H6(c) stated that for premium apps, the pricing evaluation assertiveness is lower than for premium+ apps.

Using a one-way ANOVA, the impact of pricing models on the pricing evaluation assertiveness between the three groups was compared. According to the ANOVA results, the average assertiveness between the groups was not equal (p = 0.03). Confirming H6(a), the results indicated the freemium model had significant (p = 0.045) less assertive pricing evaluations than premium. Furthermore, the results indicated the freemium model had significant (p = 0.037) less assertive pricing evaluations than premium+, confirming H6(b). Finally, premium apps appeared to not have significant (p = 0.979) less assertive pricing evaluations than premium+, rejecting H6(c).

4.7 Evaluation sentiment and mediating effect (H7&H8).

The seventh hypothesis described in the literature review stated that of sentiment predicts

assertiveness. Using a quadratic curve fit estimation regression model, a significant correlation (p < 0.001) with an R2_{of 0.939 was found, which indicated a strong relation and confirmed H7.}

The eighth hypothesis stated that sentiment positively mediates the relationship of the pricing models with assertiveness. In order to examine this mediating effect of sentiment, two dummy variables were created. Dummy variables were used for including the nominal pricing structures in the linear regression analysis. These variables did not have a fixed unit of

measurement and therefore it was incorrect to assume a linear relation between them (Berg, 2013). The reference payment model used in the analysis was freemium, because it formed a distinct reference for the premium and premium+ pricing models. To analyse the data, four regression models were generated. In model one, a regression between the two dummy variables

(39)

and sentiment was conducted. This model, with an explained variance of 16.7 percent (R2₌ 0.167), indicated that that both variables premium (β = 0.319, p = 0.001) and premium+ (β = -0.465, p < 0.001) had a significant lower average sentiment compared to freemium apps. In model two, a regression between the two dummy variables and assertiveness was conducted. This model, with an explained variance of 24.7 percent (R2_{= 0.247), indicated that both} premium (β = 0.445, p < 0.001) and premium+ (β = 0.543, p < 0.001) had a significant higher average evaluation assertiveness compared to freemium apps. In model three, a regression between sentiment and assertiveness was conducted. This model, with an explained variance of 33.5 percent (R2_{= 0.335), indicated that sentiment had a negative effect on assertiveness (β =} -0.578, p < 0.001). The results revealed that apps with less negative evaluation sentiment had less assertive consumer reactions. The last model described the mediation between the three

variables. The eighth hypothesis predicted that pricing models have a positive effect on assertiveness which will mediate through sentiment, indicating a partial mediation. Model 4, with an explained variance of 46.7 percent (R2_{= 0.467), indicated that the effects of premium (β} = 0.359, p < 0.001) and premium+ (β = 0.397, p < 0.001) still significantly affected the

evaluation assertiveness while the effect of sentiment remained significant (β = -0.486, p < 0.001). Sentiment positively mediated the relationship of the pricing models with assertiveness confirming H8.

(40)

Conceptual model - H8

Numbers in brackets are regression weights after the mediator has been controlled for. Pricing A is the dummy variable premium and Pricing B is the dummy variable premium+. * p <.05: ** p <.01: *** p <.001.

4.8 Results overview

Hypothesis Prediction Result

H1a Freemium sentiment is higher than Premium sentiment Supported***

H1b Freemium sentiment is higher than Premium+ sentiment Supported***

H1c Premium sentiment is higher than Premium+ sentiment Not supported

Sentiment Assertiveness Pricing A & B βa = -.319 *** βb = -.465 *** β = -.578 *** (β = -.486***) βa = .445*** (βa =.359***) βb = .543*** (βb =.397***)

(41)

Hypothesis Prediction Result

H2a Freemium pricing sentiment is higher than Premium pricing

sentiment

Supported***

H2b Freemium pricing sentiment is higher than Premium+ pricing

sentiment

Supported***

H2c Premium pricing sentiment is higher than Premium+ pricing

sentiment

Not supported

H3a Freemium % pricing reviews is higher than Premium % pricing

reviews

Not supported

H3b Freemium % pricing reviews is higher than Premium+ % pricing

reviews

Not supported

H3c Premium+ % pricing reviews is higher than Premium % pricing

sentiment

Not supported

H4 Pricing evaluation sentiment is positively related to overall evaluation sentiment.

Supported***

H5a Freemium evaluation assertiveness is lower than Premium

evaluation assertiveness

Supported***

H5b Freemium evaluation assertiveness is lower than Premium+

Supported***

H5c Premium evaluation assertiveness is lower than Premium+

Not supported

H6a Freemium pricing evaluation assertiveness < Premium pricing

Supported**

H6b Freemium pricing evaluation assertiveness < Premium+ pricing

Supported**

H6c Premium pricing evaluation assertiveness < Premium+ pricing

Not supported

H7 Evaluation sentiment predicts evaluation assertiveness Supported***

H8 Sentiment positively mediates the relationship of pricing

structures with assertiveness.

Supported***

(42)

5. Discussion

5.1 General results discussion

The main purpose of this paper is to examine the effects of the different app pricing structures on product evaluations. Particularly the effects of pricing structures on sentiment and assertiveness expressed in consumer evaluations and the impact of pricing on the evaluation topics were examined. Several hypotheses were created to answer the main research question of this study; in

what degree are the consumer evaluations affected by the monetisation model of an app?

Based on the endowment effect introduced by Kahneman (1990) and customer

satisfaction models from Spreng et al. (1996) and Bowman & Ambrosini (2000), our first and second hypotheses stated that freemium apps would have a higher average overall and pricing evaluation sentiment compared to premium and premium+ apps. The results of our study confirm these differences. The overall and pricing sentiment of premium apps was higher than premium+ apps, however this difference was not significant. Therefore, the effects of adding

microtransactions on consumer evaluations while keeping the upfront payment requirement, appeared to have no significant effect on evaluations in this study.

Previous studies indicate that based on the sunk cost effect, pricing would have a larger impact on product evaluations for apps requiring an upfront payment (Thaler, 1985). More recent theories (Hienerth et al., 2011) suggest that microtransactions create more involved pricing related interaction with consumers and could therefore have larger impact on product

evaluations. The results of this study contradict this and indicate that microtransactions do not have a larger impact on evaluations than upfront payments. Freemium apps have a lower amount of reviews about pricing than premium and premium+ apps. Premium+ apps have a higher percentage of pricing reviews per app than premium app, but this difference was not significant.

(43)

Several papers indicate that price is a product or service attribute considered relevant for consumer satisfaction (Anderson, Fornell and Lehmann, 1994). When consumers are evaluating a product, they consider pricing an important factor in expressing their overall satisfaction in product evaluations (Bitner and Hubbert, 1994). The results of this study confirm these theories and indicate that evaluation pricing sentiment had a strong positive on the overall evaluation sentiment. The higher percentages of pricing reviews per app related to the stronger relationship between pricing sentiment and overall sentiment.

Based on studies from Grégoire (2007), Prelec and Loewenstein (1998) and Tripp (2011), it was hypothesised that adding an upfront payment requirement, would increase the overall and pricing evaluation assertiveness. The results of our study confirm this and indicate that the freemium model has less assertive evaluations than premium and premium+ apps. Premium apps appear not to have less assertive evaluations than premium+ app. Furthermore, results show that perceived negative emotions such as unfairness and betrayal are an important motivational driver for assertive consumer behaviour. Evaluation sentiment predicts evaluation assertiveness and apps with higher average sentiment have less assertive evaluations. Oliver and Swan (1989) explain that feelings related to disadvantaged or advantaged price inequality are not similar. When the perceived price unfairness is in the advantage of the customer, the assertiveness of consumer response is reduced significantly, explaining why positive feelings regarding price are less assertive.

The final hypothesis examined how the relation from pricing models mediates to assertiveness with respect to sentiment. This relation was unanswered in the literature and the results show that sentiment positively mediates the relationship of pricing structures with

The effects of app pricing structures on product evaluations