• No results found

Online reviews in the mid-digital age : The differences between mobile-written and PC-written reviews and its effects on perceived helpfulness

N/A
N/A
Protected

Academic year: 2021

Share "Online reviews in the mid-digital age : The differences between mobile-written and PC-written reviews and its effects on perceived helpfulness"

Copied!
186
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

1

Online reviews in the mid-digital age

The differences between mobile-written and PC-written reviews

and its effects on perceived helpfulness

Author: Sanne van den Brandt

Student number: 11853050

Date of submission: 20 June 2018

Version submitted: Final version

Qualification: MSc. in Business Administration – Digital Business Track

Institution: Amsterdam Business School, University of Amsterdam

First supervisor: Ms. Shan Chen

(2)

2

Statement of originality

This document is written by Sanne van den Brandt who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

3 Table of contents Statement of originality ... 2 Table of contents ... 3 Abstract ... 6 Chapter 1 Introduction ... 7

Chapter 2 Literature review ... 11

2.1 User-generated content and online product reviews ... 11

2.2 The effects of mobile phone usage on UGC ... 12

2.3 The characteristics of online reviews ... 15

Basic characteristics ... 17

Stylistic characteristics ... 18

2.4 The effects on the perceived helpfulness of mobile reviews ... 20

2.5 Conceptual framework ... 23 Chapter 3 - Methodology ... 24 3.1 Research design ... 24 Methodological philosophy ... 24 Research approach ... 25 Research strategy ... 25 Research purpose ... 25 3.2 Operationalization of concepts ... 26 Product category ... 26 Device type ... 27 Basic characteristics ... 27 Stylistic characteristics ... 28 Perceived helpfulness ... 29

3.3 Sampling, data collection and data analysis ... 30

(4)

4 Data collection ... 32 Data analysis ... 33 Chapter 4 – Results ... 34 4.1 General descriptives ... 34 Device type ... 34 Product category ... 35 Basic characteristics ... 36 Stylistics characteristics ... 36 Perceived helpfulness ... 38

4.2 The differences between mobile-written and PC-written reviews ... 39

Basic characteristics ... 39

Stylistic characteristics ... 41

4.3 The effects on perceived helpfulness ... 49

Differences between mobile- and PC-written reviews in helpfulness ... 49

The effects of the differences in basic and stylistic characteristics ... 51

4.4 Summary of the results ... 55

Chapter 5 – Discussion ... 57 5.1 General discussion ... 57 Research question 1 ... 57 Research question 2 ... 61 5.2 Implications ... 64 Theoretical implications ... 64 Managerial implications ... 65

5.3 Limitations and suggestions for further research ... 66

Chapter 6 – Conclusion ... 68

References ... 71

(5)

5

Table 1 – List of studies on characteristics of online reviews ... 15

Figure 1 – Conceptual framework ... 23

Figure 2 – Number of reviews per device type ... 35

Table 2 – Number of reviews per product category and product ... 35

Table 3 – Results comparison mobile-and PC-written reviews on ranking... 40

Table 4 – Results comparison mobile-and PC-written reviews on ranking extremeness ... 41

Table 5 – Results comparison mobile-and PC-written reviews on extremeness of emotional text... 43

Table 6 – Results comparison mobile-and PC-written reviews on level of detail ... 44

Table 7 – Results comparison mobile-and PC-written reviews on recency of consumption ... 46

Table 8 – Results comparison mobile-and PC-written reviews on several length metrics ... 47

Table 9 – Results comparison mobile-and PC-written reviews on helpfulness metrics) ... 50

Table 10 – Results regression analyses, influence of independent variables on helpfulness ... 52

Figure 3 – Conceptual framework with outcomes of tests marked ... 56

Appendix A – Figures in text ... 80

Appendix B – Codebook ... 81

Appendix C – Output IBM SPSS Statistics 25/ FZT Computator ... 83

Appendix C1: Boxplots on all continuous variables ... 83

Appendix C2: SPSS Output on the general descriptives ... 91

Appendix C3: SPSS Output on basic characteristics ... 95

Appendix C4: SPPS Output on stylistic characteristics ... 106

Appendix C5: SPSS Output on helpfulness ... 148

Appendix C6: SPSS Output/ FZT Computator Output on regression analyses ... 176

List of tables and figures

(6)

6

Abstract

Over the past years, smartphone usage has grown excessively. Browsing behavior on smartphones was found to differ a lot from browsing behavior on PCs, which might have implications for user-generated content. However, not much research is done on this subject yet, which is why this study aims to gain insights in how writing online product reviews on a certain device, affects the characteristics of the reviews and how these effects impact the perceived helpfulness of the reviews. To investigate this, existing data from the Indian website MouthShut.com were used, concerning product reviews in the categories

‘Automotive’, ‘Books’, ‘Electronics’, and ‘Home and Appliances’. For all categories together, it was found that mobile-written product reviews indeed differ from PC-written reviews on characteristics concerning level of detail, indications of recency of consumption and length. However, certain differences appear to exist between the several product categories.

Additionally, the results of this study indicate that for mobile- and PC-written reviews different factors affect the perceived helpfulness of the online product reviews, though PC-written reviews were generally found to be perceived as more helpful. The findings of this study have important implications, which mainly concern the design of the interface in which online product reviews are written.

(7)

7

Chapter 1 Introduction

In 2016, the English newspaper the Telegraph published the following headline: “Mobile web usage overtakes desktop for first time”. The corresponding article reported that for the first time in history, more web pages were loaded on mobile devices, than on PCs (personal computers) (The Telegraph, 2016). This illustrates the rising importance of mobile devices, especially mobile phones, in our daily lives.

GSMA Intelligence (an organization that collects data from all operator groups, networks and Mobile Virtual Network Operators worldwide) states that over five billion people in the world now are mobile subscribers (GSMA Intelligence, 2017a; 2017b). A unique mobile subscriber is “an individual person that can account for multiple ‘mobile connections’” (GSMA

Intelligence, 2017b). In 2001, the number of mobile subscribers was still below one billion. Thus, mobile phone usage has grown excessively worldwide, within a relatively short amount of time. A large part of this growth is accounted for by smartphones (Cisco, 2016; GSMA Intelligence, 2017a). A smartphone is “a pocket-sized communication device with PC-like capabilities” (Carroll & Heiser, 2010, p. 1). One of its most important features is that it, similar to a PC, provides internet access. Of all people possessing a mobile phone, currently 75 percent uses its phone to access the internet (GSMA Intelligence, 2017a). However, accessing the internet from a PC is still common as well. In 2016, still 48.74 percent of the web pages was loaded on a PC (The Telegraph, 2016).

Although Carroll and Heiser (2010) stated that smartphones have PC-like capabilities, many differences exist, between smartphones and PCs. The browsing behavior on a smartphone for instance, differs a lot from the browsing behavior on a PC. Raphaeli, Goldstein and Fink (2017) found that mobile browsing behavior is more task-oriented, whereas PC browsing behavior is more exploratory. Moreover, Ghose, Goldfarb and Han (2013) argue that the

(8)

8 smaller screen of a mobile phone results in higher search costs, as opposed to a PC, which has a relatively larger screen. Also, mobile phones are portable, which means that they can easily be used at any location, as opposed to PCs. The latter means that mobile users have access to timely information as well, whereas PC users do not have this advantage (Ghose et al., 2013). All these differences in browsing behavior might have major marketing implications, but little empirical research is done on the implications of these differences. In addition, the research that has been done, is mainly focused on the implications for advertisements (Burtch & Hong, 2014). However, besides the area of advertisements, there are more areas that are likely to be influenced by the differences in browsing behavior per device type, such as the customer purchase journey, customer equity, customer loyalty and user-generated content (UGC) (Kannan & Li, 2016).

There are some studies concerning UGC that proof that differences between the browsing behavior of mobile- and PC-users exist. These studies concern for example microblogs (Ghose et al., 2013) and online service reviews (Burtch & Hong, 2014). However, more research still lacks on the differences between browsing behavior for mobile- and PC-users, in the area of online reviews (Kannan & Li, 2016), specifically in the area of online product reviews. Online product reviews are considered to be one of the most influential sources that consumers use for decision-making (Zhang, Wu, & Mattila, 2016) and can have a significant impact on product awareness and product sales (Chevalier & Mayzlin, 2004). Little is known yet, about the effects of device type on this influential type of UGC, though it is expected that there will be a certain impact, as effects were found for other types of UGC as well.

Therefore, this research addresses the following research question: “How do online product reviews that are written on a mobile phone differ from those that are written on a PC?”

(9)

9 A topic that is considered to be very important within the field of online product reviews, is perceived helpfulness, as this concept is generally used to assess the usefulness of reviews (Singh, Irani, Rana, Dwivedi, Saumya, & Kumar, 2017). Many customers experience an information overload, caused by the large amount of online reviews (Singh et al., 2017). Hence, in many cases, an indication of helpfulness of reviews is required, in order to simplify the decision-making of customers. Therefore, most online review websites ask customers to indicate the helpfulness of reviews (see Appendix A, figure 1 for an example). Subsequently, the websites rank reviews according to their helpfulness, assisting customers in their decision-making processes (Cao, Duan, & Gan, 2011). Because of the importance of this topic within the field of online reviews, this study also addresses a second research question: “How do the differences between mobile-written- and PC-written online product reviews affect the

perceived helpfulness of the reviews?”

The research questions are answered by studying data from the website MouthShut.com, India’s leading online review website (MouthShut.com, n.d.). This website provides

information on both device usage and helpfulness, and was therefore considered to be useful to conduct this research. MouthShut.com has about 12 million monthly users (BGR, 2017) and more than 800.000 products and services to be reviewed (MouthShut.com, n.d.). A sample of 1435 reviews was analyzed by using the programs LIWC: Linguistic Inquiry and Word Count, IBM SPSS Statistics 25, and FZT Computator.

This study has generated new insights concerning the impact of device type on the

characteristics of online product reviews and on their helpfulness. It contributes to existing literature, indicating that mobile- and PC-written product reviews indeed differ from each other, concerning a few characteristics. Moreover, these differences were found to affect

(10)

10 helpfulness in certain ways. These insights might be used to improve interface designs and algorithms used concerning online product reviews.

In this report, first an overview of literature that is relevant to the research questions will be presented, resulting in hypotheses and a conceptual framework. Thereafter, information on the research method will be provided. Additionally, the results of this research will be presented. Finally, the results will be discussed, and conclusions will be drawn.

(11)

11

Chapter 2 Literature review

In this chapter, an extensive literature review will be presented, leading to the development of several hypotheses that have provided guidance throughout the research process. Firstly, the concepts of both UGC and online product reviews are discussed. Secondly, the effects of mobile phone usage on UGC in general are examined. Thirdly, an overview of the several characteristics that online reviews can have will be given, leading to the development of the hypotheses, which concern the first research question. Fourthly, factors affecting the

helpfulness of online reviews will be examined, and the hypotheses concerning the second research question will be formulated. Finally, a conceptual framework is presented, giving insight in how the concepts of this study will be approached.

2.1 User-generated content and online product reviews

UGC is a subject that is extensively studied before, especially within the context of marketing research. Several existing definitions for the concept can be found. Daugherty, Eastin and Bright (2008, p. 16) argue that UGC is “media content created or produced by the general public rather than by paid professionals and primarily distributed on the Internet”. Fader and Winer (2012) provide a somewhat more extensive definition, describing UGC as interactions between consumers and between consumers and firms to “influence consumer purchasing and company decision making” (Fader & Winer, 2012, p. 369). They state that these interactions can take several forms, such as “product reviews, descriptions of product usage, homemade advertising, blogs, and other consumer-initiated contributions” (Fader & Winer, 2012, p. 369). Both of these definitions stress the importance of the role of the consumers. Tellis and

Tirunillai (2012, p. 198) provide the most complete definition, defining UGC as “the body of information, generated by consumers going beyond their role as passive seekers of

(12)

12 such as communities, blogs, product reviews, and wikis”. Notable is, that the definitions of both Fader and Winer (2012) and Tellis and Tirunillai (2012) explicitly mention product reviews as being an example of UGC. UGC can thus be considered as a broader concept than product reviews only.

The subject of online reviews is, just like UGC, extensively searched within marketing studies. Many researchers just describe online reviews as the digital version and or a

substitute of traditional, offline word-of-mouth (e.g. Chevalier & Mayzlin, 2004; Singh et al., 2017). This is also the reason that it is often called electronic word-of-mouth, or eWOM. However, also the broader concept of UGC is often described as eWOM (Kannan & Li, 2016). As the concept of eWOM can be used interchangeably with both the concepts of UGC and online reviews, only the concepts UGC and online reviews are used in this research, in order to be able to make a clear distinction between all concepts.

Kannan and Li (2016, p. 27) acknowledge the fact that online reviews have major similarities with traditional offline word-of-mouth, as they both consist of “customers' knowledge about the products, their usage, experience, recommendations, and complaints”. However, they also state that there are some important differences. Online reviews are likely to be longer and to contain richer content than offline word of mouth. Moreover, online reviews are more accessible, and can be shared more easily, because they are digital (Kannan & Li, 2016).

2.2 The effects of mobile phone usage on UGC

Previous studies already indicated that the device type used to write UGC influences certain characteristics of UGC. Daugherty, et al. (2008) argue that because of the evolvements of the digital information society, continuing to identify key motivational sources of UGC is

(13)

13 On top of that, Ghose et al. (2013) found that for microblogging (e.g. Twitter), mobile ranking effects are higher, resulting in higher search costs on mobile. Moreover, within microblogs, “the benefit of browsing for geographically close matches is higher on mobile phones” (Ghose et al., 2013, p. 613). In this particular study, that meant that stores that were located more close to the user, were more likely to be clicked on (Ghose et al., 2013). Lamberton and Stephen (2016) argued that this might be due to a more task-oriented focus of mobile internet users. Non-mobile internet users would be more oriented towards network- and relationship building than mobile users. This has been acknowledged upon by Raphaeli et al. (2017).

Another study on the effects of mobile phone usage on UGC was executed by Burtch and Hong (2014). Though it was never published in a journal, their paper was presented at the Thirty Fifth International Conference on Information Systems, in Auckland, in 2014. Burtch and Hong (2014) examined the differences between online reviews that were written on mobile and non-mobile devices. It was found that mobile-written reviews tend to be shorter, more extreme, more emotional and more concrete. Moreover, Burtch and Hong (2014)

discovered that mobile-written reviews tend to be more helpful to consumers, probably due to effects of indications of consumption recency (Burtch & Hong, 2014; Chen & Lurie, 2013).

At first glance, the study of Burtch and Hong (2014) shows many similarities in comparison to this research. However, there are also some important differences. Firstly, Burtch and Hong (2014) used data on online reviews from the platform TripAdvisor: “a website that hosts online reviews for the service industry, with a focus on restaurants and hotels” (Burtch and Hong, 2014, p. 3). This study focuses on product reviews, instead of on service reviews. The distinction between products and services is important, as proved by several researchers (e.g. Anderson, Fornell & Rust, 1997; Edvardsson, 1997; Edvardsson, Johnson, Gustafsson & Strandvik, 2000). Compared to products, services involve more human resources, and ask

(14)

14 more involvement from the customers (Edvardsson, 1997; Edvardsson et al., 2000). Also, the physical environment in which services are performed plays in an important role (Edvardsson, 1997). Moreover, differences exist in the relation between perceived quality or satisfaction and productivity of products and services (Anderson et al., 1997). Finally, “Separability of production and consumption” (Parasuraman, Zeithaml, & Berry, 1985, p. 33) is what distinguishes services from products. Goods are not produced, sold and consumed

simultaneously, while services are (Parasuraman, et al., 1985). That might have implications for the moments on which products and services are evaluated and thus on hypotheses 1a, 1b, 2, 3 and 4 (see paragraph 2.3), which are based on the fact that in general, mobile-written reviews are believed to be written more immediate than PC-written reviews. Some products, have a separate production, selling and consumption, whereas the services Burtch and Hong (2014) searched have not. This might implicate that even though product reviews are mobile-written, they might not be written immediately, as opposed to services, where this is almost always the case (Burtch & Hong, 2014). In order to test for which products this accounted, a distinction in product category per hypothesis was made in the analysis.

Secondly, Burtch and Hong (2014) state that they are not sure whether the effects they found were caused by mobile device use, or by mobile access. They argue that: “if mobile device use causes the differences in content, it is possible that the effects might grow weaker or stronger within time, as users learn and gain experience on the mobile device. However, if the observed differences were simply enabled by mobile access, then we would expect them to persist into the future and perhaps grow stronger as users become more accustomed to using the mobile channel” (Burtch & Hong, 2014, p. 14). The research of Burtch and Hong was executed in 2014, whereas this research is executed in 2018. It is likely, that within those four years, people have gained more experience on mobile devices. With the results of this study, it is thus possible to see, whether the effects found by Burtch and Hong (2014) have changed.

(15)

15 Lastly, Burtch and Hong (2014) did not address which specific review characteristics caused the higher helpfulness scores for mobile-written reviews. This research does, by assessing the influence of all the different review characteristics on perceived helpfulness separately. Because of these reasons, this research contributes to the research of Burtch and Hong (2014).

2.3 The characteristics of online reviews

Within the context of online reviews, there are numerous characteristics that can be examined. By knowing these characteristics, it is possible to develop hypotheses on the first research question (see Chapter 1). In Table 1 (below), an overview of four studies that were somehow focused on the characteristics of online reviews, is presented.

Table 1 – List of studies on characteristics of online reviews

Study Purpose Categories Characteristics/ variables measured

Cao et al. (2011) “Examining the impact of the various characteristics of online user reviews on the number of helpfulness votes those reviews receive” (p. 511). Basic characteristics

- Whether the reviewer wrote about “pros”; - Whether the reviewer wrote about “cons”;

- Whether the reviewer wrote anything in the summary; - How many days since the posting date;

- The “extremeness” level of the review. Stylistic

characteristics

- Number of words in review; - Number of sentences in the review; - Average characters per word; - Average words per sentence; - Number of words in pros; - Number of words in cons; - Number of words in summary; - Number of words in title;

- Number of 1-letter words in the review; - Number of 2 to 9-letter words in the review; - Number of 10 or more-letter words in the review. Semantic

characteristics

(16)

16 Ghose & Ipeirotis (2011) “Re-examining the impact of reviews on economic outcomes like product sales and seeing how different factors affect social outcomes” (p. 1).

Subjectivity - The length of the review in characters; - The length of the review in words; - The length of the review in sentences; - The number of spelling errors in the review;

- The Automated Readability Index (ARI) for the review; - The Gunning–Fog index for the review;

- The Coleman–Liau index for the review; - The Flesch Reading Ease score for the review;

- The Flesch–Kincaid Grade Level for the review SMOG; - The Simple Measure of Gobbledygook score for the

review.

Readability - The average probability of a sentence in the review being subjective;

- The standard deviation of the subjectivity probability. Goes &

Lin (2012)

Examining how “the interaction among online users influences their review-writing behavior, including the frequency of

writing, the opinions that they express, and how they

express them” (p. 3). Volume, or the number of reviews Valence, or the mean of ratings Variance of ratings Textual features of reviews that users generate online - Positive sentiment; - Negative sentiment; - Readability of reviews. Burtch & Hong (2014) “Improving the understanding of device- dependent user behavior by examining differences in content generated on mobile and non-mobile devices, in the context of eWOM” (p. 1).

- - Valence;

- Length of review;

- Indications of consumption recency; - Level of detailedness;

- Helpfulness;

(17)

17 Table 1 (previous page) provides an extensive overview of characteristics of online reviews that can be measured. As all the characteristics and variables measured in the studies above, can be divided within the categories provided by Cao et al. (2011), this categorization is used for this study. The only category of Cao et al. (2011) that was not perceived to be useful for this research is ‘Semantic characteristics’, as this involves finding out the exact meaning of text, which is considered to be “extremely difficult and often subjective” (Cao et al., 2011). Therefore, this category is not included. However, which characteristics within the two categories remaining are most likely to differ per device, and subsequently, which

characteristics are most likely to affect helpfulness? In sum, which characteristics should be included in this research?

Basic characteristics

As opposed to PCs, mobile phones are portable, allowing access to timely information. For online reviews, this means that they can be written immediately after, or even during trying out a product. Writing a review on a PC is often less immediate (Ghose et al., 2013). This might have an influence on the valence of reviews. Valence is defined by Tellis & Tirunillai (2012, p. 202) as “whether the overall review is positive or negative”. Kujala and Miron-Shatz (2013) found that in the early stages of product use, consumers tend to focus more on positive emotions. This would imply that the valence for reviews that are written on a mobile phone tends to be more positive than the valence for reviews that are written on a PC, as mobile-written reviews are generally perceived to be mobile-written more immediate. However,

contradictory evidence was found by Burtch and Hong (2014) who state that over time, negative emotions are more easily forgotten, because of the Fading Affect Bias (FAB). The FAB implies that “the intensity of affect associated with negative autobiographical memories fades faster than affect associated with positive autobiographical memories” (Walker &

(18)

18 Skowronski, 2009, p. 1122). It is a common term in the field of psychology, and it is an effect that has been found in several studies, using several methods and within several populations (Walker & Skowronski, 2009). Therefore, the theory of FAB is the most generally used theory, and is thus used in order to formulate hypothesis 1a as:

H1a: Online product reviews will have a more negative valence when they are written on a mobile phone, when compared to reviews that are written on a PC.

Moreover, Thomas and Diener (1990) argue that people tend to overestimate their emotions concerning a past experience. This might imply that online reviews that are written on a PC contain more extreme valence scores. However, contradicting proof was found by Kujala and Miron-Shatz (2013) who found that over time, people tend to be more objective, and less subjective, which would mean that reviews that are written on a PC would contain less extreme value scores, when compared to mobile-written reviews. Moreover, Aaker, Drolet and Griffin (2008) argue that mixed emotions are more difficult to recall and generally underreported. These effects grows stronger over time, which is not the case for more unipolar (one-sided) emotions. As both studies of Aaker et al. (2008) and Kujala and Miron-Shatz (2013) found evidence that over time emotions become less extreme, and as both of these studies were more recently executed than the study by Thomas and Diener (1990), hypothesis 1b was formulated as:

H1b: Online product reviews will contain a more extreme valence score when they are written on a mobile phone, when compared to reviews that are written on a PC.

Stylistic characteristics

Stylistic characteristics are described by Cao et al. (2011, p. 515) as “represent key features of reviewers' writing style that cannot be easily derived by simply browsing the review texts”.

(19)

19 For the same reasons that mobile-written reviews will contain more extreme valence scores (H1b), it can also be suggested that mobile-written reviews are likely to contain more emotional text. Both Kujala and Miron-Shatz (2013) and Burtch and Hong (2014) found evidence for this statement. Therefore, the second hypothesis is:

H2: Online product reviews will contain more emotional text when they are written on a mobile phone, when compared to reviews that are written on a PC.

It was already argued that mixed emotions are generally underreported, when people have to recall them. Additionally, recalling emotions often leads to more generalized reports,

containing less details (Robinson & Clore, 2002) as over time, details fade away from

memory (Burtch & Hong, 2014). As mobile phone usage is perceived as more immediate than PC usage (Ghose et al., 2013) hypothesis 3 was formulated as:

H3: Online product reviews will contain more details when they are written on a mobile phone, when compared to reviews that are written on a PC.

Burtch and Hong (2014) found that product reviews that are written on a mobile phone, contain more references to the recency of the consumption (e.g. yesterday or today), because of the easy access to timely information, that mobile phone users have. This might have an effect on the perceived helpfulness of reviews (see next paragraph). To test whether the findings of Burtch and Hong (2014) also applied in this setting, their hypothesis was taken over, to a large extent:

H4: Online product reviews will contain more references to recency of consumption when they are written on a mobile phone, when compared to reviews that are written on a PC.

Besides the differences in portability and access to timely information, there is one more feature, that separates mobile phones from PCs. Mobile screens are smaller, resulting in

(20)

20 higher search costs (Ghose et al., 2013). It has already been proved in 1999, that smaller screens have a negative effect on both effectiveness and efficiency when trying to complete certain tasks (Jones, Marsden, Mohd-Nasir, Boone, & Buchanan, 1999; Maguire & Tang, 2014). Although research has been done on how to improve both effectiveness and efficiency on mobile phones (e.g. Garcia-Lopez, Garcia-Cabot, Manresa-Yee, & Pages-Arevalo, 2017), there is still room for innovation (Maguire & Tang, 2014). As the efficiency of mobile phones is thus smaller than the efficiency of a PC, it is likely that online reviews that are written on a mobile phone are shorter. This corresponds with the findings by Burtch and Hong (2014). Hypothesis 5 could thus be formulated as:

H5: Online product reviews will be shorter when they are written on a mobile phone, when compared to reviews that are written on a PC.

2.4 The effects on the perceived helpfulness of mobile reviews

As already argued, the perceived helpfulness of reviews is considered to be important, since customers experience an information overload, as a result of the large number of reviews (Singh, et al., 2017). Many websites therefore offer a review platform with a so-called

‘helpfulness-vote’. People who read the reviews are asked: “Was this review helpful to you?” (also see Appendix A, Figure 1). Readers answer this question, and these answers help e-commerce websites to make a ranking of reviews according to their helpfulness. This has some major advantages. Firstly, it makes the decision-making process more efficient for the consumer. Secondly, it improves the reputation of the e-commerce website providing the reviews (Cao et al., 2011).

Burtch and Hong (2014) found that for services (they studied reviews on TripAdvisor.com) the helpfulness for mobile-written reviews, is bigger than for PC-written reviews. In order to be able to answer the second research question, the hypothesis concerning helpfulness of

(21)

21 Burtch and Hong was taken over (2014). By doing so, it was possible to test whether also for product reviews, mobile-written reviews are more helpful than PC-written reviews.

H6a: Online product reviews will have a higher perceived helpfulness when they are written on a mobile phone, when compared to reviews that are written on a PC.

Burtch and Hong (2014) did not do any research on which factors cause the higher

helpfulness scores for mobile-written reviews. That is exactly where this research makes a contribution. In this study is tested which basic and stylistic characteristics of mobile- and PC-written reviews cause high helpfulness scores.

Several studies are done on the factors that influence the perceived helpfulness of online reviews. Cao et al. (2011) searched the influence on the perceived helpfulness of online reviews of various basic, stylistic and semantic features and found that a combination of these three types of features predicted the perceived helpfulness the best. It was found that the perceived helpfulness of online reviews is the result of a combination of several basic, stylistic and semantic features, with stylistic features being the least important feature impacting helpfulness. This indicates that in this research, the differences within the basic features (concerning valence) would cause the biggest difference in helpfulness, between mobile- and PC-written reviews, as the semantic features are neglected in this study (for good reasons, see paragraph 2.3). This is in accordance with the findings of Singh et al. (2017), who found that on the one hand, readability, polarity, subjectivity, entropy, and the average review rating are important predictors of helpfulness. On the other hand, wrong words, stop words, length, and the number of one-letter words are less important predictors. Especially polarity, subjectivity and the average review rating are part of the concept valence, which Cao et al. (2011), found to be a suitable predictor of perceived helpfulness.

(22)

22 Cao et al. (2011) also found that more words in the con (negative) part of a review, lead to more helpfulness votes for that particular review. They argue that this is because of the known effect of ‘negativity bias’, which means that people tend to focus more on the negative parts of reviews. This is acknowledged upon by Yin, Bond and Zhang (2014). According to hypothesis 1a concerning valence, mobile written reviews contain a relatively more negative valence. Hypothesis 6b could thus be formulated as:

H6b: For mobile-written product reviews, valence will have a bigger influence on helpfulness as opposed to the other characteristics, when compared to PC-written product reviews.

Yin et al. (2014) also found that reviews with more discrete emotions, were perceived as more helpful by consumers. This would indicate that reviews with less extremeness, and thus with a less extreme valence would be considered to be more helpful. By contrast, Cao et al. (2011) found that more extreme reviews, get more attention from consumers, and that these reviews therefore were seen as more helpful. This would be the case for positive reviews, as well as for negative reviews. Schlosser (2011) agrees upon this point of view, stating that reviews with more extreme reviewer’s ratings, are perceived as more helpful. As hypothesis 1b indicates that online reviews will contain a more extreme valence score when they are written on a mobile phone, hypothesis 6c could be formulated as:

H6c: For mobile-written product reviews, extremeness of valence will have a bigger influence on helpfulness as opposed to the other characteristics, when compared to PC-written product reviews.

The stylistic characteristics of reviews can be considered to impact the perceived helpfulness, though it can be argued that the biggest difference between the perceived helpfulness for mobile written reviews and PC-written reviews is likely to be caused by the basic

(23)

23

2.5 Conceptual framework

Based on the literature review and the hypotheses, a conceptual framework was developed. This framework displays the scope of this study.

(24)

24

Chapter 3 - Methodology

Within this chapter, a justification of the research design will be provided, giving insights in the methodological philosophy, the research approach, the research strategy and the research purpose. Furthermore, the hypotheses are operationalized into measurements, showing how all concepts in this study are approached. Moreover, the sampling technique and data

collection method that were be used are explained and justified. Finally, the way the data were analyzed was described.

3.1 Research design

In this first paragraph of chapter 3, a justification will be given for the choices that were made concerning the methodological philosophy, the research approach, the research strategy and the research purpose.

Methodological philosophy

The methodological philosophical (epistemological) point of view that is taken in this research is positivistic. Positivism “advocates the application of the methods of the natural sciences to the study of social reality” (Bryman & Bell, 2015, p. 28). An important principle of positivism is the fact that hypotheses are tested, in order to assess certain theories.

Furthermore, the research should be conducted in an objective way (Bryman & Bell, 2015). In this study, already existing data (in the form of online product reviews) were used to test the hypotheses that were formed in chapter 2. Moreover, the point of view is completely

objective, as the researcher is not part of the organization (or in this case, the website) that was searched. Therefore, this study can be perceived as a positivistic study.

(25)

25 The ontological position that is taken in this research is based on objectivism, implying that “social phenomena and their meanings have an existence that is independent of social actors” (Bryman & Bell, 2015, p. 32). In this research, it is assumed that all the concepts that were searched, can be regarded as definitive, and are not subject to change by any social actors. That is the reason why this research is perceived to be ontological.

Research approach

This research started with a review of the existing literature concerning online reviews and perceived helpfulness, which indicates that a deductive approach is taken (Bryman & Bell, 2015). Based on this literature, hypotheses were formulated, that were tested in order to confirm or reject them. This also indicates that the relationship between theory and research is mainly deductive (Bryman & Bell, 2015).

Research strategy

As a logical result of this research being positivistic, objectivistic and deductive, the strategy used to conduct this research is quantitative. A quantitative research strategy “emphasizes quantification in the collection and analysis of data” (Bryman & Bell, 2015, p. 37). The formulated hypotheses in this study were tested in an objective, and mainly numerical way.

Research purpose

In order to answer research question 1 (“How do online product reviews that are written on a mobile phone differ from those that are written on a PC?”) and to test hypotheses 1 till 5, a descriptive study was conducted. “The object of descriptive research is ‘to portray an accurate profile of persons, events or situations’” (Robson, 2002 in Saunders, Lewis, & Thornhill, 2009, p. 140). To be able to answer the research questions, descriptive statistics (such as

(26)

26 means) were obtained in order to assess the differences between mobile- and PC-written reviews.

In order to answer research question 2 (“How do the differences between mobile-written- and PC-written online product reviews affect the perceived helpfulness of the reviews?”), and to test hypotheses 6a, 6b and 6c, explanatory research was done. Explanatory research

“establishes causal relationships between variables” (Saunders, Lewis, & Thornhill, 2009, p. 140). Research question 2 has a causal nature. The purpose of this question was to find out which of the differences found by answering research question 1, had the strongest effect on perceived helpfulness.

A study like this, in which a descriptive part is followed by an explanatory part, is also known as a descripto-explanatory study (Saunders, Lewis, & Thornhill, 2009).

3.2 Operationalization of concepts

The concepts that are measured in this research, consist of product category, device type, basic characteristics, stylistic characteristics and the concept of helpfulness. In this paragraph will be explained, how these concepts were operationalized.

Product category

Although MouthShut.com contains products in eight different product categories, for convenience, it was chosen to select four product categories to focus on: ‘Automotive’, ‘Books’, ‘Electronics’ and ‘Home and Appliances’ (see paragraph 3.3, sampling). The reviews that were analyzed, were selected from these four categories, and were therefore be

(27)

27 coded as such. The aim was to collect reviews of at least three different products, per product category.

Device type

On MouthShut.com, an indication is given when a review is written on a mobile device, via the iOS application, or via the Android application. All together, these categories represent the mobile-written reviews. There are also many reviews that do not contain such an indication. It was assumed that these reviews were written on a PC. Thus, a nominal scale containing four different categories (PC, Mobile, Android application, and iOS application) was used, in order to define the device type used to write the review. The categories mobile, iOS application and Android application all represent mobile-written reviews and therefore a second concept of device was used, consisting of two device types: PC and mobile.

Basic characteristics

The first variable that was searched was valence, defined as “whether the overall review is positive or negative” (Tellis & Tirunillai, 2012, p. 202). On MouthShut.com, people can give products ratings, ranging from 1 (very negative) to 5 (very positive). These were used to assess valence (ordinal variable) (also done by Burtch and Hong, 2014). The means of the ratings for mobile- and PC-written reviews were compared (Burtch & Hong, 2014; Goes & Lin, 2012) in order to test the hypotheses 1a.

In order to test hypothesis 1b (concerning the extremeness of valence), the variable for ratings was recoded into a new variable (ordinal). A rating of 3 got an extremeness score of 1 (not extreme), ratings of 2 and 4 got an extremeness score of 2 (extreme), and finally, ratings of 1 and 5 got an extremeness score of 3 (very extreme).

(28)

28 Stylistic characteristics

The next variable that was searched was extremeness of emotional text (hypothesis 2), by using the affect score of LIWC (Linguistic Inquiry and Word Count). LIWC “reads a given text and counts the percentage of words that reflect different emotions, thinking styles, social concerns, and even parts of speech” (Pennebaker Conglomerates, Inc., n.d.).. Within the operator’s manual of LIWC 2015 is stated that the affect score is based on the presence of words like ‘happy’ or ‘cried’ and another 1391 words (Pennebaker, Booth, Boyd, & Francis, 2015). Using LIWC to measure expression of emotions in text, was found to be a valid method (Kahn, Tobin, Massey, & Anderson, 2007), and was also used by Burtch and Hong (2014). The differences for mobile-written and PC-written reviews were compared in order to test hypothesis 2.

A third variable that was examined is the difference in level of detail for mobile-written and PC-written reviews (hypothesis 3). Burtch and Hong (2014) measured this with the variable ‘perceptual process’, delivered by LIWC. Examples of words that are seen as parts of perceptual processes are ‘look’, ‘heard’ and ‘feeling’, and another 433 words (Pennebaker, Booth, Boyd, & Francis, 2015). This approach was used in this research as well.

Another variable that was assessed was recency of consumption. Burtch and Hong (2014) and Chen and Lurie (2013) did measure this by looking for words like: ‘Today’, ‘this morning’, ‘just got back’, ‘tonight’ and ‘this evening’. In this research, LIWC was used to analyze all text data. Therefore, to measure recency of consumption, LIWC’s past- and present-focus scores were used. The past-focus measure contains words like ‘ago’, ‘did’, and ‘talked’. On the contrary, the present-focus measure contains words like ‘ today’, ‘is’, ‘now’ (Pennebaker,

(29)

29 Boyd, Jordan, & Blackburn , 2015). The differences in the past- and presence scores between mobile-written and PC-written were compared, which made testing hypothesis 4 possible.

The last independent variable that was examined was length. This was assessed per review, by looking at numbers of words (Cao et al., 2011), words per sentence, and number of words containing more than 6 characteristics. LIWC provided all these measurements. Again, mobile- and PC-written reviews were compared, in order to test hypothesis 5.

Perceived helpfulness

Perceived helpfulness of reviews is normally assessed by looking at numbers of helpfulness votes a review got (e.g. Burtch & Hong, 2014; Cao et al., 2011; Ghose & Ipeirotis, 2011; Schlosser, 2011; Singh et al., 2017). MouthShut.com also contains a system in which people can give helpfulness votes to reviews (see Appendix A, Figure 1). However, after taking a look at the helpfulness votes of reviews, it was concluded that many reviews on

MouthShut.com do not contain a single helpfulness vote. This is a generally acknowledged problem when studying perceived helpfulness (Cao et al., 2011). Cao et al. (2011), excluded the numbers without any helpfulness votes from their study. This was also be done in this study. Additionally, for this study, only reviews were included that have at least two or more helpfulness votes.

On MouthShut.com, helpfulness votes can range from 1 (not useful) to 3 (very useful). This eventually results into a break-up of all helpfulness votes into group 1, 2 or 3, which is displayed as a percentage of the total. These percentages were used to calculate the numbers of helpfulness votes that fell in respectively group 1, 2 and 3.

(30)

30 The influence of the variables on the perceived helpfulness for both mobile- and PC-written reviews was searched separately by the use of regression, in order to test hypotheses 6b and 6c. All of the factors that were assessed for testing the hypotheses 1 until 5, were put into multiple regression models in order to test which factors have the biggest influence on the perceived helpfulness. This was done separately for mobile-written and PC-written reviews. In equation, this looks like this:

For an overview of the codebook as used in IBM SPSS Statistics 25, see Appendix B.

3.3 Sampling, data collection and data analysis

This paragraph will provide insights in the sampling method and the data collection method of this research. Moreover, the ways that the data were analyzed, will be explained.

Sampling

As previously mentioned, the data were taken from MouthShut.com, India’s leading online review website (MouthShut.com, n.d.). This is the only website that was found, on which information on valence, helpfulness, and device type used to write the review, is available, which was critical information to conduct this research.

(31)

31 MouthShut.com has 12.000.000 monthly users (BGR, 2017) and more than 800.000 products and services to be reviewed (MouthShut.com, n.d.). It was impossible to take all these reviews into account. Therefore, only a sample of the reviews was taken. Firstly, all service reviews were excluded of the sample, as this study only focuses on product reviews. On

MouthShut.com, the product categories are: ‘Automotive’, ‘Books’, ‘Electronics’, ‘Computers’, ‘Health and Beauty’, ‘Home and Appliances’, ‘Mobile and Internet’, and ‘Movies, Music and Sitcom’. For convenience, it was chosen to assess four of these

categories, that relatively differ a lot from each other: ‘Automotive’, ‘Books’, ‘Electronics’ and ‘Home and Appliances’. Per category, the goal was to collect 300-400 reviews.

Whether products reviews could belong to the sample depended on a few factors. Firstly, the filter option ‘most reviewed’ was used. This filter makes the most reviewed product appear first in the list. This was considered to be very important, as the more products were reviewed, the more chance there was of finding many reviews containing two or more helpfulness votes. Secondly, only products were sampled that were reviewed between 200 and 1000 times. This was done because in this case, it would be more likely that a significant amount of reviews would contain enough helpfulness votes to be included in the sample, but also that there were not too many reviews that could be included, as the aim was to collect reviews of at least three different products per product category. Thirdly, the products should differ from each other. That means that for example for books, not two books that have the same writer were

selected. Moreover, for example, no cars from the same brand were selected. This was done to collect reviews that were as diverse as possible. Fourthly, on Mouthshut.com also brands are reviewed. These reviews were not included either, as this research is specifically focused on product reviews, and not on brand reviews.. Moreover, only reviews that were written in the years 2014 – 2017 were used. This is because the usage of smartphones only recently started

(32)

32 to rise extensively. The years the reviews were written in, should thus be as recent as possible. Furthermore, as the data were collected on several days in April – May 2018, it was decided to leave 2018 as a year out. Based on these requirements, a sample of products was randomly selected, until 300-400 reviews per product category were collected. However all these requirements may have caused some bias in the sample, which is further discussed in paragraph 5.3.

A downside of using MouthShut.com, is that it is primarily used by the Indian population, and not by a worldwide population. Consumers in different countries, with different ethnic

origins, use the internet for different purposes, which also leads to different impressions of the same websites (Chau, Cole, Massey, Montoya-Weiss, & O’Keefe, 2002). Moreover, people of different cultures are likely to have different perceptions of emotions (Matsumoto, 1983), which are considered to be pretty important for the valence and the content of online reviews. On top of that, India has a relatively large percentage of people who access the internet via a mobile device (70 percent), as they do not have internet access through a PC (Business Insider, 2017). This might also have influenced the results. Because of these reasons, caution should be taken when generalizing the results of this research beyond the Indian population. Moreover, the results of this research should not be generalized from products to services as products and services differ a lot from each other (Anderson et al, 1997; Edvardsson, 1997; Edvardsson et al., 2000; Parasuraman et al., 1985; also see paragraphs 2.2 and 5.3).

Data collection

Initially, the data would be collected from MouthShut.com by the use of data scraping. Data scraping is “a technique by which a computer program extracts data from human-readable output coming from a website” (Singh et al., 2017, p 349). An HTML format (the language in

(33)

33 which MouthShut.com is written) would be transformed into structured data that could have been used for analysis (Vargiu & Urru, 2012). However, nowadays, the legality of the use of data scraping is questionable, and debates about it are rising, as people become more

concerned about their privacy. With these ethical considerations in mind, it was decided not to use data scraping, but to extract the data from MouthShut.com manually.

Data analysis

When all the data were obtained, the total dataset was analyzed using two different methods. In order to test hypotheses 2 (extremeness of emotional text), 3 (levels of details), and 4 (recency of consumption), LIWC: Linguistic Inquiry and Word Count was used. LIWC “reads a given text and counts the percentage of words that reflect different emotions, thinking styles, social concerns, and even parts of speech” (Pennebaker Conglomerates, Inc., n.d.). Using LIWC as a method to measure expression of emotions in text, was found to be valid (Kahn, Tobin, Massey, & Anderson, 2007). Moreover, it was also used by Burtch and Hong (2014) to assess both extremeness of emotional text and levels of details.

All other hypotheses were tested using the statistical analysis software IBM SPSS Statistics 25, which allowed to compute among other things measures for assumptions of statistical tests like normality, homogeneity of variance, linearity, independence, etcetera. Furthermore, it allowed to calculate standard statistical measurements such as means and standard deviations. Moreover, with IBM SPSS Statistics 25, a multiple regression analysis can be executed

(Field, 2013), which was crucial to answer the hypotheses 6a, 6b and 6c. Finally, in order to be able to interpret the results of the multiple regression in a correct way, the software program FZT Computator was used.

(34)

34

Chapter 4 – Results

In this chapter, the results of the analysis will be presented. Firstly, the general descriptives of the several variables (valence, extremeness of valence, extremeness of emotional text, level of detail, recency of consumption, length and perceived helpfulness) will be examined.

Secondly, the results concerning the differences between mobile- and PC-written reviews will be assessed. Finally, the effects of these differences on the perceived helpfulness of reviews will be examined.

4.1 General descriptives

In order to start analyzing data, first was tested whether the data contained outliers and whether several assumptions were met. As all the data were manually taken over from MouthShut.com, firstly, it was checked (with the help of boxplots) whether there were no outliers within the data on the interval and ratio variables. Based on these boxplots one case was deleted (case 464, see Appendix C1, Figure 9) . The other outliers that were found, were relatively close to the whiskers. Additionally, these outliers were checked and were found to be true outliers (see Appendix C1, Figures 1 to 17). Therefore, these were not deleted.

Device type

Eventually, data concerning a total of 1436 different product reviews were collected. 1435 of them appeared to be useful for analysis. Of these reviews, 50.9 percent was PC-written. The remaining 49.1 percent was mobile-written. In Figure 2 (next page), this is displayed, with mobile-written reviews divided into the categories ‘Mobile’ (general), ‘Android App’ and ‘iOS App’ (see Appendix C2, Tables 1 and 2).

(35)

35

Figure 2 – Number of reviews per device type

Reviews were collected that were written between 1 January 2014 and 31 December 2017. From January 2016 onwards, more reviews are written on a mobile device than on a PC (see Appendix C2, Figure 18). This corresponds exactly with the article of The Telegraph (see introduction), that was written in January 2016 as well.

Product category

Data concerning products within four different product categories were collected. These categories were ‘Automotive’, ‘Books’, ‘Electronics’ and ‘Home and Appliances’. Table 2 (below) displays how many reviews per product category and product were collected (see Appendix C2, Tables 3 and 4).

Table 2 – Number of reviews per product category and product

Product category Number of reviews (percentage of total) Product Number of reviews

Automotive 390 (27.2 percent) ▪ Bajaj Dominar 400 (motor bike)

▪ Suzuki Gixxer (motor bike) ▪ Tata Nano (car)

▪ 62 ▪ 174 ▪ 154

Books 397 (27.7 percent) ▪ Gitanjali

▪ Half Girlfriend

▪ I Too Had A Love Story ▪ The Alchemist

▪ You Can Win

▪ 6 ▪ 245 ▪ 62 ▪ 50 ▪ 34

(36)

36

Electronics 321 (22.4 percent) ▪ Apple iPod Touch 6th Generation

▪ Mi Band (fitness band) ▪ Nikon D40 (camera)

▪ Philips SHP 1900 (headphone) ▪ Sony BRAVIA KLV-40W562D Full

HD LED TV

▪ Symphony Winter (air cooler)

▪ 18 ▪ 165 ▪ 57 ▪ 25 ▪ 15 ▪ 41 Home and Appliances

327 (22.8 percent) ▪ All Out Machine (pest control) ▪ Britannia Good Day Biscuits ▪ Pitanjali Honey

▪ Pureit Classic Water Purifier

▪ 59 ▪ 110 ▪ 133 ▪ 25 Total 1435 (100 percent) 1438 Basic characteristics

The basic characteristics consist of the rating and the extremeness of rating. The mean rating of all product reviews was 3.66 (1.343). The mean extremeness score of all product reviews on a scale of 1 to 3 was 2.33 (.684) (see Appendix C2, Table 5).

Stylistics characteristics

The first stylistic characteristic that was examined was the presence of emotional text. With the help of LIWC, an emotional affect score was assigned to each review. The mean of this affect score was 6.90 (3.073) with a minimum score of .00 and a maximum score of 23.08 (see Appendix C2, Table 6).

The second stylistic characteristic being researched was level of detail of a review. This was examined by using the details perception score, also assigned with the help of LIWC. For all

(37)

37 reviews together the mean of the detail perception score was 3.40 (3.006), with a minimum of .00 and a maximum of 17.24 (see Appendix C2, Table 6).

Another review characteristic that was assessed was the presence of indications of recency of consumption. This characteristic was searched by the use of two scores, assigned to the reviews by LIWC. The mean recency past score (which examines presence of words and sentences that link to a past experience) of all reviews was 2.35 (2.106) with a minimum of 0.00 and a maximum of 12.22. The mean recency present score (which examines presence of words and sentences that link to a current experience) of all reviews was 10.15 (3.228), with a minimum of .00 and a maximum of 24.42 (see Appendix C2, Table 6).

The last stylistic characteristic was length. The presence of this characteristic within the reviews was evaluated with the help of 5 different measurements. Firstly, the words in the title were counted per review, resulting in a mean of 4.16 words (2.049), with a minimum of .00 words (no existing words) and a maximum of 11 words. Secondly, the words in the review text were counted per review, resulting in a mean of 152.27 words (139.162), with a minimum of 20 words and a maximum of 1554 words. Thirdly, also the number of words within the total review (title + text) were counted. This measurement had a mean of 156.43 (139.578), with a minimum of 25 words and a maximum of 1563 words. Fourthly, the average number of words per sentence was calculated for each total review. The average number of words per sentence was 20.39 (13.331) with, a minimum of 4.74 and a maximum of 116.08. Finally, also a score was assigned for total reviews containing a number of words with more than six

(38)

38 letters in, which resulted in an average score of 15.88 (4.770), with a minimum score of 1.79 and a maximum score of 39.39 (see Appendix C2, Table 7).

Perceived helpfulness

In order to assess the perceived helpfulness of each review, first, the number of helpfulness votes was counted. Reviews with 0 or 1 helpfulness votes were excluded from collection and further analysis. The mean number of helpfulness votes concerning all reviews was 7.63 (6.972). The minimum number of helpfulness votes was 2, the maximum number of helpfulness votes was 42 (see Appendix C2, Table 8).

On all the sampled reviews, in total 10,947 helpfulness votes were given. 40.4 percent of these votes were marked as very helpful, 58.3 percent as helpful, and the remaining 1.3 percent as not helpful (see Appendix C2, Table 9).

(39)

39

4.2 The differences between mobile-written and PC-written reviews

In order to answer the first research question: “How do online product reviews that are written on a mobile phone differ from those that are written on a PC?”, several hypotheses were formulated, concerning both basic and stylistic characteristics.

Basic characteristics

The first hypothesis was H1a: Online product reviews will have a more negative valence when they are written on a mobile phone, when compared to reviews that are written on a PC. In standard texts it is generally described, that ordinal data should be analyzed with non-parametric tests (Jamieson, 2004). As the rating of the reviews was measured ordinal, it was decided to use the non-parametric Mann-Whitney test in order to test this hypothesis.

The review rating of all mobile-written reviews (M = 3.62, SE = .051) did not significantly differ from the review rating of all PC-written reviews (M = 3.70, SE = .049), U =

248,792.500, z = -1.13, p = .26, r = -.03 (see Appendix C3, Table 10, and Figures 19 and 20). The test was also carried out by product category in order to see whether there was a

difference in separation of buying, consumption and evaluation (see Table 3, next page; see Appendix C3, Tables 11 to 14, and Figures 21 to 28). On all product categories except for ‘Electronics’ (where the difference was .03), the mean for mobile-written reviews was lower, than the mean for PC-written reviews. However, also here, no significant differences were detected. This means that no support for H1a was found.

(40)

40

Table 3 – Results comparison mobile-and PC-written reviews on ranking (divided per

product category)

Mobile-written reviews PC-written reviews Test statistic

N M SE N M SE U Automotive 200 3.52 .094 190 3.73 .088 17,528.500 Books 106 3.57 .137 291 3.70 .083 14,370.500 Electronics 247 3.81 .081 74 3.78 .131 9,652.000 Home and Appliances 152 3.51 .118 175 3.65 .106 12,622.500 Note: *** p < 0.001, ** p < 0.01, * p < 0.05

Another hypothesis that concerns valence is H1b: Online product reviews will contain a more extreme valence score when they are written on a mobile phone, when compared to reviews that are written on a PC. To measure this: two tests were used: The one-way ANOVA test and the Mann-Whitney test.

The results of the one-way ANOVA tests on the normal rating data (scale 1-5), indicated that there was no significant difference between the extremeness of mobile-written reviews and PC-written reviews (F = 1.229, p = .268) (see Appendix C3, Table 15). Moreover, also the Mann-Whitney test on the rating extremeness data (scale 1-3) indicated no significant

difference between the extremeness of mobile-written reviews (M = 2.32, SE = .026) and PC-written reviews (M = 2.34, SE = .025), U = 254.495,500, z = -.396, p = .692, r = -.01 (see Appendix C3, Table 16 and Figures 29 and 30). Again, the test was also executed for each product category to test for separability of production and consumption (see Table 4, next page; see Appendix C3, Tables 17 to 20, and Figures 31 to 38). This provided mixed results. For the categories ‘Automotive’ and ‘Electronics’ the mean for mobile-written reviews was higher, than the mean for PC-written reviews, whereas for the categories ‘Books’ and ‘Home and Appliances’ the mean for mobile-written reviews was lower than the mean for PC-written

(41)

41 reviews. However, also here, no significant differences were detected, although for

Electronics, the difference was almost significant (U = 10,267.000, z = 1.782, p = .075). This means that no support for H1b was found.

Table 4 – Results comparison mobile-and PC-written reviews on ranking extremeness

(divided per product category)

Mobile-written reviews PC-written reviews Test-statistic

N M SE N M SE U Automotive 200 2.23 .051 190 2.22 .051 19,215.000 Books 106 2.34 .068 291 2.42 .041 14,483.000 Electronics 247 2.36 .042 74 2.22 .073 10,267.000 Home and Appliances 152 2.39 .054 175 2.40 .048 13,289.500 Note: *** p < 0.001, ** p < 0.01, * p < 0.05 Stylistic characteristics

The first hypothesis that was formulated concerning stylistic characteristics was H2: Online product reviews will contain more emotional text when they are written on a mobile phone, when compared to reviews that are written on a PC. However, before this hypothesis could be tested, it should be checked whether there were no outliers within the sample, and whether certain assumptions were met, as for this hypothesis continuous data were used. In Appendix C1, Figure 1, can be seen that the sample contains a few outliers on emotional text. However, these outliers were not considered so extreme or isolated that they should be deleted.

Moreover, tests were done to assure that data would be normally distributed. The problem with online reviews is that these are often not-normally distributed. Especially, the effects of self-selection bias on the extremeness of valence were extensively studied (e.g. Bhole, & Hanna, 2017; Hu, Pavlou, & Zhang, 2017). However, Hu, Pavlou and Zhang (2017) stated that there might be an effect on text comments as well. This could possibly cause problems

(42)

42 concerning the normality, even though the sample of this research is fairly large. Therefore, it was decided not to use the central limit theorem, without first testing for normality. Though, concerning the emotional affect score, for both samples (mobile-written and PC-written reviews) the histograms, the P-P plots and the Q-Q plots indicated that the data were fairly normal (also see Appendix C4, Figures 39 to 44), the values of skewness and kurtosis and the Kolmogorov-Smirnov test and the Shapiro-Wilk indicated that the scores deviated

significantly from normal, D (705mobile) = .063, p <.001; W (705mobile) = .964, p <.001; D

(730PC) = .73, p <.001; W (730PC) = .959, p <.001; (see Appendix C4, Tables 21 and 22).

Field (2013) states that for large samples, the scores for Kolmogorov-Smirnov and Shapiro-Wilk are often significant, even when they are only slightly different from normal, and should therefore always be interpreted in conjunction with for example the histogram and the P-P and Q-Q plots. As these indicate a fairly normal sample, it was decided to use the parametric independent sample t-test. The test statistics for Levene’s test for equality of variances were taken into account in interpreting the results.

On average, mobile-written reviews had a slightly higher emotional affect score (M = 6.92, SE = .122), than PC-written reviews (M = 6.88, SE = .107). However, this difference, -.04, was not significant t (1403.17) = -.27, p = .79. Moreover, it did represent a very small effect, d = .01 (see Appendix C4, Tables 23 and 24). Hence, no support was found for hypothesis 2.

Also, a distinction per product category was made. As in SPSS, it is impossible to compute histograms, P-P plots and Q-Q-plots on a group within a group (e.g. Automotive, within mobile-written reviews), only descriptives and normality tests were executed for both mobile- and PC-written reviews (see Appendix C4, Tables 25 to 28). As for several groups for both device categories the data were not normally distributed, Mann-Whitney tests were carried out

(43)

43 (see Table 5, below; see Appendix C4, Figures 45 to 52, and Tables 29 to 32). This provided mixed results again. For the category ‘Automotive’ the mean for mobile-written reviews was significantly higher, than the mean for PC-written reviews, supporting H2 for this product category. For the other categories, no significant differences were found.

Table 5 – Results comparison mobile-and PC-written reviews on extremeness of

emotional text (divided per product category)

Mobile-written reviews PC-written reviews Test-statistic

N M SE N M SE U Automotive 200 7.25 .230 190 6.41 .191 22,162.500** Books 106 7.26 .310 291 7.08 .162 15,701.500 Electronics 247 6.42 .201 74 6.53 .337 8.,862,500 Home and Appliances 152 7.07 .270 175 7.20 .251 13,259.500 Note: *** p < 0.001, ** p < 0.01, * p < 0.05

The second hypothesis that was formulated concerning stylistic characteristics was H3: Online product reviews will contain more details when they are written on a mobile phone, when compared to reviews that are written on a PC. Again, it was checked whether there were no outliers within the sample, and whether certain assumptions were met. In Appendix C1, Figure 2, can be found that the sample contains a few outliers on detail perception score. However, again, these outliers were not considered to be so extreme that they should be deleted. Also, tests for normality were again carried out (see Appendix C4, Figures 53 to 58 and Tables 33 and 34). All tests clearly indicated that the data were heavily skewed.

Therefore, the non-parametric Mann-Whitney test was carried out.

On average, mobile-written reviews had a higher details perceptions score (M = 3.76, SE = .123), than PC-written reviews (M = 3.07, SE = .100). This difference was found to be

(44)

44 significant, U = 285.662,500, z = 3.613, p < .001, which supports H3. However, it did

represent a small effect, r = .10 (also see Appendix C4, Figures 59 and 60, and Table 35). Again, a distinction per product category was made. Descriptives and normality tests were executed for both mobile- and PC-written reviews (see Appendix C4, Tables 36 to 39). As for several groups for both device categories the data were not normally distributed, Mann-Whitney tests were carried out (see Table 6, below; see Appendix C4, Figures 61 to 68, and Tables 40 to 43). This provided mixed results again. For the category ‘Electronics’ the mean for mobile-written reviews was significantly lower, than the mean for PC-written reviews, violating H3 for this product category. This was also the case for the other categories,

although these differences were not found to be significant. However, for the category ‘Home and Appliances’ the mean for mobile-written reviews was significantly higher, than the mean for PC-written reviews, providing support for H3. Hence, in general, these results provide mixed evidence for H3.

Table 6 – Results comparison mobile-and PC-written reviews on level of detail (divided

per product category)

Mobile-written reviews PC-written reviews Test-statistic

N M SE N M SE U Automotive 200 2.71 .162 190 2.78 .147 18,102.000 Books 106 1.74 .144 291 1.76 .080 15,139.500 Electronics 247 3.88 .181 74 4.70 .344 7,464.000* Home and Appliances 152 6.33 .336 175 4.86 .260 16,017.500** Note: *** p < 0.001, ** p < 0.01, * p < 0.05

In order to be able to test H4: Online product reviews will contain more references to recency of consumption when they are written on a mobile phone, when compared to reviews that are written on a PC, two measurements were used. Firstly, each review got assigned a recency

Referenties

GERELATEERDE DOCUMENTEN

• In line with theory, the high levels of objectiveness, concreteness and linguistic style all contribute to online consumer review helpfulness through argument quality and

Since the three independent variables (objectiveness, concreteness and linguistic style), which lie under the categories of semantic and linguistic characteristics, can at the

Negative reviews of the corresponding week were significant and positively related to sales in two regressions and the cumulative negative reviews of the previous weeks were not

While this study builds on previous literature on online consumer reviews by studying real name exposure, spelling errors, homophily and expert status (Schindler

4.3.1 Independent samples t-test hypothesis 1 total sample.

(2-tailed) Mean Difference Std... (2-tailed) Mean Difference

In literature on physical applications of the theory of stochastic processes, some authors (e.g. [PJ) define white noise as a wide sense time stationary process with the 6-function

The Research Question (RQ) of this research is corresponding with the research gap identified in the theoretical framework: “Is there a unified business model