Online Review Diagnosticity and the Moderating Influence of Product Price

(1)

Online Review Diagnosticity and the Moderating Influence of

Product Price

Philipp Ahrens 13359420 Master’s Thesis

Graduate School of Communication

Master’s programme Communication Science Young-shin Lim

(2)

Abstract

Online reviews are increasingly important as information source for customers decision-making process during product purchase. However, research indicated that online reviews for search and experience products suffer from misalignment between review rating and review text. It was assumed that this phenomenon reduces the online review diagnosticity, also known as online review helpfulness. The present thesis examines the impact of review misalignment and review depth on the review helpfulness rating; moderated by the product price. An automated content analysis using SentiStrength was conducted to detect review misalignment and then investigate the determinants to fill the knowledge gap regarding a possible impact on customers review helpfulness evaluation. In contrast to review depth, results suggested that review misalignment does not have significant influence. Accordingly, product price only moderates the relationship between review depth and review helpfulness, but not for review misalignment and review helpfulness. It can be concluded that review depth is an important factor for the diagnosticity of online reviews, but review misalignment and product price need further examination.

Keywords: Online Reviews, Review Helpfulness, Review Misalignment, Review Depth,

(3)

Introduction

Through the expansion of online retailer platforms in the recent decade, the amount of user-generated-content about products increased drastically. Electronic word-of-mouth (eWOM) became more and more essential for retailer, producer and customers (Park, Lee, & Han, 2007). Especially, in form of online reviews, eWOM raised to an important source to gather information for purchase decisions or acquire knowledge about the costumer’s opinion. Customers perceive online reviews as trustworthy source as they are created by other users in order to share their experiences and beliefs about products with them (Pan & Chiou, 2011). They are independent from the producing company, and believed to deliver an unbiased and credible picture of the product or service. Research supported the importance for customers by pointing out that online reviews are one of the most important sources for product evaluations, next to consulting friends and relatives (Quambusch, 2015).

As there are plenty of different products, purchasing situations, customer types and theoretically anybody can write and publish product evaluations, the quality of online reviews is not always consistent and can differ significantly from each other (Darley, Blankson, & Luethge, 2010). In order to judge the usefulness of online reviews, many retailers introduced the “helpfulness vote” with which customers can rate the specific online reviews as either helpful or not helpful. Useful online reviews are peer-reviewed by other customers and displayed at the top of the product page, while online reviews from low quality are relegate to the bottom and less visible. In research, the usefulness of online reviews is conceptualised as online review diagnosticity. Filieri (2015) refers to the concept of online review diagnosticity as one “which is determined by the positive correlation between information available to a consumer and the decision-making process and is often conceptualised as the degree of helpfulness of information” (p.1262). This helpfulness is determined by mainly by two components of online reviews: the review rating and the review text.

(4)

Several studies indicated a potential misalignment between the review ratings and opinions in the review texts, which means for instance that the review rating reflects an extreme positive or negative opinion, while the review texts is only moderately formulated (Fazzolari, Cozza, Petrocchi, & Spognardi, 2017; Lak & Turetken, 2014). The problem of misalignment becomes obvious, when considering the dual-processing theory (O'Keefe, 2008). O’Keefe theorised that individuals process information on either a central or a

peripheral route. This is particularly, as customers might lack motivation or cognitive capacity to thoroughly read review texts and then only partly process the information. In consequence, tend to rely on peripheral cues such as the review rating. Misalignment between review rating and review text harbours the possibility of wrong conclusions or confusion during the

decision-making process of the individual.

Review misalignment is largely depending on the product type the online review is written for. For instance, some products are highly depending on sensory experiences which would them especially difficult to capture in a single rating. According to the concept of search and experience products (Wan, Nakayama, & Sutcliffe, 2012), these experience products underlie very subjective feelings and the individuals personal taste. When a

customer has strong feelings towards a particular product, the likelihood of giving an extreme rating seems higher compared to a product that is judged mostly on objective criteria (Geetha, Singha, & Sinha, 2017; Willemsen, Neijens, Bronner, & de Ridder, 2011). Logically

misalignment should also occur for online reviews with great review depth. Those online reviews consist of a very long review text and discusses many different product aspects. The more comprehensive and ambiguous the review text, the more difficult it should be to reflect the opinion correctly in a single rating (Li & Hitt, 2010).

Related to the motivation to process central message cues, the product price is expected to moderate the relationships of review misalignment and review depth on the review helpfulness. Customers who are purchasing a high-priced product are usually engaging

(5)

in higher involvement during the information search than they do for low-priced products (Dholakia, 2001). As the product price entail different levels of involvement, customers should have varying expectations about useful information and would judge the helpfulness of an online review differently.

The present thesis aims to examine the potential misalignment as well as the review depth in order to provide explanation for the influences on online review helpfulness. Although, a fair body of research about online review helpfulness is existent, a knowledge gap regarding the influence of misalignment on the review helpfulness has to be

acknowledged. Furthermore, the role of product price on the relationships of review misalignment and review depth on review helpfulness is lacking examination, although an influence on the perception of useful online reviews would be logical.

Given the research gap, the study aims to clarify the following RQs: “How is the helpfulness rating of online reviews for search and experience products affected by the review depth and review misalignment? To what extent are the relationships of review depth and review misalignment on review helpfulness rating moderated by the product price?”

Therefore, the thesis conducts an automated content analysis on existent online review data, which will be scraped from Amazon.com.

Theoretical Framework

The following section lays out the framework for the thesis. At first, the concepts of product type and online review diagnosticity are discussed to build the theoretical foundation for the independent and dependent variable. Secondly, review misalignment as well as review depth are specified and following their mediating effect on the online review diagnosticity. Finally, product price is introduced to account for possible moderating influence on the relationship of review misalignment on review diagnosticity and review depth on review diagnosticity.

(6)

Independent Variable Product Type

Generally, products can be distinguished between search and experience products. According to the schema by Philip Nelson, the product type differentiation is based on attributes which are relevant for the ability to obtain useful information about the product quality before trial or purchase (Nelson, 1970).

Search products are characterised by features that are relatively easy to evaluate before the actual purchase. Customers are easily able to acquire the necessary information to

evaluate and compare different products in order to draw conclusions about quality and fit for their purpose. Product evaluations are predominantly based on objective aspects and key figures, such as price, technical data or physical properties. For instance, technical products like smartphones or cameras can be compared according to their memory capacity, display brightness, size, battery duration or megapixel of camera lenses.

In contrast, experience products are difficult to measure in numerical aspects, as customer experience for this product type is weightier than performance data. As those products are judged very subjectively and highly depend on the personal taste, they are difficult to summarise and compare pre-purchase. Often customers senses or affective evaluative cues, such as aesthetical aspects, are required to judge the whole product. Due to their sensual experiences, customers are more able to develop some kind of personal

relationship with the product than for search products. For potential buyers, it’s therefore difficult to make assessments about the quality without using or experiencing them

beforehand. This includes for instance conventional experience products like books, movies or music CDs, but also technological products such as cars, virtual reality devices or software.

Dependent Variable Online Review Diagnosticity

To account for the influencing factors of online review informativeness, Chua and Banjeree developed the model of “online review diagnosticity”. Diagnosticity is rooted in the

(7)

psychological terminology and refers to the value of information gained from an interaction, feedback or event by a person seeking self-knowledge. In this case, diagnosticity means the value of online reviews in providing information and to help individuals gain knowledge about relevant products. Usually, online reviews have different components, from which two are relevant for the customer to evaluate the product: the review rating and the review text. Studies find supports that online review diagnosticity is influenced through interaction of some of these components, in particular review rating, review depth, review readability, profile of the review author and the product type (A. Y. K. Chua & Banerjee, 2014). The relevant components for the current study, review rating and review depth, are explained in the following parts. For simplicity, online review diagnosticity will be referred to as online review helpfulness during the further thesis.

Review Misalignment

Review misalignment arises when sentiment of the review text and the review rating are not reconciled with each other and indicate different opinions. The two components review text and review rating are the relevant ones for customers to express their opinion.

Review texts are the main component of online reviews and provide customers with the possibility to author their product evaluation or experience as they prefer. The usefulness of review texts is mainly explained through review readability and review depth, which are both major determinants for the review helpfulness rating (Chua & Banerjee, 2014). While the former measures the comprehensibility of the review text, the latter defines the length of the review text and is discussed more detailed later on. The sentiment strength and emotional direction of review text can be calculated and aggregated as sentiment score.

Review rating is mostly incorporated as a scale, which basically means that customers can disclose their opinion post-purchase on a numerical scale with five being the best possible rating (Metzger, Flanagin, & Medders, 2010). For instance, they may indicate extreme

(8)

dissatisfaction or disapproval using a low rating such as 1 or 2, while approval or extreme satisfaction may be indicated by using high ratings such as 4 or 5. Similar to five point Likert-scales, 3 has the function of a middle-point to account for moderate or neutral opinions. They either reflect an indifferent position, or a review text that summarises positive and negative comments that balance out to a neutral opinion.

Research pointed out, that agreement between review rating and review text is not necessarily certain in online reviews (Lak & Turetken, 2014). As review ratings can be skewed towards the extremes, sentiment scores were mostly normally distributed middle values. This indicates that online reviews tend to contain misalignment between both indicators, since they are distributed fundamentally different. Research confirmed these results by using different sentiment analysis techniques to point out mismatch of text scores and rating scores in hotel reviews (Fazzolari et al., 2017). With regard to different product types, studies indicated that experience products might contain significantly more online reviews with misalignment than search products. Furthermore, greater misalignment seems to occur for online reviews with very high review ratings, which suggests that especially

extreme positive review ratings might not be a reliable indicator for the actual customer opinion (Mudambi, Schuff, & Zhewei Zhang, 2014). These findings are aligned with the initial assumption about subjectivity and ambiguity of experience products, which suggest that capturing them in a numerical rating is difficult. Although, online reviews for search products are less ambiguous and less prone for misalignment, they still seem to significantly differ in review rating and review text.

Thus, we can assume that experience products are more likely to contain misalignment in their online reviews than search products.

H1: “Experience products have a greater misalignment of online reviews with misalignment between review text and review rating than online reviews for search products.”

(9)

Review Depth

The characteristic review depth refers to the number of arguments about the product, which are discussed by the customer. Simply put, it refers to the overall length of an online review and is an important measurement for its quality(A. Y. Chua & Banerjee, 2015). It can be assumed that greater review depth, delivers a more extensive evaluation of product aspects and subsequently, a better justification of the customers opinion. According to the discussed assumptions about Nelsons product type schema, it can be assumed that customers tend to write longer online review texts to account for the subjectivity and ambiguity of experience products. In order to discuss all aspects sufficiently, online reviews for experience products should generally have a greater review depth than online reviews for search products. Those can mostly be summarised in their key figures, since they usually do not underlie strong subjective feelings and are less likely to be discussed in an extensive manner.

H2: “Online reviews for experience products have a greater review depth than online reviews for search products.”

Mediating Influence of Review Misalignment

The misalignment between the review text and the review rating leads to deviating opinions about the product and should be negatively perceived by customers.

Considering the ELM, argument quality is an important factor for the persuasion of a message (O'Keefe, 2008). Hereby, the argument quality is defined as the audience’s

perception of the argument and is, as in the dual-processing theory, a central cue of

persuasion. Argument quality has been shown to influence the believability of online reviews and expected to gain positive responses when quality is strong (Cheung, Sia, & Kuan, 2012). When online reviews provide customers with well-discussed product evaluations and contain appropriate justifications, then they should be perceived as more helpful than online reviews with low argument quality.

(10)

Information consistency is a heuristic cue and plays an important role in adoption of knowledge in online communities. It discusses the extent to which the processed information is align with the customers prior knowledge (Zhang & Watts, 2003). As processed

information is not consistent with prior knowledge, the individual is likely to perceive the message as untrue or not valid. With regards to online reviews, research indicated that this effect also occurs when online review information is inconsistent or deviates from similar messages (Cheung et al., 2012). Thus, when online reviews are not aligned with most of the other online reviews, then they are likely to be believed as less credible. Logically, this information consistency should also be applicable within the online review, supposing customers actively process both review text and review rating. When a customer reads a moderate review text, but the review rating is extremely positive or negative, then it should occur that the given information is not consistent with each other. The same should apply for moderate review texts and very extreme review ratings.

Taking both factors into account, it seems plausible that review misalignment has negative effects for customers on the usefulness of online reviews. Since review rating and review text are implemented for reflecting the product evaluation, customers should perceive misalignment as information inconsistency because the online review contains contradictory or deviating viewpoints. Additionally, the argument quality of the online review should also be negatively affected, since inconsistent information are not reflecting a strong and well-thought argumentation about the product. Overall, the believability of an online review would decrease and consequently the review becomes less helpful for the customer.

Therefore, when online reviews are aligned in their review rating and review text, then they would be more likely to be perceived as helpful and credible. They are aiming to find a useful source for their decision-making and vote online reviews as helpful in case they meet their expectations. In contrast, when they find a product evaluation inconsistent or

(11)

Logically, the stronger the misalignment the stronger the decrease in online review helpfulness would be, since the online review obviously contains more ambiguous information about the product evaluation.

H3: “Misalignment between review text and review rating is negatively related to the helpfulness rating.”

Mediating Influence of Review Depth

In contrast to review misalignment, greater review depth is positively associated with perceived online review helpfulness; for both experience and search products (Chua & Banerjee, 2014). Online reviews with great depth are assumed to provide customers with more comprehensive and well-argued product evaluations than review texts of low depth. Consequently, they are more likely to receive higher numbers of helpfulness votes (Metzger, 2007). However, it has been shown, that for experience products greater depth in online reviews increases the helpfulness voting less strongly than for search products with the same depth (Mudambi & Schuff, 2010). Online reviews that consist of a certain review depth are perceived as more competent and adequate than overly short online reviews. There is a threshold for search and experience products, at which online reviews with very high review depth are perceived as less helpful. This might be because these online reviews are either perceived as too time-consuming to read or the quality of argumentation decreases at some point (A. Y. K. Chua & Banerjee, 2015).

For customers who are seeking to gain as much knowledge as possible, this means an online review with great depth would be a more valuable source of information for the decision-making process than an online review with low-depth.

H4: “There is a positive relationship between the review depth and the review helpfulness rating.”

(12)

Product Price as Moderator for Review Misalignment

The relationship between online review misalignment and review depth on the helpfulness rating is expected to be moderated by the product price, as it could play a role on the expectations about useful information.

The information search can be distinct based on the amount of involvement that the customer put into the product purchase decision (Petty & Cacioppo, 1983; Zaichkowsky, 1986). With high involvement, the individual is likely to pay more attention to details and central messages cues, while in contrast low-involvement decisions are predominantly based on peripheral cues which are less deliberatively processed. They can be formed on message aspects like source expertise or aggregated opinions like the review rating.

The involvement level varies according to the individual’s internal motivation, which is influenced through product aspects that might entail higher risk of failing (Solomon, Russell-Bennett, & Previte, 2012). For instance, product price is shown to be related to the individual’s involvement-level across many purchase situations. Low-involvement products are inexpensive and likely to be purchased frequently or routinely, while high-involvement products are characterised through less impulsive buying and purchases are planned deliberatively (Peter & Olson, 1990). Due to their complexity or relatively high costs, customers are exposed to higher risks in case the purchase is failing, which means they preventative put more effort in the information search. Purchasing decisions for high involvement products are not routinely for the individual.

Taking the existing body of research into account, evidence indicates that product prices can influence the importance of online review aspects during the purchase decision. Thus, these aspects also have varying impact on the online review helpfulness when the product price changes (Baek, Ahn, & Choi, 2012). For instance, the online review helpfulness rating for search and experience products was shown to be stronger influenced by central cues when the product is high-priced than when it is low-priced. As customers aim to reduce the

(13)

likelihood of negative consequences they are highly-involved and collect as much information as possible; in this case by reading product evaluations. They are thoroughly considering the review text as it contains usually high-quality arguments. Since customers are examining those online reviews more carefully, misalignment should logically be more obvious and leave them with the impression of contradictory viewpoints within the online review. Therefore, for high-priced products online review helpfulness should be stronger influenced by review misalignment than for low-priced products.

In contrast, peripheral cues are more important for customers who are only low involved in the purchase process. They are not inclined to strongly engage in the information search process, therefore they focus on review ratings rather than spending effort to read review texts carefully. Thus, misalignment in online reviews should be less obvious for them and the online review helpfulness for low-priced products would be less strongly influenced. H5: “The influence of review misalignment on helpfulness rating is stronger when the product is high-priced than when it is low-priced.”

Product Price as Moderator for Review Depth

Logically, the argument of high- and low-involvement is also applicable for the relationship between review depth and online review helpfulness. As customers of high-priced products have a high-involvement, they would consider online reviews that are able provide more information as more useful. Since great review depth is associated with a higher number of arguments and more well-thought discussion, online reviews with greater depth would be perceived as more helpful than online reviews with lower depth. Consequently, the review helpfulness rating for high-priced products would be stronger influenced by online reviews with great depth than by online reviews with low depth. In contrast, as customers for low-priced products would avoid spending much time on reading long review texts and focus

(14)

on peripheral cues as well as online reviews with only few words, the review helpfulness rating for those products would be influenced stronger by online reviews with less depth. H6: “The effect of review depth is stronger when the product is high-priced than when it is low-priced.”

The conceptual model of the variables is as follows:

Figure 1 – Conceptual Model

H1

H3

H2 H4

H5 H6

Methods

In order to answer the research question, the present thesis conducts an automated content analysis on online reviews that contain a review helpfulness vote. The scraping and content analysis is performed using a Python script, while the actual data analysis is done with R and SPSS.

As source for online reviews, this study relied on Amazon.com which is known as the largest online retailer and assumed to provide a relatively representative sample due to the high number of users. Online reviews on Amazon.com are available publicly and contain a helpfulness rating which can be used to determine the online review diagnosticity. Data collection was done using a Python script which was configured to scrape relevant data from

Product Type Misalignment

Rating/Text Product Price Review Depth Helpfulness Rating

(15)

each Amazon product page. The collection considered all English-speaking online reviews from the product-specific release date on Amazon.com until collection during first week of June.

For determining the relevant products, a three-step approach was applied to create an appropriate sample of search and experience products. Firstly, 12 product categories for experience and search products were collected and matched with the most comparable best-selling product categories on Amazon.com. In accordance with Willemsen et al. (2011), Chua and Banerjee (2016), Nelson (1970) and Weathers et al. (2007), the categories laser printer, point & shoot digital cameras, cell phones, toasters, laptop and USB flash drives for search products were considered. The experience product categories covered espresso machines, sunscreen, skin care, men’s wrist watches, televisions and chairs, which are based on earlier research (Mudambi & Schuff, 2010; Mudambi et al., 2014; Baek & Choi, 2012). In total, 960 items covering the top 80 products for each category were collected. Afterwards, products which had no price tag were excluded from the sample, so the following sample could be maintained: cell phones (79), chairs (75), laser printer (66), men’s wrist watches (75), point & shoot digital cameras (80), sunscreen (78), skin care (79), television (63), toasters (78) and USB flash drives (78).

Secondly, based on mean price, SD and range from each product category the price levels ‘0 to 28 USD’, ’29 to 86 USD’ and ‘+86 USD’ were determined to be most

representative for low-priced, average-priced and high-priced products. Between the product categories, most prices were distributed within these price levels, so we can assume that this categorisation the concept of low- and high-involvement products. For each price level, one search and one experience product category that matched closest in their mean prices were chosen. In this way, a sample of six product categories of search and experience products that are distributed across the price levels was created. In the end, the product categories USB flash drives, skin care, toasters, watches, digital cameras and television were retained, from

(16)

which each three products were randomly selected for the final sample (see Table 1). In this way, the pool got shortlisted to a total of 18 product items, nine experiences and nine search products. Random selection was necessary to receive ideally an unbiased sample for the particular price levels.

Table 1 - Sample of search and experience products

Search Products Experience Products

Price Levels Products Names Product Names

low-priced

0 to 28 USD

USB Flash Drives:

SanDisk Ultra Flair 32GB 3.0 ($11.99) SanDisk 16GB 2.0 Flash Cruzer ($6.99) SanDisk Ultra Fit 64GB USB 3.0 ($18.99)

Skin Care:

H2Ocean 4oz Piercing Aftercare ($12.00) Eye Cream Moisturizer (1oz) 94% ($17.99) NYX Professional Setting Spray ($5.99)

average-priced

29 to 85 USD

Toasters:

Cuisinart CPT-180BKS Metal ($66.73) Oster Long Slot 4-Slice Toaster ($52.50) Cuisinart CPT-435 Countdown ($79.96)

Watches:

Invicta Men's 8926OB Pro Diver ($83.06) Seiko Men's SNK805 ($64.99) Seiko Men's SNK807 ($59.39) high-priced + 86 USD Digital Cameras: Canon PowerShot SX620 ($229.00) Sony DSCHX80/B High Zoom (368.00) Canon PowerShot SX720 HS ($330.95)

Television:

TCL 32D100 720p LED TV ($179.99) Samsung Electronics UN32J4001 ($209.00) Sony KD43X720E 43-Inch 4K ($599.00)

Lastly, during data collection in the first week of June all available online reviews for each product were scraped. In total, 30,210 online reviews, consisting of 9,523 USB flash drive reviews, 8,268 skin care reviews, 5,740 toaster reviews, 5,648 watch reviews, 661 digital camera reviews and 481 television reviews were received. After applying an

automated language detection on the review text, 2,314 non-English speaking reviews and 24 duplicates were excluded from the sample, so a total of 27,982 online reviews (8,377 USB flash drive reviews, 7,787 skin care reviews, 5,404 toaster reviews, 5,348 watch reviews, 633

(17)

digital camera reviews and 433 television reviews) could be maintained. Each record included the following variables: review text, date of the review posted, review rating, review author, review title, review helpfulness rating, verified purchase and product price.

Measurement

The independent variable product type is measured on categorical level and contains the categories “search product” and “experience product”. It was required to calculate

additional variables based on the scraped data, namely the sentiment score, the misalignment score and the review depth.

The first mediating variable, misalignment between review rating and review text, is calculated as discrepancy between rating and sentiment score. While the former is

incorporated as the given numerical scale indicating the costumer’s satisfaction from 1 (‘I hate it’) to 5 (‘I love it’), the sentiment score is calculated by means of an automated sentiment analysis tool. The opinion-mining tool SentiStrength was applied in multiple research projects (Garas, Garcia, Skowron, & Schweitzer, 2012; Kucuktunc, Cambazoglu, Weber, & Ferhatosmanoglu, 2012; Pfitzner, Garas, & Schweitzer, 2012; Thelwall, Buckley, & Paltoglou, 2011) and consistently evaluated favourably for the analysis of short-written online texts and social media channels (Thelwall, Buckley, Paltoglou, Cai, & Kappas, 2010; Thelwall, Buckley, & Paltoglou, 2012).

Firstly, based on occurrence of negative and positive words SentiStrength determines one score on a scale from -1 to -5 for negative emotions, and one score from 1 to 5 for

positive emotions. Both indicate the strength of the particular emotion present in the review text. For instance, a sentiment score of -5 would mean a very high share of negative emotions, while 5 a very high share of positive emotions. Secondly, the sentiment scores were offset with each other to create one single sentiment score. In contrast to the five-point scale review rating, the summed-up sentiment score ranges on a nine-point scale from -4 to 4. In order to

(18)

compare it to the review rating, the scale will be shifted to 1 to 9 and then standardised to 1 to 5 by using a linear transformation. The resulting values were rounded up to their closest integer value. In this way, comparable scales for both review rating and review text were retained. Lastly, the sentiment score is subtracted from the review rating and saved as absolute values to create the variable “misalignment score” as first mediator. Thus, the distance of review text from review rating measured as interval can be calculated.

The second mediator review depth, is determined as the text length based on a word-count for each specific online review and will be measured on ratio level. The word word-count is calculated using a Python function.

The moderator product price is measured in their original prices and categorised as “low-priced”, “average-priced” and “high-priced” depending to the height of the price (see Table 1).

Finally, the dependent variable online review helpfulness rating is measured as ratio variable determined by the total amount of helpfulness votes for each specific online review. Amazon.com users can indicate the helpfulness through answering the question “Was this review helpful to you?” by choosing the “Yes” button displayed at the bottom of the online review text. It has to be acknowledged that the review helpfulness rating might be biased due to the frequency count that Amazon reports for each single online review. For instance, very popular online reviews are accompanied by information, such as “98 people found this helpful”, and are more likely to be displayed at top of other online reviews. Thus, they might be more likely to be seen and influence users to find this review as helpful as well. Online reviews that were not voted as helpful are displayed without any information about the helpfulness vote frequency.

(19)

In order to answer the hypotheses different statistical tests were performed. At first, independent t-tests were performed to examine whether there are significant differences in review depth as well as misalignment scores between search and experience products. Secondly, a multiple linear regression was performed to predict the effect of review

misalignment and review depth on review helpfulness rating. Thirdly, by introducing model 14 of the computational tool PROCESS SPSS (Hayes, 2018), the moderation of product price on the relationship of review misalignment and review helpfulness rating was tested. The same model tested the moderation between review depth and review helpfulness rating as well.

The relevant descriptive statistics for the variables were summarised in Table 2. For the moderator product price, following numbers of online reviews were obtained: 16,164 (7,787 experience products, 8,377 search products) for low-priced, 10,752 (5,348 experience products, 5,404 search products) for average-priced and 1,066 (433 experience products, 633 search products) for high-priced products.

Table 2 - Descriptive statistics for variables

Full sample (N = 27,982) Search products (n = 14,414) Experience products (n = 13,568)

Variables Mean SD Mean SD Mean SD

Review Rating 4.10 1.37 3.99 1.45 4.21 1.26 Sentiment Score 3.24 0.74 3.18 0.75 3.31 0.72 Misalignment Score 1.28 1.28 1.27 1.27 1.28 1.28 Review Depth 44.07 62.19 37.57 56.33 51.00 67.18 Helpfulness Rating 1.58 18.46 1.46 15.45 1.70 21.20

Results

(20)

H1 assumed that online reviews for experience products have a greater misalignment than online reviews for search products. An independent sample t-test on the misalignment score was performed, pointing out that misalignment for experience products (M = 1.28, SD = 1.28, n = 13568) is significantly greater than the misalignment for search products (M= 1.27, SD = 1.27, n = 14414), t (27980) = 2.03, p < .05, d = .03, 95% CI [0.00, 0.04]. The effect size for the misalignment is small. Two paired t-tests were performed to check whether there is a significant difference between the review rating and the review sentiment score for experience products as well as for search products. For experience products, the mean of review rating (M= 4.21, SD = 1.26, n = 13568) was significantly greater than the mean of sentiment scores (M =3.31, SD = 0.72, n = 13568), t(13567) = 89,725, p < .001, 95% CI [0.88, 0.92]. For search products, the mean of review rating (M= 3.99, SD = 1.45, n = 14414) was significantly greater than the mean of sentiment scores (M =3.18, SD = 0.75, n = 14414), t(14413) = 78.375, p < .001, 95% CI [0.78, 0.83]. After pointing out that there is a significant difference between the review rating and the review text, a misalignment score considering the

difference between review rating and sentiment score was calculated to examine whether there is a significant difference between product types.

H2 assumed that online reviews for experience products have a greater review depth than online reviews for search products. The word count of online reviews was significantly greater for experience goods (M = 50,99, SD = 67,18, n = 13,568) than for search goods (M = 37,57, SD = 56,33, n = 14414), t(26529) = 18.057, p < .001, d = .22, 95% CI [0.19, 0.24]. The hypothesis could be confirmed and the effect size is small to moderate.

H3 assumed a relationship between the review misalignment and reviews helpfulness rating. There was no significant relationship between misalignment and helpfulness rating of online reviews found, F(1, 27981) = 373.639, p = 0.084), R² of .026. Therefore, the linear regression suggested that the relationship is not significant. The slope coefficient for each unit

(21)

of misalignment was -.252, so the helpfulness rating decreases by .252 units for each unit of misalignment. However, the R² of .026 indicates not a good fit of the model.

In H4 it was assumed that online review depth predicts the online review helpfulness rating. This assumption was supported by the results. There was a significant relationship between review depth and helpfulness rating for online reviews, F(1, 27981) = 373.639, p < .001) with an R² of 0.026. The predicted helpfulness rating is equal to -0.844 + 0.048 (words) units when review depth is measured in words. Helpfulness rating increased by 0.048 for each additional word in the review text.

Additionally, a simple linear regression was used to test the relationship particularly for experience products. There was a significant relationship between review depth and helpfulness rating for online reviews of experience products found, F(1, 13566) = 269.1, p < .001) with an R² of 0.02. The predicted helpfulness rating is equal to -0.544 + 0.044 (words) units when review depth is measured in words. Helpfulness rating increased by 0.044 for each additional word in the review text. The regression graph indicates a peak and following decline of helpfulness rating when the review texts exceeds 750 words. However, the review depth can explain 2% of the total variance in review helpfulness rating.

Complementary, a simple linear regression was used to test the relationship for search products. There was a significant relationship between review depth and helpfulness rating for online reviews of search products found, F(1, 14412) = 573.5, p < .001) with an R² of 0.04. The predicted helpfulness rating is equal to -0.574 + 0.053 (words) units when review depth is measured in words. Helpfulness rating increased by 0.053 for each additional word in the review text. The regression graph indicates a peak and following decline of helpfulness rating when the review texts exceeds 511 words. However, the review depth only can explain 4% in the variance of review helpfulness rating. The results point out, that there is a stronger

(22)

H5 a moderation was assumed which positively influences the relationship between review misalignment and review helpfulness rating. Additionally, in H6 a moderation was assumed to influence relationship between review depth and review helpfulness rating.

Regression analysis with product price as moderator was performed using the model 14 of the PROCESS macro for SPSS to test H5 and H6. The analysis revealed that the moderation on relationship with the review helpfulness rating was not significant for review misalignment but for review depth. It was assumed that the influence of misalignment on review helpfulness rating would be stronger when the product is high-priced than when it is low-priced. Results indicated that the relationship between misalignment and review helpfulness rating is not significantly moderated by the product price (b = -.001, SE = .003, p = .687). Furthermore, it was assumed that the influence of review depth on review helpfulness rating would be

stronger when the product is high-priced than when it is low-priced. Results indicated that the relationship is significantly moderated by the product price ((b = -.001, SE = .001, p < .05). Simple slopes were calculated at 1 SD below (5,99 Euro) and above (100,78 Euro) the mean of product prices, in order to examine whether the interaction is aligned with the hypnotised direction. As expected, the slope of the relationship between review depth and review

helpfulness rating was slightly less influential for low-priced products (b = .043, SE = .002, p < .001) than for high-priced products (b = .048, SE = .002, p < .000). At the highest price of 498,00 Euro, the influence was the strongest (b = .067, SE = .008, p < = .001).

Discussion

Taking the results into account, three key findings are important. Firstly, it can be concluded that misalignment between review rating and review text is existing and

furthermore, that it is greater for experience products than for search products. These findings support previous research and indicate there is a gap between review ratings and the review text (Fazzolari et al., 2017; Geetha et al., 2017; Lak & Turetken, 2014). This could possibly

(23)

be caused by the difficulty to narrow ambivalent product evaluations down to a scale from one to five, especially for highly subjective products. In general, customers seem to evaluate both product types more favourable in review ratings than in the review text, which could mean that ratings draw a biased picture of product evaluations. However, results indicate that for review misalignment does not influence the usefulness of online reviews. Since review helpfulness rating is not affected by misalignment, customers are either unaware or indifferent about differing opinions within the online review. Possibly, customers do not notice

misalignment or if they do, it might be not important enough that it would be irritating or disturbing. Thus, the initial assumption that misalignment has negative influence on the perceived helpfulness of online reviews has to be rejected.

Secondly, and in contrast to misalignment, review depth seems to be an important factor for the helpfulness of online reviews. In accordance with previous research, that the amount of words an online review contains has significant influence on the received

helpfulness votes, to be precise the more words the text contains the more helpful it should be perceived (Chua & Banerjee, 2014; Filieri, 2015; Lee & Choeh, 2016). Conceivably, those online reviews discuss a higher number product aspects compared to reviews with low depth. Therefore, they could contain more useful information for the purchase-decision process. When consumers are moderately involved into the decision-making process and inclined to actively read the review text, then they might be more likely to consider those online reviews as useful basis for their decision and evaluate them as helpful. However, plotting the

regression graph indicates that both product types have certain a threshold until which review depth has a positive influence on the review helpfulness rating. Online reviews for search products are probable perceived as most helpful around 511 words, afterwards the helpfulness tends to decline with each additional word. Online reviews for experience indicate to have this peak around 750 words per online review. It might be that, customers perceive very long online reviews as too long to read and are unwilling to spend cognitive or time resources to

(24)

read the review texts. The findings that both product types seem to have different turning points are align with the findings of H2, which indicate that online reviews for experience products are generally more likely to have a higher number of words than search products. As with the misalignment, this can be explained in the more subjective product evaluations compared to search products, which might incline consumers to expend more resources and effort into the decision process.

Thirdly, results suggest that product price has a moderating influence on the

relationship between review depth and review helpfulness rating, but not on the relationship between misalignment and review helpfulness rating. Online reviews for high-priced products suggest a higher correlation between review depth and helpfulness rating, which indicates that for more expensive products lengthier online reviews are considered as more helpful.

However, the difference between low-priced and high-priced products was only marginal. But in theory, as high-price products mainly require relatively high involvement, customer tend to perceive reviews with adequate depth as more helpful than online reviews which might lack important information. In contrast, online reviews for low-priced products seem to be perceived as helpful already with less review depth. This effect could not be found for the misalignment, which would be plausible as there was no significant relationship visible beforehand. The results solely confirm that there are no concealed relationships within the particular price levels. Thus, it can be assumed that misalignment between review rating and review text does not play a relevant role for the helpfulness, but can confirm the importance of other characteristics.

The present study has several theoretical and practical implications under which the results have to be considered.

With regards to the online review diagnosticity, the thesis aimed to provide support for the model and gain knowledge about potentially influencing factors for online review

(25)

product type and review depth and generally confirm the model. There was an indication of product price as slight moderating influence on review depth, which previously was not considered but could be relevant as well. However, this seems not to apply for the misalignment in online reviews. Further research should focus on product price to either confirm or reject the influence on review helpfulness rating and also examine other variables with regards to the online review diagnosticity model. Additionally, review misalignment should be examined using different sentiment analysis tools to confirm the results.

From a practical point of view, the results provide implications for customers and retailers. In accordance with previous research, it could be shown that online reviews have different factors which are important for the usefulness as information source for customers, but also that review misalignment is an existing phenomenon (Mudambi & Schuff, 2010; Mudambi et al., 2014). Especially, when using online reviews as information source for experience products, customers are well-advised to not only rely on review ratings as indicator for product evaluations. Although review misalignment does not have significant influence on the review helpfulness rating, it still could lead to confusion or wrong

conclusions during the purchase process.

For retailer, review misalignment could mean that review ratings that measure product experience on only one scale might not be as accurate as using multi-dimensional scales. This was already indicated in previous research and confirmed that uni-dimensional ratings might be less suitable to reduce uncertainty among customers(Tunc, Cavusoglu, & Raghunathan, 2017). Retailer might need to improve existing online review systems to account for differences in product types and offer suitable possibilities for product evaluations.

The study points out that online reviews for search and experience products on Amazon.com are differing in their review misalignment as well as in their review depth. For experience products, the review misalignment and the review depth is significantly greater than for search products. However, while for none of the product types the differences in

(26)

review misalignment have impact on the review helpfulness rating, the review depth positively influences the review helpfulness rating until a certain point. This effect was marginally moderated by product price for the relationship between review depth and review helpfulness rating, but not for review misalignment and review helpfulness rating. Thus, the higher the price for the product, the more helpful online reviews with greater depth are perceived. These results imply, that customers may find different online review aspects important when using online reviews as information source, but that they do not tend to find review misalignment irritating. However, review misalignment is existent and retailers as well as customers need to be aware of it.

Limitations & Implications

Several limitations have to be mentioned when considering the results for further research.

Foremost and with regards to the sample, it has to be acknowledged that data is only representative for Amazon.com and therefore should not be generalised to other markets or retail platforms. Additionally, the sample distribution was skewed towards the low-priced products as they generally receive more online reviews than high-priced products. Online reviews for high-priced products were under-represented, which could be due to the fact that those products might be bought not as frequently as low-priced products or customers tend to purchase and evaluate them not on Amazon.com but on specialised websites or in physical stores.

Secondly, the sentiment analysis has been conducted on document-level, which means in contrast to aspect level, the analysis does not consider specific, mentioned entities within the sentences. It only classifies the whole text as positive, negative or neutral. As consumers tend to make comparison with other models or products from other brands, the sentiment analysis is not able to distinguish between those situations and would still consider them as

(27)

opinion about the initial product. In addition, since only one sentiment analysis tool has been used to determine the sentiment tone of the review texts, there could have been mistakes or patterns of misinterpretations which were not identified and lead to biases in the results. In order to minimise this threat, using an additional sentiment analysis tools for verification would have been advisable, but was not feasible within the scope of the study.

(28)

References

Baek, H., Ahn, J., & Choi, Y. (2012). Helpfulness of online consumer reviews: Readers' objectives and review cues. International Journal of Electronic Commerce, 17(2), 99-126. 10.2753/JEC1086-4415170204

Cheung, C. M., Sia, C., & Kuan, K. K. (2012). Is this review believable? A study of factors affecting the credibility of online consumer reviews from an ELM perspective. Journal of the Association for Information Systems, 13(8), 618.

Chua, A. Y. K., & Banerjee, S. (2015). Understanding review helpfulness as a function of reviewer reputation, review rating, and review depth. Journal of the Association for Information Science and Technology, 66(2), 354-362. 10.1002/asi.23180

Chua, A. Y. K., & Banerjee, S. (2014). Developing a theory of diagnosticity for online reviews.

Chua, A. Y., & Banerjee, S. (2015). Understanding review helpfulness as a function of reviewer reputation, review rating, and review depth. Journal of the Association for Information Science and Technology, 66(2), 354-362.

Darley, W. K., Blankson, C., & Luethge, D. J. (2010). Toward an integrated framework for online consumer behavior and decision making process: A review. Psychology and Marketing, 27(2), 94-116. 10.1002/mar.20322

Dholakia, U. M. (2001). A motivational process model of product involvement and consumer risk perception. European Journal of Marketing, 35(11), 1340-1362.

(29)

Fazzolari, M., Cozza, V., Petrocchi, M., & Spognardi, A. (2017). A study on text- score disagreement in online reviews. Cognitive Computation, 9(5), 689-701. 10.1007/s12559-017-9496-y

Filieri, R. (2015). What makes online reviews helpful? A diagnosticity- adoption framework to explain informational and normative influences in e- WOM. Journal of Business Research, 68(6), 1261-1270. 10.1016/j.jbusres.2014.11.006

Garas, A., Garcia, D., Skowron, M., & Schweitzer, F. (2012). Emotional persistence in online chatting communities. Scientific Reports, 2, 402.

Geetha, M., Singha, P., & Sinha, S. (2017). Relationship between customer sentiment and online customer ratings for hotels-an empirical analysis. Tourism Management, 61, 43-54.

Hayes, A. F., author. (2018). In EBSCOhost (Ed.), Introduction to mediation, moderation, and conditional process analysis : A regression-based approach / (Second edition. ed.) New York : The Guilford Press.

Kucuktunc, O., Cambazoglu, B. B., Weber, I., & Ferhatosmanoglu, H. (2012). A large-scale sentiment analysis for yahoo! answers. Paper presented at the Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, 633-642.

Lak, P., & Turetken, O. (2014). Star ratings versus sentiment analysis -- A comparison of explicit and implicit measures of opinions10.1109/HICSS.2014.106

Lee, S., & Choeh, J. Y. (2016). The determinants of helpfulness of online reviews. Behaviour & Information Technology, 35(10), 853-863. 10.1080/0144929X.2016.1173099

(30)

Li, X. X., & Hitt, L. (2010). Price effects in online product reviews: An analytical model and empirical analysis

Metzger, M. J. (2007). Making sense of credibility on the web: Models for evaluating online information and recommendations for future research.(report). Journal of the American Society for Information Science and Technology, 58(13), 2078.

Metzger, M. J., Flanagin, A. J., & Medders, R. B. (2010). Social and heuristic approaches to credibility evaluation online. Journal of Communication, 60(3), 413-439.

10.1111/j.1460-2466.2010.01488.x

Mudambi, S. M., & Schuff, D. (2010). Research note: What makes a helpful online review? A study of customer reviews on amazon.com. MIS Quarterly, 34(1), 185-200.

Mudambi, S. M., Schuff, D., & Zhewei Zhang, D. (2014). Why aren't the stars aligned? an analysis of online review content and star ratings10.1109/HICSS.2014.389

Nelson, P. (1970). Information and consumer behavior. Journal of Political Economy, 78(2), 311-329.

O'Keefe, D. J. (2008). Elaboration likelihood model. The International Encyclopedia of Communication,

Pan, L., & Chiou, J. (2011). How much can you trust online information? cues for perceived trustworthiness of consumer-generated online

information//doi.org/10.1016/j.intmar.2011.01.002 Retrieved from

(31)

Park, D., Lee, J., & Han, I. (2007). The effect of online consumer reviews on consumer purchasing intention: The moderating role of involvement. International Journal of Electronic Commerce, 11(4), 125-148. 10.2753/JEC1086-4415110405

Peter, J. P., & Olson, J. C. (1990). Consumer behaviour and marketing strategy Irwin Homewood, IL.

Petty, R., & Cacioppo, J. (1983). Source factors and the elaboration likelihood model of persuasion. Advances in Consumer Research, 11, 668.

Pfitzner, R., Garas, A., & Schweitzer, F. (2012). Emotional divergence influences information spreading in twitter. Icwsm, 12, 2-5.

Quambusch, N. (2015). Online customer reviews and their perceived trustworthiness by consumers in relation to various influencing factors.

Solomon, M., Russell-Bennett, R., & Previte, J. (2012). Consumer behaviour Pearson Higher Education AU.

Thelwall, M., Buckley, K., & Paltoglou, G. (2011). Sentiment in twitter events. Journal of the Association for Information Science and Technology, 62(2), 406-418.

Thelwall, M., Buckley, K., & Paltoglou, G. (2012). Sentiment strength detection for the social web. Journal of the American Society for Information Science and Technology, 63(1), 163-173. 10.1002/asi.21662

Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., & Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544-2558. 10.1002/asi.21416

(32)

Tunc, M. M., Cavusoglu, H., & Raghunathan, S. (2017). Single-dimensional versus multi-dimensional product ratings in online marketplaces.

Wan, Y., Nakayama, M., & Sutcliffe, N. (2012). The impact of age and shopping experiences on the classification of search, experience, and credence goods in online shopping. Information Systems and E-Business Management, 10(1), 135-148. 10.1007/s10257-010-0156-y

Willemsen, L., Neijens, P. C., Bronner, F., & de Ridder, J. (2011). "Highly recommended!" the content characteristics and perceived usefulness of online consumer

reviews10.1111/j.1083-6101.2011.01551.x

Zaichkowsky, J. (1986). Conceptualizing involvement. Journal of Advertising, 15(2), 4-34. 10.1080/00913367.1986.10672999

Zhang, W., & Watts, S. (2003). Knowledge adoption in online communities of practice. ICIS 2003 Proceedings, 9.