Relationship of positive emotions and emotional tone in online reviews to perceived review helpfulness

(1)

Title Page Bachelor’s Thesis

Relationship of Positive Emotions and Emotional Tone in Online Reviews

to Perceived Review Helpfulness

Paul Schmid 11892110

Universiteit van Amsterdam Faculty of Economics and Business

BSc Business Administration – Management and Leadership in the Digital Age Bachelor's Thesis and Thesis Seminar Management in the Digital Age - 6013B0510Y

Topic 6: Exploring Online Consumer Behavior from Online Reviews Supervisor: Frederik Situmeang

10th of July, 2020

(2)

Statement of Originality

This document is written by Student Paul Schmid (11892110) who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Table of Contents

Title Page Bachelor’s Thesis ... 1

Statement of Originality ... 2 Table of Contents... 3 Abstract: ... 5 1. Introduction ... 5 2. Theoretical Framework ... 8 3. Method ... 13 Dependent Variable... 14 Independent Variables ... 15 Control Variables ... 16 Analytical Model ... 16 4. Results ... 17 Descriptives ... 17 Assumptions ... 19

Linear Regression Model ... 19

Hypothesis 1 ... 21

Hypothesis 2 ... 21

Further Results ... 22

5. Discussion ... 23

(4)

Limitations ... 25 Future Research ... 27 Practical Implications ... 28 6. Conclusion ... 29 References: ... 31 Appendix: ... 36 Linear Regression ... 36 Residuals Plots ... 38

Linear Regression Assumption Checks ... 41

Collinearity Statistics ... 41

Q-Q Plot ... 42

Correlation Matrix ... 49

Correlation Matrix Plot ... 50

Descriptive Plots ... 52

(5)

Abstract:

Customer reviews are gaining increasing importance in consumers’ consultation processes before buying a product and it is, therefore, more and more important to understand what factors make consumers perceive a review as helpful. To see, whether positive

emotional content in a review is related to perceived review helpfulness, this study examined the relationship between these factors via linear regression. This paper proposes that the two are positively related and that a review with positive emotional content is likely to be

perceived as more helpful than a review without positive emotional content. More so, it proposes that reviews with positive emotions are perceived to be more helpful than reviews with overtly negative emotional content. These hypotheses were tested with a sample of 297,010 video game user reviews obtained from Metacritic.com. However, these hypotheses did not find support with this data set, as no significant relationship between, both, discreetly and overtly positive emotions and review helpfulness was found, suggesting that more

research may need to be conducted to fully understand the relationship between these factors.

1. Introduction

As an increasing number of people consult customer reviews before making a purchase decision, the effect that online reviews have on other consumers is becoming more important. According to Trustpilot & Canvas8 ("The critical role of reviews in Internet trust", 2020) 89% of global consumers read reviews as a part of their online purchasing decisions. 49% consider positive reviews to be among the top 3 influences of their purchasing decision.

Especially for experience goods, such as video games, electronic word of mouth (eWOM) plays a significant role in the purchasing decisions consumers make (See-To & Ho,

(6)

2014), and through the rise of reviews, the accessibility of eWOM has vastly increased. Experience goods used to be more limited in their reach, as it was harder for consumers to judge the quality of these goods because they did not have such vast information available to them through online reviews and for similar goods, a single review with an interesting description of a single user’s experience with the game might have more weight and

informational value to a consumer than an average rating. Furthermore, the amount of people reading online reviews is likely to increase due to the COVID-19 crisis and the increased importance of online shopping.

Due to high concerns about the trustworthiness of online reviews, consumer reviews are facing a bad reputation, because companies continuously try to find new ways to make sure that the majority of reviews on their products is positive. This can be achieved through free product tests in exchange for a review or actively searching for satisfied customers and asking them to write a review, knowing it will be positive. As a response, this creates an information asymmetry for consumers, and it has become more difficult for them to

distinguish genuine reviews from fake ones. This positive skew in ratings can help companies influence consumers purchasing decisions and the effect is likely to have a bigger impact, now that an increasing number of consumers are shopping online. It is, therefore, becoming increasingly important to gain a better understanding of what makes a helpful review and what influences other consumers’ perceptions of it in order to make reviews more

informational and helpful for consumers.

Even though the market segment of video games is a considerable part of the entertainment industry and is projected to reach a revenue of 92,633 million US Dollars in 2020 ("Video Games - worldwide | Statista Market Forecast", 2020) with certain sectors

(7)

having more than 1.5 billion individual users and an average revenue per user of up to 35.17$, research on video game reviews is often neglected and scientific information may be hard to come across. Furthermore, the video game industry is an interesting field to study, as it offers a wide variety of games to a broad and diverse population, ranging across all ages, with the majority of users being in the age range of 25-34 (36.36% in 2019). The industry is also rapidly growing and advance in technology and graphics and rising consumer expectations are making it more and more expensive to develop, giving reviews a potentially big influence on how games are received and how much revenue they generate.

In the field of online consumer reviews, a lot of research has been conducted into how more negative reviews impact the perceived helpfulness to consumers. This is due to the “negativity bias”, which many papers were able to confirm (Cao et al., 2011; Chevalier & Mayzlin, 2006). The theory of the “negativity bias” states, that consumers are more receptive to negative cues, as they perceive positive cues to be the norm and they can get more

information out of negative feedback (Sen & Lerman, 2007). Therefore, negative reviews tend to be perceived as more helpful by customers, compared to more positive reviews. However, not a lot of research has been conducted into how positive reviews influence perceived review helpfulness and this information can be valuable to both companies and scientific literature as it will shed more light on what makes a helpful review, it can give a better impression on how customers feel about certain review contents and how companies can interpret their customers’ reviews.

This paper will build onto the current literature by examining the extent to which discreet and overtly positive emotions and a positive emotional tone relate to perceived review helpfulness in the form of upvotes on a review. This can provide further insights into

(8)

what constitutes a helpful review and what consumers value in reviews, especially relating to video games.

The research question, which will, therefore, be addressed in this paper, is:

To what extent are discreet and overtly positive emotions and emotional tone related to perceived helpfulness in online reviews?

This will contribute to our general understanding of the importance of reviews in customers purchasing decisions and it will help companies and science gain a better understanding of the factors that influence perceived review helpfulness and give us more information on the other side of the negativity bias.

Firstly, this paper will provide a theoretical framework on the subject, summarizing and discussing the relevant literature and emphasizing the key takeaways and findings. It will then give an introduction to the methodology used in this study and focus on the different variables and how the data was collected and analyzed. The results of the multiple linear regression model will then be presented and analyzed, leading to a discussion of the results, with their limitations and implications for future research.

2. Theoretical Framework

In the literature surrounding online reviews and eWOM, many articles focus on the effects that negative emotions in reviews have on readers and how different types of negative emotions affect them.

(9)

In their exploration of what determines an online review’s helpfulness, Cao et al. (2011) and Chevalier and Mayzlin (2006) were able to confirm a “negative bias” effect in online reviews, meaning that negative reviews and experiences are perceived to be more helpful and are more likely to influence a consumers decision than a particularly positive review. This might be due to the fact that consumers perceive positive cues to be the norm causing negative content to stand out more (Sen and Lerman, 2007). This concept has also been examined extensively and studies were able to show that negative information is more attention-grabbing (Baumeister et al., 2001; Fiske, 1980) influences peoples impression of something to a higher degree (Fiske, 1980) and that people are more likely to remember it (Pratto & John, 1991). Cao et al. (2011) further provided evidence that reviews expressing more extreme opinions are also more likely to get helpfulness-votes than reviews with less extreme opinions and contents. However, results of Mudambi and Schuff’s (2010) analysis of Amazon reviews revealed that this might not be true for experience goods and that reviews with more moderate, rather than extreme content, are considered to be more helpful if the product in question is an experience good.

Another factor influencing the perceived helpfulness of a review is whether it includes an explicit endorsement of the product or not, as consumers are more likely to think that a review was written by an expert if there is an explicit recommendation in it (Packard & Berger, 2017). However, they were able to prove that experts are less likely to explicitly recommend something, as they are likely more aware of gaps in their knowledge, making an explicit recommendation less likely. Novices, on the other hand, are significantly more likely to explicitly endorse a product and since both experts and novices perceive explicit

(10)

perception of a product and the perceived helpfulness of reviews. This also applies to metacritic.com with reviews written by critics in comparison to reviews written by users, as critics are probably less likely to explicitly recommend a game. Furthermore, critics only review a game in the time of its’ release, while users keep writing reviews for a longer period of time, accumulating a bigger amount of reviews. On the website analyzed in this study, Metacritic.com, crtitics are the experts and they are comprised of industry experts and journalists publishing their reviews of games through various channels.

Stephen (2016) and Tang et al. (2014) were able to provide evidence that reviews containing both pros and cons of products and positive reviews, in general, can have a positive effect on customers’ purchase decisions and can positively influence a business’s performance. This can be attributed to the idea that positive word of mouth might add to the popularity of a particular product, attracting more potential customers, who might not be as sensitive to prices and more loyal in general, improving business performance (Casaló, Flavián & Guinalíu, 2008). It can, therefore, be concluded, that positive review content does influence purchase decisions to some extent, and it can be argued that these reviews are then also considered as helpful by the customers.

Therefore, Hypothesis 1 is split up into two parts, with one focusing on discreetly positive emotions and one focusing on overtly positive emotion, and it states the following:

H1a: Emotional tone and discreet positive emotions in an online review have a significant positive relationship to perceived review helpfulness.

(11)

H1b: Emotional tone and overtly positive emotions in an online review have a significant positive relationship to perceived review helpfulness.

Yin, Bond and Zhang (2014) investigated how different negative emotions expressed in online reviews affect perceived review helpfulness. They focused on anger and anxiety in particular and arrived at the conclusion, that there is a significant difference in perceived review helpfulness between the two, where reviews containing forms of anxiety were perceived to be more helpful than reviews containing forms of anger.

The influence that the overall rating of a game has on people’s perception and rating of it was shown by Livingston et al. (2011), as they were able to provide evidence that people, who read positive reviews of a game, gave it significantly better ratings after playing it than people who read negative reviews. However, they also observed that there was no difference in ratings between participants who read a positive text and those who did not read a text at all. It can be argued, that this is also proof for the existence of a negativity bias and is further evidence, that negative information has a more significant effect on people than positive or no information.

However, reviews that are overtly negative and do not align with the general

consensus might be perceived as too emotionally difficult for consumers, which might lead to them avoiding the review altogether, as it puts them in a situation of an emotional trade-off (Yi & Baumgartner, 2004), where they have to decide between buying the product, risking to have the same reaction as the overtly negative review stated, or not buying the product and losing out on the potential benefits of it. According to Barnett White (2005), consumers would prefer a “benevolent provider” in a situation like that. This might be because the

(12)

information gained from that source might be perceived as more positive and less ambiguous, not leaving the consumer with as hard of a decision.

Therefore, Hypothesis 2, also split into two parts, where one focuses on discreetly positive emotions and one focuses on overtly positive emotions, and it states the following:

H2a: Emotional tone and discreet positive emotions in an online review have a more significant positive relationship to perceived review helpfulness than reviews with overtly negative content.

H2b: Emotional tone and overtly positive emotions in an online review have a more significant positive relationship to perceived review helpfulness than reviews with overtly negative content.

Furthermore, Kim and Gupta (2012) were able to show that negative expressions in a review on its’ own lowered the reviews’ value to the consumer in terms of information, whereas positive expressions in a review on its’ own, did not change much about the

consumers’ perception of the product, even though they got a positive association through the positive review. Only when there were multiple reviews, the different emotional expressions provided informative value to consumers, as they then had a way to compare the different sentiments and reach a conclusion. It is therefore of importance to judging helpfulness, how many reviews consumers have at their disposal.

(13)

Malik and Hussain (2017) utilized the broad surrounding literature and deep learning to develop a predictive model for review helpfulness which estimates review helpfulness based on 8 variables, among which are discreet positive and negative emotions. This model managed to achieve a predictive f-measure of over 89%, showing how significant the progress in this field of literature is.

3. Method

The Analysis in this paper will be of empirical nature and focus solely on the video gaming industry. This paper will attempt to answer the research question by analyzing 478,727 videogame reviews obtained from Metacritic.com by using Python and provided by the University of Amsterdam, which include, among other data, information on the users’ ratings, the words used in the reviews, whether other users perceived the reviews to be useful, the name of the game, the users’ ratings and the platform which it is offered on. The data was obtained in 2019 and only reviews published before the time of data collection will be taken into account for this analysis. Alteryx was utilized to calculate additional variables from the data and create a suitable data set (Alteryx-Workflow, Appendix).

Metacritic.com is a website offering reviews and ratings for a wide variety of media ranging from games to music, tv shows and movies. These ratings are split up into critic reviews and user reviews with a weighted average rating for each category, the Metascore being the score given by critics, ranging from 0 to 100, and the user score is the score given by users, ranging from 0 to 10. If a videogame gets a score above 90 and has been reviewed by at least 15 publications, it will receive a “Must-Play” Award ("How We Create the Metascore Magic - Metacritic", 2020). These games are considered to be among the best and

(14)

are highly acclaimed by a broad range of critics, making it important to the video game industry and how games are received. Critic reviews are written by industry experts and renowned critics publishing their reviews in a wide variety of publications and newspapers ("Frequently Asked Questions - Metacritic", 2020). A list of the different publications taken into consideration can be viewed on the website and it is subject to reevaluations and updates each year, in order to stay up to date and get as good of an overview over critics’ opinions as possible. For a game to get a Metascore, at least 4 critic reviews have to be written.

Metacritic is considered to be independent and has a strong standing in the industry, as it is a benchmark for both reviews by critics and users. Linguistic Inquiry and Word Count developed by Pennebaker et al. (2007) will be used to analyze the emotions and word groups used in each review, to then determine, whether emotional tone and positive emotions have a relationship to perceived usefulness of online reviews. In order to do that, only reviews with some usefulness rating will be used. 181,717 reviews did not have any upvotes, were not considered useful, leaving the data set with 297,010 reviews, which received some kind of usefulness rating. The sample size of this study is, therefore, n = 297,010.

Dependent Variable

Review Helpfulness is a continuous variable calculated by subtracting negative

ratings on a review from the total helpfulness ratings reviews receive, leaving a measure that accounts for both people perceiving a review as helpful and people, who do not perceive it as such. Therefore, Review Helpfulness= totalThumbs – totalDowns.

(15)

Independent Variables

Emotional tone is a continuous variable ranging from 0 to 100, where a value around

50 is suggesting that there is no display of emotions or other forms of ambivalence and a higher value generally indicates a more positive and upbeat content, while a lower value indicates higher levels of anxiety or hostility (Pennebaker et al., 2015a).

Positive and negative emotions are continuous variables ranging from 0 to 100,

where 0 is a very low display of positive/negative emotion and 100 is high. Examples are “love, nice & sweet” for positive emotions (posemo) and “hurt, ugly & nasty” for negative emotions (negemo) (Pennebaker et al., 2015b). In total there are 620 words accounted for posemo and 744 for negemo. These variables will be split up into 5 dummy variables, with averages and standard deviations being calculated for each game individually, as these categories will make the analysis more nuanced help with interpretation:

1. overtly positive: posemo > avg. posemo + 1 std. deviation

2. discreetly positive: avg. posemo – 1 std. deviation < posemo < avg. posemo + 1 std. deviation

3. neutral: posemo < avg. posemo – 1 std. deviation AND negemo < avg. negemo – 1 std. deviation

4. discreetly negative: avg. negemo – 1 std. deviation < negemo < avg. negemo + 1 std. deviation

(16)

Control Variables

The Word Count in each review will also be introduced as one of the variables, as longer reviews are more likely to be considered as helpful, as they often provide more information to users, giving them more to base their decision off of (Yin et al., 2014).

Furthermore, the Average User Score for each game will be introduced as a control variable, as it reflects the users’ opinion consensus for each game, by giving a quantitative value and there might be a correlation to what reviews are perceived as helpful and the general rating the users gave the game, where negative reviews might be perceived as more helpful with a negative rating and vice versa. User ratings, rather than critics’ ratings, will be analyzed in this study because Critic’s reviews often heavily differ from the general

consensus of consumers on Metacritic.com and since this study’s focus is on user reviews, using their ratings as a control variable is more feasible.

Analytical Model

Multiple linear regression will be utilized with review helpfulness in form of upvotes as the dependent variable and emotional tone and the dummy variables for positive and negative emotions as the independent variables, with average user score and word count as control variables in a separate model, to determine the relationship between the independent variables and review helpfulness. This result will help indicate whether the formulated hypotheses can be confirmed or not.

(17)

The underlying formula for this statistical model is Yi=b0+b1X1i+b2X2i+…+bnXni+εi,

where Yi is the outcome variable, b1 is the first predicator’s (X1i) coefficient, b2 is the second

predictor’s (X2i) coefficient and bn is the nth predictor’s (Xni) coefficient. This formula allows

for multiple predictors to be added to the model and tries to fit a linear model with independent variables to data in order to predict an outcome variable (Field, 2018).

Even though this will help show, whether there is a relationship between the variables, it cannot be utilized to draw a conclusion about the causality of this relationship, which is why this study will not make claims about causality and solely about whether a relationship can be observed.

The analysis will be conducted in the statistical analysis tool Jamovi (The Jamovi Project, 2019), as it provides many analytical tools and models and can be used flexibly. The relationship between one of the independent variables and perceived review helpfulness will be considered as significant if the calculated p-value is below 0.05.

4. Results

Descriptives

Table 1 contains the means, standard deviations and correlations of all variables. While all correlations are statistically significant with p<.001, they are mostly small to moderate in degree and no particularly strong correlations can be found. The comparatively strongest correlations can be found between Emotional Tone and Positive Emotion with a

(18)

correlation coefficient of .602 and between Emotional Tone and Negative Emotion with a correlation coefficient of -.468.

Table 1. Means, Standard Deviations, and Correlations.

N=297,010

aReview Helpfulness= totalThumbs - totalDowns

*p < .05, **p < .001

Review Helpfulness has a mean of 7.35 with a standard deviation of 14.3 and the used

data points ranged from a rating of 1 to 806. Both ratings for positive and negative emotions ranged from 0 to 100 with an average of 6.01 (SD=5.5) for Positive Emotions and 2.48 (SD=3.1) for Negative Emotions. An average rating of 62.3 (SD=36.6) for Emotional Tone was observed, with ratings ranging from 0 to 99. The Word Count in each review was 163 on average, with a standard deviation of 186 and values ranging from 0 to 4586. The mean of

Average User Scores per game in the sample had a mean of 6.67, with a standard deviation of

2.51 and ratings ranging from 0 to 10 on Metacritic’s rating scale.

Variables M SD 1. 2. 3. 4. 5. 6. 1. Review Helpfulnessa 7.35 14.3 - 2. Positive Emotion 6.01 5.5 -.063** - 3. Negative Emotion 2.48 3.1 .054** -.108** - 4. Emotional Tone 62.3 36.6 -.083** .602** -.468** - 5. Word Count 163 186 .010** -.204** -.025** -.091** -

(19)

Assumptions

Since the analysis of data in this study is based on multiple linear regression, there are multiple essential assumptions, which have to hold up in order for the analysis to be

performed. Due to the central limit theorem and the large sample size, normality and homoscedasticity can be assumed. Furthermore, there is a linear relationship between the dependent variable and the independent variables (Residuals Plots, Appendix), and the samples are independent. Lastly, a multicollinearity test revealed that there is no or very little collinearity present, with VIF values ranging from 1.1 to 2.91 (Collinearity Statistics,

Appendix). It can, therefore, be concluded that all assumptions hold up and the linear regression model can be utilized for this analysis.

Linear Regression Model

To test both hypotheses 1 and 2, linear regression was utilized. The linear regression model with the control variables Word Count and Average User Rating (Model 1, Table 2) used in this analysis has an adjusted R2 of 0.0444 and is, therefore, able to explain 4.44% of

the variation in the dependent variable. The introduction of the independent variables in the second model (Table 2) changes the adjusted R2 to 0.0447, explaining 4.47% of the variation

in the dependent variable. It can be concluded, that the adjusted R2 change between the 2

models is 2.49e-4, raising the second model’s ability to explain variation in the outcome variable by 0.0249%. Overall, the linear regression is significant with p<0.001 (Table 2) and the variables introduced additionally to the control variables add significant explanatory power to the model with p<0.001.

(20)

Table 2. Multiple Linear Regression Table.

Note. Model 1: Word Count & Average User Score; R2=0.444

Model 2: All Variables; R2 =.0447

Model 1 – Model 2: ΔR2=2.49e-4

*significant with p < .05, **significant with p < .001 Estimate SE t p Intercept 14.54936 [14.278, 14.821] .1386 104.962 <.001**

Word Count 7.89e-4

[5.09e-4, .00107]

1.43e-4 5.515 <.001**

Average User Score -1.06261

[-1.083, -1.043] .0103 -103.114 <.001** Overtly Positive Discreetly Positive Neutral Discreetly Negative Overtly Negative Emotional Tone .03939 [-.2116, .2904] -.17579 [-.3637, .0122] -1.37748 [-1.7860, -.9690] -.18431 [-.3817, .0131] -.33639 [-.5894, -.0834] -.00528 [-.0072, -.0034] .1281 .0959 .2084 .1007 .1291 9.82e-4 .308 -1.833 -6.610 -1.830 -2.606 -5.382 .758>.05 .067>.05 <.001** .067>.05 .009<.05* <.001**

(21)

Hypothesis 1

Hypothesis 1a, that emotional tone and discreet positive emotions in an online review have a significant positive relationship to perceived review helpfulness, and hypothesis 1b, that emotional tone and overtly positive emotions in an online review have a significant positive relationship to perceived review helpfulness, could both not be supported since the p-value for the independent variable discreetly positive is 0.067>0.05 and the p-p-value for overtly positive is 0.758>0.05, making both results insignificant. Furthermore, results for the independent variable emotional tone showed a negative coefficient estimate with -0.00528 (SE = -0.01362), which is significant with a p-value<0.001. This means that, generally, if emotional tone goes up by one standard deviation, the dependent variable review helpfulness goes down by 0.01362 standard deviations.

Hypothesis 2

Hypothesis 2a, that emotional tone and discreet positive emotions in an online review have a higher positive relationship to perceived review helpfulness than reviews with overtly negative content, and hypothesis 2b, that emotional tone and overtly positive emotions in an online review have a more significant positive relationship to perceived review helpfulness than reviews with overtly negative content, could also both not be supported, as the results for the independent variables discreetly positive and overtly positive were insignificant with a p-value of 0.067>0.05 for the former and a p-p-value of 0.758>0.05 for the latter.

(22)

Analysis for the independent variable overtly negative showed a with p=0.009<0.05 significant coefficient estimate of -0.33639 (SE = -0.00797). This means that, generally, when overtly negative emotions are displayed in a review, the dependent variable review

helpfulness goes down by -0.33639.

Emotional tone showed significant negative results, as stated for hypothesis 1 but since no significant results for discreetly positive and overtly positive could be found, no comparison to the results for overtly negative content could be made, meaning that the hypotheses cannot be supported.

Further Results

No significant relation between discreetly negative emotions and review helpfulness could be found, because the results were insignificant with a p-value of 0.067>0.05. Furthermore, the analysis showed a significant negative relationship between emotionally neutral reviews and their helpfulness rating with a p-value<0.001 and a coefficient estimate of -1.37748 (SE = .2084). This means that if a review is considered to be emotionally neutral, the outcome variable of perceived review helpfulness is likely to be lower by -1.37748, in comparison to it not being considered to be emotionally neutral.

As for the control variables, this study found that word count has a significant positive relationship to review helpfulness, with a p-value<0.001 and a coefficient estimate of 7.89e-4 (SE = 1.43e-4). This means that, generally, for each word that is added to a review, the review’s helpfulness rating is likely to be higher by 7.89e-4. A significant negative relationship could be observed between the average user score of a game and a review’s

(23)

helpfulness rating with a p-value<0.001 and a coefficient estimate of -1.06261 (SE = .0103). This can be generally interpreted as a review’s helpfulness rating being likely to be lower by 1.06261 if the average rating of a game goes up by 1.

5. Discussion

Findings

This study examined whether there is a positive relationship between emotional tone and discreetly and overtly positive emotions in consumer reviews and how helpful they’re perceived to be. It also examined whether this positive relationship was stronger than the relationship to more negative content, as the literature currently points to negative reviews generally being perceived as more helpful due to a “negativity bias” (Cao et al., 2011; Chevalier & Mayzlin, 2006), where consumers perceive positive cues to be the norm and therefore negative cues stand out more (Sen & Lerman, 2007).

The first hypothesis, split up into 1a and 1b, that there is a positive relationship between emotional tone and overtly/discreetly positive emotions and perceived review helpfulness could not be supported by the data examined in this study, as the observed relationship was found to be insignificant. However, there still was a difference between discreetly and overtly positive emotions as discreetly positive emotions were considerably closer to being significant that overtly positive emotions, which gives some indication that there might be a relationship between the independent variable discreetly positive emotions and the dependent variable review helpfulness. Nevertheless, this analysis was not able to conclusively confirm this relationship.

(24)

However, a significant negative relationship between emotional tone and review helpfulness could be observed. This result is also in line with the scientific consensus about a “negativity bias”, as a higher value emotional tone is an indicator for a more positive

emotional tone and the more positive the emotional tone is, the less helpful are reviews perceived to be and vice versa. However, this is not a very strong relationship and the change in values that can be observed is not very high.

The second hypothesis, split up into 2a and 2b, that there is a higher positive

relationship between emotional tone and discreetly/overtly positive emotions and perceived review helpfulness than between overtly negative emotions and review helpfulness could also not be supported by the data used in this analysis, as, similarly to hypothesis 1, no significant relationship of the independent variables discreetly positive emotions and overtly positive emotions could be found.

However, a significant negative relationship for overtly negative emotions displayed in reviews to perceived review helpfulness could be observed. This can be interpreted as being in line with the current literature, because a study conducted by Yi and Baumgartner (2004) showed that overtly negative content in a review might be too emotionally difficult for consumers and lead to them avoiding the review altogether. This might also lead to customers not perceiving a review as helpful, because they perceive the content as too negative.

However, similarly to emotional tone, this relationship is not very strong, and the independent variable only explains a very small part of the difference in review helpfulness.

(25)

Another interesting observation this study could make was that there is a significant negative relationship between neutral reviews and review helpfulness, and it can be

concluded, that emotionally neutral reviews are likely to be perceived as less helpful than reviews which are not emotionally neutral.

Furthermore, the analysis concluded that word count is positively related to review helpfulness, which is in line with expectations and confirms that longer reviews are likely to be more helpful to consumers. As for the relationship between the average user score of a game and review helpfulness, the results indicate that that the two variables are negatively related, meaning that the higher the average user rating of a game is, the less helpful are reviews perceived to be. This relationship goes against the expectations of this study.

The results of this study are, however, not completely in line with the scientific consensus on the “negativity bias”, as this relationship could only be observed for emotional tone, but reviews that were considered as discreetly negative did not have a significant positive relationship to review helpfulness, because the results were insignificant. The data did, therefore, not completely support this concept.

Limitations

There are also some limitations to this study. First of all, the analysis only controls for 2 variables, being word count and average user rating. However, the analysis could be more informative and robust if more control variables were introduced. Examples for possible control variables would be the genre for each game associated to a review, as there might be similarities in review preferences among certain groups of games and a game’s age

(26)

restriction, as the target audience for the different age groups might also make a difference in what games are perceived as helpful. These variables were, however, not introduced in this analysis, because the information was not available for all reviews and, therefore, introducing the variables would have drastically lowered the sample size.

Another limitation would be that this study only examines the video game industry, which makes it harder to draw conclusions that are high in external validity and it to generalize to other industries. However, an analysis of more than one industry would have been beyond the scope of this study.

Furthermore, the articles discussed in this study’s theoretical framework were not conducted on the video gaming industry, but on a wide range of other products and sectors, which is why it is unclear, whether the same concepts found in other sectors also apply to the video gaming industry. However, since not a lot of research into these topics has been

conducted in this industry, this study still introduced articles from other industries in order to build a theoretical reference frame and also to see, whether some of the results can be

replicated with the data in this analysis.

This type of analysis is also only able to prove a possible relationship between the independent variables and the outcome variables, which is why no conclusions about causality can be made and the possibility remains, that there are other feasible explanations for the observed relationships.

This study also only examined only one website providing review content, so there might be a difference in user base and user preferences to other websites, which could also

(27)

result in different conclusions about the relationships between review content and review helpfulness. However, analyzing more than one website would be beyond the scope of this study and the concept of Metacritic.com is also to provide information on opinions on games which is as comprehensive as possible. Due to the websites’ standing in the industry, it could also be assumed that the user base is a close representation of the general population with a certain degree of interest in video game reviews, but in order to make a conclusive statement, more analysis would have to be conducted.

Future Research

This study also leaves some interesting topics open for future research. First of all, further research should be conducted into how neutral reviews influence review helpfulness, as the analysis conducted in this study concluded that there seems to be a significant negative relationship between neutral reviews and review helpfulness, and it would be interesting to find out more about this relationship.

Furthermore, more research should also be conducted into how discreetly positive review content influences a the helpfulness of a review, because the results of this analysis did not conclude a significant relationship, but a relationship that was almost significant and it would be interesting to see how this relationship would change if more control variables were introduced, for example, the genre of a game or a game’s age restriction, as this may have some influence on the outcome due to differences in user base among the different categories.

Future research should also be conducted into more diverse industries, as the current literature still leaves out a few industries and more research could help gain a better overall

(28)

understanding of consumer reviews and eWOM, as it is likely that there are considerable differences between different industries and industry sectors.

The same reasoning can be applied to video games, as there should be more research conducted with more diverse sources of data, as this study only used reviews from

Metacritic.com, which may skew the results in a certain direction and it would help to gain a better overall understanding if more diverse data was used.

Since the independent variables also only added a very small amount of explanatory power to the model, more research should be conducted on which factors have a higher significant relationship to perceived review helpfulness on Metacritic.com and the video gaming industry as a whole, as this can help the scientific community and companies gain a better understanding of what makes a particularly helpful review in this field.

Practical Implications

Parts of the results related to the hypotheses in this study were insignificant, namely the relationship between overtly and discreetly positive emotions and perceived review helpfulness. Therefore, no statistical conclusions can be drawn about the relationship between these variables and it is unclear whether they are positively or negatively related. This may be because of the sampling used in this study, which only uses one primary data source, being Metacritic.com. This implies that there may or may not be a relationship, which just has not been proven and it will require more empirical studies to conclusively confirm, as this result may not be generalizable to other industries or data sources. Professionals in this field need to be aware of this and keep in mind that there may be a relationship between these factors and

(29)

adjust their considerations and actions accordingly. Furthermore, there may be significant other factors at play, that are related to what reviews are perceived as helpful and these factors need to be studied further in order to get a comprehensive understanding of the topic.

6. Conclusion

Previous research on the relationship between displayed emotions in consumer reviews and eWOM suggested that more emotionally negative reviews are more likely to be perceived as helpful by other consumers than more emotionally positive reviews and studies, in general, were more focused on the effect of negative reviews. Therefore, as online reviews and a thorough understanding of what kind of reviews are perceived as helpful are getting more and more important, this study examined, how emotionally more positive reviews are related to perceived review helpfulness with the example of reviews for video games. However, results did not show a significant relationship between discreetly and overtly positive emotions and perceived review helpfulness. Results were only able to show that a more negative emotional tone is likely to be perceived as more helpful, therefore confirming the current literature on the relationship of negative reviews to perceived review helpfulness and a negativity bias. However, this observed relationship did not prove to be very strong, but it was, nevertheless, still observable. The analysis also showed that reviews with overtly negative emotional content were likely perceived to be less helpful to users, but this observed relationship was also not very strong.

The research question, which this paper was addressing, to what extent discreet and overtly positive emotions and emotional tone are related to perceived helpfulness in online reviews, can, therefore, only partly be answered. The extent to which discreet and overtly

(30)

positive emotions displayed in reviews are related to perceived review helpfulness is still unclear, as no significant relationships between the two independent variables and the outcome variable could be found. However, a small negative relationship between emotional tone and perceived review helpfulness could be observed, answering the second part of the research question.

In conclusion, the more positive the emotional tone of a review gets, the less helpful it is likely to be perceived, meaning that reviews with a more negative tone are perceived as more helpful, however, this relationship is not very large and if the content of a review is overtly negative, reviews are also likely perceived to be less helpful by a small margin.

(31)

References:

Barnett White, T. (2005). Consumer trust and advice acceptance: The moderating roles of benevolence, expertise, and negative emotions. Journal of Consumer

Psychology, 15(2), 141-148

Baumeister, R. F., Bratslavsky, E., Finkenauer, C., & Vohs, K. D. (2001). Bad is Stronger than Good. Review of General Psychology, 5(4), 323–370.

https://doi.org/10.1037/1089-2680.5.4.323

Cao, Q., Duan, W., & Gan, Q. (2011). Exploring determinants of voting for the “helpfulness” of online user reviews: A text mining approach. Decision Support

Systems, 50(2), 511-521. doi: 10.1016/j.dss.2010.11.009

Casaló, L.V., Flavián, C., Guinalíu, M. (2008). The role of satisfaction and website usability in developing customer loyalty and positive word-of-mouth in the e-banking services. International Journal of Bank Marketing, 26(6), 399–417.

Chevalier, J., & Mayzlin, D. (2006). The Effect of Word of Mouth on Sales: Online Book Reviews. Journal Of Marketing Research, 43(3), 345-354. doi: 10.1509/jmkr.43.3.345

Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). SAGE Publications.

Fiske, S. T. (1980). Attention and weight in person perception: The impact of negative and extreme behavior. Journal of personality and Social Psychology, 38(6), 889.

(32)

Fox, J., & Weisberg, S. (2018). car: Companion to Applied Regression. [R package]. Retrieved from https://cran.r-project.org/package=car.

Frequently Asked Questions - Metacritic. (2020). Retrieved 7 July 2020, from https://www.metacritic.com/faq

How We Create the Metascore Magic - Metacritic. (2020). Retrieved 7 July 2020, from https://www.metacritic.com/about-metascores

Packard, G., & Berger, J. (2017). How language shapes word of mouth's impact. Journal of Marketing Research, 54(4), 572-588.

Pennebaker, J. W., Booth, R. J., and Francis, M. E. (2007). Linguistic Inquiry and

Word Count (LIWC2007), Austin, TX: LIWC (http://www.liwc.net).

Pennebaker, J.W., Booth, R.J., Boyd, R.L., & Francis, M.E. (2015a). Linguistic

Inquiry and Word Count: LIWC2015. Austin, TX: Pennebaker Conglomerates

(www.LIWC.net)

Pennebaker, J.W., Boyd, R.L., Jordan, K., & Blackburn, K. (2015b). The development

and psychometric properties of LIWC2015. Austin, TX: University of Texas at Austin.

Pratto, F., & John, O. P. (1991). Automatic vigilance: the attention-grabbing power of negative social information. Journal of personality and social psychology, 61(3), 380.

(33)

Kim, J., & Gupta, P. (2012). Emotional expressions in online user reviews: How they influence consumers' product evaluations. Journal Of Business Research, 65(7), 985-992. doi: 10.1016/j.jbusres.2011.04.013

Livingston, I. J., Nacke, L. E., & Mandryk, R. L. (2011, October). Influencing experience: the effects of reading game reviews on player experience. In International

Conference on Entertainment Computing (pp. 89-100). Springer, Berlin, Heidelberg.

Malik, M., & Hussain, A. (2017). Helpfulness of product reviews as a function of discrete positive and negative emotions. Computers In Human Behavior, 73, 290-302. doi: 10.1016/j.chb.2017.03.053

Mudambi, S. M., & Schuff, D. (2010). Research note: What makes a helpful online review? A study of customer reviews on Amazon. com. MIS quarterly, 185-200.

Packard, G., & Berger, J. (2017). How Language Shapes Word of Mouth’s Impact. Journal of Marketing Research, 54(4), 572–588. https://doi.org/10.1509/jmr.15.0248

R Core Team (2018). R: A Language and envionment for statistical computing. [Computer software]. Retrieved from https://cran.r-project.org/.

(34)

See-To, E. W., & Ho, K. K. (2014). Value co-creation and purchase intention in social network sites: The role of electronic Word-of-Mouth and trust–A theoretical

analysis. Computers in Human Behavior, 31, 182-189.

Sen, S., & Lerman, D. (2007). Why are you telling me this? An examination into negative consumer reviews on the Web. Journal Of Interactive Marketing, 21(4), 76-94. doi: 10.1002/dir.20090

Stephen, A. T. (2016). The role of digital and social media marketing in consumer behavior. Current Opinion in Psychology, 10, 17-21.

Tang, T. (Ya), Fang, E. (Er), & Wang, F. (2014). Is Neutral Really Neutral? The Effects of Neutral User-Generated Content on Product Sales. Journal of Marketing, 78(4), 41– 58. https://doi.org/10.1509/jm.13.0301

The critical role of reviews in Internet trust. Trustpilot. (2020). Retrieved 30 June

2020, from https://business.trustpilot.com/guides-reports/build-trusted-brand/the-critical-role-of-reviews-in-internet-trust.

The jamovi project (2019). jamovi. (Version 1.1) [Computer Software]. Retrieved from https://www.jamovi.org.

Video Games - worldwide | Statista Market Forecast. Statista. (2020). Retrieved 29

(35)

Yi, S., & Baumgartner, H. (2004). Coping with negative emotions in purchase‐related situations. Journal of Consumer psychology, 14(3), 303-317.

Yin, D., Bond, S., & Zhang, H. (2014). Anxious or Angry? Effects of Discrete Emotions on the Perceived Helpfulness of Online Reviews. MIS Quarterly, 38(2), 539-560. doi: 10.25300/misq/2014/38.2.1

(36)

Appendix:

Linear Regression

Model Fit Measures

Overall Model Test

Model R R² Adjusted R² F df1 df2 p 1 _0.211 _0.0444 _0.0444 ₆₉₀₆ ₂ ₂₉₇₀₀₇ _< .001 2 _0.211 _0.0447 _0.0447 ₁₇₃₇ ₈ ₂₉₇₀₀₁ _< .001 Model Comparisons Comparison Model Model ΔR² F df1 df2 p 1 _- 2 _2.49e-4 _12.9 ₆ ₂₉₇₀₀₁ < .001

Model Specific ResultsModel 1Model 2

Model Coefficients - usefulness rating

95% Confidence Interval

Predictor Estimate SE Lower Upper t p

Stand. Estimate

Intercept _14.02 _0.06765 _13.89 _14.15641 _207.31 < .001

(37)

Stand. Estimate Avg_metascore_w -1.09 0.00924 -1.10 -1.06707 -117.38 < .001 -0.2106 Assumption Checks Collinearity Statistics VIF Tolerance WC _1.00 _1.000 Avg_metascore_w 1.00 1.000

(38)

Q-Q Plot

(39)

(40)

(41)

Stand. Estimate Intercept _14.54936 _0.1386 _14.27768 _14.82104 _104.962 _< .001 WC _7.89e-4 1.43e-4 5.09e-4 0.00107 5.515 < .001 0.01039 Avg_metascore_w -1.06261 0.0103 -1.08281 -1.04241 -103.114 < .001 -0.20617

Overtly positive _0.03939 _0.1281 _-0.21161 _0.29039 _0.308 _0.758 _9.35e-4

discreetly Positive -0.17579 0.0959 -0.36373 0.01215 -1.833 0.067 -0.00535 Neutral _-1.37748 _0.2084 _-1.78593 _-0.96903 _-6.610 _< .001 _-0.01463 discreetly negative -0.18431 0.1007 -0.38172 0.01311 -1.830 0.067 -0.00544 Overtly negative _-0.33639 _0.1291 _-0.58940 _-0.08338 _-2.606 _0.009 _-0.00797 Tone _-0.00528 9.82e-4 -0.00721 -0.00336 -5.382 < .001 -0.01362

Linear Regression Assumption Checks Collinearity Statistics VIF Toleranc e WC _1.10 _0.906 Avg_metascore_w 1.24 0.805 Overtly positive _2.88 _0.348 discreetly Positive 2.65 0.378

(42)

Collinearity Statistics VIF Toleranc e Neutral _1.52 _0.656 discreetly negative 2.75 0.364 Overtly negative _2.91 _0.343 Tone _1.99 _0.502 Q-Q Plot

(43)

(44)

(45)

(46)

(47)

(48)

(49)

Correlation Matrix Correlation Matrix usefulne ss rating posem o negem o Tone WC Avg_metascore _w usefulness rating Pearson 's r — _p-value — posemo Pearson 's r -0.063 *** _— _p-value < .001 — negemo Pearson 's r 0.054 *** -0.10 8 ** * — _p-value < .001 < .00 1 — Tone Pearson 's r -0.083 *** 0.60 2 ** * -0.46 8 ** * — _p-value < .001 < .00 1 < .00 1 — WC Pearson 's r 0.010 *** -0.20 4 ** * -0.02 5 ** * -0.09 1 ** * — _p-value < .001 < .00 1 < .00 1 < .00 1 — Avg_metascore _w Pearson 's r -0.211 *** 0.22 9 ** * -0.21 1 ** * 0.32 4 ** * -0.02 6 ** * —

(50)

Correlation Matrix usefulne ss rating posem o negem o Tone WC Avg_metascore _w _p-value < .001 < .00 1 < .00 1 < .00 1 < .00 1 — Note. * p < .05, ** p < .01, *** p < .001

(51)

Descriptives

usefulness rating

posemo negemo Tone WC Avg_metascore_w

N ₂₉₇₀₁₀ ₄₇₈₇₂₇ ₄₇₈₇₂₇ ₄₇₈₇₂₇ ₄₇₈₇₂₇ 478727

Missing ₁₈₁₇₁₇ ₀ ₀ ₀ ₀ 0

Mean _7.35 _6.01 _2.48 _62.3 ₁₆₃ 6.67

(52)

Descriptives

usefulness rating

posemo negemo Tone WC Avg_metascore_w

Standard deviation 14.3 5.50 3.10 36.6 186 2.51 Minimum ₁ _0.00 _0.00 _0.00 ₀ 0.00 Maximum ₈₀₆ ₁₀₀ ₁₀₀ _99.0 ₄₅₈₆ 10.0 Descriptive Plots

(53)

(54)

(55)

(56)

(57)

(58)

(59)