• No results found

Improving restaurant satisfaction through mining latent dimensions and attributes from online reviews

N/A
N/A
Protected

Academic year: 2021

Share "Improving restaurant satisfaction through mining latent dimensions and attributes from online reviews"

Copied!
54
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Amsterdam Business School, University of Amsterdam

Master Thesis in Business Administration -- Digital Business track

Improving restaurant satisfaction through mining latent

dimensions and attributes from online reviews

(2016-2017)

Supervisor: Dr.Frederik Situmeang

Zihou Zhang

Student number: 11204079

(2)

Statement of originality

This document is written by Student Zihou Zhang who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Table of Contents

Statement of originality ... 1

Lists of Tables and Figures ... 3

Abstract ... 4

1. Introduction ... 5

2. Literature Review ... 8

2.1. Web 2.0 and online reviews ... 8

2.2. Customer satisfaction and online ratings ... 9

2.3. Innovation and newness ... 11

2.4. Product attributes and latent dimensions ... 14

2.5. Research hypothesis and conceptual model ... 16

3. Method ... 19

3.1. Sampling ... 19

3.2. Latent dimension extraction ... 20

3.2.1. Text pre-processing ... 20

3.2.2. Dimension extraction ... 21

3.3. Key attributes identification and related sentiment and newness analysis ... 23

3.3.1. Key attributes identification ... 23

3.3.2. Sentiment and newness analysis ... 24

3.4. Variables description ... 26

3.4.1. Independent variables ... 26

3.4.2. Dependent variable ... 27

3.4.3. Moderating variables ... 27

4. Results ... 27

4.1. Dimensions of customer satisfaction ... 27

4.2. Description statistics and correlation ... 31

4.3. Test of Hypothesis ... 32

5. Discussion and Implication ... 37

6. Limitation and Further Research... 39

7. References ... 40

(4)

Lists of Tables and Figures

Table 1. Newness word list ... 26

Table 2. Topic identification example ... 28

Table 3. Classification of extracted dimensions ... 30

Table 4. LDA analysis and rater validation ... 31

Table 5. Means, Standard Deviation, Correlation... 32

Table 6. Estimate of Fixed Effects ... 33

Table 7. Estimates of Covariance Parameters... 36

Figure 1. Conceptual Model ... 16

Figure 2. Framework for extracting latent dimensions and sentiment and newness analysis . 19 Figure 3. LDA conceptual model... 22

Figure 4. Word cloud ... 24

Figure 5. Top 5 most frequent words ... 24

Figure 6. Extracted dimensions... 29

Figure 7.Customer satisfaction affected by food and sentiment ... 34

Figure 8. Customer satisfaction affected by order and sentiment ... 34

Figure 9. Customer satisfaction affected by pizza and sentiment ... 35

Figure 10. Customer satisfaction affected by order and newness ... 35

(5)

Abstract

The purpose of this study is to contribute to the marketing literature and practice by

extracting latent dimensions of customer satisfaction in the restaurant industry and examining the relationship between restaurant attributes and customer satisfaction. Previous research in this field has largely relied on the qualitative survey or only quantitative ratings provided online. However, more advanced techniques for text mining given the opportunity to extract meaning from customer online reviews. By analyzing 51,110 online reviews for 1,610 restaurants over 7 states of America via LDA, this study uncovers 30 latent dimensions that are determinants of customer satisfaction. This study originally defined a list of words that customer used to express their perception of newness, and assigned related scores to show their degree of newness. By applying sentiment and newness analysis, this paper found that the key restaurant attribute food and order both have a negative effect on customer

satisfaction, while place and pizza both have a positive effect on customer satisfaction, and that sentiment and newness about these attributes also have a positive effect on customer satisfaction.

Keywords: Online reviews, Latent Dirichlet Analysis, Customer satisfaction, Restaurant

(6)

1. Introduction

Online reviews play a vital role during customer decision-making process when buying a product. Through various of online reviews, customers gather product information, compare product features and search for convenient channels. With the extensive diffusion of Web 2.0, the Internet has forested a rapid rise of online reviews. As a result, the Web has become an outstanding place where we can collect customer opinions. The role of online reviews is expected to be increasingly essential because of the prosperity of E-commerce (Lu Yue et al. 2010). Amazon, eBay, and TripAdvisor are examples of well-known sources containing such reviews. In the hospitality industry, websites like Yelp provide an excellent platform for information seekers to search reviews, online ratings and share their own “voice” with others about the different aspects of service at restaurants and hotels (Gupta et al. 2015).

Previous research in marketing and customer management has defined customer satisfaction as how products and services offered by a company satisfy or surpass customer expectation (Farris et al.2010). In other words, attributes reflect the key aspects of the product and can decide the extent of customer satisfaction (Stuart J. Barnes 2017). Customer satisfaction is very important in repeating purchase, brand loyalty as well as word-of-mouth(WOM) publicity (Cronin, Taylor 1992; Hui, David 2007).

As discussed above, online reviews contain multi-dimensional constructs and can be regarded as an important source for the formation of satisfaction (e.g. price, location, facilities and quality). For researchers, they can use text mining techniques to extract these factors instead of just focusing on star rating and qualitative survey. For business, they can find which factors are most important for customers at no cost and make improvements, thus improving profitability (Anderson et al.1994).

(7)

for the first time (Pullman et al.2004). Different degree of newness can be described as a different level of innovations that vary from minor to significant (Johannessen et al.2001). Previous studies differentiate among innovations that differ in their degree based on both firms’ and customers’ perspective (Damanpour et al.1991). According to Kim et al. (2012), Zhang et al. (2015), both the newness of a technology and its newness to customers can improve customer satisfaction. Szymanski et al. (2007) indicate that there is a linear relationship between new and performance. Thus, newness can bring benefits to both customers and firms.

However, previous studies rely heavily on traditional qualitative or quantitative research (e.g. design questionnaires and collect customer feedbacks) to identify the determinants of

satisfaction (e.g., Law & Hsu, 2005; Barsky, 1992; Cheung, 2005; Hyun, 2010). Although some researchers in hospitality industry attempt to discover the relationship between attributes and customer satisfaction (Li et al.2013; Berezina et al.2016; Lu Gao et al.2016), they neglect the newness embodied in the reviews which can be regarded as dependent or moderating variables and can influence customer satisfaction. What is more, regarding the high costs and time-consuming weaknesses, such research normally fails to apply sufficient samples. Besides, based on their knowledge and specific research direction, many researchers just focus on the industries they familiar with, so insufficient sample and measurements occur very frequently in previous studies (Danaher et al. 1996). Moreover, their methods also depend on the customers’ engagement willingness, so bias and inaccuracy exist when only a limited number of participants.

To solve these problems, this study focuses on topic-based opinion mining, which aims to extract latent dimensions from customer reviews and treat sentiment and newness as moderating variables. Through the advanced techniques developed in natural language

(8)

processing (NLP) and text mining, multidimensional dimensions in the restaurant industry can be found, and these dimensions are key determinants of consumer satisfaction. Firstly, this approach extracts dimensions from customer reviews and examines them via expert validation. Secondly, originally sets a list of words that customers use to describe something is new in their reviews and assigns according scores to these words based on their degree of newness. In this way, these words can be seen as identifiers of different levels of newness in customer reviews. Finally, uses sentiment analysis and newness analysis to identify the relationship between attributes and customer satisfaction.

To achieve above objectives, the following research questions need to be solved:

1. What are the key dimensions of customer satisfaction that expressed in online reviews in the restaurant industry?

2. Does product attribute can affect customer satisfaction?

3. What is the effect of sentiment and newness on the relationship between attributes and customer satisfaction?

Generally, there are two domains of this paper. The first one is aiming to extract latent dimension of customer satisfaction, while the second one is about exploring the relationship between restaurant attributes and customer satisfaction, and what role sentiment and newness plays in this relationship. To achieve these goals, the next two parts of this paper present the literature review and the methodology. Then, an explanation of the results will be given and this paper concludes with discussing both the theoretical and the managerial implications.

(9)

2. Literature Review

2.1. Web 2.0 and online reviews

The concept of Web 2.0 began with a conference between O’Reilly and MediaLive

International in 2005(O’reilly 2005). Evans (2010) stated that Web 2.0 make it possible to create content, exchange content and share comments through social media. Berthon et al. (2012) agreed with Evans (2010) and described Web 2.0 as a set of technological innovations that generate inexpensive content creation, and put users to the center stage of design,

community, and collaboration. The form of Web 2.0 includes social media and social networking (e.g., Facebook, Twitter, Instagram), blogs, wikis, forums, social bookmarking, video sharing sites (e.g., Youtube), Web applications, collaborative platforms (Herring 2011). However, when comparing Web 1.0 and Web 2.0, the major difference not only in

technology innovation but also in sociological effects. Web 2.0 focus on consumers,

communities, networks, and participation while Web 1.0 about companies, individuals, nodes and publishing (Berthon et al.2012). Under this background, people have more opportunities for interaction through the Internet.

User-generated content(UGC) can be defined as any form of blogs, wikis, posts, tweets, reviews, discussion forums, images, audio files, videos, tweets, advertisements and other forms of media that were generated by users, often made available through social media websites (Kaplan et al.2010). The prosperity of Web 2.0 has promoted a host of new services and possibilities on the Internet, and users can generate contents that can be easily accessed, viewed and download by others without the limitation of space and time (George 2007). As part of UGC, online reviews enable customers to filter un-related and unclear product information and depend on the first-hand usage judgments from other users directly, so they are important for business (e.g. hotels). According to Dina (2006), 89% of customers read

(10)

are as reliable as personal recommendations. Besides, a business with “excellent” reviews can have 31% more revenues that those without “excellent” reviews. Cui & Lui & Guo (2012) revealed that the number of review has a significant effect on new product sales in the early periods, although such effect decreases over time. Moreover, online reviews have becoming a new form of word of mouth and can be used as the function of free advertising (Jansen et al.2009). The reviews are not just a passing comment to a friend or two, they may remain online for years, affecting countless potential customers. In addition, a business can find constructive criticism and suggestions from customers, thus improving product or service quality and building a closer relationship with customers at the same time. Therefore, many companies encourage their customers to share personal experience actively. For

example, some firms offer products to customers for free as long as the customers share their experience on social media (e.g. Youtube, Facebook), these type of customers also called influencer. Since customers are more willing to believe real users rather than commercial advertisements (Nielsen 2011). Existing studies indicate that online reviews can be regarded as a rich pool for business to measure customer engagement with a brand and thereby predict customers purchasing behavior (Malthouse, Edward C., et al.2016; Baka 2016).

2.2. Customer satisfaction and online ratings

The concept of customer satisfaction occupies a central position in business practice. Marketing researchers and social psychologists have studied customer satisfaction

extensively. According to World Tourism Organization (1985), customer satisfaction is a psychological concept that involves the feeling of sentiment that results from customer expects towards a product. It was defined as a function of the customer expectation and product actual performance (Fornell, 1994). Similar definition from Oliver (1980) who proposed expectancy disconfirmation theory which indicates that customer purchase products

(11)

product, they will compare the outcomes with expectations. If the outcomes match or surpass their expectations, customers will be satisfied. On the other hand, if the outcomes are less than expectation, customers will be dissatisfied.

Customer satisfaction is a key performance indicator (Chan et al.2004). This is because customer satisfaction is a leading factor of customer purchase intentions and loyalty, and also can indicate how well firms are performing currently (Farris et al.2010). If customers are satisfied with products, they are more likely to purchase again and to recommend the

products to others (Bowen et al.2001). In this way, customer life time value also expected to be increased (Berger et al.1998). Besides, a higher satisfaction can increase customer

retention and reduce customer churn. According to Accenture global customer satisfaction report (2008), customer satisfaction is the main reason for customer churn rather than product price.

In the past, generally, there are two methods to measure customer satisfaction, direct methods, and indirect methods. For direct methods, the most popular one is to collect customer feedbacks directly through a survey which can be conducted online and offline. This method reflects the “voice” of customers with high accuracy, but also very costly and requires a long time for preparation such as questionnaire design and content formulation (Lee & Feick, 2001). For indirect methods, customer loyalty and Net Promoter Score (the likeliness of a customer will recommend a product or service to others, based on a scale from 1 to 10, the higher the score, the more likely customers will refer, which means the higher the score, the higher the customer satisfaction) can be regarded as effective measurements

(Keiningham et al.2007). Although the indirect methods can reflect customer satisfaction to some extent, they are still both time and money consuming. This study adopts a more convenient and accurate method which will be discussed following to measure customer

(12)

satisfaction via second-hand data acquired from Yelp.

Some recent empirical studies (e.g., Tobias & Patrick & Michael, 2015; Yue Guo et al.2016) have found that product ratings reflect customer satisfaction. Tobias et al. (2015) proposed that pre-purchase expectations have a positive impact on the score of customer rating and the product post-performance has a positive impact on the score of customer rating, which means the relation between these two actually reflects customer satisfaction. To illustrate this

hypothesis, the authors examined the total number of 7,834,166 Amazon online reviews from 1996 to 2014, and the results indicated that both two factors have a significant influence on customer online ratings, supporting the findings that customer satisfaction is the antecedent of online ratings and different scores of online rating can be regarded as the degree of customer satisfaction instead of product quality. This study assumes this finding and argues that online ratings represent the customers’ satisfaction with the product (Engler et al.2015; Yue Guo et al.2016) and online rating can be regarded as quantified customer satisfaction. For example, if a customer leaves a one-star rating for a restaurant (rating varies from one to five star), this means the customer is not satisfied with the restaurant. If this customer makes a five-star rating, this means the customer is satisfied with the restaurant.

These studies attempted to identify meaningful insights through text mining from online reviews directly instead of spending enormous time making a survey and waiting for customers’ responds. However, few studies have investigated the relationship between attributes and customer satisfaction (identified by online ratings) in the restaurant industry. Therefore, this thesis fills in this blank and originally treats sentiment and newness as moderating variables, which will be discussed deeper in the following content. 2.3. Innovation and newness

(13)

fields (e.g., Drazin & Schoonhoven, 1996; Kanter, 1985), on innovation as a way to create and sustain superior competitive advantages. Innovation is regarded as a fundamental element of entrepreneurship and a crucial component of business success (Ries, 2011). However, due to the lack of good measures of innovation, there is a long term debate about the definition of innovation (Kotabe & Scott Swan, 1995). Therefore, in order to clarify the definition and measure of innovation, three questions should be addressed: what is new? how new, and new to whom? (Lumpkin, 2001)

Several researchers have answered the question “what is new”. Johannessen & Olsen & Lumpkin (2001) indicated that innovation implies newness, and newness is the essence of innovativeness. Damanpour(1991) defined innovation as “the generation, development, and adoption of novel ideas on the part of the firm”. Nohria and Gularti (1996) defined

innovation to include any policy, structure, method or process, or any product or market opportunity that related customer perceives to be new. Similar definition from Zaltman et al. (1973), innovation is any idea, practice, or material artifact perceived to be new by the

relevant customers and markets. Slappendel (1996) believed that the perception of newness is important to the concept of innovation because it helps to separate innovation from change. Therefore, the words and phrases that indicate newness in customer reviews are of course newness that has perceived by customers. If a customer says “The new window is really beautiful”, the word “new” can indicate innovation. This thesis follows the footsteps of these similar accepted definitions and believes that what is new is a fashion that stresses a range of innovative activities perceived by customers and markets.

Generally, to answer the question “how new”, researchers categorized innovation into incremental innovation and radical innovation (e.g., Brentani, 2001; De Jong & Vermeulen, 2003). Incremental innovation attempts to satisfy the needs of existing customers or markets

(14)

at a rate consistent with the current technological trajectory (Adler et al.2009). Therefore, this kind of innovation normally presents slight variations of current products or services

(Damanpour, 1991). Conversely, the change of radical innovation is bigger than that of incremental innovation as it aims to meet the needs of emerging customer or markets (Jansen et al.2006). Because of this long-term development strategy, firms intend to disrupt the existing technological trajectory and develop new prototypes and new technologies for new markets (Gatignon et al.2002). Therefore, radical innovation aims to eliminate existing technologies by transforming the old knowledge into new knowledge, thus generating fundamental changes and a huge variety of current products or services (Subramaniam & Youndt, 2005).

To answer the questions “new to whom”, some research prefers to stand the view of firms or relevant units of adoption in advanced technology driven industry (e.g., Zaltman 1973), while more attentions on customers and markets view in the hospitality industry (e.g., Gatignon et al.1991; Victorino et al.2005). With regard to this classification, this study stands the perspective of customers. Besides, this study originally defined a word lists such as innovative, new, modern, invention that used by customers to express their perception of newness via online reviews, and assigned different scores to these words based on the degree of newness that embedded in these words in order to apply newness analysis to find the effect of newness on customer satisfaction.

However, prior research merely focuses on experiment, investigation, and survey

(Kleinschmidt et al.1991; Steenkamp et al.2003; Gielens et al.2007), no research attempts to describe the degree of innovation by the words expressed in customer reviews. This study has three obvious features that distinct from prior literature. Firstly, this study stands the

(15)

degree of newness and explores the effect of newness on the relation between attributes and customer satisfaction. Secondly, compared to previous studies, this study considers online reviews and ratings together and regards online rating as quantified customer satisfaction. More efforts have been devoted to analysis large-scale dataset with the aim to find latent attributes and dimensions that determine customer satisfaction. Finally, with the application of Latent Dirichlet Allocation(LDA), an advanced method of natural language process, this study can produce more realistic meanings from customer reviews and illustrate that LDA is applicable in a large dataset and help to get more reliable generalization than previous research.

2.4. Product attributes and latent dimensions

An attribute is a characteristic that defines a particular product and will affect a customer’s purchase decision (Keller et al.1993). According to Myers & Chris A (2003), product attributes can be tangible or intangible. Tangible attributes can include size, color, weight, volume, quantity or material composition, while intangible attributes are consists of characteristics such as price, quality, aesthetics, reliability. Both tangible and intangible attributes can affect customers’ purchase choice (Salem Khalifaet al.2004). For example, if a customer intends to buy a car, tangible attributes such color, size and physical composition are very important, but intangible attributes such as quality, price and post service also play a key role.

Latent dimensions are variables that customer may not explicitly express, but that represent a number of attributes, often indirectly from other indicators (Yue Guo et al.2016). For

example, it may be difficult to identify a person’s social status, but we can infer this

information by this person’s income and profession. In this case, income and profession are latent dimensions that determine social status. The same idea can be extended to customer

(16)

satisfaction, there are many latent dimensions determine the overall customer satisfaction. This study aims to find those latent determinants of customer satisfaction in the restaurant field. With the booming of social media, it is very easy to acquire a huge number of online content. This is particularly the case for restaurant industry since customers can leave their reviews and share their experience online. Therefore, UGC can serve as a rich source to extract customer satisfaction dimensions and these dimensions represent the “voice of customer” (Coulter et al.1995) and can be regarded as key factors of customer satisfaction. In the past decades, there have been quite a few studies examining the phenomenon of UGC and aiming to identify latent dimensions such as reviewers’ social identity and their

preference (e.g.William et al.2013; Floyd, Kristopher, et al.2014). Currently, an emerging stream of research in marketing has managed to extract the dimensions of products from online reviews. Netzer et al. (2012) used the semantic analysis to examine the influence of product reviews on customer decision making. Li et al. (2013) explored the latent attributes that determine the customer satisfaction in the hospitality field through analyzing customer reviews. Lv & Shao Hua et al. (2011) made use of LDA on restaurant reviews to acquire useful topics and calculate the scores of these topics, in order to predicate the rank of a restaurant based on new reviews that contain these topics. This study proposes that online reviews provide a rich dataset to find the latent dimensions of customer satisfaction in the restaurant industry, thus helping people to understand customer satisfaction and provide managerial implications for business.

(17)

2.5. Research hypothesis and conceptual model

In the restaurant industry, complex attributes affect customer behaviors and perception (Johns et al.2002; Koo et al.1999). Hyun, Sunghyup Sean (2010) have analyzed 233 e-mail survey and indicated that attributes will influence loyalty formation and customer satisfaction. Byrne et al. (1994) found that safe, fresh, nutritious and less detrimental environmental impact are key attributes for the customer to make a decision when they buy organic food. Thompson et al. (1998) found that store location had a significant impact on product purchase. These attributes are focal points for customer evaluation towards the product. In customer reviews, the higher frequency of attributes occurs in the reviews, the more important this attribute is (Tirunillai & Tellis, 2014). This study applied simple word count to find restaurant attributes from large online review dataset and ranked them by their frequency. Five most important attributes food, place, order, service, and pizza were selected as the representative of restaurant attributes and were used to further exploration. Based on previous research, this paper believes there is a relationship between product attributes and customer satisfaction:

H1. There is a relationship between attributes and customer satisfaction (online rating).

(18)

2.0, sentiment analysis has become increasingly popular which widely applied to the voice of customers such as Twitter posts and online reviews. This kind of analysis aims to identify the attitude of a writer in terms of some topics or the overall contextual polarity or emotional reaction to a content, interaction, or activity (Kowcika et al.2013). Some studies on marketing research and consumer behavior suggest that sentiment polarity is an important moderating factor influencing the relationship between product attributes and customer satisfaction (Salehan et al.2016; Silveira Chaves & Laurel, 2014). A positive sentiment means a positive emotion. When positive emotion was embodied in the product attributes, the link between attributes and post evaluation may be stronger and customers are more likely to be satisfied (Yuksel et al.2010). Hence, if customer emotion towards product attributes is very positive, this normally leads a higher satisfaction (Fornell, C.1992).

H2: Sentiment moderates the relation between product attributes and consumer satisfaction,

and a higher positive sentiment will lead a higher consumer satisfaction.

According to Talke & Katrin et al. (2009), newness and innovation can have an effect on product performance. Innovation is a vital activity that is important for almost all firms to adopt in order to create and maintain a competitive advantage (Chesbrough, 2006), as it can increase productivity by creating and executing new processes. In the manufacturing

industry, innovation can help firms reduce costs and increase production speed, especially for manufacturers (Rothwell, 1994). For example, with new coffee machine implemented, coffee production can be cheaper and faster. In the hospitality industry, innovation can improve product or service quality (Davenport, 2013). As discussed before, this paper distinguishes innovation by the degree of newness. Thus, when customers perceive something is new, the tendency for customers to have a high satisfaction is higher. Similar to sentiment, when theorizing and extending these ideas, one may argue that when newness was embodied in the

(19)

product attributes, the link between attributes and post evaluation may be stronger and customers are more likely to be satisfied.

H3: Newness moderates the relation between product attributes and consumer satisfaction,

and a higher degree of newness will lead a higher consumer satisfaction.

In sum, this study provides five unique contributions to marketing literature. First, this research developed the dimensions of satisfaction in restaurant industry based on natural language process. Second, through the application of textual and numerical data, this study analyzed the relationship between key restaurant attributes and star rating, thus identifying what role attributes plays in terms of customer satisfaction. Third, this study originally proposed a list of words to describe newness and assigned related scores to these words by their degree of newness, which can be used by other research. Fourth, this study clarified the effect of sentiment and newness on the relation between attributes and customer satisfaction in the restaurant industry. Finally, this study presented a powerful method that both

academic and business can use to measure the effect of product attributes on their customer’s satisfaction via online ratings and reviews.

For managerial practice, this study presents latent dimensions of customer satisfaction for managers and product owners in the restaurant industry. Secondly, it also indicates what role sentiment and newness plays in terms of customer satisfaction, thus firms can pay more attention to these attributes and related dimensions and make optimized innovation. Thus, these attributes and related dimensions are stem from actual “voice of customer” (Abbie 1993) which can be described as:

“Collective insight into customer needs, wants, perceptions, and preferences gained through direct and indirect questioning. These discoveries are translated into meaningful objectives

(20)

(BusinessDictionary).

3. Method

This paper proposes a framework to extract latent dimensions from UGC, as

shown in Figure 2., this framework summarizes all procedures in this study, which can be used as a guideline for researchers and business to identify latent dimensions from

multidimensional constructs and to explore the relationship between key product attributes and customer satisfaction, while considering sentiment and newness of these attributes as moderators.

3.1. Sampling

This study mainly targets on the restaurant sector. The empirical setting is customer review website. Yelp, one of the biggest review websites in the world, was founded in 2004 and publishes crowded-sourced reviews about the local business such as restaurants, bars, and local service. For a more detailed description, see Yelp.com. Yelp has opened “Yelp Dataset

(21)

Challenge” for academics and introduced their real business dataset for research. The large

dataset includes business, reviews, user, and check-in data in the form of separate JSON objectives. A business project has the information about the type of business, location, rating, categories, business name, and unique business id. A reviews object has rating, review, and associated with business id and user id. This paper mainly focuses on these two JSON files. The author firstly used JSON to CSV Converter (see json-csv.com) to convert the two JSON files into CSV format. Then loaded this two CSV dataset into Microsoft Access, and wrote syntax to only select the business within “restaurant” categories, and their related location, ratings, and reviews. Thus, the sample dataset has a total number of 1,610 restaurants with 51,100 reviews across 7 states in America.

3.2. Latent dimension extraction

3.2.1. Data pre-processing

The pre-processing step was similar to those adopted in previous research (Yang & Lin, 2011; Zhao et al.2016), involving the tokenizing, lower-case each word, word stemming, removing the low-frequency words and the common stopwords such as “the”, “we”. For example, the raw review is:

“I like the food at this local place, but it was crowded and I have to wait the food for an hour to come out of the kitchen. Food was good but I am not sure why there was a long wait. But the environment is very nice.”

After pre-processing, the review appears like:

“like food, this local place but crowded have to wait food hour come out kitchen food good not sure why there long wait but environment very nice”

(22)

3.2.2. Dimension extraction

One of the main contributions of this study is the effective extraction of latent dimensions affecting customer satisfaction by mining online reviews. To answer the first research question, this study utilizes Latent Dirichlet Allocation (LDA) which is the most common topic modeling method in natural language processing (NLP) (Poria & Gelbukh, 2016). LDA assumes each review is a mixture of topics with proportion. For example, a review may have 80% topic A and 20% topic B. Each topic is a mixture of words. For example, topic “food” may contain the words “pasta”, “sauce”, and “meat”. Topic modeling via LDA not only can effectively identify a cluster of topics (e.g., aspects that affecting customer

satisfaction) from massive volumes of documents (reviews) but also can label the aspects for the formation of latent dimensions. In this case, a “dimension” can be defined as a latent construct distributed over a vocabulary of words that customer used to express their restaurant experience, also referred as a “topic” in LDA related research (Guo & Stuart et al.2017).

There are serval advantages of LDA. Firstly, as an unsupervised machine learning approach, LDA is highly efficient to translate large-scale data into interpretable topics and gives the probability distribution of words. Besides, LDA compares the frequency of extracted dimensions based on customer experience. For example, customers use their own words to describe their opinions towards restaurant aspects such as food, environment. These topics represent the importance of those aspects in terms of their satisfaction related to the dining experience. This paper uses Stanford Topic Modeling Toolbox to extract latent dimensions via LDA (Ramage & Rosen, 2009).

(23)

As shown in the figure 3., there are three layers in LDA model. The inner layer, known as word level, has variables z and w which are sampled once for each word in each document. In the medium layer, θ refers to latent variables and sampled once in each document. The outer layer involves parameter α and β which are sampled once for the corpus.

LDA assumes that the sequence of N words forms a review, which represents a “document” in the model, and a document w = (w1, w2, …., wN), while M reviews consist of a corpus, D =

{W1, W2, …., wM}. Below are the generative steps for each review in a corpus (Blei &

Michael, 2003):

1. Choose N ~ Poisson (). 2. Choose θ ~ Dir(α)

3. For each of the N words wn :

a. Choose a topic Zn ~ Multinomial (θ) and

b. Choose a word wn from p(wn zn, β), a multinomial probability conditioned on

the topic zn.

In step 1, N represents the length of documents and Poisson in this step shows the length of the reviews distributed in each document. In step 2, α is the parameter of the Dirichlet prior on the pre-review topic distributions. The probability of kth dimension(topic) occurs shows

the relative importance of this dimension and can be represented as a k-dimensional dirichlet random variable θ. In a given review, the probability density can be expressed by the

(24)

following function:

𝑝(

θ|α

)

= ┏(∑ 𝛼𝑖 𝑘 𝑖=1 ) ∏𝑘𝑖=1┏(α𝑖)

𝑘𝑖=1

𝜃

𝑖𝛼𝑖−1 (1)

In this function, parameter α is a k-vector with components αi > 0, and (x) is the Gamma function. In step 3, β can be regarded as a k-dimensional dirichlet random variable in a given

topic, θ is the joint distribution of a topic mixture, z is a set of N topics and w is a set of N words. They are given by the following function:

𝑝(θ, Z, W|α, β) = 𝑝(θ|α) ∏𝑁𝑛=1𝑝(𝑧𝑛|𝜃)𝑝(𝑤𝑛|𝑧𝑛, 𝛽) (2)

In which p(zn |) is for the unique i where 𝑧𝑛𝑖= 1. We then can get the marginal distribution

of a document:

𝑝(𝑤|𝛼, 𝛽) = ∫ 𝑝(𝜃|𝛼) (∏𝑁𝑛=1∑ 𝑝(𝑧𝑧𝑛 𝑛|𝜃)𝑝(𝑤𝑛|𝑧𝑛, 𝛽)𝑑𝜃) (3) Finally, the probability of a corpus can be obtained via taking the product of the marginal probability of single review:

𝑝(𝐷|𝛼, 𝛽) = ∏ ∫ 𝑝(𝜃𝑑|𝛼) (∏ ∑𝑧𝑑𝑛𝑝(𝑧𝑑𝑛|𝜃𝑑)𝑝(𝑤𝑑𝑛|𝑧𝑑𝑛, 𝛽)𝑑𝜃𝑑 𝑁𝑑

𝑛=1 )

𝑀

𝑑=1 (4)

3.3. Key attributes identification and related sentiment and newness analysis

3.3.1. Key attributes identification

This part used the processed text data (online reviews) in 3.1.1 and counted the word frequency. This study assumed that word frequency in customer reviews represents the importance of restaurant attributes, the higher frequency of an attribute occurs in the reviews, the more important this attribute is (Tirunillai, 2014). This paper found that food, place, order, service, and pizza are the most mentioned attributes in the dataset (see figure 4.), then

(25)

3.3.2. Sentiment and newness analysis

After identifying the most important attributes, this paper focusing on the sentiment and newness about these attributes in each review.

Sentiment analysis is a process to find the emotional tone behind a series of words, aiming to understand the attitudes, opinions, and emotions expressed in UGC (Pang & Lee, 2008). This analysis is widely used in social media monitoring since it enables people to understand the public opinion behind certain topics. For example, the Obama administration applied

sentiment analysis to capture public attitudes to policy announcements and election campaign before 2012 presidential election (Sides & Vavreck, 2014). This paper applied SentiStrength

to analyze cleaned review dataset. SentiStrength estimates the strength of positive and negative sentiment in texts by using predefined sentiment word list. For example, a row review is:

“I love you but hate the current political climate” After being analyzed the output is:

“I love [3] you but hate [-4] the current political climate. [sentence: 3, -4]”

(26)

SentiStrength analyzes the text based on a 1-5 scale, “3” means this sentence has positive strength 3, and “-4” means negative strength 4. Therefore, the overall sentiment of this sentence is 3 + (-4) = -1.

Based on above principle, this study run SentiStrength to analyze the sentiment of each attribute in each review and put the results in an Excel sheet to prepare for the further analysis.

Generally, there are four steps for newness analysis. Step 1, based on the two words

“newness” and “innovation”, the author developed an exhaustive word list to capture all the words that customer use to describe something is new in their reviews via The Synonym

Finder (Rodale, 1978). The list of words includes variants and synonyms of these two words.

For example, the word newness has the variant new and synonym novelty. In this process, the author generated 96 words from The Synonym Finder.

Step 2 was about newness word validation. Two experts from two different Chinese Universities with at least ten years of academic and real business experience in restaurant management were invited to valid the newness word list. At this step, two experts randomly selected 200 reviews from the dataset respectively and manually went through each piece of reviews carefully to add words that used to express newness and delete words that make no sense in real case. Of the 96 words generated by The Synonym Finder, 28 were deleted and 5 new words were added. In total, there are 73 words were selected.

After forming a list of words that customer used to describe something is new, the two experts aimed to assign scores to these words based on the degree of newness in step 3. The score assignment process is similar to sentiment analysis which is based on a 1-5 scale. 1 means this word just has very little newness, and 5 means this word has a very high degree of

(27)

newness, while the word “brand-new” was given 5 as it shows a large extent of newness. Table 1 presents the final newness word list. One of the academic contributions of this study is originally defined a list of words customer used to describe their perception of newness, and assigned scores to these words to show the degree of newness. To help the further research this study listed all the newness words and their scores in Appendix 1.

At the last step, the author originally run newness analysis by replacing the sentiment word list with newness word list in SentiStrength, and got the newness score of the key attributes in each review.

3.4. Variables description

As mentioned before, the second domain of this study is testing the relationship between key attributes and customer satisfaction, and what effects sentiment and newness have on this relationship. To achieve this goal, this study applied the linear mixed model in SPSS to explore the variables’ relationship.

3.4.1. Independent variables

The independent variables are the number of the five attributes of in each review. By counting the word frequency in 51,100 reviews, the top five most important attributes food, place, order, service, and pizza were founded. Then counted the number of these five attributes in each review. For example, if a review says: “the food is super amazing, but the

Table 1.Newness word list

advanced, avant-garde, brand-new, breakthrough, contemporary, creation, creative, creativeness, creativity, cutting edge, developed, development, different, distinct, evolution, futuristic, imaginative, improved, improvement, innovation, innovative, innovatory, invention, inventive, inventiveness, leading edge, metamorphosis, mint condition, modern, modernism, modernistic, modernity, modernization, modernized, modification, mutation, neoteric, new, new- wrinkle, newest, newfangled, newfashioned, newness, novel, novelty, original, originality, origination, originative, progressive, radical, radical change, radically, rebuilt, recast, reconstructed, recreated, reformation, regenerated, remodeled, renascent, renewed, renovation, restyle, restyling, revolution, revolutionary, transformation, ultramodern, unhackneyed, unprecedented, up-to-date, way-out

(28)

service is bad, I only recommend the food here.”, in this case, the food was counted 2, service 1, and place, order and pizza are 0.

3.4.2. Dependent variable

The dependent variable is customer satisfaction which was measured by the online rating of each review. For the Yelp dataset, online rating was expressed by a 1-5 scale, where 1 represents the lowest rating, and 5 represents the highest rating. The higher the rating is, the higher the satisfaction would be (Engler et al.2015; Yue Guo et al.2016).

3.4.3. Moderating variables

In this study, moderating variables are sentiment and newness of the five key attributes in each review. For sentiment, can be varied from -5 to 5. For newness, can be varied from 0 to 5, which means the higher the score, the higher degree of newness these attributes would be.

4. Results

In this section, this study presents the results of extracted latent dimensions of customer satisfaction. Then examines these dimensions by expert validation. Subsequently, descriptive statistics and correlation among variables were presented, which was followed by the statistic results via the mixed model.

4.1. Dimensions of customer satisfaction

By extracting and labeling the dimensions of customer satisfaction from all the collected reviews, this study generated 30 topics with top 20 words in each topic with their relative weight. The naming process as follows: the author first checked each topic, aiming to find the logical relationship between the most frequent words within this topic. For instance, as shown in table 2, the topic name “location of restaurant” was based on the word “center”, weighted 11.10%, word “near”, weighted 4.90%, and word “street”, weighted 3,14%. It is easy to find

(29)

this topic makes sense to rest of the words. If a word did not fit the topic name, the labeling step of this topic would be restarted until the new topic fits all of the 20 words.

Table 2. Topic identification example

Figure 6 shows the top 30 most important latent dimensions (topics) extracted from 51,100 reviews within 1,610 restaurants across 7 states in America. Two of the dimensions show the overall perception of restaurant experience: amazing experience and satisficing. Amazing experience shows the whole experience is super and pleasant, while satisficing means a cognitive status in which customers’ expectation is just met. Four dimensions represent the degree of satisfaction or dissatisfaction, including regular customers, recommendation, poor reviews, and return visits. The rest dimensions represent 24 aspects of restaurant quality (e.g., long waiting time, location of restaurant and music). Previous researchers have demonstrated that there are five most common categories in restaurants — food quality, price, service quality, location and environment (Ribeiro et al.2002; Sun, Lou-Hon et al.1995), and this study builds on this classification and organizes these 24 latent customer satisfaction dimensions into these five categories (see Table 3). Thus, these five categories would be

Topic Relative weight % Topic Relative weight %

Topic 1: Location of restaurant Topic 2: Return visits

center 17415.06905 11.10% back 7224.328888 11.02% near 7689.525599 4.90% will 6816.536457 10.40% street 4930.915713 3.14% time 4104.57924 6.26% walked 4199.470919 2.68% definitely 2725.661719 4.16% downtown 3996.062224 2.55% again 2561.162446 3.91% closed 3724.996414 2.37% first 2154.07242 3.29% time 3133.846717 2.00% try 1776.764219 2.71% minutes 3015.815286 1.92% come 1769.481252 2.70% bus 2822.231512 1.80% next 1326.044388 2.02% building 2489.885736 1.59% visit 1318.799477 2.01% park 2428.56044 1.55% wait 965.1179432 1.47% next 2336.336979 1.49% coming 845.5947993 1.29% across 2008.512956 1.28% soon 828.7691615 1.26% away 1830.419936 1.17% went 773.3002577 1.18% stop 1596.946975 1.02% amazing 761.8073586 1.16% far 1586.912572 1.01% return 721.6755455 1.10% busy 1369.52656 0.87% delicious 713.9246118 1.09% long 1254.499204 0.80% sure 711.6708932 1.09% wait 1213.57637 0.77% going 693.2839421 1.06% side 1182.09852 0.75% excellent 664.3228997 1.01%

(30)

enriched, which provides a significant practical contribution to the restaurant industry. For restaurant operation, operators and management who want to improve customer satisfaction can check and improve the mentioned dimensions in order to achieve a better performance. For planning, designing, and building a new restaurant, owners and investors may want to take into account all above dimensions as an instruction to achieve a high customer

satisfaction. For example, they can shorten waiting time and add order take-away service to satisfy their customers.

(31)

Table 3.Classification of extracted dimensions

Food (10) Service (8) Price (1) Environment (4) Location (1)

Dessert Long waiting time Poor value for money Style and decoration Location of restaurant

Poor flavor Waitress service Dinner atmosphere

Food ingredients Menu and order Restaurant environment

Pasta recipes Suitable for party Dinning view

Meat Attentive staff

Pizza Order take-away

Taste authenticity Business hours

Wine Music

Main food

Breakfast

To examine the validity of above analysis, this study compared the extracted dimensions with that of derived from human analysis. Two researchers who are from the same university and have specialized experience in NLP and text mining were invited to manually read collected reviews and pick up latent dimensions expressed in these reviews. 200 reviews were

randomly selected by the two researchers respectively, so a total number of 400 reviews were chosen. A t-test was applied to examine the difference between these 400 reviews and the whole sample. The result shows no signification difference in terms of star rating (t = 0.41, p > 0.1).

Then the Jaccard coefficient was used to examine the overlap between dimensions extracted from LDA and that from human analysis. N (𝐷𝑖𝑚𝑙𝑑𝑎) represents the number of dimensions derived from LDA, and N (𝐷𝑖𝑚𝑒𝑥) represents the number of dimensions identified by the two

researchers:

J(𝐷𝑖𝑚𝑙𝑑𝑎, 𝐷𝑖𝑚𝑒𝑥) =

|𝑁(𝐷𝑖𝑚𝑙𝑑𝑎∩𝐷𝑖𝑚𝑒𝑥)|

|𝑁(𝐷𝑖𝑚𝑙𝑑𝑎∪𝐷𝑖𝑚𝑒𝑥)| (5)

The Jaccard coefficient is 0.74 and 0.78 for the two researchers respectively. Considering the nature of the task and the level of ambiguity the researchers faced when finding dimensions from reviews, above results prove that the LDA method is feasible and reliable to extract latent dimensions.

(32)

Table 4.LDA analysis and rater validation

4.2. Description statistics and correlation

Table 5 presents the descriptive statistics and the correlation between variables. Among the 51,110 collected reviews, the average star rating is 3.72 based on a 1-5 scale. The mean score of the sentiment behind these reviews is 1.07, while that of newness is 0.48.

As can be seen from table 6, there is a strong positive relationship between sentiment and customer satisfaction (r = .599, p < .01). Table 6 also presents a positive relation between newness and satisfaction (r = .033, p < .01). In addition, all five attributes show a negative relationship with customer satisfaction (food: r = -.105, p < .01; place: r = -.041, p < .01; order: r = -.192, p < .01; service: r = -.068, p < .01; pizza: r = -.021, p < .01). Overall, the

Dimensions LDA analysis Researcher A Researcher B

Location of the restaurant √ √ √

Long waiting time √ √ √

Style and decoration √ √ √

Waitress service √ √ √

Menu and order √ √ √

Poor value for money √ × √

Dessert √ √ √

Poor flavor √ √ √

Food ingredients √ √ √

Dinner atmosphere √ √ √

Suitable for party √ √ ×

Pasta recipes √ × √ Attentive staff √ √ √ Meat √ √ √ Restaurant environment √ √ √ Pizza √ √ √ Dinning view √ √ √ Order take-away √ √ × Business hours √ × √ Taste authenticity √ × √ Wine √ √ √ Music √ √ × Main food √ √ √ Breakfast √ √ √ Freshness × √ × Cleanliness × √ √ Location in restaurant × × √ Noisy × √ √

(33)

correlations between the independent variables do not above the r = .50 level. Therefore, multicollinearity is not a major concern here, and the independence of the constructs is verified.

Table 5. Means, Standard Deviation, Correlation

4.3. Test of Hypothesis

The linear mixed model was applied to regress online rating on the variable described above. The linear mixed model considers both fixed and random effects in an analysis. Fixed effects have the levels that are of primary interest, while random effects have the level that is not primary interest, but rather are thought of a random selection from a larger set of levels (Barr et al, 2013). Subject effects normally are considered as random effects. This study treated business id as subjects and each individual review under each business id as repeated. Maximum Likelihood estimations were conducted with scaled identity covariance type. Table 6 shows a significant relationship between the five attributes and customer satisfaction, although this relationship between attribute service and satisfaction is only marginally

significant (p < .10). Food and order have a negative effect on customer satisfaction, and each one unit rise in food is associated with 0.064 unit reduction in customer satisfaction, while one unit rise in order will result in a reduction of 0.151 in customer satisfaction. On opposite, place and pizza have a positive relationship with customer satisfaction. In sum, the results of

Variables Mean SD 1 2 3 4 5 6 7 1. Satisfaction 3.72 1.34 2. Sentiment 1.07 1.38 .599** 3. Newness .48 .83 .033** -0.003 4. Food .90 1.07 -.105** -.089** .098** 5. Place .63 .95 -.041** -.069** .133** .181** 6. Order .52 1.04 -.192** -.158** .144** .181** .149** 7. Service .48 .68 -.068** -.040** .053** .248** .055** .100** 8. Pizza .45 1.15 -.021** -.048** .084** -.063** .104** .150** -.023**

**. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed).

(34)

Table 6. Estimate of Fixed Effects

Sentiment has a strong positive relationship with customer satisfaction (p < .001). One more sentiment will lead to 0.53 higher customer satisfaction. Besides, newness also shows a positive relationship with customer satisfaction (p < .001). Furthermore, sentiment moderates the effects of food (p < .001), order (p < .05) and pizza (p < .001). However, the study does not find significant coefficients for the moderating effects of sentiment towards the effect of place and service. In addition, newness indeed moderates the effect of order (p < .05) and service (p < .05), but the moderating effect on food, place, and pizza is not significant. Thus, the results only partly confirm Hypothesis 2 and 3.

Figure 7, figure 8 and figure 9 visualize the effect of food, order, and pizza respectively on customer satisfaction with sentiment as moderator. Figure 7 and 8 show a positive

moderating effect of sentiment, but figure 9 shows a negative moderating effect of sentiment. The results only partly support hypothesis 2.

Parameter Estimate SE t Sig.

Constant 3.101 .021 144.631 .000 Food -.064 .009 -6.701 .000 Place .027 .011 2.562 .010 Order -.151 .009 -15.938 .000 Service -.026 .014 -1.827 .068 Pizza .022 .009 2.501 .012 Sentiment .535 .008 70.237 .000 Newness .092 .010 8.927 .000 Sentiment * Newness -.017 .004 -4.036 .000 Food * Sentiment .020 .003 6.094 .000 Place * Sentiment -.001 .004 -.222 .825 Order * Sentiment .009 .003 2.746 .006 Service * Sentiment .006 .005 1.171 .242 Pizza * Sentiment -.015 .003 -4.616 .000 Food * Newness .006 .005 1.241 .215 Place * Newness -.007 .005 -1.438 .150 Order * Newness .013 .004 2.805 .005 Service * Newness -.017 .007 -2.376 .018 Pizza * Newness .000 .004 -.111 .912 -2*Log-Likelihood 148125.73

(35)

Figure 7.Customer satisfaction affected by food and sentiment

(36)

Figure 10 and 11 visualize the effects of order and service respectively on customer satisfaction with newness as moderator. Figures 10 shows a positive moderating effect of newness, while figure 11 shows a negative moderating effect of newness. Overall, the results only partly support hypothesis 3.

(37)

Random effects were presented in table 7 which shows that the random

restaurant-to-restaurant variance is 0.14. In individual review-to-review, differences with a variance 1.03, have a larger effect than restaurant-to-restaurant differences.

Table 7. Estimates of Covariance Parameters

Parameter Estimate Std. Error

Repeated Measures Variance 1.032468 0.006543

Intercept [subject = business_id] Variance 0.139857 0.009882

a. Dependent Variable: Satisfaction

(38)

5. Discussion and Implication

This paper proposed a new framework to extract latent customer satisfaction from online reviews and analyzed the effects of product attributes on customer satisfaction. The author used a rich online dataset for dimension mining and approved LDA is a feasible and reliable method for topic extraction. Then the study originally defined a list of words customers use to express their perception of newness and assigned related scores to show the degree of newness. A conceptual model was proposed to examine the relationship between key attributes and customer satisfaction and explored what effects of sentiment and newness on this relationship.

For dimension extraction, the study analyzed customer reviews via LDA and revealed the latent dimensions that determine customer satisfaction. Compared to traditional methods, this study has several advantages. First, this study is more efficient and both time and money saving. Tradition approaches normally need to achieve the balance between the cost of sample collection and estimation performance and need a long time for manipulation, but this study acquired a large sample from online directly. Second, this study has a larger sample size which helps to get more reliable results. In principle, the more data that is analyzed, the more accurate will be the results (Levey et.al 1999). The sample contains 51,110 reviews across 7 states of America, and varies from 2010 to 2016, while many traditional approaches fail to collect a large amount of sample and can hardly achieve such long time span.

For statistical analysis, this study found that the key restaurant attributes have an effect on customer satisfaction. Place and pizza can have a positive impact on customer satisfaction, and food and order have a negative one. These attributes are key points about customer overall experience and influence whether their pre-purchase expectation can be met. Sentiment and newness also play a role in term of customer satisfaction. A more positive

(39)

sentiment about food and order can lead to a higher customer satisfaction. Besides, compared to not discuss food, customer discuss food and combine with newness can lead to a higher customer satisfaction. In addition, compared to not discuss service, discuss service and combine with newness will lead to a relatively higher customer satisfaction. Overall,

sentiment can strongly influence customer satisfaction and the more positive sentiment is, the higher customer satisfaction will be. The higher degree of newness customer perceived, the higher customer satisfaction will be.

This study has meaningful contributions for managerial practice. First, it found the

determinants of customer satisfaction and performs as a guide for restaurant improvement. Based on the five key aspect groups in the restaurant industry (food quality, service, price, environment and location), this study further classified those dimensions into these five groups, thus the content can be enriched. For restaurant operation, management and operators who want to improve customer satisfaction can use these dimensions to check their current performance and make a targeted improvement. For restaurant planning, designing and building, investors can treat these dimensions as a guideline in order to consider all the important factors in a comprehensive way. In addition, restaurant innovation is important for customer satisfaction, especially for food and service. When it comes to innovation,

restaurant management can focus more on these two attributes.

This study contributes to the current literature in several ways. First, this study proposed a framework for dimension extraction, and how to find the relationship between product attributes and customer satisfaction. Other researchers can also apply this framework to extract dimensions in other hospitality fields (e.g., hotel, tourism spot). Second, this study originally proposed a list of words that customer used to express their perception of newness and assigned weight to these words based on their degree of newness. Third, this study has

(40)

6. Limitation and Further Research

Inevitably, this study has several limitations. First, this study has removed rare or infrequent words when pre-processing the data. These words could reflect emerging customer

preference that could be very helpful in developing new restaurant marketing space. Further research can include those infrequent words to explore the emerging preference.

Second, demographic information (e.g., age, gender) of reviewers was not included in this study. Such information may influence the star rating. Further research can incorporate the reviewers’ demographic details when analyzing the data and see what effects it would be. Third, this study indeed considered random effects and applied the linear mixed model to estimate these effects. Although the variance of different restaurants can be found, the reason behind this difference is still not clear. Further research can explore more about these reasons such as location of the restaurant, restaurant styles.

Fourth, this study only identified the five key attributes and analyzed their relationship with customer satisfaction, but other important attributes were not included. This will cause some biases, because the customer satisfaction (online rating) is an overall perception, and the sentiment and newness about other attributes (e.g., environment, price) also affect this perception, but this study did not include the attributes that go beyond these five. Further research can identify more attributes and use the same method to analyze what effects of sentiment and newness on the relationship between these attributes and customer satisfaction.

(41)

7. References

Adler, P. S., Benner, M., Brunner, D. J., MacDuffie, J. P., Osono, E., Staats, B. R., ... & Winter, S. G. (2009). Perspectives on the productivity dilemma. Journal of Operations Management, 27(2), 99-113.

Anderson, Eugene W., Claes Fornell, and Donald R. Lehmann. "Customer satisfaction, market share, and profitability: Findings from Sweden." The Journal of Marketing (1994): 53-66.

Baka, Vasiliki. "The becoming of user-generated reviews: Looking at the past to understand the future of managing reputation in the travel sector." Tourism Management 53 (2016): 148-162.

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of memory and language, 68(3), 255-278.

Barsky, J. D. (1992). Customer satisfaction in the hotel industry: meaning and measurement. Hospitality Research Journal, 16(1), 51-73.

Berezina, Katerina, et al. "Understanding satisfied and dissatisfied hotel customers: text mining of online hotel reviews." Journal of Hospitality Marketing & Management 25.1 (2016): 1-24.

Berger, Paul D., and Nada I. Nasr. "Customer lifetime value: Marketing models and applications." Journal of interactive marketing 12.1 (1998): 17-30.

Berthon, Pierre R., et al. "Marketing meets Web 2.0, social media, and creative consumers: Implications for international marketing strategy." Business horizons 55.3 (2012): 261-271.

Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." Journal of machine Learning research 3.Jan (2003): 993-1022.

(42)

Bowen, John T., and Shiang-Lih Chen. "The relationship between customer loyalty and customer satisfaction." International journal of contemporary hospitality management 13.5 (2001): 213-217.

Brentani, U. (2001). Innovative versus incremental new business services: different keys for achieving success. Journal of Product Innovation Management, 18(3), 169-187. Byrne, Patrick J., J. Richard Bacon, and Ulrich C. Toensmeyer. "Pesticide residue concerns

and shopping location likelihood." Agribusiness 10.6 (1994): 491-501.

Chan, Albert PC, and Ada PL Chan. "Key performance indicators for measuring construction success." Benchmarking: an international journal 11.2 (2004): 203-221.

Christensen, Clayton. The innovator's dilemma: when new technologies cause great firms to fail. Harvard Business Review Press, 2013.

Cronin Jr, J. Joseph, and Steven A. Taylor. "Measuring service quality: a reexamination and extension." The journal of marketing (1992): 55-68.

Cui, G., Lui, H. K., & Guo, X. (2012). The effect of online consumer reviews on new product sales. International Journal of Electronic Commerce, 17(1), 39-58.

Damanpour, Fariborz. "Organizational innovation: A meta-analysis of effects of determinants and moderators." Academy of management journal 34.3 (1991): 555-590.

Danaher, Peter J., and Vanessa Haddrell. "A comparison of question scales used for measuring customer satisfaction." International Journal of Service Industry Management 7.4 (1996): 4-26.

De Jong, J. P., & Vermeulen, P. A. (2003). Organizing successful new service development: a literature review. Management decision, 41(9), 844-858.

Dixon, M., Freeman, K., & Toman, N. (2010). Stop trying to delight your customers. Harvard Business Review, 88(7/8), 116-122.

(43)

Drazin, R., & Schoonhoven, C. B. (1996). Community, population, and organization effects on innovation: A multilevel perspective. Academy of management journal, 39(5), 1065-1083.

Engler, Tobias H., Patrick Winter, and Michael Schulz. "Understanding online product ratings: A customer satisfaction model." Journal of Retailing and Consumer Services 27 (2015): 113-120.

Evans, Dave. Social media marketing: the next generation of business engagement. John Wiley & Sons, 2010.

Feldman, Ronen, Jacob Goldenberg, and Oded Netzer. "Mine your own business: Market structure surveillance through text mining." Marketing Science 31 (2010): 521-43. Floyd, Kristopher, et al. "How online product reviews affect retail sales: A meta-analysis."

Journal of Retailing 90.2 (2014): 217-232.

Fornell, C. (1992). A national customer satisfaction barometer: The Swedish experience. the Journal of Marketing, 6-21.

Gao, Lu, Yao Yu, and Wuling Liang. "Public Transit Customer Satisfaction Dimensions Discovery from Online Reviews." Urban Rail Transit (2016): 1-7.

Gatignon, Hubert, and Thomas S. Robertson. "Innovative decision processes." Handbook of consumer behavior 316.8 (1991).

George, Carlisle E., and Jackie Scerri. "Web 2.0 and User-Generated Content: legal challenges in the new frontier." (2007).

Gielens, Katrijn, and Jan-Benedict EM Steenkamp. "Drivers of consumer acceptance of new packaged goods: An investigation across products and countries." International Journal of Research in Marketing 24.2 (2007): 97-111.

(44)

Gregan‐Paxton, Jennifer, et al. "“So that's what that is”: Examining the impact of analogy on consumers' knowledge development for really new products." Psychology &

Marketing 19.6 (2002): 533-550.

Griffin, Abbie, and John R. Hauser. "The voice of the customer." Marketing science 12.1 (1993): 1-27

Guo, Yue, Stuart J. Barnes, and Qiong Jia. "Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation." Tourism Management 59 (2017): 467-483.

Gupta, Prakhar, Sandeep Kumar, and Kokil Jaidka. "Summarizing Customer Reviews through Aspects and Contexts." International Conference on Intelligent Text

Processing and Computational Linguistics. Springer International Publishing, 2015. Han, Hyun Jeong, et al. "What guests really think of your hotel: Text analytics of online

customer reviews." (2016).

Herring, S. C. (2011). Discourse in Web 2.0: Familiar, reconfigured, and emergent. Georgetown University round table on languages and linguistics, 1-25.

Ho-Dac, Nga N., Stephen J. Carson, and William L. Moore. "The effects of positive and negative online customer reviews: do brand strength and category maturity matter?." Journal of Marketing 77.6 (2013): 37-53.

Hoeffler, Steve. "Measuring preferences for really new products." Journal of marketing research 40.4 (2003): 406-420.

Hu, Minqing, and Bing Liu. "Mining and summarizing customer reviews." Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004.

Huang, James, Stephanie Rogers, and Eunkwang Joo. "Improving restaurants by extracting subtopics from yelp reviews." iConference 2014 (Social Media Expo) (2014).

Referenties

GERELATEERDE DOCUMENTEN

[r]

ACM heeft in de verkenning gekeken naar online reviews geschreven door consumenten over een product, bijvoorbeeld reviews over electronica, over een dienst, bijvoorbeeld reviews over

Gezien het toenemende belang van online reviews pleit de ACM wel voor meer transparantie bij alle partijen die betrokken zijn bij het verzamelen, publiceren en beheren van

Daarnaast vindt ACM dat alle partijen die betrokken zijn bij het verzamelen, publiceren en beheren van online reviews zich moeten onthouden van werkwijzen die kunnen leiden tot

”How does return policy leniency and the level of expertise of the reviewer impact buyers’ regret when a buyer reads a negative review, written by either an expert of a peer,

Partially supported H2 Positive (negative) valence of peer opinion has a positive (negative) effect on purchase intention of sportswear products Supported H3 The direct

How does the valence of online customer reviews written by unknown consumers and the valence of peer opinions impact the purchase intention of sportswear products, and how is

From this research it can be concluded that there are no significant differences between humorous and non-humorous reviews, and no significant differences between