• No results found

Predicting the spatio-temporal popularity of brands using multimodal social media posts

N/A
N/A
Protected

Academic year: 2021

Share "Predicting the spatio-temporal popularity of brands using multimodal social media posts"

Copied!
40
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Predicting the Spatio-Temporal

Popularity of Brands Using

Multimodal Social Media Posts

Gijs Overgoor

Student number: 10213619

Date of final version: August 14, 2016 Master’s programme: Econometrics

Specialization: Big Data in Business Analytics

Supervisor: Prof. Dr. M. Worring and dr. M. Mazloom Second reader: Dr. N.P.A. van Giersbergen

(2)

Statement of Originality

This document is written by Student Gijs Overgoor who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Contents

1 Introduction 3

2 Related Works 8

2.1 Visual Features for Post Popularity Prediction . . . 8

2.2 Textual Features for Post Popularity Prediction . . . 11

2.3 Temporal Features for Popularity Prediction . . . 12

2.4 Temporal patterns in post and brand popularity . . . 12

3 Method: Spatio-Temporal Brand Representation 14 3.1 Feature Extraction . . . 14

3.1.1 Textual Features . . . 15

3.1.2 Visual Features . . . 16

3.2 Spatio-Temporal Brand Representation . . . 17

3.2.1 Importance Parameters . . . 19 3.3 Prediction . . . 20 4 Experimental Setting 21 4.1 Data . . . 21 4.2 Experiments . . . 24 5 Results 27 5.1 Experiment 1: Investigation of temporal patterns in brand popularity . . . 27

5.2 Experiment 2: Post Popularity Prediction Using Multi Modal Features . . . 28

5.3 Experiment 3: Brand-Level Popularity Prediction . . . 30

5.4 Experiment 4: Brand Ranking Prediction: . . . 31

6 Conclusion 34

(4)

1

Introduction

People possess all kinds of tools for capturing events and experiences. More often people are seen with cameras and smart phones, filming or photograph-ing the thphotograph-ings they experience. The individuals of today experience through the lenses of their cameras and with internet being an integral part of peo-ple’s lives, thoughts and opinions are expressed through all the available so-cial media platforms. Facebook, Flickr, Instagram, Twitter and YouTube, some of the most popular platforms of today, all have millions of users shar-ing di↵erent types of content with others. YouTube generates video content, whereas Twitter posts consist of text and Facebook users upload text as well as photo or videos. A quick glance at some statistics: YouTube1, the video

sharing platform, has (according to their website) over a billion users and everyday people watch videos for hundreds of millions of hours and generate billions of views. Facebook2 as of December 31st 2015, and Twitter3 have 1.59 billion and 320 million monthly active users respectively. Whether it is through text, photo or video, people are sharing the events they captured with others across the multiple platforms. All of the content that is gener-ated by these users and shared on these platforms holds a lot of valuable information. Automated systems have been developed to extract the data from the content that is available across the internet. These automated systems have given researchers and businesses the ability to gain insight in what is posted by certain users on a large scale.

Social media is reshaping businesses and marketing. Businesses are able to reach thousands of (potential) clients at the same time through social media. Besides the ability to reach so many people at the same time it has also opened a line of communication between businesses and the consumers. In [20, 21] the authors show the importance of analyzing social media from

1https://www.youtube.com/yt/press

2”Facebook Climbs To 1.59 Billion Users And Crushes Q4 Estimates With 5.8B dollars

Revenue”. January 27, 2016. Retrieved January 27, 2016.

(5)

a business perspective. In [20], Risius and Beck examine overall patterns of social media use in di↵erent account types and communication approaches. It demonstrates the value of social media analytics for building brand loyalty by following the analytical approach as proposed in [21]. In [25] the authors show how profit businesses use social media as a source of information as well as an execution platform for product design and innovation, relationship management and marketing. Social media has become of grave importance in business intelligence. According to the authors of [13] social media should be included in today’s marketing-mix. It is a new channel and a new context in which marketing is done. People express their opinions about advertise-ment through social media and they engage with a certain brand or product through it. The authors consider social media to be a new hybrid element of the promotion mix. It combines some of the characteristics of traditional integrated marketing contributions with a form of word-of-mouth communi-cation. Word-of-mouth communication in social media comes in such great frequency and volume that it is uncontrollable for marketing managers. An-alyzing and e↵ectively using social media holds a mass of opportunities for enhancing customer engagement and brand development.

The expression of thoughts and opinions through posts are seen and possibly appreciated by other users. A certain post will become very pop-ular, while another will barely be noticed [17]. Popularity prediction is becoming more and more important. Accurately predicting post popularity provides an explanation to why people are interested or engaged to a brand or product. There are many di↵erent features of posts that provide these explanations when extracted properly. Various machine learning techniques and methods, which will be discussed more thoroughly later on in this the-sis, have been developed for accessing and processing this vast availability of data. Posts with text, image or both contain explicit or implicit features that make it more (or less) popular than others. This thesis will propose a novel approach for incorporating the temporal dimension of post

(6)

popular-ity into predicting popularpopular-ity on a brand level. The Instagram accounts of brands and their posts will be examined for the prediction.

Instagram is used for self-expression by images with a description through captions and hash tags. It is being used for sharing images by 300 million active monthly users and these users have shared over 30 billion photos to date4, averaging 70 million photos per day. When looking at Instagram for business, according to a study conducted by forrester 5, top brands on

In-stagram are seeing a per-follower engagement rate of 4.21, which is 58 times higher than Facebook and 120 times higher than Twitter. These aspects are what is making Instagram of particular interest in the research of brand popularity. Analyzing the information extracted and being able to predict popularity of a post from the content gives insight into people’s interest in and engagement with a certain brand or product.

Di↵erent platforms of social media hold di↵erent types of posts. Pre-vious works have analyzed the di↵erent platforms that focus primarily on either text or image. In [8] Hong et al. show the importance of predicting popularity of messages on twitter (i.e. Tweets). They shed light on what factors influence the volume of retweets on Twitter and focus on the content of messages, temporal information, metadata and user context. McParlane et al. [18] use the information, temporal and geographical, from contextual phototags for making recommendations to users on Flickr.

The analysis of images on social media platforms is less straightforward than the analysis of text. In recent years the analysis of image has been im-proved by introducing various machine learning techniques. Previous works have shown to be able to extract various features from visual content such as the presence of concepts, color, gradients, visual sentiment and some deep learning features. Capallo et al. [2] have explored various factors of image popularity on social media: popular and unpopular visual senses and low

4https://www.instagram.com/about/us

5http://blogs.forrester.com/nate elliott/14 04 29 Instagram is the king of social

(7)

and high engagement measurements. Focusing just on the visual content of their data they try make predictions of the popularity of an image. Khosla et al. [10] look at high and low level vision features that are likely to be used by humans for visual processing. In their work they focus on five such fea-tures namely gist, texture, color, gradient and deep learning feafea-tures. They combine these low level features with a distributional representation of con-cept presence, with which they are able to accurately predict the popularity of posts on the photo sharing website Flickr.

These works focus solely on either text or image. One would assume that combining the best of those methods would lead to even better analysis of the data at hand. Gelli et al. [6] make use of sentiment and contextual features. Specific visual sentiments can positively or negatively influence the eventual popularity of images. The authors extract visual sentiment features and use textual features to make a semantic description of the image and they combine this with the social context of a user. In [15] it shows how visual and textual features are complementary for predicting popularity of a post and then goes one step further in combining textual and visual data by adding a new layer of engagement parameters. It claims that brand- related posts may be popular due to several cues related to user engagement, focusing on brand popularity through social media rather than post popularity itself. The results the authors obtain are promising, but it is interesting to look at an increase of the data set. This thesis will continue on predicting the popularity of a brand from a similar perspective by increasing the data set to 190,000 Instagram posts. Where previous works have done rather well in predicting the popularity of posts based on visual and textual information, making use of contextual or user-related information, none of these have been able to incorporate the temporal information available, especially so on a brand level.

The temporal patterns in a certain time frame of the popularity of In-stagram posts will be an addition to current popularity prediction. The

(8)

contributions of this thesis will be the following:

• Revealing temporal patterns in the popularity of brands on Instagram. • Incorporating the temporal dimension for obtaining an accurate brand

representation.

• Accurate prediction of the popularity of brands across industries based on the ranking

The main focus of this research is the prediction of the popularity for brands which begs the main research question:

• Is it possible to predict the spatio-temporal popularity of brands ex-amining posts on Instagram?

This main question is split up into three more specific research questions: • How accurate is the post popularity prediction using only textual

fea-tures, only image features or a combination?

• Are there noticeable temporal patterns in the popularity of posts and do they aid in the prediction of brand popularity?

• How accurate is the brand ranking prediction when aggregating fea-tures extracted from Instagram posts based on temporal importance? The rest of this thesis is as follows: In Section 2 the related works in popu-larity prediction are examined. The feature extraction as well as the novel approach will be proposed in section 3. This section will also cover the experimental setup. In section 4 the data set will be discussed. The experi-mental procedure and results are then detailed in section 5 before concluding in section 6.

(9)

2

Related Works

In the past few years extensive research has been done in analyzing the content generated by social media. The automated systems for extract-ing the information contained in this content are becomextract-ing more and more advanced. The previous works discussed in this chapter focus on extract-ing features from the content of di↵erent social media platforms. Di↵erent platforms generate di↵erent types of posts which lead to di↵erent features being of importance. The post popularity metric that is used depends on the platform: Number of views for Flicker, number of likes for Instagram or number of retweets for Twitter. The di↵erent features covered in this chapter all contribute in some way to the popularity. The researchers have exploited state of the art methods for accurately predicting the popularity of a posts by making use of these extracted features. In [6, 10] [15] the post popularity prediction is treated as a ranking problem. The number of likes follows a power law distribution: the majority of the posts receives little or no likes whereas the minority of them receives a very high volume of likes. The authors make use of support vector regression to learn the importance of these features for prediction.

2.1 Visual Features for Post Popularity Prediction

Flickr and Instagram are the social media platforms that primarily gener-ate visual content. The authors of [6, 10, 15, 17] focus on extracting the visual features from posts of these social media platforms and using the rep-resentation of the images for predicting the popularity of an image posted. An important visual representation learning method as proposed by the au-thors of [12] is to extract the deep learning features of an image. These deep learning features combined with low-level vision, high-level vision and visual sentiment features make an accurate representation of an image and aid in the prediction of the post popularity.

(10)

features, low-level computer vision features (i.e. color, gist, gradient, texture and deep learning) and high-level features such as objects in images. For each of the features the authors measure contributions to popularity. The color and simple image features are mostly used for information on their contribution to popularity. The low-level computer vision features are based on the way humans process visual information. Obvious visual cues like color patches, gradient, gist and texture are noticed and processed by the human brain. The last low-level computer vision features are the deep learning features for which they use the “ImageNet network” [11] trained on the ImageNet [3] database. A convolutional neural network that is also used for the recognition of object presence. This representation is followed by Mazloom et al. [15] as they construct a concept vector which contains a probabilistic estimation of the object presence. Khosla et al. [10] combine, by late fusion, the simple, low and high-level features with the social cues of the person who is posting an image to predict the popularity. The social cues consist of a social context of the user and user-specific features that remain constant over the posts done by the user.

Capallo et al. [2] investigate popular and unpopular latent senses. It selects pairs of images with varying popularity and identifies the latent weights which respond most strongly to each image. It then updates the weights accordingly, by punishing the latent sense that responds strongly to the less popular image, while encouraging the latent sense that responds most strongly to the more popular image. The latent senses are split in two. The weights for the first set are updated according to the method that was just described. The second set is doing the opposite as it is encouraging the latent senses that respond strongly to unpopular images and punishes the senses that respond oppositely.

McParlane et al. [17] classify based on the visual appearance, content of an image and the context at which an image is posted. The image con-text features are a binary or categorical feature representation based on

(11)

the metadata, time, device used, size and orientation. The time is further discussed in the temporal features section of this chapter. For the content representation the authors classify using four methods namely scene 1 (i.e. city, party, home, food or sports), scene 2 (i.e. indoor, outdoor, macro or portrait), number of faces and color. These classifications are categorical representations of the scene captured in the image, a count of the number of faces that are present in the image and a coarse representation of the colors within an image. For the two scenes of an image a multi-class SVM is used for which they explore best performance when using di↵erent kernel functions and parameters.

The authors of papers [6, 15] make use of Visual Sentiment Ontology which was first introduced in [1] to detect sentiment in an image. The on-tology consists of a collection of 1,200 Adjective- Noun-Pairs (ANP). They represent each image with a 1,200- dimensional vector in which each dimen-sion shows the probability of the ANP being present in the image. The authors of [1] further investigated the use of ANP in detecting visual senti-ment and they demonstrate novel applications made possible by Sentibank. Their method explores large image data sets along the high-dimensional sentiment concept space using tools such as emotion wheel and treemap, making a prediction of the sentiment present in an image. A shortcoming to the VSO used by the previous works is that it is solely based on English and Western culture and is therefore limited. Unlike the flat structure of this VSO, [9] organizes an ontology hierarchically by multilingual clusters of visually detectable nouns and subclusters of emotionally biased versions of these nouns. They use the context in which emotions are felt, influenced by culture and language. The MVSO discovery method can easily be extended to di↵erent languages achieving a greater coverage and diversity than the regular VSO used before.

This thesis will investigate Visual Sentiment and ConceptVec-15k for post- and brand-level popularity prediction. The features extracted from

(12)

posts will be used to construct a similar representation for brands.

2.2 Textual Features for Post Popularity Prediction

Posts on Instagram are images that come along with a textual description through a caption and hash tags. This description contains additional in-formation complementary to the image features. It might lead to an image being found more easily and it covers the context in which an image was taken. McParlane et al. [17] extract textual features from the tag description of posts on Flicker. They represent each tag by its tag co-occurrence vector and they represent the number of images that contain a certain tag. The representations come from Term Frequency-Inverse Document Frequency (TF-IDF). An approach initially implemented by the authors of [5]. The TF-IDF representation is computed for a given number of random tags of a post.

Mazloom et al. [15] utilize three di↵erent representations based on the hashtag description of Instagram posts; term-based, Word2Vec and Textual-sentiment. Term-based is similar to the TF-IDF representation, but repre-sented by a sparse binary vector. W2V proposed in [19] is a word embedding method in which a deep neural network is trained over a billion Google news documents. From this network a 300-dimensional vector is computed which maps each hashtag or word in the caption of a post onto its W2V represen-tation. As well as sentiment features for the visual aspects of posts there are textual sentiment features. These textual sentiment features are con-structed by making use of the SentiBank [1] and SentiStrength [22] . First the stop words are removed and stemming is performed. For each emotion, tag frequency analysis for the top 100 tags is used. Then the sentiment value of each tag is computed using the SentiStrength to generate the pos-itive (ranging from 1 to 5) and negative (ranging from -1 to -5) sentiment scores.

(13)

feature extraction from tweets. It uses W2V and TF-IDF and combine these with contextual features of the user as well as temporal dynamics of retweet chains. It suggests the importance of including a temporal dimension into popularity prediction.

Textual Sentiment and Word2Vec are examined for post- and brand-level prediction. The features are extracted from the captions and hashtags that describe the posts. Di↵erent from these works, the post and brand popularity will be predicted by combining the textual and visual features.

2.3 Temporal Features for Popularity Prediction

The popularity a post receives is not only very diverse in the volume but also in the temporal pattern. A regular post will likely remain barely noticed and unpopular throughout its lifetime while a popular post receives thousands of likes and keeps being popular for a certain time frame. The time, day of the week or season at which a post is taken and posted may a↵ect the popularity. The temporal representation can be constructed on the basis of time stamps of posts. McParlane et al. [17] use time, day and season in their representation. The time at which a post is posted can be classified into time of day (i.e. morning, afternoon, evening, night), whether it is a week or a weekend day and also the season in which the post is posted.

This thesis will use the week in which a post is posted to construct a brand popularity time series. The temporal features will be used as impor-tance parameters to construct a brand-level feature representation.

2.4 Temporal patterns in post and brand popularity

The examined works view likes as cross-sectional data, they only look at the volume (i.e. total number of likes, views or retweets) of the popularity. It has not yet incorporated the temporal patterns in a popularity prediction setting so far. However, research in temporal patterns has been done in sales. The authors of [7] look at the takeo↵, a sudden and large increase, in

(14)

sales across time. Their research takes a look at the time needed for a new product to takeo↵, the pattern in time this takeo↵ follows and a prediction of the takeo↵ point. By defining a “threshold for takeo↵” and measuring in what year the level of sales crosses this threshold it tries to model, using a hazard function, the takeo↵ point of a new product. This approach holds two difficulties namely: A threshold value for a certain brand needs to be found, this might be based on some kind of historical average. Secondly the distribution of likes needs to be determined to be able to find the hour in which the post goes viral (i.e. passes the threshold).

The authors of [24] propose the K-Spectral Centroid (KSC) algorithm for clustering time series that outperforms the common K-means in finding patterns in social media content. The KSC clustering is robust against scaling and shifting as time series that have similar shape, even though di↵erent in volume or spikes at di↵erent times, will be considered similar. The authors test their algorithm on the hourly popularity of phrases in Meme tracker and tags in Twitter. The clustering of time series by using the KSC algorithm aids in the analysis of the temporal dynamics in social media. Their study reveals six main temporal patterns of attention in online content.

The authors of [23] propose Multi-Scale Temporal Decomposition in which they use time-sensitive context depending on time-sensitive factors for learning the temporal patterns in the popularity of posts from a Flickr data set. Certain features of images and posts are time-sensitive while oth-ers are not. In [23] two vectors are constructed; one based on user activity in a certain time period - user-activeness variability; the second one is based on the ratio of popularity with respect to the mean value of the popular-ity of other posts with similar photos measured at time period t - photo prevalence variability. The researchers have shown that there is a negative correlation between user activity and popularity of posts in a time frame of a week. The photo-prevalence variability will be a ratio of the popularity

(15)

of a certain posts compared to the popularity of other posts with a similar photo.

The methods focus on sales or post-popularity primarily, but they don’t consider brand popularity. They do o↵er a solid basis for exploring the temporal patterns of popularity of brand-related posts on Instagram. This thesis will investigate time series of the popularity of brands in the past year and KSC clustering will be used to examine patterns in brand popu-larity over time. It will focus on incorporating the temporal dimension into popularity prediction on a brand-level.

3

Method: Spatio-Temporal Brand

Representa-tion

This thesis will propose a novel approach, shown in Figure 1, where the popularity of the brand will be predicted based on an aggregation of the features extracted from the posts of the brand. The prediction will be done with a weighted aggregation based on the temporal patterns of popularity on Instagram. The weights used for the weighted aggregation are importance parameters based on the popularity6 across weeks and per post within a week. This chapter will consist of a description of the extraction of the di↵erent features per post, followed by the construction of the temporal popularity. Then a description of the novel approach with the importance parameters and the aggregated feature representations for each brand.

3.1 Feature Extraction

For the extraction of visual and textual features from the posts this thesis will follow the most recent state of the art methods.

6The popularity metric that is used is based on the post like count, described more

(16)

Figure 1: Proposed Method; including the brand popularity time series to create the Spatio-Temporal Brand Representation of the extracted features and using this for the prediction on a brand level

3.1.1 Textual Features

The following textual features will be extracted from the captions of the di↵erent posts:

Word2Vec: A trained deep neural network proposed in [19], which computes a vector by mapping each tag representing the tag’s relation to other tags. For a representation that is especially relevant to Instagram, this neural network is trained on Instagram only.

Textual Sentiment: Represented by making use of SentiBank [1] and SentiStrength [22]. First the stop words are removed and stemming is per-formed. For each emotion, an m tag frequency analysis for the top 100 tags is used. Then the sentiment value of each tag is computed using the SentiStrength. to generate the positive (ranging from 1 to 5) and negative (ranging from -1 to -5) sentiment scores.

(17)

Figure 2: Concepts and ANP’s that received highest probability scores after extraction. The brands behind the images are: American Express (upper-left), Ford (upper-right), MTV (bottom-left) and Nike (bottom-right).

3.1.2 Visual Features

The visual features extracted from the posts of di↵erent posts are repre-sented based on the presence of objects and the sentiment within an image. ConceptVec-15k: A distributional representation of objects presence following [15]. The conceptual representation is similar to the detection of concepts in video events as described in [16] and [14], where the authors research video event detection by learning an event from concept detector scores. This representation accurately describes the scene of an image by recognizing the presence of several thousand concepts. Figure 2 shows the concepts that received the highest scores for di↵erent images.

Visual Sentiment: A feature based on the Visual Sentiment Ontol-ogy which was first introduced in [1] to detect sentiment in an image. The ontology consists of a collection of 1,200 Adjective-Noun-Pairs (ANP). The sentiment representation consist of a 1,200 dimensional vector with proba-bilities of ANP’s being present in the image.

Figure 2 visualizes the visual feature extraction from four posts corre-sponding to American Express, Ford, MTV and Nike. It shows the ten

(18)

concepts and ten ANP’s that received the highest probabilities of being present within the image. The ANP’s show actual probabilities of the senti-ment being present, while the concepts are normalized values such that the values for all 15,293 concepts sum up to one. The distributional represen-tations for the concepts and ANP’s are accurate for some, an inaccurate for others. Starting with the image generated by American Express. The top 10 concepts capture some of the objects in the image. Valley, Mountainside and Lakeside are good descriptors for the image. The ANP’s are a bit more arbitrary with Busy Bridge and Rough Road. Scenic Ocean, Warm Creek are understandable classifications, however incorrect. Icy and Quiet River and Smooth Water are present in the image however. The image posted by Ford is very accurately described by the concept vector. Racer, Car, Sports Car are very accurate descriptors. The ANP’s also capture the sunset in the background and it captures the rally corresponding to the image. The image posted by MTV captured the Miss Universe finalists. The concepts and ANP’s describe the image accurately with high values for Dress, Queen, Crown and Cute Girls. he image posted by Nike, note that the man in the image is Neymar Jr. a Brazilian soccer player, it shows high probabilities for some nationalities other than the actual Brazilian nationality. Street Clothes and Champion are somewhat accurate, but overall the image is not accurately represented by the concept. The same holds for the ANP’s, the only accurate description might be the super team. Summarizing: The con-cept representations for the images of American Express, Ford and MTV are accurate, while Nike is not as accurate. Ford and MTV are accurately described by the ANP’s while American Express and Nike contain various ANP’s that do not correspond to the actual sentiment present in the images.

3.2 Spatio-Temporal Brand Representation

Recent works follow a cross-sectional approach in which the total number of likes is a measure of the popularity of a post, focusing merely on the

(19)

Figure 3: Spatio-Temporal Brand Representation

volume, the approach of this thesis will incorporate temporal pattern of the popularity of brands across di↵erent industries.

The time stamp for the post and the corresponding volume of likes as a count create a time series of the brand popularity over the course of five years. The frequency and distribution of posts di↵ers within and across brands and therefore results in time series of di↵erent lengths and time frames. Moreover, the popularity of Instagram as a social media platform for its users has grown vastly over the past five years which needs to be ac-counted for in examining brand popularity over a longer period than a year. To make brand popularity time series comparable and to create a realistic time series of the brand popularity a moving average was constructed to obtain a metric for the weekly brand popularity, which then resulted in 431 comparable brand popularity time series over the past year.7

7The weekly moving average was constructed for the weeks in which the posts were

posted, resulting in an irregular time series. To address the problem of unavailable weekly brand popularity metrics interpolation was used.

(20)

3.2.1 Importance Parameters

The visual and textual features are extracted from each post and they form a representation that is used for post popularity prediction. These features are unique representations of the di↵erent posts, but they do not represent the brand nor the temporal dimension of the brand popularity. To transform the post features into relevant features per brand, importance parameters will be introduced. The parameters are a measure for the relative importance of a certain week based on the popularity of a brand for that week, , and the relative importance of a post within a week for a certain brand ↵. The weighted aggregation based on the importance parameters is given by the following equation : Fb = W X j=1 bj X i2Ibj ↵i⇤ Fi , 8b 2 B (1) with PWj=1 bjPi2I

bj↵i = 1, 8b. Ibj is the set of posts posted by brand

b within week j. W is the total number of weeks in the time series. B is the set containing all the brands for which a time series is constructed. Fb

is the transformed weighted aggregated feature representation for brand b constructed with Fian extracted feature corresponding to post i. The

aggre-gation based on the importance parameters will be tested against another weighted aggregation based on the within importance parameters without using the across week parameters bj. The formulation is defined as follows:

Fb =

X

i2Ib

↵i⇤ Fi , 8b 2 B (2)

The third aggregation is based on equal weights for all of the posts of a brand: average pooling. The results based on the three di↵erent methods for aggregation will show importance of emphasizing certain posts based on the (temporal) popularity.

Figure 3 shows the process of training the model and using it for brand-level prediction. From the content, posts generated by the brands on In-stagram, features and importance parameters are extraced. These features

(21)

multiplied by the importance parameters per post form the Spatio-Temporal Brand Representation (STBR). The brand representation is a weighted ag-gregation that describe a brand based solely on the content posted. The number of likes of the posts per brand form the popularity. The brand pop-ularity and the STBR will be used for training the model that is used for the brand-level prediction.

3.3 Prediction

This thesis will make use of the current state of the art works as a base-line for the popularity prediction. L2-regularized L2-loss Support Vector Regression (SVR) with the LIBLINEAR [4] package is utilized for training and prediction. The log normalized number of likes are used as a popularity metric. Five-fold cross-validation is performed on C=[0.01; 0.1; 1; 10; 100] to determine the optimal misspecification cost. The importance of each of the features for popularity prediction will come to light by first using the features separately and then a combination. The prediction is hypothesized to be most accurate when all of these features are combined.

⇢ = P i(ri r)(s¯ i s)¯ pP i(ri r)¯ 2 pP i(si s)¯2 (3)

Interest lies in the relative popularity of posts and the popularity prediction is therefore treated as a ranking problem, following [6, 10, 15]. For report-ing the performance Spearman’s rank correlation in equation 3 is used as it measures the rank correlation between the predicted popularity and the ground truth popularity in the test phase. For the brand level prediction; support vector regression as well as multi-class support vector machine are used. The brand ranking alone is rather limited, therefore brand rank-ing categories are constructed. The experiment will be performed on the brand categories. Prediction will be done with multi-class support vector machine. Multi-class classification with a one-versus-all approach for the di↵erent classes, meaning that for every category the one-versus-all binary

(22)

Figure 4: Random posts corresponding to four highly ranked brands.

classification is performed, which lead to probablities of a brand belonging to the di↵erent categories. The predicted category in test phase is the category for which the probability is highest. The accuracy is simply the percentage of correctly classified brands.

4

Experimental Setting

In this section the experimental setting will be discussed, starting with the process of gathering the data and some description and visualization of the content the brands have generated. After the data follows the description of the experiments: An investigation of temporal patterns in popularity of the brands in the data set, popularity prediction on a post- and brand-level and brand ranking prediction.

4.1 Data

This thesis will investigate the proposed method onto a dataset of crawled posts from Instagram. Instagram is used for self-expression through images

(23)

Table 1: Most Popular Brands

Brand Images Likes Comments Followers Ranking

Accenture 175 5 0 555 129 Amazon.com 739 14 4 25,913 4 American Express 1,023 113 4 11,773 38 Bank of America 71 97 5 11,539 116 Chevrolet 383 2 0 5,366 232 Cisco 583 10 4 62,525 40 ComCast 312 14 1 1,413 191 ExxonMobil 283 4 0 2,113 61 Facebook 291 281 23 1,701,180 26 Ford 1100 323 18 1,085,851 49 Google 30 116 2 8,219 2 Harley-Davidson 153 8 0 1,671 261 IBM 367 58 3 67,938 23 Intel 807 8 0 453,183 13

Kentucky Fried Chicken 447 722 96 728,661 143

Kleenex 100 14 0 3,474 629 MasterCard 431 36 8 53,295 50 McDonald’s 391 1,347 4 106,822 18 Microsoft 236 3,44 11 385,736 7 MTV 6887 47,232 480 5,363,186 403 Nike 824 1,086 147 39,292,653 10 Oracle 457 6 1 47,714 67 PayPal 939 128 5 36,903 232 Starbucks 367 35 0 13,158 28

Union Pasific Railroad 400 52 0 9,823 282

UPS 204 99 11 21,811 30

Visa 180 13 0 36,99 24

(24)

with a description consisting of hash tags and a caption. The combination of text, image and self-expression makes Instagram an interesting platform for the research in brand popularity. This thesis will look at approximately 700,000 posts from brands in the United States across 27 di↵erent industries. Table 1 holds an overview of the Instagram activity of the most popular brands across the di↵erent industries. As shown in table 1 there is a grand variety in the use and popularity of Instagram accounts in the di↵erent in-dustries. In [10, 15] the authors show importance of the number of followers on the popularity of images posted, which can be seen in table 1 as well. Spearman’s rank correlation between average number of likes and number of followers for the brands is 0.656. More followers lead to higher averages of number of likes and number of comments. Figure 4 shows random posts corresponding to four highly ranked brands. It shows the image and the description along with the number of likes these posts received.

For constructing a representative time series of the brand popularity only the brands with a weekly popularity metric for the past year are considered. The posts in 2015-2016 for the 431 relevant brand are used. This narrows down the data set to 160,000 posts that will be examined in the experiments. The brand rankings as shown in Table 1 are a combination of various global rankings. The rankings chosen for this thesis are those global rank-ings from firms such as Barron’s, Interbrand, Forbes, Fortune and NetBase. To get an accurate representation of most brand’s values as well as their popularity amongst consumers, the rankings are a combination of top 100’s in brand performance, brand popularity and brand value. All the brands with a ranking in at least one of the rankings in the data set are taken in to account for prediction. These are top 100’s, which means that when a brand is not in it, it is ranked lower in the particular ranking. Brand’s that are ranked in more top 100’s are therefore perceived to be ranked higher than others. A combination of the brand rankings and the number of rankings they are in leads to the ranking as shown in Table 1. All the rankings that

(25)

are taken into account can be found in the appendix.

The brand rankings are limited and prediction might easily be inaccu-rate. To address the problem also brand categories based on these rankings are constructed. The categories represent high, medium, low and no rank-ing. Table 2 shows the distribution of the 431 brands among the categories.

Table 2: Category Distribution and Cluster Match Category Frequency High 104 Medium 115 Low 73 No ranking 137 4.2 Experiments

Experiment 1: Investigation of temporal patterns in brand popu-larity: The first experiment consists of investigating the temporal patterns for the di↵erent brands. The authors of [24] propose the K-Spectral Centroid (KSC) algorithm for clustering time series that outperforms the common K-means in finding patterns in social media content. The KSC clustering is robust against scaling and shifting as time series that have similar shape, even though di↵erent in volume or spikes at di↵erent times, will be consid-ered similar. By clustering the brand popularity time series of the brands, distinct patterns can be found that reveal how brands are appreciated over the course of a year. The presence of certain patterns would confirm impor-tance of the temporal dimension in brand level prediction. Certain posts are more popular in certain time frames and this di↵erence in popularity might not just be explained by the number of followers or the content of the post.

(26)

Experiment 2: Post popularity prediction using multimodal tures: The popularity prediction accuracy is examined for the di↵erent fea-tures, separately as well as combined. This shows the importance of the features extracted from the Instagram posts to the popularity.

First, only the textual features extracted from the caption of the di↵erent posts will be used for popularity prediction. These capture the information from the description that comes with an Instagram post. The rank corre-lation of the predicted popularity with the ground truth popularity will be reported for using Word2Vec, Textual Sentiment and a Late Fusion of these.

Second, only the visual features extracted from the images of di↵erent posts will be used for the popularity prediction. The representation of the image is based on the presence of objects in the images and the sentiment that these images hold. The prediction accuracy for ConceptVec-15k, Visual Sentiment and a Late Fusion will be examined.

In [15] the authors show how visual and textual features are complemen-tary for predicting popularity. The accuracy of post popularity prediction with multimodal features will also be examined. The combination of fea-tures creates an overview of the current state of the art prediction methods on the data set at hand. It also gives a comparison to the prediction on a brand level that will be described shortly.

Experiment 3: Brand-level popularity prediction. Experiment 2 predicts popularity on a post-level. This experiment will focus on the brands that posted the content. The proposed method aggregates the features ex-tracted from the posts of each brand into a brand-level representation. The main idea of this novel approach is to perform support vector regression with the Spatio-Temporal Brand Representation of features. The weighted ag-gregation of the features, by using importance parameters that are based on the popularity time series, will give a feature representation of the brands. The aggregation gives insight in how all the brand’s post combined are

(27)

rep-resented in features. The experiment performs the brand-level prediction and shows the importance of incorporating the temporal dimension into the model. The number of likes corresponding to the posts of each brand are used to construct an average brand popularity. The prediction is performed following the same approach as Experiment 2. Three di↵erent methods for creating the brand-level feature representations are compared: A weighted aggregation based on the predicted popularity, the Spatio-Temporal Brand Representation and a crude aggregation by average pooling of the features. Experiment 4: Brand ranking prediction: Experiment 2 and 3 are popularity predictions focusing on the Instagram popularity, rather than the brand rankings that represent overall brand value. For a part of the data set these brand rankings are available. The same brand-level features will now be used for brand ranking prediction. The experiment is an examina-tion, similar to Experiment 3, of the features to predict overall brand value. Accurate brand ranking prediction forms a basis for explaining how glob-ally successful and unsuccessful brands use social media and Instagram in particular to enhance their brand. Unfortunately, the sparsity of the brand rankings makes prediction with support vector regression highly unstable. Therefore, the brands are categorized based on their rankings. The predic-tion will now be a multi-class classificapredic-tion where the brands are classified into one of the four categories. The accuracy reported will be a percent-age of correctly classified brands in test phase. Similar to Experiment 3, the Spatio-Temporal Brand Representation will be tested against average pooling of the features.

(28)

5

Results

In this section the results of the experiments will be discussed. First an investigation of possible temporal patterns in the data set and second the popularity prediction on a post level followed by the popularity prediction on a brand level. The results for the brand ranking prediction are discussed at the end of this section.

5.1 Experiment 1: Investigation of temporal patterns in brand popularity

Figure 5: Cluster centroids after performing KSC clustering on 431 brands

Figure 4 shows the cluster centroids after the KSC clustering of the 431 brands and their weekly popularity on Instagram covering the past year. The centroids of the clusters give insight in the temporal patterns the brands within each of the clusters follow. The clustering shows the 4 distinct popularity patterns the brands follow. There is barely a

(29)

correla-tion8 noticeable between the categories and the cluster assignment, however the highly ranked brands are somewhat more present in the cluster with the downward pattern and the lowly ranked brands are more present in the cluster with the upward pattern. Even though there is very little correlation and there is no specific pattern to di↵erent categories, the distinct patterns the brands follow do suggest importance of incorporating the temporal di-mension. Because popularity of brands di↵ers within certain time frames, this pattern is likely due to more factors than the posts or brands alone and should therefore be taken into account for brand-level prediction.

5.2 Experiment 2: Post Popularity Prediction Using Multi Modal Features

The results of Experiment 2 are presented in Table 3. Spearman’s rank cor-relation is reported for the prediction accuracy using the di↵erent features in support vector regression. Starting with the textual features, Table 3 shows a rank correlation of 0.346 for Word2Vec and 0.026 for textual senti-ment. The Word2Vec representation of the caption and hashtags of a post have significantly more predictive power than the textual sentiment. The Word2Vec method is a deep neural network trained on Instagram specifi-cally and accurately captures the nature of the caption and hashtags of the posts. The textual sentiment does not seem to capture this very well. A late fusion9 over the rank after prediction of the two features leads to a lower

rank correlation than Word2Vec alone and it would therefore be better to just use Word2Vec.

Table 3 shows a rank correlation of 0.254 for the ConceptVec-15k repre-sentation. For an optimal and efficient support vector regression only the top

8Support vector regression of rankings on distances to cluster centroids gets Spearman’s

rank correlation 0.02

9Two methods of late fusion were explored. The first is a simple average over the

di↵erent features and the second is a weighted average. The latter turned out to be most e↵ective for all three combinations.

(30)

Table 3: Analysis of di↵erent visual and textual features on post popularity prediction.

Textual Features Visual Features Multimodal

W2V Sentiment Fusion Concept Sentiment Fusion Fusion

0.346 0.026 0.345 0.254 0.313 0.348 0.416

fifteen hundred concepts were selected for the feature representation. The concepts present in images hold information on the popularity of a post. People may respond to concepts in images positively or negatively and a post that contains those popular concepts generally attains more likes. A similar explanation holds for the sentiment within an image. The adjective-noun pairs present in images capture the sentiment that of an image in a similar fashion as the ConceptVec-15k. The rank correlation for the visual sentiment features is 0.313. This is higher than the concept representation and therefore suggests that a sentiment within in an image is more impor-tant than a particular concept. Late fusion of the object and the sentiment features leads to a rank correlation of 0.348. The results suggest these fea-tures are complementary and demonstrate that concept and ANP’s have high impact in the popularity of brand posts on Instagram.

Table 3 shows that when considering the individual features for pop-ularity prediction, Word2Vec is the most e↵ective. The e↵ectiveness of Word2Vec can be explained by the fine tuning on an Instagram database. The combination of visual features has an impact of similar size. In [15], Mazloom et al. show that visual and textual features are complementary for post popularity prediction on their Instagram data set. The rank corre-lation for multimodal fusion in Table 3 confirms this. There is a significant increase in rank correlation when combining the textual features with the visual features. The late fusion of Word2Vec, ConceptVec-15k and Visual sentiment achieves the best performance: 0.416.

(31)

Table 4: Experiment 3: Analysis of visual and textual features on brand popularity prediction.

Textual Features Visual Features Multimodal

Aggregation W2V Sentiment Fusion Concept Sentiment Fusion Fusion

Weighted* 0.505 0.000 n/a 0.291 0.387 0.371 0.522

STBR** 0.539 0.000 n/a 0.282 0.409 0.393 0.553

Crude 0.540 0.000 n/a 0.337 0.424 0.421 0.563

*weights based on predicted popularity **Spatio-Temporal Brand Representation

5.3 Experiment 3: Brand-Level Popularity Prediction

For the brand-level prediction all posts of 431 brands are aggregated into a brand-level representation. Table 4 shows the results of brand-level pop-ularity prediction using the brand features. Spearman’s rank correlation is reported for the prediction using di↵erent features.

Shown in Table 4 are the results brand popularity predictions for three di↵erent methods of aggregation for the feature representation of the brands. The first row are the results for the aggregation based on the predicted popularity for each post of a brand. The second row are the results for the aggregation based on the importance parameters. The third row are the results after average pooling the features corresponding to each post corresponding to a brand.

The textual sentiment features are not a good representation of the brands. The rank correlation of the predicted popularity and the ground truth popularity in test time is very volatile around an average of zero. Word2Vec has a high impact on the brand popularity with a rank correla-tion of 0.505, 0.539 and 0.540 for the di↵erent aggregacorrela-tions. The Word2Vec representation holds a lot of information on a brand’s popularity. No fusion of the textual features is performed as the textual sentiment features have no impact on the popularity.

(32)

The weighted representations of the ConceptVec-15k reach a rank corre-lation of 0.291 and 0.282. The average pooling representation reaches 0.337. The sentiment features have more impact, following the trend in Experi-ment 1. The SentiExperi-ment features lead to rank correlations of 0.387 for the predicted popularity aggregation, 0.409 for the importance parameter aggre-gation and 0.424 for the average pooling. To test whether these features are complementary late fusion is performed. The results show that for the visual features, unlike on post-level, this is not the case. The rank correlation after late fusion is lower than the Visual Sentiment features alone.

The results for the multimodal fusion shows that for all three aggre-gations the textual and visual features are complementary. The features combined reach a rank correlation up to 0.563. The results show that in-corporating the temporal dimension into the weighted aggregation lead to better results as rank correlation goes from 0.522 to 0.553 after fusion of tex-tual and visual features. However, when no weights are incorporated and average pooling is performed the prediction is even more accurate. To con-trast the results of post-level prediction with the brand-level prediction it is shown that the brand-level prediction achieves a better performance. An ex-planation could be that the features on the brand-level are more distinctive and therefore have a higher explanatory power over the popularity.

5.4 Experiment 4: Brand Ranking Prediction:

The results of support vector regression on the brand rankings are unstable and insignificant and therefore only multi-class classification on the brand ranking categories is performed. Only the weighted aggregation based on the importance parameters is used as it has shown to be more e↵ective than weighted aggregation based on predicted popularity. Table 5 shows the prediction accuracy for multi-class classification for predicting brand ranking categories.

(33)

Table 5: Experiment 4: Analysis of visual and textual features on brand category prediction.

Textual Features Visual Features Multimodal

Aggregation W2V Sentiment Fusion Concept Sentiment Fusion Fusion

STBR 38.8% 30.1% 37.5% 42.3% 31.1% 40.4% 43.3%

Crude 37.6% 29.4% 35.9% 42.8% 31.9% 40.8% 43.1%

Word2Vec representation. The Textual Sentiment features correctly predict the brand category 30.1% and 29.4% of the time. The Word2Vec represen-tation of a brand holds information on brand’s usage of captions which tells about their brand ranking compared to the other brands. For the sentiment this is very limited, similar to Experiment 2 and 3. The late fusion of the textual features does not lead to an increase in accuracy, which is due to the limited predictive power the textual sentiment has over brand ranking. The textual features in the STBR lead to a slightly more accurate prediction.

The visual features on a brand level lead to classification accuracies of 42.3% and 42.8% for ConceptVec-15k and 31.1% and 31.9% for Visual Sentiment. The aggregation of concept presence achieves highest accuracy. A possible explanation for this is that the aggregated concept presence in brand’s images distinguishes a brand from other brands, more so than the other features. The concepts present in posts are likely to correspond to the product of a brand. High probability for a car corresponds to Ford or Chevrolet rather than McDonald’s or Nike. The Visual Sentiment reaches a similar accuracy as the Textual Sentiment, which means that it does not hold particular information on a brand and it’s ranking. The Visual Senti-ment has more predictive power over the popularity of an image than over the ranking of a brand. Late fusion of the visual features does not lead to an increase in accuracy, probably due to the lack of predictive power of the Visual Sentiment features. In contrary to the textual features the crude

(34)

aggregation leads to a slight increase in prediction accuracy.

The late fusion of the visual features and the late fusion of the tex-tual features don’t lead to improvement in the classication accuracy. The Sentiment features, textual as well as visual have very little explanatory power over the brand ranking categories. For the multi modal fusion dif-ferent combinations were examined and the highest accuracy was achieved when combining only the Word2Vec and the ConceptVec-15k by late fusion. The accuracy of classification with a late fusion of these textual and visual features is 43.3% for the weighted aggregation and 43.1% for the crude ag-gregation. For the prediction of brand ranking categories the textual and visual information available are complementary as well. The di↵erent meth-ods for aggregation show very similar results, the di↵erence of 0.02% is to be neglected.

From the results of the di↵erent experiment it can be concluded that there are distinct patterns in brand popularity but they do not reveal spe-cific temporal composition of brand ranking categories. The results show how Word2Vec is most accurate for popularity prediction on the post- as well as the brand-level. The textual and visual features are complementary and a late fusion leads to the highest prediction accuracy. Incorporating the temporal dimension leads to an increase in accuracy when comparing the methods of weighted aggregation. However, an average pooling has shown to be most accurate of brand-level popularity prediction. The brand ranking category classification confirms complementarity of the visual and textual features as well. The Spatio-Temporal Brand Represenation reaches to a higher accuracy than average pooling. However, the accuracies of classi-fication show that the information extracted from the content generated by brands have very limited predictive power over the brand rankings and categories in the data set.

(35)

6

Conclusion

The thesis was set out to investigate how visual and textual features impact the popularity of content posted by brands on Instagram. It has sought for a representation of the features on a brand-level and investigated prediction of brand popularity and overall brand value by utilizing the brand-level features. In contrast to previous research that has focused primarily on post-level prediction from a cross-sectional perspective. The thesis sought answer to the research question: Is it possible to predict the spatio-temporal popularity of brands examining posts?

The results in the previous chapter show the prediction accuracy for the di↵erent features on the post- and brand-level. Spearman rank correla-tion shows the impact of the di↵erent features on the popularity of a post. Word2Vec has highest impact on the popularity, whereas Textual Sentiment barely has an impact. The Word2Vec feature representation is fine-tuned on an Instagram database and is therefore most e↵ective. Late fusion of Word2Vec, ConceptVec-15k and Visual Sentiment confirms complementar-ity of the textual and visual features.

K-Spectral Centroid clustering has shown there are four distinct pat-terns of brand popularity over the past year. It shows that posts related to brands are more popular during certain time frames and shows the im-portance of the temporal dimension of popularity for the Spatio-Temporal Brand Representation. This thesis emphasized the brand-level prediction while incorporating the temporal dimension of popularity by constructing importance parameters. The importance parameters capture the popular-ity of a brand over time and emphasize on the feature representations of posts that are more popular across and within weeks. The Spatio-Temporal Brand Representation forms a solid foundation for brand-level popularity and brand ranking prediction. The results show that the brand-level pop-ularity prediction leads to an accuracy that is even higher than post-level prediction. The prediction using feature representations constructed with

(36)

the importance parameters incorporating the temporal dimension is more accurate than aggregation based on the predicted popularity instance based. However, an average pooling of the features turned out to be most accurate for brand-level popularity prediction which suggests that the novel approach has not yet proven importance of incorporation of the temporal dimension. Elaborating on the success of post- and brand-level popularity predic-tion, prediction of brand rankings was attempted. The accuracy of classifica-tion of the brand ranking categories is little, however not insignificant. This suggests that features extracted from the posts on Instagram have limited explanatory power over the brand ranking categories. The ConceptVec-15k was the most accurate individual feature as it is most distinctive for brands. Complementarity of textual and visual features is confirmed again with the brand ranking category classification, as the late fusion reached highest ac-curacy.

The main contribution of this thesis is the prediction on a brand-level. This thesis has shown to be e↵ective in predicting brand-level popularity. The results show that the average pooling of the features construct a repre-sentation that is most accurate in prediction of brand-level popularity. The temporal composition needs to be further investigated to be more accurate than average pooling. The brand rankings have been constructed by combin-ing top 100’s from di↵erent sources, which is rather limited. It is important to find accurate rankings that can be explained by a brand’s Instagram or social media activity. For brand ranking prediction to be accurate more applicable rankings are needed.

Not only has this thesis proven to e↵ectively predict post-level popular-ity, it has shown how level popularity prediction, by using a brand-level representation of the features, can be performed even more accurately. The proposed method shows promise and has laid groundwork for further research in overall brand value and brand’s social media usage.

(37)

References

[1] Damian Borth et al. “Large-scale visual sentiment ontology and de-tectors using adjective noun pairs”. In: Proceedings of the 21st ACM international conference on Multimedia. ACM. 2013, pp. 223–232. [2] Spencer Cappallo, Thomas Mensink, and Cees GM Snoek. “Latent

fac-tors of visual popularity prediction”. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. ACM. 2015, pp. 195–202.

[3] Jia Deng et al. “Imagenet: A large-scale hierarchical image database”. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE. 2009, pp. 248–255.

[4] Rong-En Fan et al. “LIBLINEAR: A Library for Large Linear Classifi-cation”. In: Journal of Machine Learning Research 9 (2008), pp. 1871– 1874.

[5] Nikhil Garg and Ingmar Weber. “Personalized, interactive tag recom-mendation for flickr”. In: Proceedings of the 2008 ACM conference on Recommender systems. ACM. 2008, pp. 67–74.

[6] Francesco Gelli et al. “Image popularity prediction in social media us-ing sentiment and context features”. In: Proceedus-ings of the 23rd ACM international conference on Multimedia. ACM. 2015, pp. 907–910. [7] Peter N Golder and Gerard J Tellis. “Will it ever fly? Modeling the

takeo↵ of really new consumer durables”. In: Marketing Science 16.3 (1997), pp. 256–270.

[8] Liangjie Hong, Ovidiu Dan, and Brian D Davison. “Predicting pop-ular messages in twitter”. In: Proceedings of the 20th international conference companion on World wide web. ACM. 2011, pp. 57–58.

(38)

[9] Brendan Jou et al. “Visual a↵ect around the world: A large-scale mul-tilingual visual sentiment ontology”. In: Proceedings of the 23rd ACM international conference on Multimedia. ACM. 2015, pp. 159–168. [10] Aditya Khosla, Atish Das Sarma, and Ra↵ay Hamid. “What makes an

image popular?” In: Proceedings of the 23rd international conference on World wide web. ACM. 2014, pp. 867–876.

[11] Alex Krizhevsky, Ilya Sutskever, and Geo↵rey E Hinton. “Imagenet classification with deep convolutional neural networks”. In: Advances in neural information processing systems. 2012, pp. 1097–1105. [12] Yann LeCun and Yoshua Bengio. “Convolutional networks for images,

speech, and time series”. In: The handbook of brain theory and neural networks 3361.10 (1995), p. 1995.

[13] W Glynn Mangold and David J Faulds. “Social media: The new hybrid element of the promotion mix”. In: Business horizons 52.4 (2009), pp. 357–365.

[14] Masoud Mazloom, Xirong Li, and Cees Snoek. “TagBook: A Semantic Video Representation without Supervision for Event Detection”. In: (2015).

[15] Masoud Mazloom et al. “Multimodal Popularity Prediction of Brand-related Social Media Posts”. In: ACM Multimedia Amsterdam, Nether-lands. 2016.

[16] Masoud Mazloom et al. “Searching informative concept banks for video event detection”. In: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval. ACM. 2013, pp. 255– 262.

[17] Philip J McParlane, Yashar Moshfeghi, and Joemon M Jose. “Nobody comes here anymore, it’s too crowded; Predicting Image Popularity on Flickr”. In: Proceedings of International Conference on Multimedia Retrieval. ACM. 2014, p. 385.

(39)

[18] Philip J McParlane, Yashar Moshfeghi, and Joemon M Jose. “On con-textual photo tag recommendation”. In: Proceedings of the 36th in-ternational ACM SIGIR conference on Research and development in information retrieval. ACM. 2013, pp. 965–968.

[19] Tomas Mikolov et al. “Distributed representations of words and phrases and their compositionality”. In: Advances in neural information pro-cessing systems. 2013, pp. 3111–3119.

[20] Marten Risius and Roman Beck. “E↵ectiveness of corporate social media activities in increasing relational outcomes”. In: Information & Management 52.7 (2015), pp. 824–839.

[21] Bruns Stieglitz Dang-Xuan and Neuberger. “Social Media Analytics - An Interdisciplinary Approach and Its Implications for Information Systems”. In: Business Horizons 6.2 (2014), pp. 89–96.

[22] Mike Thelwall et al. “Sentiment strength detection in short informal text”. In: Journal of the American Society for Information Science and Technology 61.12 (2010), pp. 2544–2558.

[23] Bo Wu et al. “Unfolding Temporal Dynamics: Predicting Social Media Popularity Using Multi-scale Temporal Decomposition”. In: Thirtieth AAAI Conference on Artificial Intelligence. 2016.

[24] Jaewon Yang and Jure Leskovec. “Patterns of temporal variation in online media”. In: Proceedings of the fourth ACM international con-ference on Web search and data mining. ACM. 2011, pp. 177–186. [25] Daniel Zeng et al. “Social media analytics and intelligence”. In: IEEE

Intelligent Systems 25.6 (2010), pp. 13–16.

(40)

T ab le 6: B ran d R an k in gs R an k in g R an k O w n er C at egor y B es t G lob al B ran d s In te rb ran d B ran d V al u e B ran d F in an ce US T op 100 B ran d F in an ce B ran d V al u e B ran d Z T op 100 M os t V al u ab le G lob al B ran d s M il lw ar d B ro w n B ran d V al u e F or tu n e G lob al 500 (100) F or tu n e B ran d V al u e F or tu n e U. S . 500 (100) F or tu n e B ran d P er for m an ce G lob al R ep T rak 100 R ep u tat ion In st it u te B ran d P op u lar it y G lob al R ep u tat ion P u ls e -U. S . T op 100 R ep u tat ion In st it u te B ran d P op u lar it y G lob al T op 100 B ran d C or p or at ion s E u rop ean B ran d In st it u te B ran d V al u e T h e W or ld ’s M os t V al u ab le B ran d s F or b es B ran d V al u e T op 100 B ran d Lo ve Li st Ne tB as e S o ci al M ed ia T op G lob al M ean in gf u l B ran d s In d ex Ha vas M ed ia B ran d V al u e T op U. S . B ran d In d ex B u zz R an k in g Y ou G ov B ran d P op u lar it y W or ld ’s M os t Ad m ir ed C om p an ie s F or tu n e B ran d P op u lar it y W or ld ’s M os t R es p ec te d C om p an ie s B ar ron ’s B ran d P er for m an ce

Referenties

GERELATEERDE DOCUMENTEN

Scan gradings were compared using a pairwise nonparametric method (Wilcoxon matched pairs test) to test for statistical difference between the grading of planar imaging alone and

The mechanical design of the vsaUT-II is such that the output stiffness can be varied by changing the transmission ratio between the internal linear springs and the output. The

Overigens kan het ook zijn dat een procedure over een irreële inschrijving niet leidt tot terzijdeleg- ging, maar tot een veroordeling tot heraan- besteding, als de

This will result in the following: firms that outperform the average market of private equity will exceed more impact in the value-weighted index, and so the index of return of

Comparison of DSM-5 criteria for persistent complex bereavement disorder and ICD-11 criteria for prolonged grief disorder in help-seeking bereaved children.. Boelen, Paul A.;

Die waarneming dat trekarbeid ootstaan bet beide as gevolg van die goodmyne se vraag na arbeid as die interne dinamiek van die Swart gemeenskappe (p. 30), dill op

Fashion Nova Fashion Nova Louis Vuitton Louis Vuitton Tesla Tesla Consumer Brand Consumer Brand Consumer Brand Product quality Service quality Product quality