• No results found

What musical features make popular music popular?

N/A
N/A
Protected

Academic year: 2021

Share "What musical features make popular music popular?"

Copied!
52
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Amsterdam

WHAT MUSICAL FEATURES MAKE

POPULAR MUSIC POPULAR?

Albertine Holterman

Supervisor: dr. J.A. Burgoyne

19 July 2016

University of Amsterdam

Faculty of Humanities

(2)
(3)

Table of contents

Abstract...2

1. Introduction...3

2. Background and related work...5

2.1. Charts and popularity...5

2.2. Emotion and preference...10

2.3. Popularity of music based on non-musical features...13

2.4. MIR studies and hit song science...15

2.4.1. Popularity of music based on musical features...18

3. Study: assessing music’s popularity based on musical features...22

3.1. Method...22

3.1.1. Song selection...22

3.1.2. Procedure and materials...24

3.1.3. Data analysis...25

3.2. Results...25

3.2.1. Gini coefficient...30

3.2.2. Top 100 chart-ranking and popularity...30

3.3. Discussion...32 4. General discussion...36 4.1. Limitations...40 5. Conclusion...43 References...45 Websites...48 Software...48 Appendix I...49

(4)

Abstract

This thesis investigates popular music. More specific, it asks whether there are certain musical features that account for differences in popularity between songs. Based on a framework of existing literature on popular music ‘prediction’ and music information retrieval, a study was conducted using data from the Dutch Top 2000, from which 90 songs were selected, varying in the number of years the songs had spent in the Top 2000. All selected songs had been in the annual Dutch top 100 as well, and were released between 1999 and 2008. Making use of the FANTASTIC toolbox (Müllensiefen 2009) to extract musical features from the melodies of these songs, linear regression suggested a model (p < .05, R2 = .09) including two components that were significant predictors of the amount of time a song spent in the Top 2000, and thus how popular a song was over the years . These

components related to a relative complexity in the rhythm (p = .12) and use of pitch (p = .02). In accordance with existing literature, it was also found that the average rank in the annual top 100 had a significant effect on the amount of years in the Top 2000 (p < .01, R2 = .09) suggesting that a high ranking in the top 100 enhanced a song’s life in the Top 2000 as well. Combining both the two musical components and the average top 100 rank resulted in the best model (p < .001, R2 = .17).

(5)

1. Introduction

What is the difference between ‘Bohemian Rhapsody’ or ’Bohemian like you’ (by Queen and Dandy Warhols, respectively)? The easiest answer: a difference of roughly 1999 chart positions in between the two songs, in the Top 2000. ‘Bohemian rhapsody’ was number two in the Dutch Top 2000 of 2015, and has been the most common number one song of the Top 2000: it was number one 13 out of the total of 17 times.1 ‘Bohemian Like You’ is number two-thousand.2 What accounts for such a big difference in popularity of songs? Is it the artist, the music, or because of the other people who like it? Why do we prefer some music over other music? This thesis is aimed at investigating the musical aspect of popularity.

Since the 1950s, when certain forms of music (popular music) became widely available for the masses, there have been charts to keep track of songs’ popularity. These charts show which songs are ‘hot’, which songs are climbing up, and which songs are falling down in the charts. There are top charts of ‘all time’, charting songs over periods in time such as decades, as well as weekly, monthly and annual charts such as a top 10, 20, 40, or 100. There are charts for different genres, and at least every Western country has something like a national chart (Bekhuis et al: .2013). Popularity of songs and artists is a subject that keeps the population interested and intrigued. Music is a consumption product and has been subject to commercialization. The popular music industry is (partly) an industry aimed at making profits, and record companies invest a lot of money in artists and artist-promotion, in order for the music to be as popular (and profitable) as possible (Hong 2012: 1105). It is therefore not without reason that many have tried to find that element or combination of elements that makes a song a hit. The fact that there are several companies selling their ‘hit predicting’ results to record labels shows that the music industry takes (the possibility) of hits predictions serious (Pachet and Sony 2012: 306).

This study contributes to the investigation of popular music. Besides all the aspects that are hard to measure objectively – such as emotions that music evokes – what are the intrinsic qualities of the music that makes certain songs so popular? The main question of this thesis is therefore as follows:

What, if any, are the specific musical features that account

for difference in popularity of songs?

1 See: <http://www.nporadio2.nl/top2000>. 13 June 2016. 2 See: <http://www.nporadio2.nl/top2000>. 13 June 2016.

(6)

The investigation of what makes songs popular is an existing field of study, as will be outlined below. However, the approaches that have been used have been multifold, and only few studies focused on intrinsic music qualities that could account for music’s popularity. In order to answer the research question, a background of studies related to popularity of and preference for music, as well as studies on music information retrieval (MIR) will be given. This provides a theoretical framework for the method chosen for my study in this thesis, which was conducted to answer the research question. Results from this study will then ben compared to existing studies, which will lead to my final conclusion on whether there are any musical features significant in predicting or explaining popularity.

(7)

2. Background and related work

To find out what makes music popular, it is important to look at the human aspect of music. People listen to music and buy records, but for what reasons? Why do we prefer certain music over other, and thus, why is some music more popular than other? What studies have been done so far focusing on popularity, and how is popularity actually measured? This will be outlined below.

2.1. Charts and popularity

There are multiple approaches to investigating popularity, such as focusing on meta-data and chart information. Most popularity studies on music base their popularity criteria on a position of songs in a chart, such as the Billboard Hot 100 (different studies using this chart as criteria will be mentioned in this chapter). The Billboard Hot 100 charts is a chart that computes its songs’ rankings based on sales, airplay, and more recently, music streaming (Wang: 2). However, the concept of what the (Billboard Hot 100) chart contains in terms of music has changed over the years. The Hot 100 chart was first published by Billboard magazine on 4 August 1958 (Giles 2007: 1878). Various charts preceded this chart. Between 1955 and 1958 there was already a Top 100 chart by Billboard, alongside several other charts: ‘Best Sellers in Stores’, ‘Most Played by Jockeys’ and ‘Most Played in Jukeboxes ’. The ‘Most Played in Jukeboxes’ chart consisted only of 20 chart-positions until 1957, when the chart was discontinued, and has eventually been continued again, varying in consisting of 20 to 25 positions. The ‘Best Seller in Stores’ chart varied from a top 25 to a top 50-chart. All these different kinds of charts – comprising different kinds of information – merged into the Hot 100 (Giles: 1878). In addition to the kind of popularity (most played, best sold), there were other criteria for songs as well: until 1998, only singles that were available for purchase were included, so for example ‘album cuts’ were left out. Since 2005 the Billboard Hot 100 also includes digital downloads, but this only contains paid digital downloads (Giles: 1878).

The way in which the Hot 100 is comprised has thus changed over the years, and therefore the way a number one hit was measured has also changed (best sold, most played by jockeys, most played in jukeboxes, (most) digital downloads). This means that the way in which the number one song of the charts has gained that position has not been consistent throughout the decades, resulting in various measures of popularity. As will be described below, several studies on popularity have used the Hot 100 as a guideline or indicator to measure popularity to. However, in the Netherlands, there is specific chart called the Top 2000, which is comprised solely of people’s votes. People are asked to

(8)

vote for songs which they like most of all songs (ever made). This chart thus actually comes closer to an ‘best of all time’ charts. This chart does not calculate the songs’ positions by means of any of the methods that Billboard uses, but bases the songs’ rankings solely on listener’s votes.

The Top 2000 is a yearly phenomenon in Dutch culture. Every year, starting on Christmas Day at 9.00 a.m., the chart of the 2000 most popular hits of all time kicks off, and it ends exactly at midnight on New Year’s Eve. The Top 2000 was first introduced in December 1999 and has since been aired every year between Christmas and New Year’s Eve.3 Almost every year, Queen’s single

‘Bohemian Rhapsody’ has been ranked number one. Other stable top 10 songs are The Eagles’ ‘Hotel California’, Boudewijn de Groot’s ‘Avond’, Deep Purple’s ‘Child in Time’, and Led Zeppelin’s ‘Stairway to Heaven’. The chart is generated by votes of Dutch citizens; the songs that get the most votes get the highest ranking in the Top. The Top 2000 is therefore a good chart representing songs’ popularity over time: all songs in the Top 2000 – even the songs with the lowest rank - can be considered ‘popular’. However, there is a larger difference in positions (1 – 2000) than in a top 100 chart. Moreover, since the Top 2000 is generated every year by counting the amount of votes for every single song, it gives an indication of the long-lastingness of songs’ popularity in terms of a total number of years (in which they occur in the Top 2000).

The news section on the website of the Top 2000 provides statistical information about the vote(r)s, and shows that all ages and all regions in the Netherlands are represented, as well as both males and females. The website provides for example the top 10 of every region, and for every region, most songs in their top 10 correspond to the songs in the national top 10. However, for most regions the top 10 also contains a regional artist (Daniël Lohues for Drenthe for example, or Guus Meeuwis for Noord-Brabant), see figure 1 below.4

Draaisma et al. (2011) provide some statistics of the Top 2000 as well. They show the distribution of the age of voters in 2007 and 2010, along with a prognosis of voters three years later. The prognosis shows that the top does not ‘age’; young voters are represented and grow in number in the three year prognosis. The distribution of voters in 2010 also shows that in all age groups, more males than females vote, and that younger voters (<= 15 to 20) increase in number, whereas there is a decline in the age groups from 21 to 40. In both years, 2007 and 2010, the age groups from 40 to 60 have the most voters, and the amount of voters from 16 to 40 years are approximately equal in number (Draaisma et al.: 33). Regardless of the age differences and the amount of voters for each age group, there is a general trend for voting for ‘old’ songs. The top three of every age group shows this, and it is remarkable that 14 different age groups vote a total of seven different songs in their top

3 See: <http://www.nporadio2.nl/top2000/faq>. 13 June 2016.

(9)

three. See table 1 for details.5

Figure 1. Top 10’s of the regions Drenthe (up) and Noord-Brabant (down) (2015)

Notice the fact that Daniël Lohues, a musician from Drenthe, appears in this top at the 9th position, whereas this song does not appear in any other region’s- nor national top 10 (see column ‘in Top 2000’). The same goes for Guus Meeuwis, a musician from Noord-Brabant, who only appears in the top 10 of this region.

Table 1. Top 3 of 14 different age groups (2015)

This table shows for every age group the top three songs, along with the songs’ positions in the Top 2000. It shows that there is high consensus on the best three songs, the largest difference between older and younger voters are in the votes for ‘Piano Man’, popular among younger voters, and ‘Mag Ik Dan Bij Jou’, popular among older voters. Also, the younger age groups prefer ‘Bohemian Rhapsody’ more than older age groups.

Song title and artist Position in Top 2000

Top 3 of age groups

11-15 16-20 21-25 26-30 31-35 36-40 41-45 46-50 51-55 56-60 61-65 66-70 71-75 76-80 ‘Imagine’ John Lennon 1 3 3 2 1 1 1 1 1 1 1 1 1 2 3 ‘Bohemian Rhapsody’ Queen 2 1 1 1 2 3 - - 2 2 - - - - 2 ‘Hotel California’ Eagles 3 2 - - - 3 - 3 3 3

-‘Mag Ik Dan Bij Jou’ Claudia de Breij 4 - - - 3 2 3 - 3 2 2 1 1 ‘Stairway To Heaven’ Led Zeppelin 5 - - - 2 - - - -‘Piano Man’ Billy Joel 6 - 2 3 3 - - -

(10)

‘Black’ Pearl Jam

12 - - - - 2 2 3 - - -

-Draaisma et al. also provided statistics on the top 20 most-voted-for artists for male and female voters in 2010. This shows that every artist in the male top 10 is at least in the top 20 of the female voters (75). The reverse is true as well, except for one artist: Marco Borsato. The number one and two songs are similar for both male and female voters, for more details see table 2. However, besides the top 20 of artists, which largely overlap for males and females, there are also exclusively “male” or “female” artists: artist that are mainly popular among either males or females. For example, there are at least ten artists for whom the chance that a male votes for them is four or five times as high as the chance a woman votes for them, and the reverse is true as well, see table 3. A similar pattern exists for the most “female” or most “male” songs which shows that there are large difference in chances that voters of each gender votes for those titles (Draaisma et al.: 65-84). The statistics on the Top 2000 website show a similar pattern for 2015: there are differences in artists and songs that men and women vote for. There seems to be a tendency that men specifically vote for ‘music with guitars’, and that women have a preference for Adele, see figure 2.6

Another interesting thing to note is that the Top 2000 keeps renewing; it does not only contains old songs, or songs from the youth of the largest age-voting group, but also newer songs. Newer songs (from the 1990s and 2000s) grow in number throughout the years in the Top 2000 from the start of the Top 2000 in 1999 onwards.

Even though the Top 2000 is a (Dutch) chart compiled by a Dutch public, it contains a lot ‘foreign’ artists and songs that are not in Dutch language (mostly English). For songs with Dutch lyrics, the amount increases somewhat starting in 2002, with a peak in 2007, and then declining again. Throughout the years, approximately 25 percent of the songs in the Top 2000 have lyrics in Dutch (Draaisma et al.: 155).

This sections shows that the Top 2000 provides good information for conducting my research on: it has songs with varying degrees of popularity, the songs are ranked by a public that contains younger and older males and females from every region in the Netherlands. The Top 2000 exists for 17 years in a row now, and chart information of every year is available online. But how do people decide for which songs they vote, what shapes a preference for songs? The next section will discuss non-musical aspects that play a role in preference for music.

(11)

Table 2. The top 20 of most voted for artists by males and females (2010)

The top two artists are the same for both males and females. The first half of the top 20 (1-10) is almost completely comprised of the same artists, except for Marco Borsato for females, this artist does not even occur in the male top 20, and Dire Straits for males, but this song does occur in the female top 20. However, for the second half of the top 20 (11-20), there are more differences in artists for males and females.

Males Females Ranking males gave Artist Ranking females gave Ranking females gave Artist Ranking males gave

1 The Beatles 1 1 The Beatles 1

2 Queen 2 2 Queen 2

3 The Rolling Stones 9 3 Boudewijn de Groot 8

4 Pink Floyd 14 4 ABBA 10

5 U2 5 5 U2 5

6 Eagles 6 6 Eagles 6

7 Dire Straits 15 7 Coldplay 9

8 Boudewijn de Groot 3 8 Michael Jackson 16

9 Coldplay 7 9 The Rolling Stones 3

10 ABBA 4 10 Marco Borsato 35

11 Bruce Springsteen 12 11 Anouk 37

12 Deep Purple 36 12 Bruce Springsteen 11

13 Led Zeppelin 47 13 Blof 31

14 Golden Earring 27 14 Pink Floyd 4

15 Creedence Clearwater Revival 38 15 Dire Straits 7

16 Michael Jackson 8 16 Simon & Garfunkel 26

17 Metallica 42 17 De Dijk 28

18 Elvis Presley 22 18 Acda & De Munnik 43

19 Bob Dylan 41 19 Guus Meeuwis 52

20 Doors 44 20 Robbie Williams 64

Table 3. Most male and most female artists (2010)

These top 10’s of most male/female artists represent artists that are only or mostly voted for by either males or females, as the ratios show.

Most male artists Most female artists

Artist Ratio votes

male/female

Artist Ratio votes

(12)

Kraftwerk 5,0 James Morrison 5,0

ZZ Top 4,0 Kelly Clarkson 5,0

Allman Brothers Band 4,0 Spice Girls 5,0

UK 4,0 Take That 5,0

Spencer Davis Group 4,0 Weather Girls 4,0

Ten Years After 4,0 Maria Mena 4,0

XTC 4,0 Paul de Leeuw & Ruth Jacott 4,0

Canned Heart 4,0 Alessandro Safina 4,0

Alquin 4,0 Chaka Khan 4,0

Propaganda 4,0 Carel Kraayenhof 4,0

2.2. Emotion and preference

Generally speaking, there are thought to be three types of variables that have an impact on musical preference: features of the musical stimulus, characteristics of the listener, and situational or environmental factors (Nunes et al. 2015: 187). The extent to which these three different types of variables have an effect on shaping the preference is a hard to measure, and differs across individuals. Situational and environmental effects of music can be related to autobiographic memories,

collective memories, and exposure to music. A lot of music that is ‘meaningful’ in some way is attached to certain emotions and memories (Van Dijck). Van Dijck’s study is actually focused on the Top 2000 and she reviews comments posted by listeners and voters on the Top 2000 forum. A lot of comments describe how people felt when listening to those songs, and which feeling and emotions came back, such as feelings of nostalgia. Emotion and memory are related to each other, and emotion triggered by music often enhances the memory for music. Schulkind et al. did a study (1999) on the memorability of songs for older and younger adults and they found that emotion played an important role in memory, especially for older participants. They found that the recall of specific memories declined over the years, but that emotionality ratings did not decline. The effect of emotion was remarkable in enhancing the retrieval of information and autobiographical memory related to the song (Schulkind et al.: 952). Feelings of nostalgia might also be associated with preference, there is a

(13)

tendency that people generally prefer music from their youth (also known as the ‘reminiscence bump’), and that music from one’s youth might influence one’s musical preference in the future (Draaisma et al.: 17-18, 188-189).

Music is also thought of as having certain emotional characteristics, but there is no specific set of emotions that generally enhances people’s memories or likings for music. Which emotions work enhancing on preference differs across individuals (Platz et al. 2015: 16). A neurophysiological explanation for musical preference relates the ‘rewarding’ experience of listening to music with the effect this has on our brain; differences in individual levels of arousal explain the differences of the effects that music has on our brains, this might in turn be an explanation of individual differences in preference for music (Greasley et al. 2013: 403).

Comparable to associations we have with certain smells are associations, or memories, that are attached to or triggered by certain (musical) sounds. The neural systems that encode these sounds that trigger memories supply feelings of familiarity (Coad 2016: 2). Familiarity has an effect on the liking of music. Increased exposure to a stimulus enhances liking (Nunes et al.: 190). Even though there is no consensus on the cause of this exposure effect, one likely explanation is that a prior exposure enhances a stimulus to be processed more easily when it is subsequently encountered. Repeated exposure thus enhances fluency and greater fluency ensues greater positive affect, which in turn results in greater liking (Nunes et al. 190). Effects of exposure and familiarity have been widely studied in music: in a study where students had to evaluate songs on the radio, familiarity turned out to be a driver of music choice (Ward et al. mentioned by Nunes et al.: 190). Also, people turn out to be more familiar with higher chart-ranked music, and people tend to listen most of the time to music they have already heard before (Nunes et al.: 190). However, if familiarity turns into ‘overfamiliarity’ the effect on liking decreases (Greasley et al.: 403). Besides familiarity, other musical characteristic such as tempo and complexity also have an influence on individual preferences for music (Greasley et al.: 403; Lee and Lee 2015: 3).

Rentfrow et al. (2011) studied preference for music, looking at the influence of psychological

characteristics of individuals. They investigated this by looking at which genres clustered together on different musical preference-factors and which characteristic and social attributes could be attached to these genres. They found a model containing five music-preference factors: Mellow,

Unpretentious, Sophisticated, Intense, and Contemporary, in short ‘MUSIC’. These factors could be characterized in terms of musical genres. For example, classical, jazz, or world music genres fall within the ‘Sophisticated’ factor; rock, heavy metal, or punk genres fall within the ‘Intense’ factor. There were, however, also genres that loaded on more than one music-preference factor, indicating that the preference factors not seem to capture preference for merely genres. They suggested that, because music varies on a range of features such as tempo, instrumentation, and mood, individuals might just

(14)

as well have a preference for certain specific music attributes such as sad-sounding music. Therefore, they tested whether the five preference factors also reflected preferences for certain attributes. Their assumptions were met; results showed that the five factors indeed have unique musical and

psychological features. For example, people with a preference for Sophisticated music like the music to be thoughtful, complex, quiet, relaxing and inspiring (for more details about the music and

attributes see Rentfrow et al.) The conclusion that can be drawn from this study is that people indeed seem to have a preference for certain musical characteristics, not per se certain genres, even though there are five clearly distinguishable dimensions in music preference.

Other studies exploring the relationship between personality traits and music preference suggested that extraversion is related to preference for high arousing music (like heavy metal and rock), and in contrast, lower levels of extraversion leads to a preference for softer forms of music (Greasley et al.: 404). Highly intuitive people had a reportedly greater preference for classical, jazz, soul, and folk music (Greasley et al.: 404). Music preference also tends to be shaped by social influences (Greasley et al.: 405; Lee and Lee: 3). Based on what others think, individuals form an opinion. It has been shown that individuals’ likings for songs is influenced by knowing that a song is a hit or preferred by others (Pachet and Sony 2012: 309). In a study by Salganik et al. this phenomenon was investigated. Individuals had to rank songs without any information – independent of anyone’s opinion. Songs that were highly ranked by these individuals were then ranked by a group of people. This group was presented with the ranking information from the individuals. It turned out that the songs that were highly ranked by the individuals were then also highly ranked by the group. This study proved the strength of the social signal. Namely, in similar experiments, the results are never replicable, suggesting that the songs that become hits are really based on the ‘early arriving

individuals’; the individuals that first rank or prefer a song influence ‘followers’ (study by Salganik et al. in Pachet and Sony: 310).

Greasley et al. notice that such preference studies focus on fixed genre-ideas and ask for the likings for different genres. However, studies never investigate the music individuals prefer in terms of their own music collections. What shaped their musical taste and why do they prefer the music they have, over other music? Greasley et al. conducted interviews with adults in their home setting, asking about their favorite songs, meanings and memories attached to those songs and their music

collections. Analyzing all the answers, they found that generally, people have an omnivorous musical taste, meaning that most people do not like one particular genre, but multiple, different genres (419). Eventually, certain preferences for musical characteristics even differ for the same individual between musical styles, as well as that the aspects that shape the preference can be different for different songs and artists (419). This finding implies ‘that any comprehensive explanation of why an individual

(15)

likes a specific piece of music must break down that piece into its various components to discover exactly what is preferred and why’ (Greasley et al.: 419).

2.3. Popularity of music based on non-musical features

Not only has the personal aspect of preference for music been studied, but also the overall popularity of certain songs (across genres). Bradlow and Fader (2001) explored a model to explain the way songs move up and down in charts, and what kind of curve-shapes can be modeled after the songs’ lives in the charts. They made use of data from the Billboard Hot 100 charts. Their findings suggest that many popular songs leave the charts fast once they leave the top 40. So, a drop out of the top 40 leads to a fast drop out of the whole top 100 as a consequence. However, eventual popularity outside of top 100 is not taken into account. The drop in popularity can be due to the fact that music is subject to fashion (Bradlow and Fader: 377, 379). Bradlow and Fader found as a result of their study that the lives of songs from popular artists benefit in three separate ways: ‘a faster rise through the chart, a higher peak position, and overall longer life on the chart’ (378).

Another study focusing on the lifespan of songs found that the life of a song was increased with the starting position in the charts, suggesting that initial marketing of an album is useful tool for gaining popularity (Chon et al. cited in Pachet and Sony: 310). Giles investigated songs’ popularity by means of analyzing the survival characteristics of songs. He specifically looked at number one-hits in the Billboard Hot 100, from 1955 to 2003. Based on this analysis, he found that ‘a number one hit’s ‘life-at-the-top’ was enhanced significantly if it was recorded by a female solo artist, if it was an instrumental piece, or if it was able to ‘bounce back’ for a second spell’ (1883).

However, Hong commented on this study by Giles by means of an ‘improved’ study (2012). He found that the dataset used by Giles contained a number of errors (such as labeling a band as a male solo artist) (1101-1102). In order to obtain more reliable results, Hong corrected for these errors. Based on this new dataset he provided extra information about the survival characteristics of the number one songs (1101). This led to the conclusion that – as a result of the corrections – there were several changes in the survival characteristics; the significance of the instrumental pieces disappeared, but in addition, a number one song’s life at the top was also enhanced significantly when it was recorded by black artists, when the PRIME RATE is stable or decreasing, or when the GDP is growing. (Hong: 1105). The PRIME RATE and GDP variables indicate that when economic conditions are better, a number one-hit benefits from this in terms of more weeks in which it is ranked as number one. This suggests that record companies would invest more money in the promotion of a song or album, when economics are better, but also that economic stability and growth benefits the stability in the chart.

(16)

of users in a large online music service system (5917). They looked at the distribution of the activity and popularity, and found that both followed a stretched exponential form and can be well fitted by SED, and that the distribution of two consecutive listening to songs show a fat tail feature (5917-5920). This means that in general, the popularity of a song will follow a ‘rise-and-ebb’ process, relating to the song being in- and out of fashion (5919). They also found that some songs showed extremely popularity, but this is due to frequent visits of a few individuals (5918). Their findings are in line with what Sangalik et al. (mentioned in Pachet and Sony) found: a song becomes popular by initial ‘innovators’. The innovators ‘dare’ to listen to new songs, independent from what other individuals do. Eventually, they influence their followers, leading to the fact that the followers will also listen to the new songs, which in turn results in a growing popularity of those songs (5920). So, online popularity is dependent on the initial innovators who listen to new songs, as well as the natural ‘eb-and-flow’ process of popularity.

A somewhat different popularity study was conducted by Bekhuis et al. They studied popularity of artists and songs based on their nationality and globalization. They looked at the top 100 charts of nine Western countries (including Germany, Dutch-speaking Belgium, the Netherlands, the UK, and the US). They found several trends in popularity regarding nationality, such as the initial declination of the popularity of domestic music until the late 1980s or early 1990s, but an increase in popularity of domestic music since then (10). Since the early 1990s, domestic music in Western countries revived. This process of revival seemed to relate to globalization, in that sense that globalization leads to a higher popularity of domestic artists singing in English, but also to a higher popularity of foreign artists (11). When a country has a high sentiment of national pride, there is a higher popularity of domestic artists, as well as artists singing in the country’s language (11).

2.4. MIR studies and hit song science

The studies mentioned above base their conclusions of what leads to a song’s popularity on extra-musical factors, such as looking at the songs’ lives in the charts. ‘Hit song science’ is a field of study that bases its popularity predictions of music on actual musical features. The way in which musical features can be extracted an analyzed varies across methods and programs, and is called music information retrieval (MIR). Studies in MIR have focused on computing models and programs to extract musical features explaining differences in music, as well as classifying music. Ever since the growth of online musical databases, accurate tools for MIR have become an important topic in computer science. There is a diversity in machine learning algorithms, theories, formats, (symbolic) data, and tool(boxe)s (Corrêa and Rodrigues 2016: 190). To compute information about music, musical content needs to be represented in some way. There are two different types of musical content in MIR: audio-recorded content, which is the actual acoustic representation of music derived

(17)

from the sampling of wave-form, and symbolic content, which offers high-level music representation. Symbolic content represents music in terms of instructions, i.e. information, about pitch, duration, key, rhythm, tempo, etc. (Corrêa and Rodrigues: 191). Symbolic data descriptors are often divided into three main groups: pitch, timbre, and rhythm. These descriptors are referred to as high-level descriptors because they ‘take into account a higher-level abstraction of music (the notes) instead of audio samples’ (Corrêa and Rodrigues: 192). Audio content, which is used for classical MIR

techniques, is based on low-level signal features, and this representation might be too far off from the human perception of music, whereas symbolic descriptors would be closer to human perception (Kaminskas and Ricci 2012: 115).

Both types of contents are used in MIR studies. The ways in which information about music is computed differs across studies, both the method/toolbox, as well as the kind of musical content (symbolic or audio). For example, MIR studies are conducted using symbolic data to investigate whether there are musical differences in popular music over the decades (see Serrà et al. 2012) and whether there are different characteristics of music in different genres (see Lowe 2015). One musical dataset that is often used in such studies is the Million Song Dataset. This set contains 464,411 distinct music recordings from 1955 to 2003. The songs have year annotations and audio descriptors, and the songs span a variety of genres (pop, rock, hip hop, metal, etc.). Symbolic data concerning loudness, pitch, and timbre descriptors are provided as well in this dataset (Serrà et al.: 1). These are the symbolic descriptors used for analysis.

Wolf and Müllensiefen (2011) make use of MIR methods to measure similarity in music. They tested several algorithms and found that – even though they only used monophonic melodies – the algorithms provided enough information to measure similarity between songs. Their methods proved to be promising – one algorithm in particular (a spatial account called the Earth mover’s distance) was promising, measuring the pitch intervals and centralized pitch as best features for measuring

similarity.

Van Balen et al. (2013 (a)) use computational methods to investigate structures within songs. They study whether sections in pop songs, specifically the chorus, are musically distinct, and can be structured based on perceptual and audio features (1). To human ears, choruses are easy

recognizable, they are more prominent, more catchy and more memorable than other sections (Van Balen et al. (a): 1). However, existing chorus detection systems have always been based on identifying the most-repeated section of a song (Van Balen et al. (a): 1). Therefore, Van Balen et al. modeled the computation of chorus detection on a list of robust and interpretable features, and did not include repetition as a feature. They model these features on 649 songs form the Billboard Hot 100, from 1958 to 1991 (1-3), which they divide into a total of 7762 sections (such as verse and chorus). The features they used as descriptors for the analysis of the songs to compute the model were loudness,

(18)

sharpness, roughness, MFCC’s, chroma variance, pitch salience, pitch centroid, section length, and section position (2-3). These features resulted in a model that can locate choruses and chorus-like sections, but their proposed model does not reach the same accuracy level as existing modelling techniques that include repetition as a measure. However, their findings do prove MIR techniques are helpful in analyzing music and that there are definitely certain musical features that can help locate choruses (6).

The above mentioned studies show that MIR techniques are promising in extracting information from music, as well as providing us with new information. These studies use different approaches to MIR, and one particular interesting approach I found is a method called ‘FANTASTIC’, which uses symbolic descriptors to compute musical features (Müllensiefen 2009). FANTASTIC is a toolbox developed by Daniel Müllensiefen. In a study by Williamson and Müllensiefen (2012), the toolbox is used in its developing state. The study was aimed at investigating the aspects underlying the occurrence of involuntary musical imagery (INMI, or: earworms). They wanted to find out what triggers INMI experiences in daily life, but also whether there are certain musical features that seem to be evocative of INMI (3). FANTASTIC computes numeric and categorical features as aspects of melodies, as well as relating them to the context of a corpus of melodies. See for more details about the FANTASTIC toolbox the method section in the next chapter. In their study, they used a total of 58 tunes, on which they computed a binary logistic regression using features of the melody such as pitch and rhythm as predictor variables (8). Two variables turned out to be significant predictors; the distribution of intervals in the melody and the median note duration (8). One clear limitation of the FANTASTIC analysis technique is that it can only analyze monophonic melodies (8).

Müllensiefen and Halpern (2014) use the FANTASTIC toolbox as well. As mentioned in the section ‘Emotion and preference’, familiarity with music is an element that generates a certain preference or liking for music. Müllensiefen and Halpern indirectly studied musical features that can explain a feeling of familiarity. Initially, their study was aimed at investigating whether certain musical features are responsible for (enhancing) implicit and explicit memory for music. Participants were presented with 40 unfamiliar, to be remembered, melodies, and had to rate the familiarity of these fragments. In the test phase, these 40 fragments were presented in randomly mixed order with 40 ‘new’ unfamiliar melodies. They were asked to rate whether they recognized the fragments (explicit memory task) and how well they liked it (implicit memory task) (Müllensiefen and Halpern: 422). For the analysis, they extracted features by means of using the FANTASTIC toolbox and used the method of partial least squares to see which features loaded on the same components, explaining

(19)

features resulted in feelings of old and new, based on the explicit rating data (425).

They found that melodies with unique melodic motives generally enhance both implicit and explicit memory (427). Moreover, melodies with a varied contour – including lots of steep upward and downward movements, unique motives, uncommonly short or long phrases and high repetition of unusual motives make a song sound more familiar: it gave the listeners the idea they had heard the song before, even if this was not the case (428). Melodies with these features thus sounded more familiar, and familiarity influences preference positively. On memory, they found that ‘lower motivic complexity combined with a closer match of relative motive frequencies to the corpus help explicit memory accuracy’, but ‘a simple and familiar contour shape combined with an unusual and complex rhythm’ helps implicit memory (429). They eventually conclude their study with an advice to hit song-composer, suggesting that in order to make a song sound more pleasurable on repeated listening, songs should have a unique usages of motives, along with a low repetition in motives, smaller average interval size and simple contour but complex rhythm (432).

As Müllensiefen and Halpern found rhythmic complexity to play a role in the pleasurability of music on repeated listening, Lee and Lee focused their whole study on complexity as a feature of music. More specific, they investigated whether music popularity could be predicted from complexity as feature extracted from music, in which popularity was defined as a feature that can be extracted from the music chart, in terms of information about the highest ranking as well as the total period in time in which the song is ranked in a top 100 chart (Lee and Lee: 3). They focus on complexity of a song since complexity is found to be a significant influence on musical preference: a song needs neither to be too complex, as this might distract attention, nor too flat, as this might be boring for the listener (3). Besides looking at the complexity, they also looked at whether the initial ranking of the songs in the charts affected the long-term popularity (3). They motivated their idea based on previous work showing that ‘early view patterns are effective in predicting long-term popularity of YouTube videos’ (Lee and Lee: 3). Lee and Lee measured complexity based on structural changes of chroma, timbre and rhythm features (4). They found that the two groups of features, both complexity features and the early stage popularity of songs, ‘are effective for different popularity patterns and combining the two types of features can be synergetic’ (Lee and Lee: 6). Studies by Boyle, Hosterman & Ramsey, and McMullen & Arnold have also found that moderately complex rhythms were

preferred over rhythms that were perceived as too simple or too complex (studies mentioned in Teo 2003: 5, 12).

McMullen (in Teo) investigated the role of pitch on musical preference and found that melodies that have low or medium levels of ‘melodic redundancy’ are preferred over high and low levels of redundancy (Teo: 7). Redundancy is the frequency at which notes are repeated in a

(20)

melody, meaning that melodies with more variation in pitches were preferred.

Serrà et al. found that a certain use of pitch also plays a role in how ‘old’ songs are perceived. In their study, aimed at measuring the evolution of popular music, they found that the use of pitch sequences became simpler throughout the decades, as well as the usage of novel timbral mixtures and a higher loudness level (5). They state that an old popular song would be perceived as novel sounding by listeners, if it made use of these musical aspects. The other way around, new songs would be perceived as ‘older’ when they would have more complex pitch sequences, timbral mixtures and a lower loudness level. However, in contrast to the notions of ‘old’ and ‘new’ used in the study by Müllensiefen and Halpern, those aspects found by Serrà et al., that make a news song sound old, or an old song sound novel, have nothing to do with memory tasks, but are related to the way in which popular songs change in their musical characteristics throughout the years – the notions of old and new relate to the release year of a song, when a song was actually old or new. The usage of pitch, timbre, and loudness in light of the study by Serrà et al. are features that make both old and new songs sound more ‘old’, but this might suggest that it also makes them sound more ‘familiar’ and thus preferred over ‘newer’ sounding songs.

2.4.1. Popularity of music based on musical features

As the study by Lee and Lee – mentioned already – there are several studies which only focus on musical features to predict or account for music’s popularity. There are several approaches to the prediction of hits, most studies base their prediction on a binary classification: a song is either a hit song or not, and this classification is based on a chart-position (such as a number one song against a number 100 song). Most relevant for my study is the fact that these hit song prediction-studies use only musical features instead of social and psychological features.

There seems to be good reason to assume that there are features that make songs more or less popular. Namely, bird song studies have proved that in the songs male birds produce, certain features turned out to be important in attracting females. Researches artificially exaggerated these features, resulting in the same preference-response from the females, confirming the idea that (these) specific features play the most important role (Pachet and Sony: 312). Such finding is supportive of the idea that certain audio features are more important and more effective in the preference for music.

One study aimed at predicting popularity based on audio features is the study by Pachet and Sony. They classified popularity as having three levels: low, medium, and high popularity (314, 316). They used a database with both metadata as well as information about the popularity of songs (314). Three different approaches/sets of features were used in order to prevent bias in the experiment. The first approach is the Bag-of-Frames approach, in which audio signals are modeled to the distribution of audio features computed on short segments (315). The set they used consisted of 49 audio

(21)

features, including characteristics of the spectrum, tempo and harmony. The second feature set was computed by the ‘specific’ approach, which consists in ‘training the same (SVM) classifier with a set of “black-box” acoustic features developed especially for popular music analysis tasks by Sony Corporation’ (315). In contrast to the Bag-of-Frames approach, the specific approach does not compute the features on frames, but on the whole signal. The third approach was the human generated set of high-level features, in which 632 Boolean labels were used to train the classifiers (316). They used F-measures to test classifiers, and found that acoustic classifiers were no good at modelling popularity; random classifiers were compared with acoustic classifiers and this showed that popularity could not be learned with classifiers. This means that based on their studies, they found that there are no patterns concerning popularity that are significant, using any of their feature sets (317-321).

Kawawa-Beaudan and Garza (2015) used to Million Song Dataset to test the successfulness of machine learning algorithms in predicting success of songs. The features they included in their study were genre labels, metadata (such as mode, song duration, and artist ‘hottness’), and twelve-dimensional vectors for pitches, timbres, and loudness (1). They used the Billboard Hot 100 as an indicator of the song’s success: if a song had appeared on the chart, it was labeled as positive (for success), if the song’s artists never made it to the Hot 100 the song was labeled as negative. The prediction of success was thus set as a binary ‘problem’. They trained several classifiers to recognize successful songs based on the data from the Million Song Dataset. The classifiers were successful; they proved significantly greater accuracy than randomly guessing. The best classifier ‘was a Gaussian discriminant model on the metadata features’ (4). However, pitch and timbre data were also

successful in classifying, using Gaussian discriminant analysis and decision trees (4). So, a song’s success could be successfully predicted based on metadata alone, providing no high-level information, but pitch and timbre were also good predictors in addition.

Wang used instrument, beat and melody features - extracted by means of MIDI files from songs from the Million Song Dataset - as well as language models of n-grams, ‘to transform raw musical features into word-document frequency matrices’ (3) to base the analysis of hit song

prediction on. The classification problem in this study is also a binary one in which a song is classified as a hit if it has ever reached top 10 position on a Billboard weekly ranking; if not, it is not classified as a hit (Wang: 3-4). Wang found that the classification demonstrated that there are characteristics distinguishing popular and unpopular songs, and that the feature extraction of instruments, melody, and beats features is able to detect these distinguishing characteristics (Wang: 5); the combination of the instrument/melody and beats features is most promising (Wang: 5). However, no clear pattern is distinguished based on these features, suggesting that the fact that a positive prediction is most likely due to the fact that a large number of features were combined (5).

(22)

Nunes et al. looked at repetition in particular as a feature explaining difference in popularity (and preference). They hypothesized that repetition in lyrics would be beneficial for the processing of the song, making songs with more repetition easier to process than songs with less repetition. This advantaged processing is what they call (processing) fluency (188). They particularly pay attention to the role of the lyrics (in the chorus) that are repeated. However, they do not leave out the role of the melody in this process, stating that lyrics and music are often entwined – when recalling the words of a song, it is most likely that the melody is also recalled, and the other way around. Also, when lyrics are repeated, this is often accompanied by the same melody and rhythm (Nunes et al.: 188). Based on three experiments, they found evidence in favor of their hypothesis. (Lexical) repetition increased processing fluency, and they also found that repetition was a good predictor of a song’s popularity in the top 100 (191-193). The repetition also has a positive effect on the weeks in takes a song to reach a number one position (195). However, if there is too much repetition in terms of words in the song, the effect of repetition is negative (189). This finding is in line with a two-factor theory of ‘wear-in and wear-out’, introduced by Berlyne (in Nunes et al.:) ‘a highly repetitive song chorus-wise has a positive effect, but too much word repetition has a countervailing negative effect. In other words, the benefit in terms of entering the charts at an especially high ranking that comes from increasing the repetition of the chorus is moderated by word repetition. The interaction effect reveals that increasing

repetition i[s] not a surefire method for increasing success; there is a ceiling effect to the benefit of repetitiveness’ (196).

Another study related to the subject of investigating that what makes certain songs more popular than others, is the study by Van Balen et al. (2015 (b)). They tested algorithms to find the most ‘catchy’ parts of songs, based on a dataset from a project called ‘Hooked On Music.’ This project aims to find the catchiest tune ever, defining catchy as the song that is fastest remembered and thus has best long term salience. They test individuals’ memories for songs by means of a game, in which participants hear music and are asked to respond as quickly as possible if they recognize the song. If they recognized the song, the music fragment mutes for four seconds, in which the participant is instructed to follow the music by singing or humming it further. Then after four seconds, the music is heard again and the participant has to answer whether the music came back at the correct point or not. The idea behind this task is that once a song is recognized, the listener should be able to recall it completely; and the best-remembered (= the catchiest) song is most easily/quickly recognized (Burgoyne et al.: 2013).

To find out what makes songs so catchy, Van Balen et al. (b) presented a set of audio corpus description features founded on the use of three concepts introduced in their study (see Van Balen et al. (b) for more details). In contrast to the studies that are mentioned above, this study does not investigate popularity as a binary problem (popular or not), but aims at finding the catchiest part of a

(23)

song as well as the most catchy song ever; this would be comparable to a study aimed to find what predicts the ‘most popular song’. In addition, this study also differs from the above mentioned popularity-studies in that it uses audio features. However, they additionally use a set of symbolic features to see how much insight the novel concepts related to audio features add to existing methods of symbolic features (229). Based on the audio features, they found 12 components explaining the variance in the audio set confirming that audio features are meaningful descriptors of the corpus (230). Features relating mainly to conventionality were significantly correlated, suggesting that more recognizable sections have a more typical, ‘conventional’, sound (230). Furthermore, timbral recurrence and vocal prominence (this component having the strongest effect), are two other components relating to more recognizable sections. Timbral recurrence points to the role of

repetition (230). In addition to the audio features-based model, Van Balen et al. (b) also based a model on symbolic data only; this model contained only two components. These components suggest that recognizable melodies are repetitive and have more typical motives (232). Overall, their findings suggest that vocal prominence in songs is the best predictor of catchy music (segments). Also, harmonic and melodic conventionality enhanced catchiness of songs.

The MIR studies mentioned in this chapter show that characteristics can be meaningfully retrieved from the music. However, several studies have found different elements to be predictive of a song’s popularity. In the next chapter I discuss the choices I made when conducting my research: the method, the popularity-classification, as well as the actual study and the results.

(24)

3. Study: assessing music’s popularity based on musical features

Music’s popularity is not based on musical features alone. The studies in the previous chapter show that preference and liking for music is influenced by several social and psychological aspects. Emotions, memories, personal as well as musical characteristics, and individual differences all

account for variances in musical preference. However, on a more global level, there are songs that are way more popular than other songs – which could imply that not only personal differences, but also some intrinsic musical qualities could have an influence on a why certain songs are more popular than others.

MIR studies provide interesting insights in music on several levels, not just popularity. These studies prove that musical features can be extracted in a meaningful way, both audio and symbolic features – and that there are several useful methods. The results from, for example, the studies on INMI and catchiness make intuitive sense and prove that musical features can be good predictors of certain musical information.

What sets this study apart from the above popularity-predicting studies is that this study focuses on a chart that is actually composed by voters, and it includes the option to contain

information for songs over a longer period of time: some song reoccur in the Top 2000 multiple years, whereas in the top 100 charts (used in previous studies) songs most likely only occurred in the chart for one year, and the charts are (mostly) only comprised of songs released that year. It is clear from the section on the Top 2000 that people still like music from past times, as well as new music.

The popularity problem assessed in this study is not a binary one, but a continuous one, assessing popularity over a span of seven years. In this study, I chose to only look at melodies. Even though melodies only represent a small part of a musical whole, melodies make music memorable and distinguishable (Corrêa and Rodriques: 202). In addition, FANTASTIC provides a fairly easy way to compute features, but this toolbox can only compute features on monophonic melodies. In this chapter, the methodological aspects of the study will be described in more detail, as well as the study itself.

3.1. Method

3.1.1. Song selection

To make the comparison between ‘popular’ and less ‘popular’ songs in relation to their position and lifespan in the Top 2000, a song selection was made based on a few criteria. First, 194 songs were selected that were both in the Top 2000 and that had been in the annual Dutch top 100 as well. This

(25)

was the first criteria for selection, so that later in the analysis, the songs could be compared as having an equal amount of years in the Top 2000; this was to prevent the mistake that certain songs got a higher number in the total amount of years simply because of the fact that they were for example twice as long in the Top 2000 than other songs. With regards to the equal amount of years in the Top 2000, songs that were released later than 2008 were excluded in the selection, therefore, songs ranging from 1999 (which was the start of the Top 2000 as annual radio event) until 2008 were included, so that for each song, seven years – starting at their release year – could be taken into account when looking at their lifespan and position in the top. All of these songs had at least been in the annual top 100 once (also in the same period from 1999-2008), and in most cases, songs were only in the top 100 for one year, but this played no further role in the analysis.

Then, all the 194 songs were divided in one of a total of four categories that were based on their total amount of years in the Top 2000:

I. Songs that were in the top either from the release year onwards, or the first year after the release year – and were still in the top at the seventh year;

II. Songs which lives lasted exactly within this seven year life span: starting the release year or the first year after the release and having left the top the seventh year, or songs that were otherwise four or five of the seven years in the Top 2000;

III. Songs which were in the top three years to one year(s);

IV. And finally, songs that were in the seven year lifespan, starting from release year, never in the Top 2000, but that made in eventually in the Top 2000 (for example, a song released in 1999, with a position in the top 100 of that year, entered the Top 2000 only just in 2006). This category of songs was included to have also have a measurement or category of ‘less popular’ songs as opposed to songs that were in the Top 2000 for seven years.

Graphs gave a clear vision of the different lifespans of the songs, see figure 3 below for four different lifespans of four different songs.

(26)

3.1.2. Procedure and materials

In order to analyze the songs using the FANTASTIC toolbox, the songs had to be recorded as monophonic melodies because FANTASTIC can only analyze monophonic melodies (Müllensiefen 2009: 4). There is also a limited amount of notes that FANTASTIC can analyze as one phrase: 24 notes (Müllensiefen: 11). Therefore, for each song, I

selected a short fragment, ranging from six to fifteen seconds (8 to 24 notes). The selected fragments often involved the chorus of the song, and if not, it was a fragment that was characteristic of the song (often repeated and recognizable).

Due to time limitations, a sub-selection of 90 songs from the total of 194 songs was made; see Appendix I for a list of the songs. These 90 songs were selected by hand, an approximately equal amount of songs for each category was maintained: category I consisted of 24 songs, category II consisted of 23 songs, category III consisted of 21 songs, and finally, category IIII consisted of 22 songs. Two trained pianists played the songs after listening to the fragment several times (accessed on Spotify). One pianist played 35 songs; the other pianist played 55 songs. The songs were played on a Roland PC-160 MIDI Keyboard controller and were recorded by the program ‘Garage Band’ on an Apple personal computer. The AIFF output from Garage Band was converted to MIDI by the program ‘GB2MIDI’. This was done so that the MIDI files could be converted into MCSV files – the final file format that FANTASTIC requires as input. The MIDI files were converted into MCSV files by the conversion program ‘Melconv’7.

7 Retrieved from <http://www.mu-on.org/en/download>. Figure 2. Lifespan of songs based on their position in the Top 2000

This contour graph shows four songs, with their respective ‘age’ on the x-axis, age represents the seven year lifespan, with ‘0’ being the year of the song’s release. ‘Top_2000_rang’ on the y-axis represents the position in the Top 2000, with a position that can range between 1 and 2000. When a song had no position in the Top 2000 (for any one of these seven years) the value of ‘2500’ was assigned to it, indicating that the song was not ranked highly enough to be in the Top 2000. This graph shows songs for every one of the four categories; ‘Anyplace, Anywhere, Anytime’ (by Nena and Kim Wilde) represents the song in category I, ‘Ayo Technology’ (by Milow) represents category II, ‘Apologize’ (by Timbaland & OneRepublic) represent category III, and lastly, ‘As’ (by George Michael and Mary J. Blige) is representative of category IV.

(27)

3.1.3. Data analysis

This study was set-up so that melodic features could be extracted from the fragments using the FANTASTIC toolbox. These features can possibly explain the difference in popularity as measured as the total number of years a song spent in the Top 2000.

FANTASTIC has a set of first- and second-order features that are numerical and categorical values that describe the musical characteristics of the melodies. First-order features compute features for a specific melody, second-order feature are computed as features of a melody in the context of the whole corpus (in this case, the selected 90 melodies). See the software documentation by Müllensiefen (2009) for more information about the specific computations that lead to the set of first- and second-order features.

3.2. Results

First- and second-order features were computed for all the fragments with FANTASTIC. Eventually, one song-fragment was dropped because it had missing values, leaving a corpus of 89 melodies.

The package ‘psych’ was used to compute a model for the years a song spent in the Top 2000. A total of 85 first- and second-order feature variables with the fragment-corpus were used. Principal components analysis (PCA) with oblique rotation (oblimin) was conducted on the 89 MIDI items. Parallel analysis suggested nine factors, the scree plot showed an inflexion that would also justify retaining nine components. These nine components were retained for the final analysis. Table 3 shows the factor loadings after rotation. The items that cluster on the same components suggest that Component 1 represents contour shapes, for which the overall shape is common, but that have an unusually steep inclination. Component 2 represents unique use of motives (both in the corpus and in the single melodies) that combined with a longer duration of the section length, increases. Component 3 represents a wide variety in pitch use and pitch range. Component 4 represents often repeated motives that are common in the corpus. Component 5 represents unique use of motives in a melody, but that are not unique in the overall corpus. Component 6 represents typical rhythmic patterns which consist of variable durations. Component 7 represents typical range and use of pitch. Component 8 represents common contour with note transitions that are uncommon. Finally,

Component 9 represents shorter note duration and high note density. The components are arranged by means of the total variance they explain.

Linear regression by means of the all-subsets method was used to compute which predictors should be included to best predict the total years in the Top 2000. This method gives all possible models, with the best model on top of the list. The best model has the lowest AICc value. In table 5, the top five models are represented with AICc and delta values. The best model only has two components

(28)

(Component 3 and Component 6) explaining 12.37% of the variance within the set of all predictor variables. The model had an R2-value of .09, see table 6 for more details. Component 6 contains several high-loading feature variables using statistical information about common duration of notes (high modal duration, entropy and unequal transitions; common with respect to the corpus). Due to the variation in duration and relative ‘randomness’ of the duration, the component can be described as having a certain level of complexity relating to rhythm. Component 3 reflects a set of features relating to the range and the entropy of the pitch; both variation and entropy are high, suggesting that for this component too, there is a relative complexity - this component relating to the complexity in the use of pitch. Thus, common unequal note durations combined with a wide range and higher entropy in pitch seem to benefit the years a song spends in the Top 2000.

In addition to predicting the total amount of years a song spent in the Top 2000, I also looked at whether there were certain components predictive of any other amount of years such as zero or one year(s). Therefore, models for predicting zero years (in the seven year time span since the release year of the song), and six or more years in the Top 2000 were computed. Based on an exploratory Anova, Components 3 and 6 seemed to be predictive for songs that spent zero years in the top, but after assessing the fit of the model including these two components, the model including these components led to no significant better fit than not including them. For the prediction of a total number of six or more years in the Top 2000, an exploratory Anova suggested two components to be important, Component 3 and 5, but again, including these two components did not lead to a

significant better fit than not including them in a model.

A similar approach was used to predict the nationality of the artist of the song: Dutch or non-Dutch. Component 5 turned out to be a significant predictor for the nationality of the artist, meaning that songs that have a unique occurrence of motives with regards to the melody, but not unique with regards to the whole corpus (all Top 2000 songs) are most likely songs by Dutch artist. The model explains 6.44% of the variance within the set of all predictor variables, with a small effect size (R2 = . 06), see table 7.

Table 4. Feature loadings as result of principal component analysis (PCA)

Feature name

Oblimin rotated factor loadings

PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 Common Unique Pitch Common, Motives Complex Common Common Density

Referenties

GERELATEERDE DOCUMENTEN

Since the charged black hole is essentially a point charge, it is easy to calcu- late the radial component of the electric field as a function of r, so that we can calculate

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

• Numerical work, in which features are extracted from process data with Restricted Bo ltzmann Machines (RBMs) and used as the b asis fo r p ro cess fault d iag no sis in

Since the real LATW process includes many considerations that disturb the spatial temperature field, e.g., various winding angle and non-uniform laser heat influx, a 2D

To be precise, LIA contributes to four benefits for INBUS, namely (1) the use of LIA eliminates the need of having an employee who has high competency in accounting, (2) the

From the review of related literature in various areas, such as emotion, music, Taiwan popular music, and sentiment classification, I considered that conducting a study related

In conclusion, this thesis presented an interdisciplinary insight on the representation of women in politics through media. As already stated in the Introduction, this work

Procentueel lijkt het dan wel alsof de Volkskrant meer aandacht voor het privéleven van Beatrix heeft, maar de cijfers tonen duidelijk aan dat De Telegraaf veel meer foto’s van