The Democratic State of Spotify: An Examination of How Spotify Includes New Artists Within its Network and What That Means for Discoverability

(1)

(2)

Abstract

At the beginning of April 2018, Spotify Inc. launched their Initial Public Offering on the New York Stock Exchange. While the reception was tepid, it is the next logical move by Spotify to be the most dominant platform in music streaming. Yet, Spotify still advertises as a

“democratic” space for artists, and not as a media conglomerate. It has yet to be analyzed how Spotify’s API may lend itself toward or against this image as an egalitarian space for music. This research dives into the aspect of platform equality as exemplified by diversity. Specifically, it traces how the new concept of “discoverability” is put into action in Spotify’s music recommendation algorithm and, in turn, how artists are clustered or networked together within the myriad catalogs of Spotify’s system. The study finds that even though Spotify recognizes the inherent flaws in the music filtering system they choose to employ, the networks still are built and behave in such a way that brings more disadvantages than advantages to new artists who wish to join Spotify and achieve success. This finding

contradicts the stated existence of Spotify as a “democratic” and egalitarian space for music to be accessed and enjoyed.

(3)

Abstract 2

1. Introduction 4

1.1 Media Concentration and “Democratization” 4

1.2 Research Question and Direction 8

2. A Journey Through Music Recommendation 10

2.1 How Music Recommendation Algorithms Work 10

2.2 Breaking Down Acoustic Metadata 11

2.3 Network-Centric Recommendation 13

2.4 Popularity Analysis 14

3. A Platform Built on Collaborative Filtering 16

3.1 Music as a Service 16

3.2 Spotify as a Profit-making Platform 18

4. How to Dissect the Spotify Network 20

4.1 Empirical Methodology 20

4.2 Revisiting The Spotify Teardown 25

5. Unlocking the Network 27

5.1 Genre and Popularity 27

5.2 Followers vs. Monthly Listeners 29

5.3 Network Size 30

5.4 Popularity Distribution 33

6. Discussion 35

6.1 The Power of the People 35

6.2 An Uphill Battle 37

6.3 Vectors Within Networks 39

7. Spotify Case Study 42

7.1 Biyo Through the Eyes of Spotify 42

7.2 Spotify Through the Eyes of Biyo 49

8. Conclusion 52

8.1 Does Democracy Matter? 52

8.2 Limitations and Further Research 54

Bibliography 56

(4)

1. Introduction

1.1 Media Concentration and “Democratization”

For as long as musicality first developed amongst humans, it has been transmitted in

varying ways from person to person. According to music historians and educators, the story of musical transmission forms something of a complete circle. It started out as a completely audial type of transmission from one person to another through the form of singing songs directly to others. Through later years, musical transmission became faster through use of manuscripts and the invention of the printing press to copy written music. But this, of course, changes the form of music from its natural state into something that is a mere translation of the original musical piece. It wasn’t until the development of sound recording that music could once again spread through aural channels, while still maintaining the high distribution elements that printed copies of music featured (Creative Commons 6).

However, even before the invention of the musical record as we know it, the exercise control over the spread or transmission of music media. And, make no mistake, music is one of the more essential forms of media that is recognizable across all cultures. Scholars have estimated that although it may alter slightly over time, music is something that is an extremely pervasive and influential form of media (Creative Commons 7). Returning to the concept of how control is practiced over music as a medium, we have to look closer at the history of media concentration. According to the Future of Music Coalition, media ownership as it relates to music is an important issue that first came into the sphere of public

knowledge in the 20th century (“Why Media Ownership Matters to Musicians” n.p.). Their report on media concentration highlights the dangers of “a small handful of companies get to decide what information you encounter on any given day” (n.p.). These types of control have changed throughout the year, but the coalition focuses on the dangers of music radio as a vector through which media concentration can be seen.

This is nothing new as the practices of “payola” were parts of popular discussions about media ethics in the 1950s. Payola was the practice of charging artists or labels for the playing of their songs on commercial radio as part of the regular broadcast (Cowen 164). This was ethically dubious as it only distributed to radio listeners the artists who could afford to pay the bribes of the radio stations. Radio was and continued to be one of the more

(5)

popular methods of musical transmission into the late 20th century and even into the early 21st century. It was the most traditional path to mass popularity for aspiring artists, but it continued to be an extremely narrow path according to the Future of Music Coalition. After extensive studies conducted as late as 2006, they found significant song overlap across a wide range of stations (n.p.). This decreased the diversity of the music that the general population was exposed to via the radio. They attribute this recent trend to the merging of media conglomerates that consolidated control over radio stations to parent companies such as iHeartMedia (n.p.).

The internet and streaming services were marketed as the solution to this diversity in popularity issue. They were meant to break up the media concentration of music that had developed over the past century. It is a common thread among entrepreneurs in this market such as Sean Parker of Napster or Tim Westergren of Pandora. These musical platforms offered glimpses into the future of music ownership. They provide different

models for how to achieve musical file-sharing on different sides of the law. Napster was the original experiment in free mp3 sharing that launched in 1999. It was the first time that music could be reliably and quickly shared through an internet connection (Lamont n.p.). The founders of Napster were later sued for blatant copyright infringement when artists like Metallica voiced public displeasure with the Napster model. Surprisingly, Napster still lives on today after it was bought and eventually rebranded by another music platform called Rhapsody. Abandoning their illegal file-sharing model, they currently use a

subscription-based pricing model, but recent layoffs indicate a lack of sustainable growth for Napster (Popper n.p.). On the other end of the spectrum, you have a program like Pandora. Pandora pivoted to a “digital radio” model that allowed listeners to input information about artists, genres, or songs that they preferred. From those preferences, Pandora would play songs that fit those characteristics. The user was not allowed to cherry-pick the specific song or artists it wanted to play at a specific time. Most revenues were made through advertisements (Disis n.p.). For a time, Pandora was a dominant fixture on the market for music-streaming. They still have over 75 million unique listeners per month. But, according to recent reporting, Pandora was late to offer an ad-free subscription model and the radio model soon fell out of favor with consumers who preferred a more personalized experience (Disis n.p.)

Enter Spotify. No other music-streaming platform has been as successful as Spotify in recent years. With a unique monthly listener count of 159 million users, Spotify has more than double the listener base of Pandora and triple the listener base of Apple Music (Disis

(6)

n.p.). On top of that Spotify is particularly adept at converting users from their free

ad-based model to their premium model, with over 71 million subscribers. Spotify recently launched their Initial Public Offering on the New York Stock Exchange, achieving a market cap of 27 billion dollars (Owsinski n.p.). Spotify itself has made no secret of its wish to disrupt the established order when it comes to access to music. Spotify’s New Market Leader Bahigh Acuña said, “We arrived to democratize music, and that’s what we’re doing. Our business model is win-win for everyone” (Herrera n.p.). This type of grandiose

statement, along with the previously mentioned market dominance, makes Spotify and its systems worthy of intense study. Certainly, it is worth knowing what makes Spotify so successful, but it is perhaps more important to examine what this type of market dominance and concentration means for consumers of music. Just as the dominance of radio affected popular culture, Spotify may have far-reaching effects as well. This research has already been initiated, albeit with more of a focus on the fiscal and economic impacts of Spotify. Multiple news reports in the past few years have debated the effectiveness of Spotify’s “democratization” endeavor as it pertains to artist compensation.

Spotify operates on a streaming music model, which has largely redirected music ownership from individual persons to a rental play option that will be discussed in further detail in the next chapter. What is important to understand about this system is the way Spotify

compensates artists for their music. It has been under fire for the last 5 years or more for a system that pays mere pennies on the dollar per stream on Spotify. In 2014, a

co-songwriter for the song “Livin’ on a Prayer” by Bon Jovi said that from 6.5 million streams of the classic ‘80s hit, the entire songwriting team only received $110 in

compensation, which had to be split evenly among the songwriting team of three. “We could each buy a pizza,” the songwriter Desmond Child said (Gross n.p.). This data could easily be extrapolated and applied to newer or less-established artists than Bon Jovi. If these

hitmakers were not able to support themselves through use of Spotify’s system, could anybody? A smaller artist like singer-songwriter Jason Isbell reports that he earned so little from Spotify that it was completely negligible to his bottom line (Gross n.p.).

This type of critical investigation is just the beginning of the exposure that the company has been subject to in terms of how Spotify exercises their powers. Up until recently, Spotify hasn’t been studied by academics in detail. However, this autumn, the first significant piece of large-scale study of Spotify will be published. The Spotify Teardown is a long-term study performed by a group of five esteemed researchers and sponsored by the Swedish Research Council. They take a look at Spotify from many angles including how they have collected

(7)

user data and employed it within their streaming infrastructure. Certainly, part of this was meant to further investigate criticism aimed at Spotify as a company, as well as an

investigation into the way Spotify defends itself against criticism. The Spotify Teardown does indeed examine Spotify’s market impact as it relates to media ownership and

concentration. Given that in Sweden, “Almost 90 percent of the population under thirty-six now uses the service on a weekly basis” (Eriksson et al. 21), this type of study is extremely relevant and important.

One of the ways that Spotify has defended itself from such a claim is that the exposure that they give to artists is one of the many benefits that other formats are less adept at

providing. They tout the advantages of marketing new music through a platform that allows unlimited streaming. Spotify’s own Chief Content Officer stated, “Even if you wanted to explore the world of music from your chair, it was virtually impossible to sample genres that you had never been exposed to and records that you might have heard about but didn't have any way to access to try before you buy,” (Castillo n.p.). This is a reference to Spotify’s ability to propagate discoverability when it comes to new artists. Discoverability originates as an industry term. At its core, it means how easy or difficult it may be to find content in any given system. In this age of easy access to immense catalogs of music, discovery is an essential part of the service that music platforms like Spotify provide. Being able to be found is what makes joining Spotify valuable to artists. Their philosophy is that even if you aren’t making significant amounts of money through streaming, you could be opening up your catalog to new ears. These new listeners may be converted into subscribers or fans who will buy tickets to concerts or merchandise (Cooper et al. 6). But all of this cannot be accomplished unless the artist and their music can be found. Therefore, this “discoverability” part of Spotify’s identity is integral to their claim that they represent the “democratization” of music. It is necessary to appropriate the term “discoverability” for the purposes of this research. It makes it possible to explore this term from different

perspectives. Discoverability may have many different facets that can affect how easy or difficult it may be to find a particular artist or song. It is necessary to determine all of these different categories and evaluate their efficacy as it relates to music being found or

discovered. This type of research is highly significant when it comes to determining the veracity of Spotify’s claims about the egalitarian nature of their platform and their form of music recommendation. Even if their mission statement as it relates to equality has been proven problematic from a compensation perspective, the more important question is how Spotify approaches equality within its API. Can this key tenet of Spotify’s benefit to the music industry withstand scrutiny and assessment?

(8)

1.2 Research Question and Direction

Using Spotify’s algorithm, I will endeavor to explore the ways in which Spotify

operationalizes its quest to democratize music. However, instead of focussing on the already crowded field of breaking down their compensation structure, I will zoom in on how Spotify uses its artist network to establish these connections that people may not readily make using a traditional buying structure attached to music. Using new-to-Spotify artists is extremely valuable in this research as it allows to track how small artists establish connections within Spotify’s network to larger artists and allows me to trace a pattern of discovery likelihood within these parameters. Namely, to what extent does Spotify’s Artist Network algorithm offer pathways to discoverability and diversity for new artists?

I will approach this issue using various academic perspectives and methods. First, it is important to understand how the Spotify Artist Network operates. Using literature and research, I will explore how music recommendation systems work and operate. It is significant to achieve an understanding of the ways machines analyze music and thus categorize it. There are several different approaches to how computers can classify audio files, both with and without human interference. However, it is still good to keep in mind that in Spotify’s current system human data is included. From there, I will move into launching empirical research studying a sample size of new artists on Spotify. Once again, studying artists who only joined Spotify recently will provide me unique insights on how and why these networks are structured. I will gather all of the network data about these artists including who they are connected to, what genres they are clustered in, how many followers they have, and what their popularity ranking is within Spotify. This combination of data capture about Spotify is unique because it allows me to analyze almost all of the

components that culminate in popularity and playcount on the platform. This can reveal possible pathways to popularity and discoverability for new artists. It can also help determine whether Spotify’s network is a reification or diversity machine as it relates to emergent popularity. Finally, I will conclude this project by presenting a case study of a new Spotify artist. This personal take will add a face to a very technical piece of research. What channels and interfaces do they encounter as they navigate the Spotify platform?

I do not consider this research to be a general look at how Spotify categorizes musicians into certain genres and sub-categories. This could be done solely through a literature review based on the current information available about Spotify’s practices. Instead, I want to dive

(9)

deeper into how Spotify ranks and orders artists in terms of how they bring added value to the platform. I consider it more of an interrogation of Spotify. Not through interviews, but through the evidentiary study of the platform’s code. The founders verbally boast how it Spotify is on a mission to give people more equal access to all different types of music. It is important to see whether the algorithms at use support that claim. But it is just as vital to look at these algorithms in action. It may turn out that the theoretical outline for how Spotify formulates networks may be markedly different from how it functions in practice. This is why looking at specific cases and artist networks is an advantageous position as it allows the research to fully capture the surprises within the machine. It should also be expected that there may be no definitive black-and-white answer as to whether Spotify is either reification or diversification machine. It may hold characteristics of both. But by zooming in on how the network is visualized as it pertains to individual cases, it is possible to examine very specific patterns and anomalies within the dataset. My hope is to create a well-rounded and holistic view of the Spotify Artist Network both from an interior and exterior perspective.

(10)

2. A Journey Through Music Recommendation

2.1 How Music Recommendation Algorithms Work

It is a little-known fact that before Mark Zuckerberg started Facebook, he developed a music recommendation program called Synapse with his best friend Adam D’Angelo.

Zuckerberg completed this program while still in high school and was offered $950,000 from Microsoft to purchase it. Instead, Zuckerberg turned down the offer and chose to attend Harvard (McCracken n.p.). But the fact remains that before Pandora, last.fm, and iTunes Genius, there existed Synapse.

So how Zuckerberg crack the code on a program that can read your taste in music and recommendidd similar songs and artists? The truth is that the answer is different every time the question is asked. As Óscar Celma elucidates in his highly detailed volume Music

Recommendation and Discovery, there are several highly effective ways of building music learning algorithms. Some methods are more effective than others. In order to understand more deeply the variations in contemporary software music recommendation, this chapter will examine Celma’s research point-by-point.

The first entry point for music recommendation is what Celma describes as the easiest, User-Centric Evaluation. This is often the starting point for common recommendation platforms such as Spotify and Pandora. When a user is just joining the platform, they may do so by logging in through an email or other social media account. The music

recommendation system takes the only information it has so far, demographic information, and builds a profile (Celma 19). While it may seem non-specific, Celma explains, “Musical taste and music preferences are affected by several factors, including demographic and personality traits. It seems reasonable to think that music preferences and personal aspects—such as age, gender, origin, occupation, musical education, etc.—can improve music recommendation” (45). Of course, Celma understands that these profiles are not static and require constant feedback to improve their recommendation accuracy (21). Logically, this leads to the next type of algorithmic recommendation.

The type of recommendation that leans most heavily on user interaction and rating is the system that the application Pandora radio relies upon. This system is called Playlist Generation and requires users to rate songs presented to them on a traditionally binary

(11)

scale (Celma 32). In the case of Pandora, this is a simple “thumbs up” and “thumbs down” assessment. This sends a simple message to the system to play (or to not play) music similar to what is currently playing. How the system determined what music is similar often came down to artist self-reporting metadata about their musical genre and influencing artists (Celma 55). However, this can cause issues because “tagging spam is a problem for any music recommender system that relies on this type of data to derive artist (or track) similarity” (Celma 59). From this, another method was developed that involved less human interaction with the metadata.

The next frontier for developers of music recommendation systems was content-based analysis using acoustic metadata. The next chapter will discuss acoustic metadata and how it is retrieved in more detail. Focusing on what content-based analysis is and how it

generally works is sufficient when discussing the chronological development of music recommendation algorithms. The first step, according to Celma, is by using item-based similarity, which “is the most common way to compute and predict the recommendations” (81). Put simply, the computer must establish a baseline of what features can be used to evaluate musicality. Once the baseline is established, only then can similarity among song features be calculated and a reliable recommendation system is set up (Celma 75). Another digital scholar, David Beer, described this smoothening integration of feedback and

recommendation as a way our system of music consumption has “become more permeable and networked so this information becomes real-time, locational and, most importantly, increasingly ambient and hidden” (478).

The “ambient and hidden” qualities of music recommendation is precisely the direction that applications like Spotify wish to take themselves when it comes to seamless integration. In a recent study published about Spotify’s advertising model, it was posited that the two most important recommendation tools are a mixture of profiling and content-based analysis. By synthesizing these two approaches, Spotify has created a situation where “music has become data, and data, in turn, become contextual material for user profiling at scale” (Mahler and Vonderau 213).

2.2 Breaking Down Acoustic Metadata

In order to better understand the inner-workings of content-based analysis and how it serves a program like Spotify, it is necessary to give a quick overview of how a computer “listens” to music. Celma presents concrete algorithmic examples of the type of

(12)

mathematical proofs it takes to communicate to a computer how to digest musical information. For the purposes of this research, it is only necessary to understand the terminology, intent, and usage of such programs.

On its base level, acoustic metadata decomposes music into two separate types of data, temporal and spectral (Celma 64). Spectral features are described as more “robust to polyphonic and complex textures” (64). This essentially means that this type of music analysis takes a short sample of a song and maps how the sonic layers within the song relate to one another. There are several technical criteria for spectral measurement including centroid --average frequency, weighted by amplitude, of the spectrum-- and flatness -- the ratio between the geometrical mean and the arithmetical mean of the

spectrum magnitude (Celma 64). While spectral features offer computers the opportunity to analyze music through a mathematical lens, temporal features take a more artistic

approach.

Temporal features of songs and music can be listed in terms that more general audiences are familiar with, even though content-based analysis presents its own spin on all of this common terminology. For instance, temporal features measure instrumentation by

comparing recorded samples of instrument textures or “timbres” (Celma 65) Additionally, it evaluates rhythm through common tempo criteria such as time signature and BPM (beats per minute) (Celma 66). Other temporal features include measurement of harmony in relation to prevalent melodic chords and progressions, the structure in terms of A-B themes like chorus and verses, and intensity by cross-comparison with benchmark sound files for a reading of the song’s energy (Celma 66-67).

As mentioned previously, temporal features are easily identified by their connotations with known musical terminology. Some of the features temporal analysis evaluates that I haven’t yet mentioned include genre and mood. It is important to remember that all of these

measurements are not singular, but work in tandem with one another to create one singular analysis of a piece of music. From there, the artistic measurements of the temporal aspects are married to the mathematical breakdowns that spectral analysis provides. It is both of these measurements that contribute to effective content-based analysis that minimizes human input. These are the foundations upon which Spotify recommends artists and songs to its users. In turn, a content-based analysis is also the building block upon which networks are created between similar music. The focus of the final section in Chapter One will be

(13)

solely on network creation and network-centric evaluation in music recommendation systems.

2.3 Network-Centric Recommendation

Turning from how content-based analysis is applied to music within a system to how it is used to form networks will be the focus of this chapter. Just like with the actual process of musical analysis, there are several different ways in which networks can be formed by the synthesis of temporal and spectral evaluation. But, as Celma is quick to point out, there are many potential pitfalls when it comes to pleasing consumers who utilize the network. It is a careful balance or “trade-off between the desire for novel versus familiar recommendations. A high novelty rate might mean, for a user, that the quality of the recommendation is poor because the user is not able to identify most of the items in the list of recommendations” (Celma 35)

Among the many types of musical networking are similar concepts that I have discussed previously. These include human-expert based networks that utilize human editors to link artists and musical styles and content-based filtering which relies on no feedback from humans and works from content-based analysis (Celma 132). The system that Spotify uses is somewhat of a balance between the two called collaborative filtering (Celma 131). An understanding of how collaborative filtering works will open up the understanding of how a new artist is added to the Spotify network.

Collaborative Filtering, according to Celma, is a social system that uses data collecting from users (both directly and indirectly) to modify their content-based filtering system (134). This social aspect allows users to play the role of “fact-checker” against the computer’s technical assumptions. For instance, it is mentioned that collaborative filtering presents the highest incidents of assortative mixing than either of the other two networking models (Celma 135). However, it is very important to pay attention to the fact that human

interference also creates the tightest and most exclusive clusters within networks, as Celma explains further: “That means that the most connected artists are prone to be similar to other top connected artists. Neither CB nor EX present indegree–indegree correlation, thus artists are connected independently of their inherent properties” (135). An example of collaborative filtering in action at Spotify is their shift toward, “‘entirely personal’ musical experiences, was materialized in the (now slightly rearranged) Discover, Browse, and Follow features. Together, these functionalities allowed users to follow their favorite artists and

(14)

tastemakers and to check out selected music based on previous listening patterns and editorial decisions” (Eriksson et al. 118). The creation of these different pathways can be seen as various approaches to diversity, but in fact, carry all the same baggage that collaborative filtering can put on any given system.

A more clear break down collaborative filtering comes from an actual former employee of Spotify, Sander Dieleman. Dieleman describes the process of collaborative filtering as using a particular user’s history as a stream of consciousness way of connecting artists to artists and songs to songs (n.p.). By aggregating these historical patterns of usage among large groups of consumers, patterns begin to emerge. Put simply, “if two songs are listened to by the same groups of users, then probably these two songs are of a similar type” (Dieleman n.p.). However, Dieleman admits that there are oversights within the collaborative filtering system that are quite similar to the issues that Celma highlights.

2.4 Popularity Analysis

The main issue is how to get around the dilemma of popularity (Dieleman n.p.). Just as Celma pointed out that collaborative filtering tends to cluster artists more tightly than other types of networking, Dieleman corroborates by explaining: “Since the collaborative filtering approach is based on usage data, the more popular the song is, the more usage data is related to the song. So it is much easier for popular songs to be recommended, whereas music which is new or unpopular is unlikely to be recommended” (n.p.).

It is highly important that this aspect of collaborative filtering be addressed as it is the very basis for research to follow. According to these scholars, popularity itself is a filtering method and presents disadvantages to new and upcoming artists. Celma, whose focus is more technological than Dieleman’s publication, cites the explanation for this being that the “[Collaborative Filtering] network has a clear correlation (rCF = 0.503); the higher the playcounts of a given artist, the higher the average playcounts of its similar artists” (Celma 139).

Celma delves further into the exploration of the effects of popularity on networks by conducting a series of trials for each of the main three music networking techniques: collaborative filtering, content-based filtering, and human-expert based networks (139). Essentially, by inputting a control list of artists through each recommendation system, it determined how likely each process was to recommend familiar or unknown artists. Oddly

(15)

enough, both content-based filtering and human-expert networks scored higher in the discovery category than collaborative filtering did (Celma 142). This is particularly remarkable when compared to content-based filtering because one would a expect a computing-only approach would yield fairly safe results. However, it appears that it is the human influence on content-based analysis that tends to make it more exclusive.

This is the largest part that Celma remarks upon in his discussion of the study results: “the popularity effect derived from the community of users has consequences in the

recommendation network. This reveals a somewhat poor discovery ratio when just browsing through the network of similar music artists… This could be related to the existence of positive feedback loops in social-based recommenders” (145). This applies directly to Spotify as well, as The Spotify Teardown points out, “Music recommendation algorithms at Spotify did not really take advantage of the archival infinity of the service” (Eriksson et al. 102), meaning they had a tendency to play the same songs over and over again resulting in a lack of originality.

Now that it has been made clear what the advantages and disadvantages are of this type of music recommendation network system, a new perspective can be approached. It is worth exploring how a service, program, or platform can operate using this type of network. In order to understand the service vacuum that Spotify came to fill with its own form of collaborative filtering, it is important to understand the technological landscape that allows for such programming to flourish.

(16)

3. A Platform Built on Collaborative Filtering

3.1 Music as a Service

Since file sharing amongst music fans became popular in the late 1990s, the music industry itself and how it distributes music has been transformed multiple times. These notable shifts have been cataloged in detail by media professor David Hesmondhalgh. All of them were instrumental in how Spotify and its predecessors came into being. Firstly, MP3 discovery and development as a format let people compress files so small that they could be stored on a computer or transportable device. Next, the growing availability of high bandwidth

connections made downloads of music files faster and more accessible. After that, computers and electronic devices with larger file capacities made it possible to store and play more music. Finally, more advanced software development that created ways for ordinary users to convert audio files on CDs to an MP3 (Hesmondhalgh 59). The creation of these new sources or channels for music files forced the music industry to make a

monumental shift from physical record sales in order to stop the bleeding from illegal downloads (60).

One of the directions that distribution focused on was the download-to-own model-- as can be seen in early cases like iTunes. The download-to-own model allows users to choose a song off of an album to purchase for a nominal fee. This presented a way for companies and artists to profit off the sales of digital files, where before illegal file-sharing through

programs such as Napster were taking a serious toll on profits. However, this line of thinking was only scratching the surface because: “In effect, the primary barrier that the record companies have used to control and dominate the music industry – owning content – offered little resistance to these new technologies. They soon realized that they had clearly underestimated the impact of the internet and had been very sluggish to react to this unprecedented threat. To prevent entry – largely illegal – needed them to be vociferous in the pursuit of protecting copyright” (Lewis et al. 353).

However, according to Doerr and his colleagues at the Munich School of Management, this type of purchasing system also became less viable as time wore on. Instead, record

companies started to look past the idea of music as a good and focused on the providing of music as a service. They explain Content as a Service as, “CaaS, whose technical foundation is referred to as data-streaming and which is analog to the concept of the much-discussed

(17)

Software as a Service (SaaS). CaaS describes a type of business model which provides content over the internet as a service - without transferring ownership. This differentiates CaaS from ‘rent or buy models’ on the internet” (Doerr et al. 14).

One of the most prominent purveyors of Content as a Service is Netflix. Netflix is the single largest source of Internet traffic in the US, consuming 29.7% of peak downstream traffic (Adhikari et al. 1620). Therefore, its servers and Content Delivery Networks must have enough support to handle all technical challenges. Netflix may have originally been established as a channel through which people could mail-order video rentals, but soon switched to a streaming platform for movie and television files, so as to avoid being disrupted just as Netflix disrupted Blockbuster (Gomez-Uribe and Hunt 13:2). Whereas Blockbuster catered to a market transitioning from buy-to-own models to a physical renting model, Netflix took it a step further. After the same events that left the music industry scrambling occurred, the film industry faced a similar pirating struggle. The ability to compress large file sizes beyond the mp3 (such as the mkv or the mp4 file), made file sharing just as much of a threat to film studios. Netflix capitalized on this by allowing people to access a large --but not complete-- library of movies (Gomez-Uribe and Hunt 13:4). The profit model of Netflix differs slightly from Spotify’s, which will be discussed in more detail in the next chapter. Netflix charges a flat-rate subscription to all customers for unlimited access to their streaming library of films and television. Netflix was one of the first to marry a streaming platform model with the Content as a Service design that would soon become popular across various types of media.

Music as a Service is a more specific term when applied to music streaming models. It is mentioned that Spotify is not the only actor in the field of providing MaaS. In addition to Spotify, these initial purveyors include streaming platforms like Grooveshark, Deezer, Steereo, and Last.fm (Doerr et al. 16). These companies took advantage of and marketed the full benefits of streaming over owning. These benefits include the addition of the

peer-to-peer model, community features, and the ability not to register for those platforms with free services (16). However, there were some initial drawbacks that these streaming models had to take into consideration and market around. The biggest disadvantage was that “in contrast to downloads, MaaS users have to be online to receive their music by streaming” (Doerr et al. 17)

(18)

As raised in the paragraph above, one of the most original benefits that MaaS offers its consumers are the community and social features. It has been seen how these community features will grow to become the cornerstone of the collaborative filtering model that allowed Spotify to construct its network. Next, a brief history of Spotify must be examined to obtain the full context of the research that is to follow.

3.2 Spotify as a Profit-making Platform

Spotify as we know it was launched in October 2008 in Stockholm, Sweden, where it is still headquartered. It has been stated by founder Daniel Ek wished to build a platform that could create a profitable MaaS model. They initially focused on creating exposure and

growth in the European markets. They did this mostly by pushing the embedding function of their playlists on different websites. Moving forward from that success, Spotify launched in the U.S. in 2011 (Swanson 208). The researchers of The Spotify Teardown trace how Spotify achieved this success, highlighting that it has a history of equivocation about

Spotify’s role in copyright infringement by not paying artists properly for the files that were shared on the platform (Eriksson et al. 50).

Some might argue that because the adding of content to Spotify is controlled, that this rule means that its behavior as a traditionally-operated “platform” is debatable. In contrast to this, Tarleton Gillespie argues that “the term ‘platform’ has already been loosened from its strict computational meaning. Through the boom and bust of investment (of both capital and enthusiasm), ‘platform’ could suggest a lot while saying very little… Platforms are platforms not necessarily because they allow code to be written or run, but because they afford an opportunity to communicate, interact or sell” (351). It can surely be posited that Spotify fulfills all three requirements that Gillespie mentions. Communication is achieved through the above-mentioned community features of MaaS. Interaction happens through playlist generation, both by artists and listeners. Finally, the sales aspect of Spotify is achieved through both advertising and subscription profit models. However, The Swedish Research Council group prefers to label Spotify as a mediator because of its role in translating the media that it distributes.

A new term has been attached to this particular type of profit structure. The “Freemium” model allows users to access content for free with the interruption of advertisements, or pay a monthly subscription to bypass ads (Swanson 208). By offering this type of profit model, this fits in with suggestions by the Future of Music Coalition who suggest that digitization

(19)

opens up different revenue streams including Streaming Mechanical Royalties, Cloud Storage Payments, YouTube Partner Program, and Ad Revenue (Swanson 220). A Spotify Premium subscription costs $10 a month. Of that number, Spotify takes a 30 percent cut and leaves the rest for the owners, producers, and contributors of the song to split (221). This freemium model is not necessarily how Ek may have first envisioned the success of Spotify, according to the authors of The Spotify Teardown, but it does support the claim that throughout its history, “Spotify is a shape-shifting service developed by a company that constantly adjusted, if not entirely changed, its main strategies and goals” (Eriksson et al. 31). The platform has experienced so much success worldwide that it was recently

announced they will be filing for an Initial Public Offering on the New York Stock Exchange, putting it in the company of Content as a Service platforms before it like Netflix (Roof and Constine n.p.). However, this success may not be all that it appears to be as Spotify has always depended on financial speculation, by purposefully obfuscating the details of their financing rounds including the identity of investors and the size of their investment

(Eriksson et al. 37). Certainly, by deliberately making several aspects of their company so opaque, Spotify invites examination by academic researchers and governments alike.

(20)

4. How to Dissect the Spotify Network

4.1 Empirical Methodology

While discussing Spotify’s system of categorizing and organizing music, it can be easy to fall into the trap of looking at each individual artist as an island. But this is not the way that Spotify sees artists on their platform and is therefore not the way that this research will be approached. Spotify filters both add to and alleviate pressure on the network when it comes to classifying musicians and genres.

I have highlighted in detail how collaborative filtering combines user-listening statistics with the ability to extract acoustic metadata about songs and artists. But the basis of this

research is not about categorization, but rather how discoverability is achieved through networking within the Spotify platform. My wish is to break down just how new artists were added to each network, how large those networks are, and the breakdown of how

prominence is spread across these networks. As Dieleman describes in his document on collaborative filtering at Spotify, the more attention a song gets by users the more feedback Spotify has when it comes to including it in the recommendation system. Thus concluding, popularity tends to skew the recommendation objectivity of any network (n.p.).

The goal is to find a way of using digital methods to effectively dissect the Spotify artist network as it applies to new musicians who join it. Luckily for the purposes of this research, there are some highly applicable options when it comes to exploring network data on

Spotify. This is accomplished in the vein of other studies that have wished to explore networks more deeply. Others have tried to explore networks through the use of Issuecrawler technology, which is used for mapping issue-based controversies online

(Rogers 89). Issuecrawler studies hyperlinking between actors in order to visualize networks embedded in the web. It is from this express purpose of the Issuecrawler tool that I can determine that mapping networks and bringing them into an easily-digestible and visual medium can add to any study of digital spaces such as Spotify.

Born and Haworth used this concept to accomplish a not-altogether unrelated research project about musical genres. By combining digital ethnography with hyperlink issue

(21)

distinctive eras in the evolution of the Internet as a digital-cultural medium; how the two genres exemplify at the same time changing cultures of Internet use; and how they illuminate—through music—the increasingly reflexive aesthetic and political uses being made of the Internet” (Born and Haworth 83). Not only were they intent on discovering how the internet links two genres, but they put it in context of trends and fads on the internet. They were telling a story simultaneously about music history and about the history of the internet and digital environments as well.

Since the research in this thesis is focused primarily on dissecting networks, it is highly valuable to understand the importance of networks within digital environments. It is essential to build on the foundations laid by new media scholars when it comes to

uncovering patterns in how technology creates connections between digital objects of study. In the case of this particular research, the digital objects are catalogs of artist data hosted on Spotify. How do these objects communicate and interact with one another using the network API that Spotify has created? Most importantly, how does data like popularity score, follower count, and genre affect the artist network?

The main digital tool that will be used for this research is Bernhard Rieder’s Spotify Artist Network tool. Rieder is a researcher with the Digital Methods Initiative and has built several “as-is” tools made for use by digital researchers. Using a variety of sources including code accessed from GitHub, API access granted by Spotify, and visualization tools such as Gephi, sigma.js, and chroma.js, Rieder has created an in-browser tool that allows for quick access to all the artists on Spotify and their respective networks (Rieder n.p.). The process works by inputting the name of a Spotify artist in the search bar (See Figure 4.1.1). Spotify will then present you with a list of artists who match that name. After selecting your preferred artist, the tool begins to work by achieving the particular artist’s network information. The gephi visualization of the artist network will appear after a few moments (See Figure 4.1.2). The settings that are used to present this network are the Atlas 2 and No-Overlap options presented within the gephi program. Most importantly are the settings concerning popularity in the network as both node size and node color are used to depict the popularity of the individual musicians included in any one network, while the original artist searched appears in black (See Figure 4.1.3). The more popular an artist is, the larger their respective node will be. An added visual aid is the color which ranges from light blue to dark red. Simply, the warmer the color, the more popularity an artist retains.

(22)

Figure 4.1.1. A screenshot of the Spotify Artist Network tool built by Bernhard Rieder. Currently, the tool can be seen searching for the records of an artist named KYLE.

Figure 4.1.2. A completed visualization of the KYLE artist network with the Atlas 2 and No-overlap settings in use.

(23)

Figure 4.1.3. A zoomed-in look at KYLE’s artist network. His node is represented in black, while other node colors and sizes appear to indicate popularity.

Popularity can be a very subjective term, particularly when considered from a research perspective. What is popular amongst one group of people may not be true across all groups or demographics. Establishing a definition of popularity based on a system devised by the researcher could skew the entire dataset and make the research virtually useless. I wish to avoid this conflict of interest before it even arises. Therefore, the only real way to study how popularity manifests itself on Spotify is to go to the source for how they define and measure popularity. How does Spotify define popularity and why does it matter? The way popularity is used on Spotify can be seen as a measure of how effective their discovery networking actually is. After all, being discovered regularly means regular new listeners, which increases popularity by its very nature. Before delving into the actual process of highlighting the findings of this research, it is required that several more parameters be called out so that it can be most effective in trying to discern the effects of popularity on the Spotify network. First, it should be understood how Spotify is defining popularity in their popularity score when an artist network is requested. Spotify chooses to define popularity based on an analytics ranking from 0 to 100. The exact breakdown that Spotify uses to determine this ranking is proprietary, but it is comfortably a mix between total followers a

(24)

Spotify musician has, total number of listeners, and a total play count on all the songs in their individual discographies. For reference, one of the most popular artists on Spotify, Ed Sheeran, enjoys a popularity rating of 97 with over 23 million followers and 39 million unique listeners in the last month. To drive home this point, Sheeran experiences high playcounts on most of his songs, regularly reaching into the hundreds of millions. It is helpful during the course of this research to further break down these ratings into multiple categories. I have decided that there should be five tiers of popularity based on the ratings Spotify supplies through access to their API. These tiers include the unpopular artists (rated 0-20), the desiring growth artists (rated 21-40), the middling artists (rated 40-59), the popular artists (rated 60-79), and finally the mega-popular artists (rated 80-100). During the course of this study, I will be referencing these rankings and assignments as I delve into the differences between new artists’ networks on the platform.

Particularly when one speaks of new artists on Spotify, it is also significant for this study to further narrow down this rather broad category. The choice to focus on new artists was done so as to better ascertain how the Spotify network incorporates newer artists into their network. This way, there is a better understanding of the process of connection and

networking from the beginning of the Spotify “life-cycle” of an artist. The artists included in this study will all be artists who joined and created Spotify profiles from January 2017 and on. The list of these new artists was obtained from the Spotify platform itself, from a

promoted list of artists that have joined the platform recently. Of course, it is important that some biases may be included when obtaining a source list from the platform that is the subject of the study itself. However, I have personally pared down the list to include artists from multiple genres and popularity rankings. I did this by cross-comparing their names with their personal profiles. My goal was to find as little overlap as possible between the related artists on each of the pages. As such, I was able to find everything from “desiring growth” artists in the Dutch Pop genre to “popular” artists in the Rock/Hip-hop genre. The number of individual artists examined for this study amounts to 50. A search within the Spotify Artist Network tool was performed for every single one of these musicians.

Visualized networks were formed and captured. Additionally, the library of data was taken from each of these searches and placed into a database for further scrutiny and study. The fields included all artists names included in every single network, as well as respective popularity scores, follower counts, and genres corresponding to the artists. All in all, a database that included over 170,000 individual cells and inputs was created for this study.

(25)

In the next section, I will discuss what was found during the process of this research. The patterns found will be dissected and addressed one-by-one. Do any of these artist deep dives reveal points of interest about how Spotify networks its new artists?

4.2 Revisiting The Spotify Teardown

It is critical that this research is set apart from other digital methodologies that have been completed. Most significantly, there has been research funded by the Swedish Research Council into the “black box” that is Spotify. The results of this study are to be published in the upcoming book, The Spotify Teardown by Maria Eriksson, Anna Johansson, Rasmus Fleischer, Pelle Snickars, and Patrick Vonderau. This group of researchers approached the Spotify from several different angles. It attempted to situate Spotify in a type of research more common in “autoethnographic and self-reflexive forms of fieldwork, as they are

common in social anthropology and ethnology,” (Eriksson et al. 7). To this end, the research combined its digital methods research with work that is grounded in traditional interviewing and archival research.

However, there was detailed outlining of The Spotify Teardown study as the result of digital methods work including examination of Spotify as a “mediator” and not as a

traditionally-defined “platform.” They prefer to look at Spotify through the perspective of Bruno Latour, picturing Spotify as “an actor that transforms, translates, and modifies the meaning of the elements it is supposed to carry,” (Latour 39). By viewing Spotify this particular way, you mitigate arguments that a platform like this is inherently unequal due to the highly varied contributions of actors within the space. This sets up The Spotify Teardown as a piece of work that is informed by new media methods of research and study. The research makes clear all of the methods they used to obtain their results, pointing out that at some points the researchers may have violated Spotify’s terms of service. These methods are listed as: First, according to the methodological approaches they introduce: “1)

following the controversies of a social media campaign; 2) establishing a record label for research purposes; 3) intercepting network traffic by way of packet sniffers; 4) conducting a reflexive analysis of methodological decision making; 5) building a digital application as part of an arts performance; and 6) engaging in web scraping of corporate materials” (Eriksson et al. 16).

(26)

While this methodology does give some illumination to the actions that the researchers are performing, it still remains to be seen what the intended purpose of the methods may be. On behalf of The Swedish Research Council, the researchers looked into various topics including the political ramifications Spotify had in Sweden (and globally), the power

structures and finances of media distribution, the infrastructure of music streaming, and an examination of user data collection (16). This in-depth look at all the ripple-effects that Spotify has had in different spaces is the first of its kind. The sheer breadth of topics covered by the study already sets it apart from the hyper-focused nature of the research completed in this paper.

Specifically, even though the context of this paper depends on situating Spotify briefly in the histories of media concentration and how streaming is constructed on a platform (or

“mediator”) space, this is merely an act of world-building as a set-up for the large question of this research. The Spotify Teardown researchers do not approach their research from the perspective of new artists and how they are integrated into the networks. They discuss the types of networks that precede Spotify and Twitter networks that were a part of Spotify’s first public relations drives, but it leaves the API of the effect of the artist network largely unexamined. The Spotify Teardown, for all it does tend to do, does not handle the concept of discovery and popularity. The focus of this research does. It looks at the advantages or disadvantages given to the artist actors specifically. Because of this narrowed field of examination, the concept of discoverability can be defined, analyzed, and re-defined as the need arises. Both studies may look at Spotify’s claims of “democracy” or “fairness” through a lens of skepticism, but only this study focuses on the actors who have the most

investment in whether or not this “democratic” style is functioning.

(27)

5. Unlocking the Network

There are three partitions when it comes to how Spotify organizes artists and information within their networks. It is my intention to study each of these partitions individually and to illuminate the pathways that Spotify provides to increased discoverability in the system. When withdrawing information from the Spotify Artist Network, these partitions become visible. They are followers, genres, and popularity score. By looking through each of these lenses and seeing how each of them present different opportunities as well as

disadvantages toward getting discovered through the platform. Perhaps the follower count tells a completely different story of the impact than the popularity score. Perhaps genre has a large effect on popularity and discoverability within any of the networks. Keeping this in mind is important as I look into what can be found when trying to dissect each of these partitions. Ultimately, by taking each of them apart, it can make it possible to construct an overarching narrative as it relates to the “democratic” nature of the Spotify platform.

5.1 Genre and Popularity

When scraping for all 50 artists there is one trend that stands out among every single data set. This is the effect caused by the inclusion and clustering of the most popular genres on Spotify. Each graph includes two iterations of results, meaning that it crawls for both the network of the artist that is queried, as well as the related artists of all those within the first-generation network. This greatly expands the networks of many artists, sometimes including artists that seem either unrelated or tangentially related to the queried artist’s genre. For instance, one new artist included in the scrape, Clairo, has been identified by Spotify as belonging to the lo-fi beats genre. This genre is characterized by minimal

production, mostly focused on the rhythm section. Her direct network includes artists similar to her in genre and style such as Castlebeat and Fox Academy who both belong to lo-beats and indie psych genres. These correlations to Clairo are fairly easy to make and it is logical why Spotify groups them together. However, once you open up the network to who is connected to Clairo’s first-degree connections, you start to expand the genres to include artists who do not share any common links with Clairo. This begins to skew the graph as we know it, particularly because Clairo’s second-degree connections include r&b and pop artists who tend to heavily skew the popularity weight of his network (See Figure 5.1.1).

(28)

Figure 5.1.1. A look at Clairo’s network. He is depicted in red only tangentially related to the popular black and gray dots representing r&b and pop artists.

As r&b artists such as SZA and 21 Savage carry popularity scores in the high 90s, they tend to dominate the networks of many artists. Clustered tightly with other r&b artists that they have mutual connections with, they form clusters with lots of links both directed toward and away from them. Once again, these artists are only tangentially linked to Clairo. This means that people browsing the lo-fi beats genre of music would soon start being directed toward more popular artists and genres such as r&b and pop music. In the figure below, all r&b and pop artists are colored in black and gray while the more similar genres of indie rock and lo-fi are displayed in purple and pink respectively. The clusters of r&b and pop appear to be highly self-referential linking back and forth to one another frequently.

This is not just unique to Clairo. Though the original sample list of new artists spanning all genres from folk music to French electronica, the results often lead back to the most popular artists on Spotify, more specifically belonging to the more popular genres of r&b and pop. For example, the artist The Weeknd appears in no less than 39 out of the 50 studied networks. In fact, after ordering the most popular artists (See Figure 5.1.2) on Spotify--those with the highest popularity score-- I found that all 20 of the top artists belong to the pop or r&b genres including some of the artists that made an appearance in

(29)

Clairo’s network such as SZA. This brings up interesting leads that will be further explored in forthcoming chapters.

Figure 5.1.2. The artists with the highest popularity scores on Spotify, all of which belong to the pop or r&b genres.

5.2 Followers vs. Monthly Listeners

While the tool allows for the easy scraping of a subscribed follower count of all the artists within the networks, the monthly listener count is a bit more elusive as it refreshes every day to reflect an accurate month-over-month overview of how many people have listened to a song on the discography in the past 30 days. However, this could be by design as the monthly listener count affects the popularity score more than the follower count. As

previously mentioned, total playcount on discography also factors into the popularity score which is subject to change as well. A good example of this is the case of Knox Fortune, a 2017 new artist who has the highest popularity score at 68 of all of the 50 original case study artists. This score qualifies Knox Fortune as a “popular” artist according to the previously stated popularity tiers. However, his follower numbers are more difficult to reconcile with his preeminence since he only has 9,517 subscribed followers (see Figure 5.2.1) and there are seven other artists in the initial dataset who have more. In fact, the artist Kojo Funds boasts a unique follower account upwards of 70,000 but has a popularity score of 67, one below Knox Fortune.

(30)

Figure 5.2.1. Knox Fortune and Kojo Fund’s monthly listener and follower counts.

But it is obvious that the Knox Fortune and Kojo Funds cases have a similar gap when it comes to monthly listener counts. Once again, Kojo Funds outstrips Knox Fortune by a significant margin. This is prevalent amongst many of the different artists that were scraped where follower counts and monthly listener counts do not have a one-to-one correlation with the popularity score provided by Spotify. The only missing piece of the popularity score equation is the total playcount of every artists’ discography. This information is not available through scraping the open API. Cumulative discography playcounts are also not listed on the artist detail pages. They can be obtained from song to song, but are subject to rapid change due to the very nature of active streaming. This can lead us to believe that the popularity score clearly weighs some categories of artist statistics more heavily than it does with other data.

5.3 Network Size

It would seem logical that the larger network an artist may have that the bigger the reach they would obtain in terms of discoverability. Certainly, there are more paths to discovery If there are more nodes in any given network that have the opportunity-- rather directly or indirectly-- to link back to the central artist. Following this line of thought, it could be conceivable to imagine a feedback loop where those artists with the largest networks would also be the same artists with the highest popularity scores. In such a way, the networks would be facilitating a process of solidification that could create a wider disparity between more popular/ better-networked artists and those who still struggle with heightening their presence on the Spotify platform. However, through analysis of our 50-artist dataset, this postulate about the trickle-down effects of network size is proven to be much more complex and variable in reality.

(31)

The most extreme example of this within the dataset are cases of the artists Honey Harper and Not3s. Honey Harper is what I would define on the previously set measure as an unpopular artist, with a popularity score of 16. Not3s, on the other hand, enjoys a healthy popularity score of 63, marking him as a popular artist. Both of their statistics in terms of follower count and monthly listeners support this data. Honey Harper sports 338 followers to Not3s’ 92,240. However, none of this seems to feed into creating the size and the reach of each of their respective networks (See Figures 5.3.1 and 5.3.2). Honey Harper has a very large network consisting of 2,205 artists ranging from her own genre of country-pop all the way to Brian Eno’s ambient art rock. Not3s, on the other hand, has a network consisting of a grand total of 150 artists including genres such as his own, grime, to the afrobeat genre of WizKid, the most popular artist within Not3s’ network.

Figure 5.3.1. A general overview of the size of Honey Harper’s network of 2,205 artists. Honey Harper is represented by the small center black node, while the highlighted artist Brian Eno represents the most popular artist within her network.

(32)

Figure 5.3.2. An overview of Not3s’ 150-artist strong network. Not3s is pictured in black, while the most popular artist in his network, WizKid, is only connected to Not3s through a hub node artist called Mazi Chukz.

Once again, it can be seen that in these two cases, popularity score has little direct

correlation to the strength or size of a given network. There are many possibilities for why this could be the case. It is worth note the placement of the artists within their own

networks in order by popularity score. Out of 150 artists, Not3s places with the 17th highest popularity score within his own network. This means that he is only connected to 16 artists more popular than him, and he is also connected to 133 artists who have lower popularity scores than him. This allows him to serve as more of a vector toward lesser artists.

However, Honey Harper’s network behaves in a different way. With her popularity score of only 16, she falls in 1743rd place in popularity within her own network of over 2,200 artists. This means she falls only in the 31st percentile of her network’s popularity scale. She is

(33)

connected to 1,742 artists who have higher popularity scores than her. Therefore, her discoverability position as it related to more popular artists puts her in a much more favorable position than Not3s’ own network does for him.

5.4 Popularity Distribution

It is important to also examine trends that occur across all networks when patterns can be spotted in these areas. In order to better determine how Spotify as a whole assigns

popularity scores, it is valuable to study the whole dataset and not just the new artists that were initially examined. The entire dataset including every artist that was included in the network crawl consists of 23,939 unique Spotify artists. These artists range from the extremely popular like Justin Bieber with a popularity score of 99 to the virtually unknown like D Cult with a popularity score of 0. Distribution of popularity score can be an important factor in determining how easy or difficult it may be for any artist to achieve extreme popularity. Based on the organizational bins that were specified earlier about how I intend to examine popularity scores amongst artists, I can examine how popularity is distributed amongst networked artists (See Figure 5.4.1).

Figure 5.4.1. A histogram (bars) and normal distribution (line) representation of all Spotify artists’ popularity scores in the dataset.

(34)

The above graph shows a few interesting patterns when it comes to how popularity is distributed across the collection of almost 24,000 artists. The first thing to spike the attention is the spike at the very low end of the chart. This means that many of the artists in the dataset have very low popularity scores. In fact, 6,757 had scores of 20 or below, classifying them as unpopular artists. That means that 28 percent of artists within the entire dataset classify as unpopular. This makes sense when compared to the mode of the entire popularity score dataset, which is actually 0. The mode is the most commonly recurring score within the entire set. The exact count of artists who have popularity scores of 0 is at 1,425, or almost 6 percent of the entire catalog of artists. Inevitably, these types of scores weigh heavily compared to the other artists, which is why the normal distribution displays such a crest at the low end of the popularity score spectrum.

The next interesting discovery when looking at the histogram chart is seeing where the majority of artists cluster together as it relates to popularity. It can be seen plainly in the visualization that the largest numbers of artists have popularity scores between 30 and 45. This point is also supported by the numbers. The mean or the numerical average of the dataset logs in at 31.23. Further support is given by the median value of 32, as it applies to the collection of artists. This puts the majority of artists within these networks squarely in the “desiring growth” subset as was specified earlier. This breaks down to 9,722 artists who have popularity scores between 21 and 40. The desiring growth artists represent a

whopping 40 percent of the total dataset.

The final point to highlight is how the number of artists dramatically drops in each sector as the popularity scores rise. The distribution curve experiences a deep dive as the popularity scores approach 50. Scores increase almost exponentially until the bottom out near the highest scores on the spectrum. Only 3,728 artists (15 percent) actually have popularity scores of 50 or higher. This valley only gets steeper and steeper as the scores increase. As defined in Methodology, “mega-popular” artists are those defined by having popularity score between 80 to 100. Artists with popularity scores that high only account for 0.4 percent of the total dataset with a mere 114 artists who have attained “mega-popular” status. The number of artists with a popularity score of 0 is more than 10 times the amount of artists who belong in the entire 80-100 subsect. Now that the prominent patterns in how popularity scores are approached, they can be further dissected to see how discoverability plays into the creation and updating of these ranks. These patterns in popularity distribution make for very significant points of discussion that will be discussed in the next chapter.

(35)

6. Discussion

6.1 The Power of the People

The notion of popularity has everything to do with people, and therefore people have everything to do with what is popular at a particular time. That’s why attempting to find meaning in what people like or prefer can seem to some like a fool’s errand. However, I am not the first researcher to take seriously what people like and how that affects interactions with technology. As was mentioned previously in the empirical methodology, Born and Haworth embarked ambitiously studying how time and technology changed the social landscape of how people create and interact with music, with a particular focus on how five contemporary music genres evolved and thrived with extended reach due to the internet. Although completely different from the corpus of this research, Born and Haworth’s exploration of vaporwave and the like could be seen as another lens through which popularity via discoverability can be examined.

The dominance of the very current r&b and pop genres as on Spotify is a powerful example of the effects of popularity. It could be seen how easily these genres could connect and “metastasize” within the networks of seemingly unrelated artists and genres. They tended to cluster together when grouped by every measure whether it was by genre, popularity, or followers. No matter what the pathway was, artists such as SZA and The Weeknd often formed clusters that outsized the rest of the network. Particularly, it could be seen that these “mega-popular” r&b artists made up such a small fraction of any dataset that they belong in. However, they were connected to in a majority of all of the 50 new artist networks. On top of that, the edges performed another service to these “mega-popular” artists. Seeing as how they were similar in genre, they all connected back and forth with one another. Even several of the edges coming from less popular artists often directed toward the r&b pop clusters.

This could be seen as an advantage for “desiring growth” or “unpopular” artists to receive a boost in terms of being discovered by listeners who may listen to artists like Justin Bieber. However, more often than not, these clusters are hard to surface within because they are constantly surrounded by better-connected artists who have more edges (or pathways) to listeners than the less popular artists do. It would take several clicks to venture away from the artists in the “mega-popular” clusters, and even then there are more pathways leading