• No results found

What Google Considers to be “Good”: Contributing an Alternative Method for Understanding Google’s Search Engine and the Inscription of Societal Norms

N/A
N/A
Protected

Academic year: 2021

Share "What Google Considers to be “Good”: Contributing an Alternative Method for Understanding Google’s Search Engine and the Inscription of Societal Norms"

Copied!
50
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

What Google Considers to be “Good”:

Contributing an Alternative Method for Understanding Google’s Search Engine and the Inscription of Societal Norms

Lin Chun Yen 10842365 10 July 2017

MA New Media and Digital Culture Universiteit van Amsterdam

Supervisor: mw. Dr. Esther Weltevrede Second Reader: dhr. Dr. Jan Simons

(2)

Table of Contents

Abstract 3

Introduction 3

1. How Google Gets to Decide What is “Good” 7

1.1 Why Google Gets to Decide 9

1.2 The Role of Algorithms 9

1.3 Algorithmic Neutrality 10

1.4 Is Google Really the One That Decides? 12

1.5 The Power Google Is Said to Have 13

1.6 Power Relationships 15

1.7 What Happens When Google Gets to Define “Good” 21

2. How Norms Are Inscribed 23

2.1 Historical Origins of Algorithms 24

2.2 Monetizing Algorithms 25

3. How to Study Norms 27

3.1 Overview 27

3.2 Challenges 28

3.3 Methodology and Objectives 31

4. Case Study 32

4.1 Google Search Console 32

4.2 Search Quality Evaluator Guidelines 38

4.3 Findings 41

5. Possibilities for Future Research 43

6. Summary and Conclusion 45

(3)

Abstract

In an online environment in which any user can easily contribute knowledge and in which most documents are digitized, people need an efficient way to collect relevant information in a shorter timeframe. Following the incorporation of PageRank into its search algorithms, Google has become one of the most used and trusted search engine for web users, especially for its perceived precision and objectivity, which users have come to expect from search platforms. In fact, users have become so dependent on Google for information and everyday decision-making, it is arguable that Google is now mapping our social reality through indexing and prioritizing information. As a result, any item or webpage evaluated as “good” (or “relevant” or popular”) will be promoted on the results page by receiving a higher ranking. This evaluation mechanism has a far-reaching influence in society, because the results provided by Google shape what we know, whom we know, what we discover, what we experience and what we trust. However, the algorithms utilized by Google’s search engine are carefully guarded secrets, and the precise ways in which Google evaluates

information is unknown (only parts of the algorithm, such as PageRank, have been discussed publicly in any detail). This thesis discusses Google’s ability to shape knowledge, and proposes alternative methodologies for measuring and examining “what Google considers good” and for understanding how norms have been inscribed in the social sphere through Google’s complex platforms.

Introduction

Google has often positioned its Web search as deploying a fundamentally democratic computational logic (i.e. the search algorithm), which promotes mainly the web pages with more “votes” that have a higher degree of preference from the public. However, the algorithm actually takes into account hundreds of additional criteria inserted by the software designers (Gillespie, “The Relevance of Algorithms” 180). Combining these diverse criteria leads to the results page displayed to the search engine user, shaping the parameters of his or her

knowledge about the subject of the search. One example captures the crucial role that top results play for users: according to research conducted in 2013, 32.5% of the search traffic is shared on the first page of Google result (by default, the 1st to the 15th results are displayed on first page), and more than half of the people stopped their search at the second page of the result (Chitika). Additionally, according to a research on moz.com, studies have found that the first 10 results presented on Google have taken up to 48-95% of the total click-through

(4)

rate (Petrescu).1 It could be argued that a presence on the first page of results implies a guarantee of legitimacy (Boutet, Quoniam and Smith 446). Since the ranking of the results plays an important role in what is eventually known by the users, it is particularly important for us to understand what is behind this ranking mechanism, and how it impacts society’s perception of the results.

Another way to build a better understanding of how Google’s search algorithms influence perceptions of what there is to know, expressed as Google’s “power” by some scholars (e.g. Beer), is well illustrated by Pasquale in the opening of “The Black Box Society,” where Pasquale presents the lamppost story. The story begins with a man crawling around a lamppost in the dark to look for his lost keys:

“You lost them here?” asks the cop that passes by. “No, but this is where the light is,” the seeker replies.

To Pasquale, “darkness” is a metaphor for how enigmatic new technologies have become, and how the user (or seeker) is so limited in his or her resources for setting the parameters of the search. According to Beer, it is only when we understand what algorithms are and what they do that we can understand their influence and consequences (3). In this thesis, I will address the need to understand further how search algorithms work and how they impact our lives. In short, this thesis analyzes the social and cultural roles of search algorithms.

Related to an increasing awareness of the social influence wielded by the Google search engine, emerging theories have started to look beyond an instrumental discussion of the search engine and towards an exploration of the possibilities to conceive the algorithm applied by Google’s web search as an actor or agency, as an “institution with effects on individual/ collective behavior and social order” (Just and Latzer 242), affecting individual consciousness and shaping social norms. Accordingly, one major characteristic of search algorithms is termed their “governing effect” (Just and Latzer 239) or the “power” of algorithms to “decide what matters and to decide what should be most visible” (Beer 6).

1 The research done on Moz.com takes numerous studies in consideration, and the studies on Google (US) can be found in Enquiro, Chitika 2010, Optify, Chitika 2013, Catalyst, and Caphyon. Due to methodological differences, the studies display varying results. Methodologies are explained on Moz.com.

(5)

For this reason, it is also important to understand the interaction of human agency and

corporations in designing the algorithm, whose mechanisms then shape how the process plays out on the Google search engine, what the desired outcomes are and how “relevance” is defined and practiced (Beer 5). Namely, through exploring the way in which Google ranks search results, it will become more apparent to realize how Google defines and interprets “relevance” when answering our queries, and understanding that pre-determined

interpretation will give us a better insight into how Google affects the construction of social reality. In this thesis, I aim to follow Just and Latzer’s argument for the importance of looking beyond the functional aspects of Google’s search engine and further explore how they create social meanings through embedding “norms” in the design and construction of the search algorithm, i.e., how the algorithms set the conditions of possibility for us to know and to trust certain information.

In order to achieve these research aims, this thesis focuses on what the important attributes are to Google when giving high rankings for a web page or prioritizing certain content. In addition, this project proposes a methodology for understanding what types of webpages might be prioritized by Google through reading into the documents and practicing the tools offered by Google itself, with the goal of opening up the discussion of how Google assists in constructing social norms. In contrast to the vast amount of technical research for algorithms, critical research on algorithms across humanities and social sciences are yet to be engaged with enough attention (Kitchin 26). This project thus aims to inspire further research on the construction of Google search engine and the role it plays in the complex relationships of power that are negotiated and re-negotiated among various aspects of the market, as well as the consequences it brings to the society.

To build a valid argument and to develop my research question, this thesis begins with a section on “How Google Gets to Decide What is ‘Good.’” This section includes several subsections on how Google is perceived, namely its reputation for objectivity and neutrality, followed by several subsections on realities of power relationships that conflict with this perceived neutrality. In other words, this first section addresses tensions between Google’s reputation and its actual powers of defining the criteria for quality and legitimacy online, including a multi-directional dynamic in which Google, its users, and SEO experts perpetually shift the workings of Google’s algorithms. This section contributes to a broad discussion of the inner workings of search engine’s selection and ranking criteria, and the

(6)

effects of its selection features. In the second section, “How Norms Are Inscribed,” I examine the linkage between Google’s definitions of “good” and the resulting social norms that are built into the social fabric. By analyzing the historical development of search algorithms, I demonstrate that different kinds of design have developed over time and have resulted in different sets of implicit social norms, leading up to the current model of monetized

information and advertising. By making sense of how Google ranks have a direct relationship to how the search engine inscribes norms, this section provides a contextual background to my case study of norms through an analysis of the kind of web sites Google search engine prioritizes.

After this survey of previous studies on Google, its characteristics, its role in society and how it inscribes social norms, the next section on methodology discusses the existing critical studies on Google, the challenges involved in critical algorithmic studies, the conclusions reached, and the studies yet to be done on remaining questions unearthed by the existing studies. The section concludes with a discussion of the current difficulties in conducting critical algorithmic studies, including the problem that much algorithmic information is black-boxed (Pasquale). In light of this conclusion, I propose that new approaches are needed to study Google and the social norms it inscribes. This section explains my approach toward building better insights into “what Google considers good” by looking at what is told to the webmasters on Google Search Console and what is told to the quality raters in the Search Quality Evaluator Guidelines, which is explained in detail in the Case Study section. In Section 4, the core approach of my case study of Google with the tools and documents offered by Google itself is inspired by Bucher’s methodology proposed in her Ph.D. dissertation on programmed sociality by observing the technology itself and what it can be found to be suggestive of. The design goal of my methodology emphasizes to the need for new approaches to study the kinds of power that algorithmic systems exercise over us, as noted by Kitchin. At the end of the case study, by providing insights into which technical and contextual specifics are preferred by Google with its goal of ranking the better or more relevant information, conclusions are offered regarding which societal norms could be said to be inscribed through these logics of selection and ranking. In the fifth and sixth sections, I sum up the opportunities and limitations of my chosen method as well as the key findings. In addition, I include discussion of the future developments of research on search algorithms,

(7)

specifically on how the subject poses new challenges and opportunities to apply versions of my methodology in future critical studies of algorithms.

1. How Google Gets to Decide What Is “Good”

Before delving into the critical algorithm studies, and more specifically, the qualities of algorithms and how its operation impacts society (Kitchin 18-19), it is important to clarify the relationship between Google’s search engine and its algorithms. In addition, I explain my use of the term “good.” Although the core purpose of my study is to look at the norms embedded by Google Web search, its search algorithm is constantly brought up in critical discussion and comparison because the search algorithm is the “problem-solving mechanism” deployed by Google on its Web search in order to achieve the technical solution it promises to the public (Just and Latzer 239). Therefore, the design of the search algorithm as well as the debates and consequences around its functionality and effects are crucial to my research of the norms that Google Web search inscribes in society. Moreover, Google has also created tools and written documents such as the Google Search Console and the Quality Guidelines, which also reveal valuable insights into the critical study of the Google search engine (and will be the subject of the case study in this thesis).

Positioning Google’s algorithm as returning the most “relevant” information is not nearly accurate for describing what Google does. In fact, “relevant” is a “fluid and loaded judgment” in that there is “no independent metric for what actually are the most relevant search results” (Gillespie, “The Relevance of Algorithms” 175) because it is “always in response to an informational need” (Weltevrede 102). As a result, relevance is a concept that is open for more interpretation around complexes of popularity (see, e.g., Gillespie, “The Relevance of Algorithms,” Lovink, Van Dijck) and newsworthiness. According to Brin and Page, it is the “probability” of a random query inputter to visit a page that Google has aimed to calculate. Similarly, Weltevrede and Rieder have both expressed that relevance is not simply an objective characteristic but rather a response to a specific informational need for the most authoritative sources (Weltevrede 110). Furthermore, PageRank itself also contributes to that authority by ranking sources highly, becoming a “status-authoring device” (110). In Roger’s research to probe the problems of the design of search algorithm by tracing the different

(8)

given results in time, he concluded that Google’s search engine does not necessarily displays the viewpoint from a diversity of voices and that “the displayed sources are often the ones that are ‘familiar and established’” (“The Googlization Question” 1).

With a desirable amount of backlink on a website to be placed on top in the first place is likely to remain on top, having its status enforced by Google, because most of the users only check within the top dozens on the result page (Weltevrede 110); “the rich get richer” dynamics built in PageRank (Rieder “Democratizing Search?” 7). Therefore, instead of saying that Google search finds the most relevant result for the query, it may be more accurate to say Google recognizes the dominating voice on web, pushes that voice to more interested parties, and reinforces the popularity of the dominating voice. This process continues in perpetuity, reinforcing a snowball effect.

In addition, the recent study entitled “2017 Local Search Ranking Factors” conducted by SEO expert Darren Shaw have revealed parameters of “good” as determined by Google. This research was conducted through inviting SEO experts who have been optimizing their

webpages and experienced the change of ranking logic of Google. This report shows what the SEO specialists believe to be the more important factors among all algorithm signals, namely what factors about a site make it better recognized and promoted by Google. According to the experts, in the localized organic ranking section, it shows that they still believe Link Signals to be the most important factor, scoring by 29% of overall importance. 2 Following this factor are One-Page Signals (including keywords in title and domain authority), Behavioral Signals (click-through rate, mobiles clicks to call and check-ins), Personalization, Citation Signals, My Business Signals, Review Signals and Social Signals (social media engagements) (Shaw). This survey indicates that by fulfilling the above factors on your website, the content you promote will be much more likely to be found and seen by Google Web Search users.

To Google’s engineers, the aim is in fact to have the search algorithm returning the result that looks “right,” “treating quick clicks and no follow-up searches as an approximation”

(Gillespie, “The Relevance of Algorithms”175). In other words, it is not of relevance they 2 In the survey, experts were asked to rank the top 20 individual ranking factors based on what they considered impacting the “localized organic” ranking the most. Results were then calculated with inverse scoring logic, where the number one-ranked factor received the most "points" for that question, and the lowest-ranked factor received the fewest points.

(9)

measure, but satisfaction. From this perspective, the procedure of selection might be consequential base on engineers’ choices of what meets the satisfaction, giving the

opportunities “for propriety, for commercial or institutional self-interest, or for political gain” (191). By using the term “good,” I aim to emphasize the very subjective nature of relevance, which is simply the logic contributed by the criteria embedded in the search algorithms. Furthermore, any given “good” webpage is further authorized by Google Web search itself, which bestows a high ranking on the webpage.

1.1 Why Google Gets to Decide

In short, Google gets to decide because people believe in the functionality it promises, which is to return the most “relevant” answers to their queries through an automated process. This process does not seem to be something that could be done by an individual, because the resources online are vast. In the following sub-sections, I discuss in more detail why people believe in this functionality and how Google is perceived. Specifically, users put trust in Google search engine partly because of the way the notion of algorithms is promoted, and because they understand algorithms to be objective and neutral. In other words, users believe that algorithms are the best tools for returning the best results. Understanding why people put trust in Google and rely on it to return knowledge clarifies why Google has such an impact on users’ knowledge construction, that norms could inserted in society base on Google’s

selection logic.

1.2 The Role of Algorithms

The social power of algorithms is often analyzed through their functional and architectural significance (Just and Latzer), focusing on the concept that algorithm has the ability to select and prioritize information (Beer 6). While this is a major concept in analyzing algorithms, and the consequences of which ability will be discussed in the followings, I invite readers to take a step back and contemplate what the word algorithm means, and the sense of assurances this term implies, that is, the power invested in the term. Gillespie suggests a similar

approach in his analysis of the term platform. The term platform plays a role in how Google and the Google-owned company YouTube position their websites, is a term that generally implies neutrality (“The Politics of ‘Platforms’” 350). However, according to Gillespie, it is a strategic use of terms from Google to shape public discourse, and a careful move of

(10)

intervention to establish the criteria this technology will be judged of in the future (“The Politics of ‘Platforms’” 359). The purpose here is to analyze the power of terminology but not to make a conspiratorial claim that corporates intentionally deceive the users on its services and functionality. However, it is important to incorporate the power of the term, how it might be misinterpreted and how it shapes users’ understandings. By discursively framing itself as a platform, Google has successfully given itself the image of an information gatekeeper, distinct from the technical affordances and received legislative protections, as well as added a favorable discourse on its obligation and responsibilities (Gillespie, “The Politics of

‘Platforms’” 358).

While the term platform was used by Google to explain the kind of service it offers, the term algorithm is deployed to explain the search engine and its operation logic (“How Search Works”). I argue that by promoting the concept of algorithms, Google has already acquired a desirable discourse leading to a form of power. I suggest a new angle to look at the notion of algorithm in two steps. Firstly, algorithm as a term is perceived as being objective and making trustworthy decisions. Since it works as a set of mathematical and computational calculation, it is reliable for returning the fix result on a certain input. And when Google deploys the concept of algorithms, a platform that connects query to result, the result (answer) to any query (question) then is perceived as reliable. In other words, it might be perceived as returning “the answer” for whichever topic the users would like to explore. Secondly, similar to deploying the term platform for its neutrality, the notion of “algorithm” has wrapped in the mechanic rationality and further convinced the public of its truth

producing function. Yet in other words, algorithm is largely trusted for its precision and objectivity, which might be understood among the public as a truth producer (Beer 8). As a result, Google possesses the authority to define truth and practices the power of controlling the outlet of “truth.” While there is no denying that algorithms in their purest form operate through logic outside of human interference, its complexity is often overlooked.

Algorithmic decisions are often perceived as neutral, efficient, objective and trustworthy (Beer 10-11). And while there is certainly a need to investigate Google’s algorithms for their social impacts regarding the working logic, I believe the culture prominence of the notion of the algorithm itself have also revealed important practices of power through Google’s web search.

(11)

1.3 Algorithmic Neutrality

As illustrated in the notion of algorithm, the way Google Web search is perceived could already result in given power or could allow power to be operated through Google. The way algorithms are perceived by the public of its functionality, for instance, its computational characteristic or the notion of algorithm creates a lure of objectivity and the trust of its objectivity could lead to a form of control. To Weltevrede, Google’s search engine becomes our dominant knowledge logic because the public has been positioning PageRank as a neutral device (111). Efforts such as pulling itself out of China helps to reassure the users of its objectivity (Gillespie, “The Relevance of Algorithms” 180). However, whether Google’s search engine is truly neutral remains debatable.

In his article “Thinking critically about and researching algorithms,” Kitchin argues that algorithms are “far from being neutral” (19). In the nature of algorithms, according to Kitchin, it is in fact made to seduce, coerce, discipline, regulate and control, or to “create, maintain or cement norms,” as mentioned in Beer’s research (6). The authority of algorithms are also questioned by Beer. As described previously, algorithms have a “status enforcing,” information prioritizing function that was created to deliver the supposedly desired (or relevant, or popular) result to users. Through search algorithm, its ranking and profiling system, knowledge is not conveyed to us but is in fact co-produced (van Dijck 575).

However, an understanding that Google hardly subscribes to neutrality and transparency (van Dijck 583) is seldom reached among the public. One possible explanation for this lack of awareness is that the users believe in the neutrality in apparatus, and that most of the users are technically unconscious to be aware of its other mechanism (van Dijck 586). Since

algorithms are written by humans who inserted countless decisions in the design and development of algorithms, a human-technological system (van Dijck 575), bias and errors and subjective decisions are likely to persist (Morozov 21, Caplan and Boyd 7). Because “there is no search without bias,” as asserted by Rieder and Sire, it is important for us to extract the “Google dilemma” as a search engine that is bound to be questioned for its “systematic bias” as well as the “selective bias” and the editorial effects (203). Because the search result is data-driven, namely it returns results that score the highest with the signals included by Google, racism may also be embedded when biased users’ feedback is received by the system (Caplan and Boyd 7).

(12)

The reason why neutrality is such an important debate is because the assertion of

“algorithmic objectivity” is crucial for Google to function and to maintain its status as the legitimate information broker in our society (Gillespie, “The Relevance of Algorithms” 180). Google’s PageRank has been commonly perceived as a neutral device because it is

“objective, automated and reuses the associations of others” (Weltevrede 111). However, both the terms objective and automated are subjected to suspicion by researchers. As far as the debate of objectivity goes, the human intervention of algorithm is the major evidence against it, as some mentioned previously. According to Gillespie, the objectivity promised by algorithm is by no means the same as the objectivity promised by journalism, both overstated. With both journalism and algorithm, the practices to neutrally define relevance are not clear to the public, and are depending on having its accumulated guidelines for the

“proceduralization” of information selection, trusted by the public (Gillespie, “The Relevance of Algorithms” 181). In fact, Google’s algorithmic system is so complicated that, no Google engineer fully understands it (Morozov 21). Also, to Rogers’ attention, unlike news outlets that use generic wording and deploy style guides, Google might show different results with different sides of argument/ emphasis based on the inserted key word for the query (82-83). Moreover, Google is employed as a socio-epistemological machine because it’s result page varies when the language of the query and the location setting of the search engine is changed (87). Nonetheless, by insisting on the algorithmic objectivity, Google founders can “carry on with their highly political work without noticing any of the micropolitics involved” (Morozov 21).

As discussed above, it is reasonable to say that absolute objectivity and neutrality to Google, as a machine isolated from human intervention and society feedback, is only a corporate-designed illusion. Moreover, algorithms are better said to be a part of a broader power dynamics, constructing the parameters of truth and reinforcing norms (Beer 11). 1.4 Is Google Really the One That Decides?

This subsection takes up the question of the extent to which Google truly “decides” the social norms that have been discussed thus far. By concluding that the “power” and social effects Google is said to have or to not have, in the following sub-sections, I will further illustrate the complex relationship between Google search engine and the users. As a result, Google does get to define what’s prioritized through the design of its search algorithm in terms of what

(13)

criteria will be considered and preferred, hence the power to shape knowledge, construct reality and inscribe norms. However, the users and the market also influence how Google Web search works and how the algorithms are designed. In addition, the users and the market has such ability to influence Google Web search that some says they are able to change the way Google designs its algorithm. However, many believe that the algorithm is already self-stabilizing and compromised.

1.5 The Power Google Is Said to Have

Search algorithms, first designed to make academic research easier online, were created as a tool for information retrieval (Van Couvering 182). It acts as a mediator to sort, classify, filter and prioritize information for the users. Gradually, looking at the immense amount of

information online, it has become crucial for us to rely on machine to pick the best result through the masses of sources, that media and information can only be utilized efficiently via search algorithms (Just and Latzer 242). Therefore, we seem to be hooked to this retrieval tool (Lovink 1) and become relying on its functionality. As a result, we depend so much on Google Web search to retrieve information that we have allowed algorithms to shape outcomes and opportunities for us (Beer 5). According to Rogers, the Google search algorithm, framing in the PageRank logic, has the power to authorize information by delivering the most “deserving” web pages (“The Googlization Question” 4). By defining which story is more deserving of public attention, Google thus shapes our knowledge by telling us what to know, or what the “majority” agrees are the right things to know (especially when PageRank values highly on its voting mechanism). Through this sorting process, Beer argues, our cultural experiences and social connections may be limited (7).

To probe into the social power of Google’s search algorithm on its recommendation patterns, I propose to divide the discussion further into two of its main dimensions involved in the recommendation system: inclusion/exclusion and relevance, which are widely discussed in software studies. Inclusion indicates the predetermined factors of Google’s on what data has made it to be indexed on the first place. That is to say, some information on web has never been seen because it has never been indexed. And through excluding, penalizing and promoting, Google could further alter the visibility of those indexed information, which power is said to insert “selective bias” on user’s result page (Rieder and Sire 203). Both inclusions and exclusions made by Google have predetermined what the users can and will know about certain topic, and this predetermined factor of database design and management

(14)

has in fact been largely overlooked (Gillespie, “The Relevance of Algorithms” 171). The other social power said to be encrypted in Google’s recommendation pattern is the fact that it maintains the final decision-making power regarding what determines relevance, a term that has been examined in earlier sections of this study.

As the consequences to allow Google to determine the most relevant (or what I term “good”), our daily activities are increasingly shaped (Just and Latzer 239). As put by Kitchin,

algorithms have disruptive and transformative effects (26). Algorithms are vested with particular powers in “autonomous, automatic and automated ways” as they get in contact with people in their everyday lives (26), that there is little doubt that the public sphere is indeed constructed through the network technology which can be easily manipulated (Caplan and Boyd 15). Therefore, in the discussion of power relationships and social impact of

algorithms, an important approach is to look at which exact players are manipulating

algorithms, and who benefits from the norms inscribed (e.g. Rieder and Sire). In other words, it is important to pin down the crucial parties who accumulate capitals from taking control of algorithm, and further unpacking the key of “what values, peoples, and voices should have power” (Caplan and Boyd 15). Essentially, through shaping our knowledge in a wide

spectrum, algorithms co-govern or co-determine both our economic and social choices, result in the “governing effect” on social behaviour interplaying with norm, eventually contributes to reality construction (Just and Latzer 246-247).

Additionally, in his essay “Google’s PageRank Algorithm: A Diagram of the Cognitive Capitalism and the Rentier of the Common Intellect,” Pasquinelli sees Google as a “global rentier,” acquiring the dominant position in the production of “network value” (10).

According to this analysis, Google, playing a very limited role in generating content, focuses mainly on exploiting the “new lands of the internet” and on possessing the most efficient methods to access and measure collective intelligence (10). Capitalism is then inscribed in ranking and indexing the collected information. By determining its ranking position and giving a value for each web, it sets a “rank value” for every page. And this “rank value” set by Google is then (unofficially) recognized as a form of currency for the value this attention will bring to a specific site (7). Moreover, this accumulated “attention value” can then be transformed to enormous monetary value, AdWords being one of the famous examples (Pasquinelli 6, Mager). Being able to redistribute value from our common intellect and to

(15)

direct traffic, then, makes Google one of the most important “rentiers” in the world (Pasquinelli 10).

According to Pasquinelli’s analysis, Google has created the opportunity for itself to

accumulate the network value with its PageRank algorithm, and further transferred to actual monetization through AdWords. Without producing its own content, Google profits from linking the traffic to advertisers (traffic that Google directs with the given network value). Specifically, advertisers pay AdWords for their content to appear on top of the relevant Google Search result pages and other partner sites. In addition, Google has developed AdSense, which helps to deliver advertisements created on AdWords to partner sites, and share the profit (based on per-click or impression) with the ad publisher. According to Pasquinelli, Google has conquered a dominant position in the monopoly of the production of network value through PageRank. As a result, Google is the new rentier of the world of the Internet. Used in this sense, “rent” is a parasitic income that can be acquired by property owners because one simply possesses an asset that is currently occupied by the user or tenant. In the context of Google, the search engine “suggests” a form of differential rent on its dynamic space and indicates which information deserves more attention (Pasquinelli 11). 1.6 Power Relationships

While Google gets to decide what to promote through the design of its search algorithm, Google’s priorities are not always the only factor that makes the Web search the way it is presented now. However, multiple weaknesses of the users, such as lacking sufficient

knowledge to criticize the algorithms or relying too heavily on the functionality promised by Google search engine, may make it difficult to resist the power exercised by Google’s platforms. Through an exploration of the three-sided market theory, the role of SEO experts and the weaknesses of the users, this section examines the power relationships between Google and its users, how the power is negotiated and re-negotiated, and why Google as a result has a norm-inscribing function in society.

To begin with, the three- sided market concept, as explained by Rieder and Sire, has demonstrated how the advertisers, users and Google Web search contains one another:

On one side, Internet users query the engine to find information, entertainment, and so on. On a second side, Google indexes “content providers” that want users to reach

(16)

their websites. On the third side, advertisers are trying to attract visitors beyond the traffic received from “organic” results.

In order to support this multi-sided market consisting in a platform, the end-user is in fact a crucial element to keep the cycle sustainable. Therefore, it is then important for the platform owners to decide whom to subsidize or whom to invest to make a profitable balance. In this case of Google’s three-sided market, it looks like Google is subsidizing both the user for free search, and content providers for free indexing and free traffic directed through its search result page, charging only the advertisers to finance the entire platform. However, web provider and user activities also contribute to generating profit for Google (Mager 8), for that user data could be collected for profitable activities (Cheney-Lippold 168) and organic content is valuable itself as it builds the database of information for Google to rank (Pasquinelli 2).

Furthermore, the collection of user data is described as “goldmine” for the search engine because it makes advertisement more effective by connecting advertisements to users based on interests and desires indicated by their online behavior (Mager 8). Mager even argues that Google’s largest value no longer resides in the algorithm itself, but the customer data it collects (8). According to this perspective, Google has actually created a promising position for itself that, if keeping all three sides of the market satisfied and productive, Google will be highly profitable at its maximal capacity. However, in order to keep the advertisers and users satisfied, it has to constantly improve its search technologies (Rieder and Sire 200) and to make sure the public discourse of its technical neutrality and progressive openness retained (Gillespie, “The Politics of ‘Platforms’” 360). Also, while paid advertising is not the main focus in this thesis as only “organic” result is being discussed as the production of search algorithms, paid ads do work in a way to affect organic websites since they direct away a good amount of user traffic. More specifically, with PageRank’s sorting logic causing a “rich-get-richer” effect, many smaller content providers find little ways to obtain organic user traffic because all of the “eyeballs” move vertically on the left side of the result page,

including the ads that are placed on the top-left corner (see the heat map: Mediative 12). As a result, many organic content providers have come to believe that they also need to obtain more traffic through paid ads or SEO. In this case, organic links are perceived by Google as opportunities for profit (Rieder and Sire 205), and this need of attention and user traffic from content providers has further stabilized Google’s algorithm and the business model (Mager

(17)

6).

The power relationships between Google and its users often operate multi-directionally; this dynamic is especially apparent in the case of search engine optimization, or SEO. SEO specialists emerged at the end of 1999 since the early development of Google’s (Rimbach et al. 41), and the practice was encouraged especially after the introduction of PageRank, in order to gain higher visibility and increased traffic (Mager 3). Through the new algorithm updates, SEO approaches have changed, and in a reverse trend, some algorithm updates were also made to combat SEO manipulations (Fiorentini 28). In fact, as Fiorentini argues, “SEO is nothing but reactivity to search engine’s measurement of websites” (6). Therefore, the development of SEO has not only recorded the changes in algorithms, but has also shaped the search algorithms’ modifications. This dynamic constitutes an active and intertwining

relationship between webmasters and Google’s engineers, a push-and-pull relationship that perpetuates with each new algorithmic modification and subsequent response by SEO engineers.

Early SEO strategies mainly focused on the factors of “domain and URL selection, title-tag, meta-tags, header-tags, bold-tags and keyword-density,” and many link exchange and link spamming had been done to be preferred by PageRank (Rimbach et al. 41). And since the topic-sensitive logic was implemented on Google’s search engine in 2000, which also took consideration in the subject-specific popularity, SEOers began to involve hypertext content on their webpage. On 16 November 2003, the well-known Florida Update was implemented, and the topic-sensitive page ranking concept has changed Google’s ranking massively (44). Seeing the change of algorithm, SEO strategies were soon applied such as leveraging

automatic inclusion scripts of other websites and posting specific URLs in discussion forums and articles (45). The implementation of these SEO approaches has shown the “weakness of automated algorithm” and had led to some algorithmic updates to tackle it (45). A famous implementation of the “Big Daddy” and its subsequent updates, for instance, were not intended to enhance the index of the algorithm but instead to tackle canonical problems, to implement quality rules and to penalize outbound hyperlinks to low-quality sites (46). Since PageRank has been a major feature in Google’s platform, hyperlinks are important as the votes of a site’s popularity and relevance. Therefore, many SEOers have developed complex methods taking advantage of this specific logic. From using the simple “reciprocal

(18)

links” in the early 2000s, to the exchange of triangular links, to closed circuit of link exchange such as the “butterfly” and the “Linkwheel,” which groups a high number of websites, create mininets to promote and reference one another in a repetitive way” (Boutet and el. 450-451). As exemplified by the group work of Boutet, Quoniam and Smith, “it is possible for a single individual to position a website on the forefront of search engine results in a sustainable manner without being illegal” (457). Moreover, Caplan and Boyd raise the concern that the development of SEO could be used to gaming the algorithm, and allowed politicians, activists and companies to benefit from the higher visibility (8). It also has to be taken into account that, while developing high quality web page is positive for the users, the interplay of algorithm and SEO development has made both players (Google and SEO experts) to promote a homogenous information, and the structure of diversification of content is excluded (Rimbach et al. 47). These findings raise the question of whether the resulting homogenization of content is beneficial or harmful to users.

Although it seems possible to influence the way Google design its algorithms, because advertisers have financial recourses, users contribute to content and traffic, and SEO specialists holds the knowledge and skills in software development, users often remain vulnerable in the decisions made by this complex power structure. As discussed earlier, Google may hold the secret of its algorithms, but users (semi-passively) provide the collective intelligence on which Google’s apparatus relies (Pasquinelli). Additionally, as a part of the three-sided market, maintaining user satisfaction and keeping them to using the platform is the key to sustaining Google’s business model (Rieder and Sire). Therefore, it is apparent that users also play a significant role in stabilizing its system, and that lower user satisfaction will be highly concerned by Google’s software developer. In other words, the conditions in the search possibility could be changed according to user’s concerns. In fact, according to Gillespie, “algorithms can be easily, instantly, radically and invisibly changed” (“The Relevance of Algorithms” 178), and that the software developers have constantly conducted researches to understand “how humans habitually seek, engage with, and digest information” (174). As illustrated by these insights, Google is constantly conducting A/B testing with the search results in order to gain the data of user satisfaction, and further incorporate subsequent upgrades of the algorithm accordingly.

However, while users play a role in these multifaceted dynamics, their role is much more passive in comparison to the active, deliberate decisions made by the other players described

(19)

here. Although users may have a voice in influencing the algorithm, they may not know enough about the workings of the platform to play a deliberate, democratic role in changing the algorithms to directly serve their best interests. That is, users may not realize what norms the algorithm is promoting, what financial activities are behind it, therefore not exercising true power to make a difference. In fact, many users’ understandings of algorithms is often too simplistic and even mistaken (Gillespie, “The Relevance of Algorithms” 185).

Therefore, as proposed by Pasquale, one possible way to balance the power relationship when users interact with black-boxed software, is to actively promote public values on the Internet, and to urge the regulators to deploy technologically savvy contractors (16). By setting legal boundaries and establishing legal procedures to “detect and deter fraud, abuse and

unnecessary treatments” in the algorithm, it is possible to ensure that the search engine remains relatively honest (16). Gillespie also argues that the legislators have done too little in terms of fair commerce and political discourse, and allowed the information providers to contend valuable information as trade secrets (“The Relevance of Algorithms” 185) and, in Pasquale’s terms, “cherry-picked” the most positive content for publication (10). Mager suggests that mass media researchers as well as activists should raise more critical debates around this topic of software studies (12). In addition, in order to keep up with the fast growing, continuously-complicated coding ability of the developers, social and corporate leaders should popularize software studies and education of information network to enrich our analytic skills toward the tech companies and its software machines (van Dijck 586). Another way to look at the power relationship is that, perhaps, while Google is trying to steer public discourse, the users also give up certain rights intentionally. As explained by Mager, users’ ignorance stabilizes the algorithms and its economic logic (9), and some users are willing to “enter alliances with search engines to reach their goal of conveniently finding web information they want” (10), because we fear of slower performance and that “serendipity requires a lot of time” (Lovink 6). The users and Google have formed a virtual circle wherein “activities manually sustain each other” (Rieder and Sire 207) in the point of view of the countermeasure of three-sided market, how it is formed and sustained (207). As put by Kitchin, one way to study algorithm is to look at how people engage with as well as conditioned by algorithm, and how algorithmic system reshape organizations (26). In

particular, research could focus on examining the ways people “resist, subvert and transgress against the work of algorithm,” and how the users “re-purpose and re-deploy for purposes

(20)

they were not originally intended” (26). Nonetheless, it is also important to keep in mind that the algorithm development and user behavior could be seen as a recursive loop (Gillespie, “The Relevance of Algorithms” 183) because while algorithmic updates impact search behavior, user behavior also shapes and rearticulates the functioning of the algorithms. According to Gillespie, we must look at algorithms in their multidimensional entanglement with “the social tactics of users who put them up,” which goes through a more “varied, organic and complex process” (“The Relevance of Algorithms” 184).

Moreover, Gillespie believes that, if algorithms fail users’ expectations, this failure might be the result of the fundamental vulnerability we now face in society, which can never be fully resolved (“The Relevance of Algorithms” 191-192). According to this analysis, we are entering the time of a complex society where division of labor is necessary, that while “some produce and select information,” the rest will “take it for what it’s worth” (Gillespie, “The Relevance of Algorithms” 191). This process of production and selection of algorithms is facing the challenge in no differences from every public medium in the past, that there is always the “distinct possibility of error, bias, manipulation, laziness, commercial or political influence, or systemic failures” because the procedures are all unavoidably selective (191). To be more specific, the fundamental paradox of information selection or knowledge

formation could be better understood by comparing the more traditional medium (journalism) with algorithm that both claim to resolve the problem of human knowledge (192). It may be seen that algorithmic selection is “closer to statistics” in comparison with journalistic selection because the human decisions are further expressed and transformed in ways in which “procedures are imagined, discussed, implemented and managed” (Rieder and Sire 196), that the algorithmic logic depends on proceduralized choices of machine logic which automates “some proxy of human judgment” (Gillespie, “The Relevance of Algorithms” 192). However, despite the different approaches, both mediums have remained highly problematic and have deeply struggled with human means that brought error, bias and manipulation (192). As a result, we can only look at the algorithm as, not necessarily a better but a “new knowledge logic” (192) constructed socially and institutionally. Similarly,

Pasquinelli also expresses his concern that, in a context of monopoly capitalism, “PageRank and Google cannot be easily made more democratic” (12).

To summarize, Google search engine is indeed a tool that is not only shaping society but also shaped by multiple parties such as SEO experts, advertisers, and of course, end users.

(21)

However, SEO experts and advertisers might not be taking social responsibility as their priority and the users might not have enough knowledge to judge nor enough options to choose from when it comes to search engines. Not to mention that changes and influences are harder to be made when the information about search algorithm is black- boxed. As a result, users and society at large remain quite vulnerable with little ability to be conscious of the impact imposed by Google Web search engine and little ability to combat this social impact or to play a more active role in shaping it. The persistence of this dynamic highlights the importance for researchers to focus on understanding the social impact of Google search engine.

1.7 What Happens When Google Gets to Define Good

Having established the need for further researches on Google’s social impact and power through executing its designed (and secret) knowledge logic. In this sub-section, I further examine the consequences inherent when this knowledge logic is highly depended upon in larger social spheres. For instance, Google gets to practice its business model, shape what we know and construct our reality. Consequently, social norms are inscribed and reinscribed in society.

Tracing back to the beginning of Google’s commodification, Van Couvering proposes a strong argument about the current business nature of search engine: while algorithms were not designed to profit from user data and it’s ranking power, the ones that did not apply this business model have all become historically significant. Search engine has made different attempts but have only succeeded by creating the network based on the control of Internet traffic, instead of generating Internet content. Similar to Van Couvering’s analysis,

Pasquinelli also pointed out this nature of algorithm for not producing content, and further proposed the power of the common intelligences for its control of content. As a result, Van Couvering believes the algorithms is how it is today because of the competition in the market, and because the development of technology is mediated through not only social and political context, but capitalistic context. Although inserting business model might not be the original intention when building a search engine, it is no longer accurate to analyze Google’s social impact without acknowledging its priority to profit. And as put by Gillespie, search engine owners and designers are well aware of the need to maintain its promise of objectivity in order to legitimize many of its technical and commercial undertakings and to help “obscure

(22)

the messier reality of the service it provides” (Gillespie, “The Relevance of Algorithms” 182). Which messier reality includes the fact that Google’s ultimate goal as to take prior to making profit for the corporate and its investors, instead of to aim at offering public goods. And with the business model Google has, the corporate might have designed its algorithm to gaining the value chain for its audience attention (Van Couvering 203) and by delivering content which appeals to the largest audience possible (Rogers, “The Googlization Question” 1).

As discussed earlier, many studies have pointed to Google’s power that, when Google

practices this new knowledge logic in our society, it shapes our knowledge and constructs our reality by telling us what to know and whom to believe in. As a result, this “automated assignment of relevance to selected pieces of information” has become “a growing source of and factor in social order” (Just and Latzer 253-254). What should be emphasized here is that this growing source is hosted and controlled by a private corporation, which is designed for the company’s own good and the construction of which source is kept unknown to the public. That is to say, when a private corporate with its own interests and concern contributes to the construction of our social reality, the public interest goal and social responsibilities will be weaken, which eventually leads to creating new social inequalities (Just and Latzer 254). Moreover, the construction of Google’s search algorithm is said to be taking in a part of machine-learning technology (Sullivan) which, on one hand, might help Google understand our queries better yet on the other hand, will make the algorithm less controllable and its social impact less predictable. In addition, the accountability of which algorithm will become even harder to judge.

Google does not create content or generate information on its own. Instead, it prioritizes the criteria inserted by algorithm designers. Therefore, a major reason that Google could take part in altering the social order is its ability to normalize concepts, knowledge and values for us through Google search engine, which then affects the collective and social behavior. As put by beer, the algorithm holds “powerful and convincing sway in how things are done or how they should be done,” namely, the ability to insert, reinforce and cement norms (Beer 2). In other words, when Google returns what it thinks is the “good” answer to our query, it could steer our society to a certain direction by normalizing some concepts and ideas. As exemplified by Just and Latzer, algorithmic selection might lead to the increasing of “individualization, commercialization, inequalities, and deterritorialization” (255). Also, it

(23)

may decrease “transparency, controllability and predictability” in the perspective of our social order (255).

To conclude this section, I have first explained that by using the term “good” instead of the commonly used “relevance” and “popularity,” I aim to emphasize the fact that which term best describes the kind of information prioritized by Google is debatable. For instance, when designing the algorithm of Google Web search, subjectivities, biases, private incentives are also embedded both intentionally and consequently. Evidently, a great deal of expertise, judgment, choice and constraints are exercised in producing algorithms (Kitchin 18). And as Kitchin puts, algorithms are created for purposes that are often far from neutral: “to create value and capital; to nudge behavior and structure preferences in a certain way; and to identify, sort and classify people” (18). As the result of forming and stabilizing its

relationship with the users, Google then gets to define what is “good” with its algorithmic sorting process and further gain the ability to steer public discourse, to shape and condition what we know, and eventually to insert norms in society. And while the powers and the governing effects of search algorithms have been interrogated and validated in the past, the norms it inscribes have not received much attention from researchers in the humanities. As highlighted by Beer, more studies are needed to not only look at the algorithmic system itself but the social power which results from these phenomena (11). In response to this need, I present a possible methodology to further analyze the norms Google inscribes in society, through retrieving a better insight of what types of web pages and content are made more visible to the users.

2. How Norms Are Inscribed

In the previous section, I have explained my use of the term “good” to allow for an open interpretation of the kind of information returned to our queries, if not based on relevance. And because what is returned by Google as top results are often perceived as trustworthy information, Google further works to authorize the information it considers to be good. Eventually, the status of the top information is enforced by Google and remains the most visible results to users’ queries. “Algorithms act,” Goffey argues, as a “part of a complex of power-knowledge relations,” and which programs behavior costs consequences (19). Put this in the characteristic of Google’s search algorithms, although Google does not produce information itself, by defining what is “good” to the user query allows it to condition what

(24)

users know, reinforce dominating voice, map the reality, and to construct knowledge logic. Consequently, Google normalizes ideas and concepts, cementing norms in society.

The norm-inscribing ability of Google Web search through its automated algorithmic

selection on the Internet has been highlighted in some researches. Because Google’s tools for providing “solutions” are often experienced as processes that “simply and unproblematically ‘work,’” Google has to some degree normalized its knowledge logic “as right as its results appear to be” (Gillespie “The Relevance of Algorithms” 187-188). Furthermore, as this knowledge logic accommodated and used regularly, it not only shapes what we know but “lead users to internalize their norms and priorities” (187). Similarly, Just and Latzer have explained that as the result to algorithmic selection’s shaping of individuals’ realities, it affects the norms and values of societies (246). The term “norm,” as found on the Cambridge Dictionary, is defined as “an accepted standard or a way of behaving or doing things that most people agree with” (“Meaning of Norm”). A norm creates the common logic of how things should be done (Beer 2). By deciding what should be made most visible, the power of norm inscription resides in the ability to feed information and to “shape what [users] know, who they know, what they discover and what they experience” (Beer 6). By this definition, norms are not necessarily the content of knowledge, but its parameters: not the “what,” but the “how,” the shape that knowledge is allowed to take.

2.1 Historical Origins of Algorithms

To fully understand the current social functioning of algorithms from the political perspective of understanding capitalism, it is important to understand the history of their development. The inscription of societal norms is sometimes seen as the unintended consequence of algorithmic phenomena (Goffey 19). As search engines evolved and were incorporated into business models (Mager), the software was designed eventually to “mobilize money and media for private gain” (Pasquale 10), which consequently began to have real impacts on the ability of algorithms to impact politics and society. However, search engines were not first created for commercial purposes, and were first designed in the late 1990s to support academic and research institutions for information retrieval. During this period, most of the search engines were developed by institutions, had no commercial concept and had been testing various models to sustain a business. With Yahoo, WebCrawler, Lycos, Excite and Inktomi developed my Computer Science specialists, many companies had “no business

(25)

plan” according to the Chief Financial Officer of Lycos (Van Couvering 184). And the first revenue plans that were brought to the industry were advertising, licensing (license search engine technology), and sponsoring (Van Couvering 190). As a result, Yahoo, by

transitioning to a “professional commercial service” (“Yahoo! forms marketing powerhouse to design and manage its new look”), was reported to achieved revenue totaling $8,551,000 by 1996 ( “Yahoo! Reports Fourth Quarter Profit”). With more emerging companies in the search engine industry, the market became competitive and many companies were bought, sold or integrated. Vertical integration was especially common in the late 1990s and early 2000s, from hardware companies to software, browser, telco, ISP, browser, search engine and eventually, destination websites (Van Couvering 187). During this period, while some

companies still sold content or software service to users and companies, many had focused on selling audiences (traffic) to advertisers and provided channels that feature content from advertisers. Added value of user attention was explored in the late 1990s that the company started to try to control and calculate what users were seeing. During this period, determining the definition of “good” was not the company’s main concern. They only wanted to

maximize their user base in order to increase advertising revenues.

Throughout this period, many companies were also exploring other possibilities to stabilize their financial situations. However, most of them had failed. During this period, many corporations made attempts on vertical and horizontal integrations with search engines to achieve scale and market domination. Notable examples include NBC’s purchase of Snap!, CMGI’s of AltaVista, @Home’s of Excite and Terra Network’s of Lycos (Van Couvering 193). However, none of them had really succeeded and some either went bankrupt or were sold again. According to Van Couvering, this could be a part of the larger issue of the “dot-com crash” (193) or the “dot“dot-com bubble.” Between 1995 and 2000, society was facing rapid technological advancement, and the commercialization of the Internet, in specific, had shown great performance in terms of capital growth (Investopedia). Therefore, many investors were pouring money at all the Internet-based companies, that some investors had “abandoned cautious approach for fear of not being able to cash” in the promising tech wonderland (Investopedia). However, the big market started to crash in 2000. Within one month from March to April, the combined value of stocks on the NASDAQ dropped by nearly a trillion (Geier). According to the article in Time, while seventeen dotcom companies had paid for ad spots during the 2000 Super Bowl, the number declined to three companies the following year (Geier). Nonetheless, not having a concrete business model was definitely a key

(26)

problem. “We really couldn’t figure out the business model,” recalled Michael Moritz, a major Google investor, in his book The Search (Oremus). He admitted in the book that “things were looking pretty bleak” for Google for a while.

2.2 Monetizing Algorithms

Starting in the early 2000s, many tech companies experienced great financial performances with the new traffic-selling model. From the last quarter of 2001 to the second quarter of 2005, US quarterly revenue with online ad has almost doubled, from around 1.5 billion dollars to nearly three billion dollars, according to Internet Advertising Bureau. (Van Couvering 197). In fact, approaching toward the very end of the 1990s, Gross had taken Inktomi’s search engine and introduced a new concept of paid search. At this period, search companies including Google and Inktomi were facing the issues of spam problems with the search result, and Gross started to invite websites to actually pay for top placements because he believes if the company pays to reach its potential clients, then it must be relevant, and it would also be more cost-efficient comparing to other types of advertisement (Oremus). In other words, the “good” results were the ones that were purchased. In 2001, the company then changed its name from GoTo.com to Overture and focused exclusively on this paid network model. This company was later sold to Yahoo in 2003 to help it combating with the major rivals Google and Microsoft in Web search advertising (Olsen). The advertising in this period had three key characteristics including pricing on a cost-per-click basis, being

contextual (focused on search terms or linked to page content) and was syndicated to other website and providers of the paid service (Van Couvering 197). In 2002, after refusing Overture’s proposal of partnership of placing its ads alongside their result page, Google had launched its own pay-per-click, auction-based search-advertising product, called AdWords Select (Oremus). And in 2003, AdSense by Google allows the syndication of cost-per-click ads to partner websites in order to monetize traffic without the need of ownership for Internet service providers (Van Couvering 198).

This analysis of historical developments in the tech industry shows that algorithms were not first created to profit from search traffic and many attempts had been made for the companies to sustain, and many kinds of integrations had been tested. However, it seems like the ones that did not apply the business model to profit from search traffic/results and tried to profit from other approaches instead, have all become history. Through different attempts made in

(27)

the past, norms are embedded because different types of sites were promoted. For instance, GoTo.com promoted the ones that are willing to pay the price for top placement. And while Google keeps its organic search result, the corporate profits from traffic by selling ads. As the result, the development of technology is highly mediated through not only social and political context, but capitalistic context (Van Couvering 180), which all lead to inscribing norms in society.

The strong relations between Google’s ability to select information and to prescribe norms have been noted and sometimes thoroughly analyzed in publications. However, norm inscription through embedding knowledge logic is sometimes easier to theorize but difficult to document (Gillespie, “The Relevance of Algorithms” 187). In response to this difficulty, the present study seeks to define and discuss the knowledge logic of Google Web search, specifically on what information is prioritized and the resulting social consequences.

However, as illustrated previously, it is through this very knowledge logic that Google gets to map our reality and to directly insert norms. In the case study, while seeking on retrieving better insight of what is “good” to Google, the kind of norms embedded could be assumed simultaneously as “what we know (is good)” is often determined by “what Google decides is good.”

3. How to Study Norms

3.1 Overview

Previously, as studies on the algorithmic complexity and its social power have been discussed, it has been made clear that there is a need for studies to understand how the algorithm works and how it inserts norm in society. And to study what knowledge is normalized through Google search engine, one of the most direct ways would be to

understand what information is prioritized which are most visited by the users. Undoubtedly, understanding the selection process could be easily done by reading the algorithm. However, the current version of Google’s algorithm examines over two hundred signals for every query and there are fundamental difficulties to retrieve and examine all as they are considered to be trade secrets (see, e.g., Gillespie, “The Relevance of Algorithms,” Fiorentini, van Dijck, Pasquale). Yet, it is important for the researchers to continue raising critical awareness of algorithm and to find the complex and invisible of it, since the less we know, the more

(28)

control Google holds over us (van Dijck 587). Therefore, in this case study, I demonstrate an alternate method to get a better insight of what is prioritized by Google without having to read the codes, which selection criteria will directly contribute to the norms inserted in society. However, my methodology only enables the demonstration of a small part of

Google’s ranking priorities. I hope that by pointing out the importance to study norms and by demonstrating a possible methodology, more researchers will be further enabled to contribute their studies on the norms inscribed by Google Web search and other technologies with significant effects in society.

3.2 Challenges

There remains the exceedingly difficult task of studying Google Web search not only because Google’s search algorithm is black-boxed, but also because a global definition of relevance is becoming more challenging, perhaps impossible, to find. As summed up by Kitchin, the main challenges to researching algorithms are not only on its accessed being black-boxed, but on the algorithms itself being heterogeneous and at the same time ontogenetic and constantly unfolding (20-21). However, a number of methodologies have been done without having to deconstruct its codes, such as to make assumptions of its technical changes through reading patents (Weltevrede 118) or to capture its social and cultural role through looking at how Google “captures, formats, and recommends” (Weltevrede 130), to look at search as research (Rogers, “Foundations of Digital Methods”), or to conduct ethnographies of the way people engaged with and conditioned by algorithmic systems (Kitchin 26).

The challenges to study a “black-boxed” algorithm is well illustrated in Pasquale’s book The Black Box Society. Through its lamppost story (as mentioned in the Introduction above), the consequences of not fully understanding the algorithm is to be left in the dark, only able to see what the lamppost illuminates while it obscures the rest in the shadows. In other words, Google gets to work as a “one-way mirror” (Pasquale 9) by keeping its algorithm a secret, disclosing what they want us to think (of its objectivity and neutrality, for instance), keeping the rest as trade secrets to avoid scrutiny (3) and to avoid its conflicts with public powers being revealed (9-10). One thing that is certain about the Black Box, in Pasquale’s book, is that it smoothens and simplifies ordinary transactions by rendering them more efficient. In exchange for such efficiency, users accepted black-boxed algorithms and allow themselves to live in a climate of algorithmic secrecy, where “bad information is likely to endure as good, and to result in unfair and even disastrous predictions” (Pasquale 216). This Black Box

(29)

Society, as defined by Pasquale, “has become dangerously unstable, unfair, and unproductive” (218).

The information fluctuation on Google’s search result page, as how Google might be unstable in authorizing information, is demonstrated in Rogers’ research. With which research, he has concluded that Google’s search engine does not necessarily displays the viewpoint from a diversity of voices and that “the displayed sources are often the ones that are familiar and established” (“The Googlization Question” 1). When he first searched for the query of “9/11,” the webpage 911truth.org was ranked top ten, further than sources from The New York Times and the New York City government. Interestingly, 911truth.org is often referred to as a conspiracy site (9). 3 However, 10 days after the event, 911truth.org dropped to some rankings around 200; and two weeks after, back to the top result list. It raises concern of the stability of the stories we are told by Google. In addition, the main reason why 911truth.org were ranked high in the first place was argued to be the excessive reciprocal links in receives from its franchises sites. With a desirable amount of backlink on a website to be placed on top in the first place is likely to remain on top, having its status enforced by Google, because most of the users only check within the top dozens on the result page (Weltevrede); “the rich-get-richer” dynamics built in PageRank. However, to be able to find evident or to further improve Google to become more stable in mapping our reality is difficult, because the design of the search algorithm is kept as a secret from the public. To Beer, seeing that algorithms are encoded with rules that map our social reality and directs our life decisions, the uncertainty about the algorithms may lead us to misjudge its power (3).

Taking into consideration the limitations of existing research, Rogers provides useful

guidelines for research methodology. Specifically, the aim is to observe the societal impact of Google search engine with the device data retrieved. For instance, the 9/11 research as explained is done by collecting the rankings of different web pages in different time periods (“The Googlization Question”). Moreover, search as research could also be used to study trends, dominant voice, commitment and concern, through tracking of keywords and rankings (“Foundations of Digital Methods”). Studies have been conducted to analyze how Google understands query input by testing it with different keywords, in different languages, at different locations, in different timeframes, and with a different domain. As exemplified by 3 See the delicious tags of ‘911truth.org’ form the Delicious Tags Scraper Sample project at: https://digitalmethods.net/Dmi/ToolDeliciousTagsPerUrlSampleProject

Referenties

GERELATEERDE DOCUMENTEN

In addition, since Google enabled people to directly upload geographic information through its own social network, the Google Earth Community, the system gained substantial

The other strategy focuses on researchers, and calls upon them to make their research results available through open access, so that anyone searching in the open information

To receive the GSFs, in total 90 search terms were chosen from which we anticipated that one half would display the intention to commit a crime and one half would display the

This is the case for not only Google but also for DuckDuckGo, which is a search engine that does not collect personal user information Especially if an issue is

Recapitulating, when answering the research question “To what extent is the data provided by Google Trends useful for forecasting night passes in hotels in Amsterdam?” on the bases

This literature search resulted in the following success factors: service quality, information quality, system quality, trust, perceived usability, perceived risks,

De leverancier van het federated search-systeem heeft al een paar jaar een product op de markt gebracht dat federated en indexed search combineert, maar de ontwikkeling

Basically, the getmapdl Lua script downloads static map images depending on command line options and allows to parse kml, gpx and gps (a plain list of geographical coordinate