• No results found

Twitter as a sensitive City sensor

N/A
N/A
Protected

Academic year: 2021

Share "Twitter as a sensitive City sensor"

Copied!
99
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)
(2)

Abstract

The way in which society communicates has evolved, digital technologies have enabled new ways of recording and visualising social life, creating a new wave of digital sources that have infiltrated our daily lives (Fielding et al). Harnessing these new data sources and to repurpose them as research tools, can provide a unique perspective into a city. More research is needed into the positive capabilities that location based social networking platforms can have for the urban environment (Kling & Pozdnoukhov; Bimbo et al; Cranshaw et al; Noulas et al; Saghl et al; Quercia et al). This thesis explores previous research and the perspectives of various scholars within a digital method, new media and urban studies perspective. Specifically exploring how Twitter can be used as sensitive city sensor, and as a tool to assess a city’s urban dynamics. This research scraped over 55,000 tweets across two major European cities, Amsterdam and Manchester. Utilising only the voluntarily geo-tagged tweets, applying a close reading approach to distinguish between the tourists and the locals, and as a way to determine the functions of the urban space. It produced a variety of Twitter data driven maps, which exhibit various profiles of the cities Manchester and Amsterdam. Showing the fine-grained capabilities of Twitter as a tool and providing further research in a relatively underexplored field of study. University of Amsterdam New Media & Digital Culture Master’s Thesis Name: Max Cantellow Email: maxcantellow@icloud.com Student Number: 11103620 Date of completion: 24th June 2016 Supervisor: Sabine Niederer Second Reader: Natalia Sánchez-Querubín Word count (excluding bibliography): 22,756 Acknowledgements: Thank you Sabine Nierderer, Alexander Daniel, Anissa Jousset & Pepijn Bierzhuizen for your support throughout the entire process. Keywords: Twitter, Digital Methods, Urban Studies, Manchester, Amsterdam

(3)

Contents Page

1. Introduction…pg5 1.1. Digital Research & Digitization 1.2. What this research entails? 1.3. The thesis map 2. Understanding the City…pg8 2.1. Over population, the problem 2.2. New techniques of looking at the problem 2.3. Why we need to define a city? 2.4. City and urban | a cities boundaries 2.5. 3 Distinguishable characteristics of “urban” 2.6. Progress of a city and our working city definition 2.7. City assessing themselves | census data 2.8. Advantage / Limitations of using census data 2.9. Alternative Ways of measuring a city: Happiness Index 2.10. Shift from traditional methods to new digital methods 2.11. But, what is Twitter? 2.12. Twitter as a close system 2.13. Objectifying Twitter 2.14. Research into sentiment analysis using Twitter 2.15. Research into urban dynamics and mobility using Twitter 3. Methodology…pg23 3.1 Research Question 3.2 Twitter as an object of study & research tool? 3.3 City Selection 3.3.1 Amsterdam 3.3.2 Manchester 3.2.3 Similarities between the two cities 3.4 Data Collection 3.5 Noise Reduction

(4)

3.6 Data Categorization 3.7 Data Visualisation 3.8 Google Translate 3.9 Automation of Processes 4. Findings… pg47 4.1 Findings 1 | Break Down of Twitter & the Functions of Space 4.1.1 Manchester | 25th March 4.1.2 Manchester | 4th April 4.1.3 Manchester Similarities 4.2.1 Amsterdam | 27th April 4.2.2 Amsterdam | 4th April 4.2.3 Amsterdam Similarities 4.3.1 Findings 2 | Tourist Vs. Local 4.3.1 Manchester 4.3.2 Amsterdam 5. Discussion…pg76 5.1. Collection of Data 5.2. Automation of the process 5.3. What kind of city does Twitter put forward? 5.4. City research and how Twitter can be used as a city sensor? 5.5. Twitter as a platform – do previous scholarly views still hold true? ? 5.6. Application 6. Conclusion…pg86 6.1. Limitation 6.2. Future Research 7. Bibliography…pg89

(5)

1. INTRODUCTION

The rise of digital technologies have become a catalyst for a new wave of practices with regards to the recording, analysis and visualization of social life (Fielding et al 2008). Data has become increasingly more available, as the proliferation of smart hand-held devices, have infiltrated our daily lives. This new era of digital sources is altering the landscape within sociological research, which is at risk of loosing some established, traditional forms of research - such as cluster analysis and textual analysis - which can be done using digital methods (Savage and Burrows, 2007). The abundance of data and the increasing digitization1 of social life, has allowed for areas of social research to be redistributed towards, more digital methods. Noortje Marres discovered in her research, there are three societal elements that have transitioned social research into a more digital methods context. Firstly, it was the rise in ‘new devices, genres, and formats for the documentation of social life’, these new devices have enabled, and encouraged users to broadcast their every day lives. Developing a surge of online content, in which users are contributing their daily activities, routines, thoughts, and feelings. Ways of digitization ones social life has become such an integral part of the 21st century. It has become embedded into everyday practices, from ‘live streaming’ events via Facebook to tweeting ones daily routine, or to even Instagram ones breakfast. (Marres). Thus creating the second feature of digitization, in which the “routine generation of data about social life, is part of social life”(Marres; Fielding et al 2008). Underpinning these two features of digitization is the availability of online social platforms and third party applications that offer analyses of digital social data (Marres). Internet users are given an eclectic range of platforms and third party applications to freely engage with. Digitization is allowing for unprecedented breadth, depth, and scale of computational social research in the forms of collecting and analyzing data (Lazer et al). Taking advantage of the large quantities of data that is becoming increasingly available to us can provide us with a unique perspective into a city. 1 Digitization is the process by which text, pictures or sound are converted into a digital format and become readable by a computer (Oxford).

(6)

This research would like to contribute to the debates surrounding digital technologies, specifically location-based service networks, and their usages within a new media, digital methods and urban studies perspective. The thesis argues that online social networking platforms can offer value for various stakeholders of a city, by exploring the capabilities of Twitter as a city sensor for urban space. Re-purposing Twitter (Rogers), as something more than a micro blogging social networking platform, but to explore how Twitter can be used to assess a cities urban dynamics. Cities urban dynamics can be characterised by the movement of its inhabitants, mapping and understanding their vigorous patterns of activity within the urban environment, can provide a perceptive account of their city (Fransen). While exploring Twitter’s capabilities, it will be interesting to see if Twitter can offer a more fine-grained approach to city research, and whether previous negative scholarly views regarding the content of Twitter still hold true (Rogers; Marwick & Boyd; Miller; Java et al). The paper will open the discussion on a global phenomenon, why cities and how the lines between the physical and the digital world are becoming increasingly entangled. It will then begin to explore what it means to be a city and its interrelatedness with the concept of what is urban. Once there is an understanding of the city as a concept, then begin to look at how they are assessed and some critiques of traditional city assessment methods. Some of the limitations of the more traditional methods could be reduced, by adopting new digital methods, such as using Twitter, as a city sensor. Twitter is then discussed at length with a brief history and description of Twitter as a platform, followed by previous scholarly views, and research efforts that investigated Twitters usability in sentiment analysis, patterns of mobility, users behavior and urban dynamics. Once the research is grounded into previous literature, the research will shift its focus onto two major European cities, Manchester in the United Kingdom and Amsterdam in the Netherlands in a quali-quantitative approach.

(7)

A quali-quantitative, is an approach that uses complimentary techniques of qualitative2 and quantitative 3 research methods, as a way of engaging with large quantities of data. The qualitative approach will be by applying a close reading to voluntarily geo-tagged tweets, while using more quantitative methods when comparing large quantities of geo-tagged Twitter data and combining the outputs to support the research aims. Resulting in maps that will show different profiles of the cities Manchester and Amsterdam; initially the maps will show a comparison between a weekday and a public holiday, in which one will be able to determine the functions of the cites urban space. Once the functions have been categorised, the map then offers a more fine-grained perspective into the city, which can be fine tuned further to sense areas and establishments that are frequented by tourists and locals of the respective cities. The output of this research will support a different methodological approach to Twitter research within an urban studies perspective and would be able to show city stakeholders with novel a way of gaining insight into their city. 2 Qualitative - typically is concerned with understanding human behavior from an informants perspective - data is collected through participation observations. (McLoed) 3 Quantitative refers to facts about social phenomena, fixed and measurable realities that are quantifiable. (McLoed).

(8)

2. Understanding the City

Over the past decade, there has been a rapid increase in the global population living within the confinements of the concrete infrastructures that are called - cities. The city is an interesting area of research, as it can provide us with a snapshot in which one can explore localised global forces - such as the impact of tourism, globalisation and the ever-increasing pervasiveness of social media, due to the rise of smart-handheld devices. The ubiquity of technology today has allowed for the creation of an interconnected digital landscape, in which its users have the ability to augment their presence between the virtual and the physical world, almost simultaneously. This technology is blurring the boundaries of the virtual and physical space, creating a ‘spatial hybrid’ in which the city is being recreated as a digital double in the virtual world (Owen & Imre, Quan-Hasse & Kim-Martin). As these two worlds mesh, understanding this spatial hybrid by exploring the virtual world can become an invaluable asset to gain a precious insight into a city. Scattered traces of user data have become increasingly available; these digital footprints can provide us with a remarkable understanding of social and temporal patterns in which one can understand the urban dynamics of a city, by modeling patterns of human mobility and understanding the functions of its urban space. Technology has revolutionised the way in which society communicate and has made it less imperative for humans to depend upon living in cities, yet there are still mass migration of people towards cities (Sassen). Currently there are over 54% of the global population living in an urban space, this is expected to increase 1.5 times by 2045, which would contribute an extra 2 billion more people to our already over populated cities (The World Bank). This influx will bring numerous challenges to a cities infrastructure, for example; managing the flow of visitors, ensuring a reliable cross-city public transport system, reducing a cities mass contribution to climate change and more importantly developing a sustainable city that is capable of working for all its stakeholders. It has therefore never been more prevalent for our society to explore, understand, and be able to manage our cities to cater for this urban phenomenon. To grasp such a contemporary issue, it has never been more prevalent

(9)

to embrace new technologies and techniques that have become available to us. Social media platforms, such as Twitter, can be harnessed to scrape digital traces of its users. This user data can be analyzed, made sense of, and used in a meaningful way to uncover a different angle and insight of the city. Combine these results with traditional methods could allow us to successfully manage urban challenges. A city needs to recognise the various the various economic, social, physical and environmental forces that are at play, aided by understanding its land use patterns and the dynamics of its urban space. A cities urban dynamics concerns the movement and the activities of individuals around the urban environment (Fransen). Understanding this urban motion is a vital component in maintaining equilibrium of a city, discovering areas that are frequented by users of the city provides insightful data, which can be used to support sustainable urban development. As knowledge of these areas can help to inform the decision making process and to ensure that proper infrastructure are supporting areas that are becoming increasingly more popular. I propose that location based service social networks, such as Twitter, can be used as strong supplement data to understand a cities urban dynamics. But, first one must fully understand what is meant by the term a city. As a “city”, in itself, is a highly contested term. It is applied and redefined differently by governments, organisations, institutions and individuals. Does one define it by the municipals definition - that is by its legal boundaries; or, based on its physical infrastructure that has been built up in a modular fashion as its population increases; or, consider its metropolitan area, the cities urban core as being the primary influence on the contiguous areas and its mobility patterns? (BBC). Defining a city can be difficult, due to its subjective nature and it can cause further problems when cross-analysing varieties of urban space. Therefore, there needs to be a rudimentary break down on what is classified as a city. A City & its urban Space A city’s boundaries are refined and dependent on what is regarded as “urban” and as a result, it is conducive to understand what is meant by the term “urban”. There are a considerable amount of interrelated factors that determine whether a place can be classified as “urban”. Such as an areas population size & density, its economic

(10)

function, its labour supply and its economic & social structure. For more statistical purposes – such as government censuses – defining an urban space tends to be based on governmental boundaries and the size of the population of a specific area. This would imply that a city’s boundary is more a static border, yet in contrast the urban space is fluid and is often changing with its local environment and ever increasing population (Frey). Research from William Frey and Zachary Zimmer in their paper “Defining The City”; they were able to distinguish 3 key elements of “urban” as a concept. This can then be applied to differentiate between urban and rural space, which in turn will help to develop the understanding of a city for this research. The first key element of “urban” is called the ecological element; it differentiates the two typologies based on their spatial characteristics, such as population density and total population size. Countries develop their own definition of how many inhabitants’ there needs to be within a certain area and the population density of that area before it can be classified as an urban space. For example, the Netherlands municipality believe that an urban area needs to be inhabited by 2,000 or more individuals and accompanied by a 44% population density to be regarded as “urban” (Dijkstra & Poelman; UN). In comparison, the United Kingdom believes a space should be treated as “urban” if the population is of an area of 10,000 or more with a 57% population density (Bibby & Brindley; UN). The second element refers to an areas economic function, specifically how the area makes its money and the financial activities that take place within the space. Some countries, such as Botswana, will take into account the percentage of inhabitants that participate in agriculture vs. non-agriculture to determine an areas economical function (UN). The third element is the one that is most aligned with what this thesis will explore, as it looks at the social character of the area. This examines the nature, behaviour, characteristics and values that are held by the individuals within an area. It became a common trend among researchers, to focus classifications on cities based on their commercial, administrative and transport centres. Especially in the 1960s, when there was a strong influence from Marxism, as society began to use political economy of industrial capitalism as a structure to attribute issues of social movements, class and power (Castell). Social character is

(11)

vital, as an urban area is not just defined simply by the functions of the space that are found there, but by the people that make the space and create the community by actively participate in daily life - making these spaces, places. The tale of success behind many cities can be accredited to the multiculturalism within a small area, as once combined, the community forms a synergy of different opinions, thoughts and attitudes (Florida). Understanding the human mobility network can provide a unique perspective into the attitudes of the community, by recognizing areas that are frequented by the locals, this can indicate the type of places they visit and which are the most popular places within a city. Developing this cognizance of the community, could help to create conditions that encourage this community synergy, thus working towards a more sustainable city (Florida). Using Frey and Zimmer’s research there is a better understanding of the core three elements to distinguish between urban space and its variable definition across nations. Their work provides a strong framework, which can develop an applicable definition of what a city is for this research. Earlier definitions relating to the wave of new urbanism in Western Europe after the middle ages had a strong focus of commercial and business elements being the dominate factor, which provided the city with a level autonomy from its surrounding areas (Kuper & Kuper; Pirenne). This idea was developed further with Max Weber’s definition, in his work The City in 1921, in which he believed that a market was the central institution of a city, combined with a form of fortification and a level of independent administration & legal system as another component. Providing us with a more historical definition of what a city was allows us to see the foundations of a city. Where it can be seen as confusing, would be to apply these definitions to distinguish between a town and a city, as both urban forms can hold similar institutions and economic functions to one another. As the dawn of industrialism descended upon the world, academics and urbanists alike began to dissect the inner characteristics of the new kind of city, with a growing concentration of hard physical labour helping to rapidly expand the contemporary cityscape (Engels; Kuper & Kuper). Sprouting from this new form of urban development was a generation of subcultures and lifestyles, with society beginning to appreciate cultural diversity that became readily available in the urban ethnography

(12)

(Sanjek). This began to form what society know as a city, for the English this means, a large town or any town within the United Kingdom that has a cathedral (Cambridge). Unfortunately there is still little distinction in English between what a large town is and a city. Therefore, the need to make the distinction based on our previous discussion surrounding urban and take into consideration what countries deem as an urban space. Combining the English definition of a city with the United Kingdom’s definition of what urban space is to create our working definition. For this thesis, a city is a large town with a population of over 10,000 and a population density of 57% (Bibby & Brindley; UN). Assessing a City Now there is an understanding of what is meant by a city, but need to grasp how cities assess themselves. Many countries do not abide by similar definitions of a city; however, they do implement common methods when it comes to assessing a city. Traditionally a cities municipality would collate data sets based on government census data, which provides demographic records and spatial statistics that can inform urban planners, and be used in further analysis of the urban space. A government census is the oldest form of measuring the population and is the most commonly used way of undertaking statistical measurements of an area to generate social data (Hristova et al). Utilising such methods can supply local authorities with the ecological element that can help to determine whether an area is rural or urban (Fray). While allowing for the governments to gain an insight of the people living within their country; their occupation, who they live with, how they usually travel to work, how many hours a week do they work (Office of National Statistics). Further interpretation of this data can show areas of inequality with regards to income distribution or a top-heavy demographic, while also being used to predict population growth and current/ future land-use allocation. Conventional forms of urban social data, such as the national census, are praised for their high degree of quality, accuracy and cogency (Psyllidis et al). However, they do hold some significant limitations, which can be seen to outweigh some of its benefits, a prime example of this would be the census’ timeliness. Implementing a government

(13)

census is an expensive and timely process, which is caused by a number of factors, such as its high level of detail and the large amount of people its needs to reach. Managing such an operation can be a demanding task and it therefore may take many years for a census to form from conception to implementation. The United Kingdom census, for example, has been running every decade since 1801 (with exception of 1941 due to the Second World War) and provides a detailed snapshot of the national demographic of the United Kingdom (Wikipedia). The large gap in its refresh rate, allows for the UK’s national characteristics and its demographic to change unbeknown to its assessors. This has the potential for policy decisions to be developed with a lack of information and crucial statistics could be left out of the decision timeline. In addition, it fails to portray the full image of a city, ignoring some of the social characteristics that make the space what it is - such as its inability to map out urban mobility and the current function of space of the city. The census therefore can provide an in-depth snapshot of a nation or a cities dynamic and its well being, however, fails to provide a real-time – accurate depiction of a cities social dynamic. There are, however, other methods of assessing a city’s social progress, the chairman of the United Nations in April 2012, considered an alternative way of benchmarking countries called the Happy Planet Index. In comparison to other methods, such as Gross Domestic Product or Gross National Product, which are based entirely on economic performance as the key indicator of success. The Happy Planet Index re-thinks the view on our cities/ countries, as more than just a place for economic growth, or a place to work, but by measuring characteristics that should be more important to us – such as how happy people are where they live. The happy planet index publish the “Happiness Report” in which they explore over 150 countries and question 3,000 respondents in each of those countries to understand their current well-being. Taking into account six key variables, such as GDP per capita, the social support received within each country, the individuals understanding of freedom and healthy years of life expectancy (Helliwell, Layard & Sachs). Arguably understanding an individual’s inclination to avoid suffering and desire to be happy is the underlying factor for an individual’s behavior (Argyle; Synder & Lopez). Therefore, it would be

(14)

more applicable to derive the average key motivations of a community, so that policy makers are able to support and encourage their local inhabitants. The Happy Index provides a positive step in the right direction towards a more humanistic outlook to understand a countries well being. It improves areas of weakness that the government census’ holds by creating a more insightful description into the social characteristics of certain countries. Yet, similar to government census, The Happiness Index lacks efficiency and timeliness and is still unable to provide a real-time snapshot of cities current composition. Both methods are based on self-reporting as a way of assessing an individual, which results in the byproduct, becoming dependent on individuals ability of self-reflection & self-perception (Killworth and Bernard). In addition, due to the invasiveness and time consuming nature to fill out a government census, the data can easily become skewed as the assesse finishes the questionnaire in haste. This has resulted in some policy makers advocating the use of other socio-economic measures that can more accurately depict the subjective well being of individuals (Quercia & Saez). As data has become increasingly available due to the proliferation of smartphone devices, researchers have begun to explore the capabilities of utilizing decentralized location based service networks (LBSNs) to explore a specific urban phenomena such as human activities, a community’s happiness, mobility and space functionality. In comparison to conventional data, LBSNs provides us with an alternative that can dramatically speed up the process of assessing a cities urban dynamics. User generated content, such as geo-tagged Tweets, can provide a relatively real time mirror of the activities that people undertake within urban space. Researchers have begun to explore the applicable nature these tweets could have for the physical world by creating meaningful links between the relationships at play between the online virtual world and its implications to offline interactions (Gordon and de Souze e Silva).

(15)

What is Twitter? This thesis will focus on the location based social network, Twitter. Twitter is an online social networking platform that was created in March 2006 by Jack Dorsey, Evan Williams, Biz Stone, and Noah glass (Wikipedia). Just over five years after Twitters initial launch, it was already boasting a global audience of just fewer than 180 million monthly users (van Dijck. It is seen as a micro-blogging platform, as it adopted the 160-character limit that was applied to SMS text messages (Rogers). In which, 20 characters are designed for the name of the Twitter handle and the other 140 characters are reserved for the content of the message, also known as a ‘Tweet’ (Rogers). While a geo-tagged tweet, is a tweet that a user has voluntarily opted in for their geographical location to be posted, in the form of latitude and longitude. These tweets are posted onto Twitters timeline and can be searched by a hashtag within the tweet, the username or a keyword search. A User that is ‘following’ another user, will automatically have all their tweets displayed on their personalized ‘timeline’. The timeline shows a stream of tweets from Twitter users that the user has already chosen to follow, the user then can interact with the tweet by replying to the user, re-tweeting or press the like button (Twitter). Unlike some social networking platforms, users ‘follow’ one another, rather than befriend each other; this reduces the sense of intimacy and a presumed relationship that is typically accompanied with the notion of a ‘friend’. Twitter launched itself into a niche market, it provided a service that only allows short messages to be sent and received, this gave it he nickname “SMS of the Internet” (van Dijck). It was their character limitations, which can be attributed to some of its success’, as it managed to remove certain formalities that were applicable to other means of communication such as a phone call, an e-mail or by blogging (van Dijck). A phone call is personal, intimate and exclusive, an email exchange becomes formulaic and blogging laborious. The tweet overcame these forms of communication by undermining their technological design and rethinking how society can interact with one another. Entrusting an invaluable insight onto a platform, our thoughts, our intentions, and the activities of our society (Phelan et al.). Now, Twitter’s self-proclaimed title as the “window to the world, is accurate, as it can portray what is occurring in the real-physical-world, but also now it

(16)

represents the activities of the virtual world, developing its new nickname as “the pulse of the internet” (Hiner). Twitter, an open system? Many people view Twitter as an open and public medium, as it allows anyone to sign up as long as they have two elements - a name and email address. One of its co-founders, Jack Dorsey, expressed his desire for Twitter in 2009 to be used “as a utility” in which users can “use it like electricity” (CNET). His aim was to bring Twitter to the level of other communication devices, something that can “fade into the background” (CNET). Similar to other social networks, Twitter adopted a walled garden approach, in which certain areas of information are unavailable to users who do not have an account. Therefore, forcing a viewer to sign up in order to gain access to any of the sites features, such as user profiles and top trending hashtags. This notion echoes Jonathan Zittrain’s concept of ‘appliancization’, in which he argues, we are shifting from our information ecosystem and openness towards more “tethered appliances and services which increasingly constrict a potential sea of users” (Zittrain). This concept can be witnessed in the ‘appliancization’ of hardware devices, but also can be applied to online social media platforms. As both areas are a seeing a generative shift in which the end users are being encouraged to join a closed system. Once on a network like Twitter, a user becomes restricted in the way, they can express themselves and self-communicate due to the character limitations on the site (van Dijck). Restrictions are also enforced to non-Twitter users, whom have to sign up in order to just view the sites core features. While also chasing away, any third party developers who wanted to support Twitter by creating applications for its users. They safeguarded its API shortly after becoming a public company; any Twitter clients would have to be approved by Twitter ensuring that they would pay some dividends or consent to Twitter’s advertisements. Objectifying Twitter as a platform Some criticize the content of Twitters stream; Richard Rogers in his opening of Twitter and Society discuss’ how Twitter can be seen as banal and phatic, but also as a shallow and egotistic medium (Rogers; Marwick & Boyd). Rogers attributes this to

(17)

the large percentage of tweets which can be regarded as meaningless, “devoid of substantive content” (Miller) and just generates “daily chatter” (Java et al) in the Tweetosphere. Encouraging a transitory state as users flick through their endless stream of reverse chronological Tweets (Rogers). However, if we were to adhere to Roger’s advice, in terms of rethinking and repurposing the medium, to think along the line of the device - we can repurpose Twitter as more than just a social network but as a research tool (Rogers). We can take advantage of the 1% of tweets that users have voluntarily geo-tagged (Morstatter et al 2013) to drive a new insight into urban dynamics. This may seem like a small percentage of useable data, but 1% of the ~500 million tweets that are sent out daily on a global scale, can still provide 5 million geo-tagged tweets (InternetLiveStats). But, as the number of smartphone devices increase, the number of geo-tagged tweets and the number of users on Twitter will naturally grow making it more useful for urbanists and researchers alike (Hawekla et al). Revealing the pulse of the city with Twitter has the potential to reduce costs and the timeliness that accompanies other traditional methods. Unlike some traditional methods that tend to have a high turnout rate, Twitter cannot be used as a complete accurate reflection of cities demographic. As not everybody who visits or lives within a city is on Twitter. Therefore, the platform can only provide a snapshot of the city through the lens of Twitter, meaning all results will be measured based on the demographic of Twitter users (Hristova et al). This can sway the results and must be taken into account when exploring the city through Twitter. However, research undertaken by Enrique Martinez, in which he explored complaints about cities through Twitter, concluded that a set of generic tweets “allowed for a global perspective about the areas to improve in the city” (Martinez). Meaning the snapshot of the city can still be a vital tool in understanding areas frequented by explorers of the urban environment. Now we have a firm understanding of some of the key characteristics of Twitter, further explanation into Twitter research is needed to ground my thesis within its field of study. There are three main research areas in which I want to explore

(18)

Twitters usability; sentiment analysis, patterns of mobility, users behavior, and urban dynamics on Twitter. Sentiment analysis using Twitter Research undertaken by Mitchell et al, explored the potential of the micro-blogging site Twitter as a research tool, in which it was used it to analyze the geography of happiness. They managed to reveal and explain disparities in societies temporal happiness by exploring 80 million words posted on Twitter and applying forms of text based sentiment analysis to it. Using this they were able to estimate the level of happiness within a state or a city, creating links between census data - demographic characteristics, word choice and message length. From this they were able to estimate variations in key urban characteristics, such as education levels and obesity rises, while showing the plausibility of using social media data as a supplementary data source for urban analysis (Mitchell et al). Similarly, Quercia et al managed to track gross community happiness by examining Twitter users of London, analyzing the relationship between the content of the tweets and the sentiment that was expressed within their 150 characters. They found that self-reporting online was a “reasonably accurate” way in measuring a persons well being; their research supports the claims that social media can be used to track the overall well being of an individual (Quercia et al). Noulas et al looked at accompanying semantic annotations of specific clustering areas of user data from location based social networks Foursquare and Twitter. The research showed how it was possible to profile users and break them down into communities based simply from user-generated data; from this they were able to work out the function of a geographic area and characterize it (Noulas et al). This research was expanded further by Golder and Macy who found that there were seasonal and diurnal patterns on Twitter had a strong correlation between when patterns of work, sleep and daily activities of users (Golder & Macy). More recently, Frank et al scraped 37 million geo-located tweets applying a hedonometer4 introduced by Mitchell et al, to the content of the 180,000 4 Hedonometer was created by the economist Francis Edgeworth, who described it as “"an ideally perfect instrument, a psychophysical machine, continually registering the height of pleasure experienced by an individual” (Dodds, Peter, et al)

(19)

Twitter users. Using this, the team was able to map out patterns of mobility across various cities in the United States and was able to determine how happiness increases with distance from the users average location (Frank et al). Lastly, Saghl et al looked at the collection of human behavior patterns by using user generated mobile network traffic and geo-tagged information on social media platforms from a range of European cities. While they conclude that utilizing, social media data as a way of sensing the city can further ones understanding of urban social dynamics. One of the interesting points that they discuss in their conclusion, is whether spatio-temporal patterns from social media channels is representative enough of the larger population and they query whether they can ground-truth their own conclusions. It is this last point that resonated with me and has provided me with a framework of limitations that I can aim to try and reduce the impact to my study. These sets of researchers provide us with a strong foundation and further our understanding of how one can use Twitter to understand the overall sentiment of individuals and how this can be applied for urban analysis. With such a strong foundation, from researchers all over the world applying their methodology to hundreds and thousands of tweets, attempting to replicate and improve similar studies will only be shrouded by the large quantities of tweets they were capable of analyzing. Therefore, this research will not undertake sentiment analysis as part of this research, but an awareness of what is possible, was necessary to ground the research as Twitter as a city sensor. Assessing urban dynamics and human mobility through Twitter There have been some research efforts into investigating patterns of urban dynamics and mobility through the prism of Twitter. Cranshaw et al applied a cluster-based model, using data from Foursquare check ins that were shared on Twitter as a way of understanding the social composition of Pittsburgh, USA. The paper was built on the notion that municipal boundaries for neighborhoods were becoming outdated and they now do not accurately reflect the individuals who are living in these areas. This idea was also discussed in Rainie and Wellman’s book Networked; the New Social

(20)

Operating, in which they believed that a societal shift in how people socialize and communicate through mobile methods, is altering the traditional boundaries of neighborhoods (Rainie & Wellman). Cranshaw et al reflected this notion in their research, as they managed to identify clusters within the city that provided a more dynamic viewpoint of the Pittsburgh neighborhood. Their clusters reflected the overall collective activity patterns of people living in Pittsburgh, which allowed them to understand the dynamic nature of urban areas, including the forces that were shaping the function of the area (Cranshaw et al). Portraying a dynamic, real time view of the city allowed them to generate a live character of the city, which was further validated with qualitative researching from local residents. The visual character produced with the internal social sub-clusters that were discovered would have been difficult and costly to obtain with conventional governmental census data. Census records would likely have provided an inadequate reflection of the sub-clusters within each neighborhood and would require further labour intensive on-site testing. Needing even greater financial support for a third party assessor to observe the neighborhood, their results still might not be able to provide us with patterns of Pittsburgh’s urban dynamics. Their research could have expanded further from just capturing the social dynamics of a city, using Twitter they could have understood the reason for mobility by applying semantic analysis to the content of the tweets (Bimbo et al). Researchers Kling and Pozdnoukhov tried to combat the limitations of Cranshaw et al by trying to combine the textual nature of the tweet with the geo-tagged movement data of a user. Using this data, they managed to apply topic analysis, which enabled them to understand patterns in the function and structure of specific urban spaces (Kling & Pozdnoukhov). One of the limitations I felt from this research was their lack of variations of defined topic, being either nightlife or home, I argue these only scratches the surface of Twitters capabilities as a tool for urban analysis. Scraping Twitter data can provide you with much more information than simply the user being at home or at nightlife venue. Sub-categories could be applied to be able to visualize and differentiate between the locations of individual users. Manually sifting through the Tweet content and specifically defining that areas function from

(21)

the tweet could provide you with the type of nightlife venue – a bar, club, or a concert. While also capturing extra information such as; who the user is going with? How the users are feeling about the venue? Twitters capabilities are under-used in their research and further studies need to explore the various functions Twitter holds. This notion also echoes in the field of study, with many researchers discussing in their concluding remarks, that further research needs to be done in exploring the use of LSBN as a new approach to urban analysis (Kling & Pozdnoukhov; Bimbo et al; Cranshaw et al; Noulas et al; Saghl et al; Quercia et al). Zagheni et al also agreed with the lack of consistent research in their paper on understanding international and internal migration patterns from Twitter data. Their research concentrated on migration, as it is one of the key drivers of change within a countries demographic, they concluded specifically there is a severe lack of migration statistics in between traditional census records and any recent trends. Their answer was to harness the power of Twitter, using user data to map out movements of individuals within and between countries over a short period. The results proved that Twitter was capable of predicting turning points of migration and can provide an insight into the relationships of internal & international migration. The richness of volunteered geo-tagged tweets and crowd-sourced data on human activities has the potential to strengthen the users experience in real-time while allowing for a new wave of opportunities of enriching a cities information systems. This thesis would like to support and test some of the previous research discussed, by adopting similar methods, but to build upon the limitations that became apparent, in an attempt to create a more recent concrete study. Upholding Uprchiard’s argument on social research, in that “cases are ‘made’, both conceptually and empirically, by constantly and iteratively re-shaping and re-matching theory and empirical evidence together” (Uprichard; Bora & Rieder). Exploring Twitter as more than just a micro-blogging site, but as a research tool that could assess a cities urban dynamics, and something that would be capable of enhancing and enriching our traditional census data. Twitter has the capabilities of providing a real time reflection of the fundamental forces that are altering the urban landscape. Complementing these new techniques with the more traditional methods could reduce the pressure of

(22)

measuring a city or a countries welfare, many of which are still based on material or economic methods (Quercia & Saez).

(23)

3. METHODOLOGY

3.1 Research question and aims This research will aim to explore, how Twitter can be used to assess a cities urban dynamics, with a some underlying sub-questions: 1. What kind of city does Twitter put forward for Manchester and Amsterdam? 2. Do previous scholarly views of Twitter still hold, when used for city research? 3. Can Twitter be used for a more fine-grained approach to city research? 3.2 Twitter as an object of study & research tool This research will use Twitter as a source of data, and as a way of grounding my research within a digital methods and new media context. Research undertaken by Miller, Java et al, Marwick and Boyd explored the earlier content of Twitter, also known as “Twitter I” (Rogers), in which they debate the usefulness that large quantities of Twitter data can hold. It was found in these earlier stages that Tweets typically were “devoid of substantive content” and full of “daily chatter” (Miller; Java et al). The propensity of banal tweets, help to shift the field of study from traditional literary methods towards understanding more phatic expression, where the “content is not king” (Malinowski; Miller). Steering research to objectify and analyse the platform, as spaces, not by its content. Miller felt that tweets did not convey any useful information, but the tweets and connections made through Twitter simply represent the “process of communication”, that is void of any beneficial information. Even so, if the Tweets are phatic in content, I argue they can still provide some meaning – especially if the Tweet has been voluntarily geo-tagged. A geo-located Tweet is a tweet to which a user has opted to add their latitude and longitude from the position they sent out the tweet. “Geo-located Twitter is one of the most freely and easily available global data sources that stores millions of digital and fully objective records of human activity located in space and time” (Hawelka et al). The user grants Twitter access to their GPS device and/ or the location from their mobile telephone, the co-ordinates is then be transmitted as meta-data5 attached to the 5 Metadata is often described as data about data, as it is a structured form of information that provides a detailed account, or explains, or locates, or otherwise allows the user to easily retrieve, use or manage an information resource (NISO).

(24)

tweet (Freelon). In addition, developing a connection with an individual is not necessarily based on verbal or textual communication; it is possible to connect with another individual without words (Rogers). If a Twitter user follows another Twitter user, without any tweets sent between them, the connection being visible still could have connotations that could be useful for research, such as Network analysis. Having this ambient connection creates and maintains a form of digital intimacy with another user (Reichelt), observing such a relationship in the digital world could have an influence to understanding the physical world. This image of Twitter changed, after the adjustments to Twitters tagline in November 2009, from “What are you doing?” to “What’s happening?” where there was a significant shift in the quality and content of Tweets (Rogers). Co-Founder Biz Stone shortly after the tagline alteration, publically renounced Twitter’s new purpose as a “discovery engine for finding out what is happening right now” (Stone). Roger’s argues that this “move from an ego to a reporting machine” was caused by an encouragement from the users and researchers that created an internal cultural shift on the platform (Rogers; Tate). However, it could be argued that this culture of Twitter being a reporting machine has always been in its foundations but just not utilised by the masses in this way. Twitter became the kaleidoscope to worldwide events in many occasions before its recent transformation. Facilitating revolutions in Iran, which demonstrated “as never before the power and influence of social media”, which coined the phrase “Iran’s Twitter Revolution” .The live documentation of Mumbai terrorist attacks in November 2008 (Arthur) or the San Diego Fires (Rogers), or the simple yet provocative Tweet of “Arrested.” that was composed by James Kalr Buck after being arrested in Egypt in 2008. It was a warning beacon that managed to reach thousands of people with just one tweet, which reverberated around the globe through the Twitter, which led to his subsequent release from prison. Therefore, I would argue Twitter has always had some involvement in live-reporting events, but now it has been recognised even more so for this purpose. It can provide access to a reverse chronological, constant stream of users with various viewpoints on contemporary global affairs.

(25)

This evolution into “Twitter II” (Rogers) has re-shaped the purpose of Twitter, making it more credible to use as an object of study, as a research tool, and as a city sensor. For this thesis, there were some methodological considerations that had to be taken into account. For example, the research team consisted of just one person, so there was a strong deliberation when exploring which platform to utilise. The output of each individual data set had to be taken into consideration, i.e. what format did the data come in, how long was each individual data set, how long would the data take to map and to apply some form of analysis. The 140 character limitations that are applied to each individual Tweet, made Twitter even more of an attractive option for research, as it allows one to easily apply textual analysis and a more attentive close reading approach (Marres & Weltevrede). The access to tools, such as DMI-TCAT6, allowed me to gather tweets with ease, using the tool, grants permission to access Twitter’s well-guarded API and enables navigation through a data set without any prior coding experience. There are also an abundance of third party commercial applications that accompany Twitter, they can provide basic level analytics in regards to hashtags, follower count and engagement, such as Tweetstats, or more detailed network analysis one can use Gephi. Tweetstats is a third party web application that allows you to visualise a Twitter users statistics, from number of tweets per hour to monthly user engagement and visualising the change in followers over a period of time (Tweetstats). While Gephi is an open-sourced software that is capable of visualising and applying forms of network analysis, that allows you to explore, filter and manipulate large quantities of data (Gephi). At the time of writing this thesis, these programs were available for free public use, however, after Twitter’s IPO7, they have tightened their policies. Once a relatively easy access to Twitters API, has now become more increasingly difficult to gain access, this is linked to their initial public offering where Twitter has now become listed on the stock market. This greatly affected the openness of the platform, as they have reinforced and entrenched areas in their terms of service, which has had a 6 DMI-TCAT is a Twitter data capture and analysis tool, which is only available to students or affiliates of the University of Amsterdam (DMI-TCAT). This tool is explained in more detail under “data collection” section in the methodology on page twenty-five. 7 Initial Public Offering or IPO, refers to the first sale of stock from a private company to the public (Investopedia).

(26)

knock on effect to access of Twitters application program interface. Subsequently reducing the pool of available third party apps, this might be a growing trend for Twitter further down their organisational timeline, as they begin to focus more on their shareholders rather than their users. In addition, the ephemeral nature of Twitter also adds to its attraction to research as a platform, as it only takes into account Tweets from the last 7 days on its stream. This reiterates the functional nature of Twitter as a live-reporting tool; by only displaying contemporary and prominent events, it can be then seen as an accurate sensor for the world (Guian-Illanes). There is an obvious limitation with choosing Twitter in the fact that only a small percentage of tweets are geo-tagged, this can make it rather difficult when mapping specifically urban dynamics on a city, as it provides us with a limited selection of only certain users activities. This being said, this research is to show the potential and capabilities of Twitter as a tool by utilising in a novel way in which it can be repurposed and re-applied to fit the needs of the user. Utilising new digital technologies and adapting them to new practices, can considerably improve the range of empirical and analytical social research, while producing more pertinent social research to social life (Latour et al; Rogers). I would therefore like to take Twitter further than just a social networking channel, but as a research tool and to show that Twitter can become a sensitive city sensor.

(27)

3.3 City Selection To explore and to answer the research questions and aims, I will be using two major European cities to be the case studies for this research, my chosen cities are: Amsterdam and Manchester. 3.3.1 Why Amsterdam? “Even in the middle of the city the serenity of the canals is around every corner.” Twitter user –Amsterdam Data Set from 4th April 2016 Amsterdam provides an interesting area of research due to its prominence on an international demographic, international scale, its compact size, and it is considered, as a smart metropolis (City of Amsterdam). Amsterdam is home to over 180 different nationalities, with around 830,000 people residing in the historical centre of the city, while nearly 2 million people live in the surrounding metropolitan and urban areas (Wikipedia). The diversity of its residences is mirrored in its tourist demographic, Dutch hotels have had over 22 million overnight visitors in 2013 from holidaymakers from all across the globe. This cultural diversity provides a variety of tastes, views, thoughts and opinions, all of which will be attracted to different points of the city, which might be visible through Twitter. Tourism is a very big problem in Amsterdam and is increasingly putting pressure on the city centre and its support infrastructures. Understanding where tourists migrate through the city is essential for not only city planners, but also for business developers, local shop owners and any other stakeholder invested in Amsterdam. Therefore, uncovering and revealing places that are frequented by tourists will allow for the municipality to be able to effectively manage tourists and understand which infrastructure needs greater support. Amsterdam has a rich, intertwined history with all things Digital; it became one of the pioneers of encouraging Internet access. In 15th January 1994, there was a freenet initiative from the cultural centre De Balie and Hack-Tic, called The Digital City (Waag). It was one of the first global attempts of trying to connect up a large group of citizens to the global online Internet community. It managed to encourage over

(28)

100,000 users to connect to the Internet; this has allowed Amsterdam to have one of the highest percentages of Internet users in the world (Amsterdam). The high amounts of Internet activity, could suggest that there are large proportions of Internet users who are active Twitter users. Their freely publicised data could provide us with quantities of meaningful data in understanding the dynamics of Amsterdam’s urban space. In addition, Amsterdam provides an interesting area of research as it is a high, tech innovative metropolis within a small urban boundary. From a research perspective, this might make it more straightforward when identifying areas frequented by users of the city, and could allow for greater detailed to be observed due to the smaller land space. An air of transparency and openness surrounds Amsterdam’s culture, in terms of its people, but also its policies – especially in regards to its ‘Open Data’ policy. The Municipality of Amsterdam have created a central point for citizens, researchers or just inquisitive minds to access over 175 public datasets. The topics range from management and organization, to tourism and culture, to the more niche topics such as parking data (Municipality of Amsterdam). The availability of supplementary data sources may become useful when trying to validate claims when using Twitter to assess a city and the availability of such data was a contributing factor when considering my cities case study. In addition to this, Amsterdam is an incredibly well connected city both physical transport systems and virtually through digital methods. The city boasts Schiphol Airport just less than 15 minutes away from the city centre, with high speed rail connections to London, Paris, Frankfurt, Dusseldorf and Brussels. Amsterdam’s accessibility makes it an ideal city for trade, which has pushed it into the top 5 best locations for business in Europe (City of Amsterdam). Regarding its digital accessibility, Amsterdam has the second largest Internet exchange in the world and has one of the world’s highest percentages of Internet users (City of Amsterdam). With such a high percentage of Internet users active within Amsterdam and its increased availability of data sources, it in theory could allow us to research into a ‘city of the future’. What I mean by this is, if the growing trend towards a more connected and networked world, where citizens are more calculable and data is freely available. Then I argue Amsterdam is in capsuling many of these characteristics

(29)

to ensure that they are seen as one of the pioneers in this Digital revolution. This Digital Revolution is not novel for Amsterdam, the city launched “The Digital City” in 15th January 1994, to remove the ‘just aristocrats’ title that was applied to Internet access as it was an initiative to encourage Internet accessibility for everyone (van Lieshout). Therefore, to undertake research in a city that is aiming to become a futuristic and smart city, in its early stages, may prove useful for further research. The final reason for picking this case study was more personal, as I am currently living and studying in Amsterdam as I write this thesis. 3.3.2 Why Manchester? “ Manchester is a city, which has witnessed a great many stirring episodes, especially of a political character. Generally speaking, its citizens have been liberal in their sentiments, defenders of free speech and liberty of opinion. ” - Emmeline Pankhurst Manchester was picked as a case study for a number of reasons; its dominant position in the Northern territories of the UK, it is undergoing an interesting urban phenomenon of urban regeneration & growth, its current lack of research in the digital world of Manchester, its openness for data, culturally diverse and there is a personal connection to the city. Manchester is situated in the North of England, which was once described by New York Times foreign correspondent R.W. Apple Jr as “the other Britain” (New York Times). At the time of his publication in 1985, this description could not have been more apt. But, Manchester became the world’s first industrialized city and has had unprecedented growth over the past 20 years in terms of investments, infrastructural development and its diverse population (Kidd). This revival can be traced back to the 1996 IRA bombing in the city centre, which wounded over 200 people and reportedly damaged £1 billion worth to the surrounding area (McDermott). The city responded to this act of terror attempting to rebuild and regenerate Manchester into a modern

(30)

post-industrial city. Manchester is now recognised as an open and international city; its cityscape is juxtaposed with its industrial heritage, the open brickwork and renovated factories mingle with the modern glass and steel structures that loom over the city. The most recent government census has shown that there was almost a 20% increase in its population over a ten-year period; in comparison, this growth was 3 times higher than the United Kingdom’s national average (Neighbourhood Statistics). In 2013, Manchester’s population was 514,417 in the city and 2,714,900 people in the Greater Manchester area. Manchester’s population is the most linguistically diverse city throughout the entire of Western Europe, with over 200 languages spoken across the 493 SQ-mile-area of Greater Manchester (Brown; Wikipedia). The population is expected to continue to grow, in conjunction with the increase in trade, commerce, and industry as they begin to thrive once again, giving Manchester the title as the second largest city and the second capital of the UK. Such cultural diversity within Manchester provides an interesting area of research, as it may be viewable through Twitter, an individual for a different background might visit different areas of Manchester for example; China Town or the infamous Curry Mile of Rusholme. The past two decades has allowed for Manchester to become “the northern powerhouse” (McDermott) and to grow with its rapidly expanding population -. Creating the capital of the North and the second city of the United Kingdom (Wikipedia). This growth will be exacerbated further, especially after the recent win of a £10 million prize for Manchester to become one of the world leaders in ‘smart city’8 development. Attracting such considerable donations from wealthy investors, 8 The term “smart city” is an overused & vague term; for many Amsterdam is seen as a ‘smart city’ and Manchester is en route to becoming one. The term is underpinned by its strive to ‘make sense’ of the vast amounts of digital data that is produced and collected in an age of ubiquitous computing, and to utilize this data as a way of managing and planning a city. Connected devices to be able to communicate with one another, in real time; it has the potential to drastically improve the efficiency of major cities. There is a growing shift towards cities to use “smart” technology, as more of a promotional device to drive marketing campaigns, to rival city competitiveness and as a way of encouraging capital and labor to their city (Shelton et al).

(31)

government grants and the steady increase of business opportunities, has allowed Manchester to regenerate and redevelop as the ‘cultural counterbalance to London’ (Jupp). This rapid growth and regeneration of Manchester makes it an interesting city to study as there are many complex forces that are currently changing Manchester’s cityscape. It will be interest to explore how this sudden influx of financial investment is changing the city, how it is bringing in new life to Manchester and whether this could alter the dynamics of the city. My study could be used as a benchmark to assess, the impact further investments, may have on the cities urban dynamics. Will the type of city that Twitter portrays hold similar characteristics 5-10 years down the line, or will ‘locals’ have been pushed out of the city and the bustling scene that Manchester once offered is altered? I would also be keen to notice any affect on the availability of data once the £10 million smart city grant prize has come into fruition and is effecting in Manchester. As the city is being reimagined using technology, it has the potential to generate even greater data from the city, which in turn will be able to drive further analytics - making a real difference to the people, business and research of Manchester (Vaizey; Leese).By providing some research of the virtual world of Manchester, it can paint a contemporary picture of what Manchester is now and be used as a snapshot for future research as a comparison. Greater efforts are also needed in contribution in data surrounding Manchester, there is a gap in city data, in which the city is driving for an open, freely available data source for any stakeholder of Manchester. The cities open data organization, DataGM, is a collaboration effort between numerous public sector organizations and local councils (DataGM). It is currently still in its beta stages; however, it has already managed to centralize 369 datasets on a variety of subsets – for example, map files for authority and city boundaries, or leisure centres across the Greater Manchester area (DataGM). The availability of data on the city, was another contributing factor when choosing my case study example; as I would like to have the option to be able to cross-examine what I find through Twitter with readily available data. Especially as a basis for one of my proposals, is to utilize Twitter data as supplementary data to assess a cities dynamics. Therefore, it may be useful to be able to compare and

(32)

validate with other supplementary data that is available. The last factor is my personal involvement with Manchester; I lived there for three years while I complete dmy bachelors in social science. I have a strong connection with Manchester as a city, in which I still maintain by regularly visiting the city and being fascinated by how much has changed since my last visit. The city is undergoing a really exciting change and I want to capture it; I argue that by using new digital methods that can shed a new light into understanding Manchester city. 3.2.3 Similarities between the two Manchester and Amsterdam are many miles away from one another’s city centres; however, they hold some similarities, which must be accounted for before one goes any further. Firstly, both cities hold similar sizes within their metropolitan area and are seen leaders of their respective countries. However, where they differ is the physical size of the city in which Manchester is considerably larger, this also suggests that this methodology can be scaled to other cities. Secondly, both cities are encouraging an open data initiative, which can provide cross analysis and validation to some of my data sets if necessary. Thirdly, Manchester and Amsterdam are both attempting to become a digital city and are attracting similar business demographics. Lastly, there is a strong personal connection, as I have lived for a long period of time in both cities. 3.4 Data Collection How the data was collected? The main tool I used for data extraction and collection was the Twitter Capture and Analysis Toolset (TCAT), which is created and developed by the Digital Methods Initiative (DMI), nicknamed DMI-TCAT. It allows for the user to capture a defined dataset using a variety of methods; based on specific queries, or by excluding keywords, hash-tags or, by a specific date range within a predefined bounding box (Digital Methods Initiative). Once a dataset has been defined by your query, either by a geo-capture or keyword capture, DMI-TCAT will scrape your data and produces an

(33)

output in either a comma-separated-values file (CSV) or a graph-exchange-XML-format (GEXF). The CSV file will store numbers and text in a tabular format that is separated by a comma, it is a common file type that can be read in Microsoft Excel. The latter is a networking file that describes the structure of a complex data network, its core characteristics and any associated data with the network (Gephi). A GEXF allows for a hierarchical structure to be built upon by nodal representations, creating the ability for a network to represent clustering (Gephi). Both files can be useful in data analysis, as in them they provide key information on Twitter users, hashtags, mentions, conversations and relationships in the “tweetosphere” (Collins). DMI-TCAT is restricted to users who are currently enrolled in a media studies masters at the University of Amsterdam, or a researcher who is affiliated with the universities media studies department. DMI-TCAT uses Twitters application program interface (API), which provides the tool with predefines rules and regulations that are applicable to third party applications and their developers. It provides key computational information on how the application should interact with the platform and access is limited by application-only to Twitter headquarters. For my thesis, I used TCAT to retrieve data on two of my chosen cities – Amsterdam & Manchester. DMI-TCAT has been consistently scraping data on Amsterdam since 2014, defined by a bounding box and any tweets that containing the word [Amsterdam] in all European Languages. The Digital Methods Initiative and one of the developers of TCAT, Eric Bora, helped to enable DMI-TCAT to track Manchester activity based on my pre-defined bounding box and keyword capture. Data capture for Manchester started on the 3rd March 2016. Manchester A bounding box was set up based on four corners, with a central point in Manchester city centre, with a large radius to include peripheral points, such Manchester City Airport, Manchester United Football stadium and the borough of Salford. These spillover areas were taken into consideration in the study, as I believe they are crucial constituents in the county of Greater Manchester. Manchester Airport chaperones over 22 million passengers a year and supports a workforce of over 61,500 people (Manchester Airport). Manchester United Football stadium, Old Trafford, has a

(34)

capacity of 75,635 while its rivals, Manchester City’s Etihad Stadium just 10.4 miles away has a capacity 55,097 (Wikipedia). These highly populated and dense areas provide a centralised Wi-Fi enabled area, where there is a mass congregation of people with a variety of attitudes, opinions and backgrounds. Popular venues and key entry points into Manchester could provide greater insight into the urban dynamics of Manchester. The bounding box co-ordinates for Manchester were: - South West corner (latitude/longitude: 53.33743, -2.47055) - North East corner latitude/longitude 53.67800, -2.04895

(35)

The keyword word capture were based on all European language translations of Manchester, which are as follows: Amsterdam Unlike Manchester, Amsterdam has been running on DMI-TCAT for a few years now, therefore I had less control of the bounding box. However, this did not affect my results as the bounding box included everything within S100 ring road (i.e. the city centre). The co-ordinates for the bounding box were: • South West: 4.768520,52.321629, • North East: 5.017270,52.425129 European Language Language Version Polish Manchester Danish Manchester Italian Manchester French Manchester Spanish Mánchester Dutch Manchester German Manchester Portuguese Manchester Swedish Manchester Greek Manchester Romania Manchester Slovakia Manchester Slovenia Manchester Lithuania Mančesteris Latvia Mančestra Greek Μάντσεστερ

(36)

The keyword capture were based on all European translations of Amsterdam: European Language Translation Polish Amsterdam Danish Amsterdam Italian Amsterdam French Manchester Spanish Amesterdão Dutch Amsterdam German Amsterdam Portuguese Amesterdão Swedish Amsterdam Greek Amsterdam Romania Amsterdam Slovakia Amsterdam Slovenia Amsterdam Lithuania Amsterdama Latvia Amsterdama Greek Άμστερνταμ

(37)

3.5 Noise Reduction This research applied a quasi-qualitative approach when taking into account spam, noise or any irrelevant tweets. A quali-quantitative is applying both a qualitative and quantitative method for social research. The first area of reducing noise in my data set, I utilized the quantitative tool Microsoft Excel, to manually clean the data from any evident errors, such as removing illegible characters (e.g. @ symbols, corrupted textual data & hyperlinks) and determining artificial Twitter noise by working out the ratio of Tweets for the number of users. This ratio can reflect the quality of the data set, as a high ratio on a data set implies that there are a large number of tweets to a small number of users. This could indicate that a user is repeatedly posting tweets, which have a high chance of containing, self-promotion, non-human Twitter activity such as professional advertising, or just “daily chatter” (Java et al). These users can skew my data by mass-produce Tweets, this can alter my findings, as sometimes these users do not represent a form of human physical presence in the time and the place that the Tweet was sent (Hawleka et al). A social media company might run some accounts, or a robotic Twitter that sends out schedule pre-defined Tweets. These handles will provide little information regarding the dynamics of a city. Having an awareness of this ratio, will keep me more vigilant while performing analysis of my data, as I will be informed of the characteristics of the data set. It also can provide some validation for the data set, as a comparatively low ratio represents an equal and well-balanced set of tweets. The second area of noise reduction, was to apply a qualitative method by manually categorizing each individual tweets. One of the categories I added to this process was “noise”, which was determined by any tweets that were not relevant to my study, if they did not provide information of the city it was tweeted from - this is expanded further in my Data Categorization section.

Referenties

GERELATEERDE DOCUMENTEN

Bovenstaande gedragsregels worden al bij inschrijving kenbaar gemaakt bij onze algemene voorwaarden (verwijzing naar onze website met de regels) aan al onze leden en zijn terug

Het herroepingsrecht van de klant vervalt, als hij niet voor de overeengekomen zijn recht tegenover a&o GmbH in schriftelijke vorm uitoefent, tenzij sprake is van een

De Gouden Karper , in de volksmond ook wel de Krent genoemd, is een van de oudste horeca gelegenheden in de Achterhoek.. Sinds 1642 is het pand in gebruik

De Nederlandse afgevaardigden hebben gemeend voor dit Duitse ul- timatum, dat iedere mogelijkheid tot overleg uitsloot, niet te moeten zwichten, doch hebben de

senioren / junioren portie 15,75 Biertip: La Chouffe Blond, Brugse Zot blond. - Zalmfilet met Scandinavische kruiden 21,00 Lachsfilet mit

Op zuidelijke gebouw van de Residence bedraagt de maximale geluidbelasting als gevolg van het wegverkeer op de Parallel Boulevard 55 dB (incl. aftrek artikel 110g Wgh)..

As a consequence, the required percentage of people using the bicycle in a city in order to be considered as a cycling city according to city dwellers might be lower for

Those who need ECEC during daytime hours (6:00 am. – 6.00 pm) usually are children living in surrounding neighbourhood area of the day care centre.. Some children may attend