• No results found

Mapmaking for Language Documentation and Description

N/A
N/A
Protected

Academic year: 2022

Share "Mapmaking for Language Documentation and Description"

Copied!
55
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Mapmaking for Language Documentation and Description

Lauren Gawne

SOAS, University of London

Hiram Ring

Nanyang Technological University

This paper introduces readers to mapmaking as part of language documentation.

We discuss some of the benefits and ethical challenges in producing good maps, drawing on linguistic geography and GIS literature. We then describe current tools and practices that are useful when creating maps of linguistic data, par- ticularly using locations of field sites to identify language areas/boundaries. We demonstrate a basic workflow that uses CartoDB, before demonstrating a more complex workflow involving Google Maps and TileMill. We also discuss presen- tation and archiving of mapping products. The majority of the tools identified and used are open source or free to use.

1. Introduction 1 Linguists engage in representing linguistic data so that others can easily see and understand what a linguist is attempting to describe. There are many ways to represent a language and its context. Some are non-spatial, such as indexes and tables. Others are spatial, although the space may be abstract, such as in Neigh- bornets and Bayesian phylogeography. Representing the distribution of a language in geographical space is a useful visualization, and being able to create maps is a valu- able skill. People are used to processing geographical and spatial cues on a map, and well-designed maps can help researchers convey facts about a language to a wider audience (Upton 2010). However, language maps such as those found in high school atlases can give the erroneous impression of a clear relationship between nation-states and homogeneous national languages (Mackey 1988:24–25), while even detailed lan- guage maps of some linguistically-diverse areas can obscure small languages (Dahl &

Veselinova 2005). The majority of the world’s languages are under-documented, and thus our knowledge of where these languages are located is also lacking. Often the 1Lauren Gawne would like to thank the Research Bazaar team at The University of Melbourne for their TileMill training course as part of ArtsHack, with additional thanks to Steven Bennett for server space and answering endless questions. Lauren would also like to thank Christine Gawne, who did the tedious work of tracing the maps (duplicated in Figures 1 and 2) that Lauren used before she learned to make maps for herself. Thanks also to the CartoDB support team, and the makers of GoogleMaps and TileMill.

Hiram Ring would like to thank the programmers of GoogleMaps, CartoDB, and TileMill for creating excellent tools, and NASA for GPS. Hiram would also like to thank the Ethnologue team and members of SIL for introducing him to language mapping and the identification of small languages in remote and inaccessible places.

(2)

location of language-speaking populations is known only to the speakers themselves and possibly specialist researchers (Dahl & Veselinova 2005). The ability to make maps has also been limited to those with access to the technical skills and capital, creating an imbalance in who can claim the right to be represented (Peeters 1992:7).

In this paper we explore the “cartographic representation of linguistic facts,” to borrow a useful definition of linguistic geography from Kehrein et al. (2010:xi). This may refer to ‘language maps’ that map the location of one or more language varieties, or ‘linguistic’ maps that map particular linguistic features of one or more language varieties (Girnth 2010: 100). We particularly focus on languages that are generally underrepresented in the geographical tradition. As noted above, many maps relating to languages or dialects are out of date or based on incomplete information. This is particularly the case in remote or inaccessible areas where little is known about a particular language group or variety.

This paper introduces readers to language mapmaking in relation to the language documentation process. This includes some of the considerations necessary for mak- ing language maps, and the introduction of techniques and methods that have made the task of map creation much easier in the last few years (see Dahl & Veselinova 2005, who a decade ago observed a lack of an easily accessible tool for language mapping). These techniques are centered around making use of Global Positioning System (GPS) data with free software such as Google Earth2 and TileMill3 and online services such as CartoDB,⁴ converting the data into features to overlay on a digi- tal map of the world. We first discuss some considerations regarding mapmaking, particularly in the language documentation context (§2). We then identify ways of gathering GPS data (§3.1), how to convert the data into a usable format (§3.2), how to use the data with various free mapping software (§3.3–3.5), and how to share your map, including exporting the map as a print-ready image (§3.6).

Mapping has a long history in linguistic research, even if minority and endangered languages and varieties are generally underrepresented in the literature. Some of the earliest mapping of linguistic information by geographical area in the Western linguis- tic tradition includes the Sprachatlas des Deutschen Reiches by Georg Wenker and Ferdinand Wrede in 1888 (Scheuringer 2010) and the Atlas Linguistique de la France⁵ by Jules Gilliéron in 1902 (Swiggers 2010). Since then, maps have been used by lin- guists for a variety of purposes, such as the historical mapping of dialects (Moore et al. 1935), sociolinguistic variation and dialect studies (Trudgill 1983; Labov et al. 2006; Kretzschmar 2013b), language attitudes (Waller 2009), perception of vari- ation (Alfaraz 2014; Bounds 2015; Yan 2015), as well as variation and change over time (Kulkarni et al. 2014; MacKenzie 2014; Prichard 2014), and many other areas of linguistics.

Mapping is not an entirely objective activity, and methods of mapping draw on the same theoretical underpinnings as any other tool of linguistic analysis. For example, 2http://earth.google.com.

3http://www.mapbox.com/tilemill.

http://cartodb.com/.

Online at: http://diglib.uibk.ac.at/ulbtirol/content/titleinfo/149029.

(3)

as Kretzschmar (2013a:53) observes, mapping language variation is grounded in a Neogrammarian assumption that there are ‘dialects’ which may exist and can be mapped (see also Labov et al. 2006:5; Murray 2010). While it is possible to undertake a kind of word geography, where points are plotted without any claim to dialects, variationist approaches may result in striving for neat generalizations in messy data (Kretzschmar 2013a:53–54). Mapping has also been a valuable tool in linguistic typology (Mühlhausler 2010), and many of the mapping skills in this paper can also be applied to maps illustrating typological features.⁶

Unlike sociolinguists and others who work with relatively better-described lan- guages, those who document languages are often providing a baseline of knowledge that was previously unavailable and which can serve as a stepping-stone for future research (cf. Roth 2013, building on previous work by SIL). Although we touch on some other uses of maps in linguistics, our aim here is not to produce a comprehen- sive discussion of the history of mapmaking in linguistics. Auer & Schmidt (2010) and Lameli et al. (2010) provide comprehensive discussion of the history and con- temporary state of linguistic mapping, and further useful information can be found in chapters of Krug & Schluter (2013), as well as Kretzschmar et al. (2014).⁷ The Jour- nal of Linguistic Geography⁸ was launched in 2013, and provides recognition that mapping is not simply a factual representation, but a core element of the “…pursuit of a better understanding of the nature of language structure and language change”

(Labov & Preston 2013:1). The process we describe can be useful to linguists in many subfields, not just language documentation. We hope to encourage more acces- sible and engaging mapping and visualizations that can add to our knowledge and understanding of language and of human interaction.

Throughout this paper we will be drawing on maps that illustrate some facet of information about language, or about communities of language speakers. This will include maps of language groups that we have worked with. Author Gawne works with varieties of Yolmo (Tibeto-Burman, ISO 639-3 scp) that are spoken in Nepal.

The traditional area that Yolmo is spoken in centers on the Melamchi and Helambu valleys north of Kathmandu; however, there are multi-village communities that have been living in other parts of Nepal for over a century (see Gawne 2013b for more detail). These include communities in Lamjung, Ilam, and Ramechhap (where the language is known as Kagate, ISO 639-3 syw). Gawne has worked mostly with the Lamjung Yolmo and Kagate varieties. Author Ring works with speakers of Pnar (ISO 639-3 pbv) and other Khasian languages spoken in Meghalaya state of North-East India, Austroasiatic languages in Thailand, and languages in North-East India which border Pnar areas, such as Biate (Tibeto-Burman, ISO 639-3 biu) and Assamese (Indo- Aryan, ISO 639-3 asm).

WALS, mentioned in §2.1 is an example of this on a large scale. For an example of map- ping a specific typological feature, see this map by the first author plotting the distribution of re- ported speech evidentiality in Tibeto-Burman languages: http://lgawne.cartodb.com/viz/95fc31aa-3543- 11e5-bc55-0e0c41326911/public_map.

A useful and accessible blog post from Hugh Paterson can be found here: http://hugh.thejour- neyler.org/2012/types-of-linguistic-maps-the-mapping-of-linguistic-features/.

http://journals.cambridge.org/jid_JLG

(4)

2. Considerations regarding mapmaking In this section we will discuss the nature of linguistic mapmaking, and considerations that will need to be addressed before starting the mapmaking process. This includes both practical and theoretical consid- erations. Even though language documentation often involves site-specific communi- ties who have lived in an area for a long period of time, there has been a general lack of discussion regarding mapmaking in the language documentation literature. Bow- ern (2008:111) mentions the need to take a device on fieldwork to collect GPS data and, where possible, use existing high-quality maps to mark features. The ability to generate your own high-quality maps from the GPS data collected has simplified this process somewhat, as we will show below.

Mapping involves the manipulation of what are actually a limited number of data types. In §3 we introduce file type conventions and manipulation, but it is worth covering some basic terminology here: points, lines, polygons, and layers.

Point data is the coordination of latitude and longitude represented as a single point. What is represented by a point depends on what is being mapped: it may be a physical object like a house, or a proxy object like mapping a house to represent the location that a person lives in, even though the person obviously moves from that point.

Line data is a series of two or more points that are joined. They can be used to represent physical information, such as paths or rivers, non-physical socio-political boundaries, potential dialect boundaries or migration pathways. Lines are also known as vectors in some of the literature.

Polygon data is a series of points that form a bounded shape. As with the other two data types, this may be a representation of physical object like a lake, or a more abstract bounded locale like a dialect region.

Layers are a way of organizing and styling map data. When thinking of maps it is best to think in terms of layers. Each thematically linked group of data of the same theme that is kept in the same file will be represented and manipulated in a map as a single layer. A layer will contain only the same type of data; that is, it will contain exclusively points, lines or polygons. Knowing whether to map points, lines or polygons is part of deciding what story you want your map to tell. We discuss some of the issues around this in §2.4.

Effective mapmaking skills allow researchers to make maps that specifically con- vey relevant information. By way of demonstration of the advantages of well-made maps, Figure 1 and Figure 2 are the maps used by Gawne in her completed PhD dis- sertation. This outline of Nepal was traced from an atlas, and then MS Paint was used to label the relevant language varieties. The villages in Figure 2 were identified in Google Earth using satellite imagery and the road traced in approximation. These maps offer limited contextual information and are limited in their reproducibility.

We are unable to reproduce the map Gawne was using earlier, as it was an image with copyright restrictions. Copyright-restricted lack of reproducibility is, in itself, sufficient reason for such a map to be unsatisfactory, but it also contained a great deal of irrelevant information and was not very visually appealing. In contrast to the hand-drawn and difficult to replicate maps above, Figure 3 is a map that Gawne

(5)

Figure 1. Map of Nepal with Yolmo villages used in Gawne (2013a)

Figure 2. Map of Yolmo villages in Lamjung used in Gawne (2013a)

(6)

created after attending a mapping workshop, using TileMill and GPS data she col- lected. Unlike Figure 1 it is easily re-purposed and replicated, and looks much more professional. This map has appeared in subsequent publications including Gawne (2013b). In §3.5 we will demonstrate how to create a map like the one in Figure 3.

Figure 3. Map of Nepal made in TileMill used in Gawne (2013b)

2.1 Different types of maps In this section we introduce some further terminology to distinguish different types of mapping products that we will use throughout the paper. We identify a number of different map product types, such as paper maps and digital maps. Within digital maps we distinguish between static maps, dynamic maps, and multimedia maps. Each of these map types have their own specific advantages and limitations.

Print maps have a number of uses. Such maps are still required in traditional publishing formats. These maps are not dependent on digital literacy or access to an internet connection. Additionally, communities may appreciate having a physical copy of a map as a sign of legitimation (Bowern 2008, ch. 14). Paper maps are often a product of digital maps, but may require different colors or designs to look good.

Historically, print maps were created manually; Kretzschmar (1996:15) notes that the maps for the Linguistic Atlas of the Middle and South Atlantic States (LAMSAS) were originally all hand-drawn, taking up to six to eight hours to complete a single map of a single feature of lexical variation. Only then could isoglosses be observed, if they existed, and analyzed. This meant that the researchers had to be highly selective about which features to map.

Digital maps are those with recoverable geolocation information. It is possible to design a map using digital illustration software by drawing directly onto an existing

(7)

map. These images do not involve spatial encoding and so the data is not recoverable, and cannot be used to make other maps. Kretzschmar (1996) illustrates the benefits of keeping geospatial data separate from the image rendered. The same LAMSAS datasets that would take a full working day to draw were digitally encoded and in 1992 a GIS program took around 90 seconds to plot all the points. This meant that more maps could be generated in less time, allowing for quicker observation of potential isoglosses and full use of the survey data. As Labov & Preston (2013:1) observe, digital maps are not constrained by printed space, budgets for color printing, or other practicalities that have been holding linguistic geography back as a discipline.

Not all digital maps are the same, however, and below we describe three varieties of digital map in the paragraphs that will be relevant to our discussion.

Static maps have a fixed number of features that are presented. Print maps are inherently static, as they are fixed. Digital maps may be static, although even in static digital maps it may be possible to scroll and zoom. In the WALS maps it is possible to also change the base map and the style of the points used, but the map still shows fundamentally the same data. Static maps can be made in all digital mapping software.

Dynamic maps, in contrast to static maps, allow the user to alter what information they see, be that through an ability to toggle certain features on or off or by showing different information at different zoom levels. The kinds of features that can be built into a dynamic map depend on the software used to build the map. For example, CartoDB gives you the option of publishing a digital map where the viewer can easily toggle features on or off. However, to build a map where different features appear at a certain level of zoom requires the user to move beyond the simple interface and to code that information using their knowledge of HTML/CSS. Dynamic maps can also represent changes and movement over time, which can be useful for illustrating migration patterns. Although easily created animated maps still have a way to go, Hanewinkel & Losang (2010) provide an introduction to the topic.

Multimedia maps integrate other media into a map. For example, hovering over or clicking on a village name may bring up an image of that village, or a recording of the village name. The Algonquian Linguistic Atlas⁹ has recordings linked to mapped points. Media can be linked to static maps, or form part of a dynamic map. Different software can allow users to build in multimedia links with varying degrees of ease.

The status of maps as objects has changed drastically in recent decades. Peeters (1992:7) and Dahl & Veselinova (2005) note that static maps must, in their nature, exclude and overcrowd information. While all maps are selective snapshots of a mo- ment in time, modern digital mapping offers some solutions to the problems inherent in static maps.

Digital maps allow collected data points to be represented in different ways, so that multiple maps can be made to demonstrate the complexity of a linguistic sit- uation. Also, each map feature set is produced on its own layer of the map, so additional information can be easily toggled on or off. Digital maps can include a zooming feature, allowing for detailed village-by-village views on closer zoom and

http://www.atlas-ling.ca/

(8)

more generalized polygons at greater distance. These digital maps are obviously not suitable for print publications, and there are still the same concerns about what is included and excluded; however, they allow for more complex language stories to be told.

2.2 Currently available maps Linguists generally use one of a number of strategies when making language maps for presentation and publication. Some may use exist- ing resources like Google Maps, dropping digital pins to locate languages, or may use image editing software to add that information to an existing static map. While this serves the necessary functional purpose, it is limited. Firstly, the information drawn onto the map is hard to re-purpose into another map, unlike digital mapping where GPS data points are created and stored separately from the styled map. Secondly, every map is designed with a specific purpose in mind (cf. MacEachren 1995), so co- opting an existing map may mean including extraneous information. Thirdly, there may be limits to reproducing such maps in publications for reasons of copyright and visual clarity (i.e., graphic resolution). Finally, these maps are often not particularly aesthetically appealing. Speakers of endangered languages are often minorities, ren- dered invisible on official maps. Making targeted and professional-quality maps for these languages can help increase their perceived status, both within the community and among the general public.

The most comprehensive language maps that currently exist are those created by the Summer Institute of Linguistics (SIL), which are updated periodically for editions of the Ethnologue (Lewis et al. 2013), based on the World Language Mapping System (WLMS).1⁰ This information is also used to provide the GPS information for ISO 639- 3 labelling,11 which is used by services such as Glottolog12 and OLAC.13 These maps can be useful, especially the ISO 639-3 point data; however, they also have some limitations. The first is that individual linguists and specialists would often prefer to have a map that illustrates different features than what these larger datasets offer.

Linguists (and the larger linguistic community) would benefit from a way to create new maps based on their research and knowledge of a particular area, using GPS data to clarify and assist their efforts. These maps could then be posted online or be printed in publications. The second limitation is that these maps are licensed products of WLMS and cost money to obtain. Current publicly available software is free to use, and builds upon open access data. This means that maps created by researchers can be easily distributed online and in print without the licensing limitations that pre-made maps and map data can have.

There are a number of other projects that have a strong geo-spatial element and a focus on global linguistic diversity. Many researchers interested in typology will have used the World Atlas of Linguistic Structures (WALS).1⁴ The discussion of each 1⁰http://www.worldgeodatasets.com/language/.

11http://www.ethnologue.com/about/language-maps

12http://glottolog.org/.

13http://www.language-archives.org/REC/discourse.html.

1⁴http://wals.info/.

(9)

typological feature in WALS includes a map of the distribution of that feature across the languages surveyed. On a similar scale, the Endangered Languages Project1⁵ em- ploys a map based on Google Maps technology. The data that formed the basis of the initial launch of the project was put together by the Catalogue of Endangered Languages (ELCat) project, and is drawn from a range of sources. Each location source is cited on the language page. For example, the geolocation information for Yolmo (also known as Helambu Sherpa) is from Glottolog,1⁶ while the geolocation information for Alyawarr and many other Australian languages comes from Bowern (2011).1⁷ If there is more than one source with differing geolocational information, the Endangered Languages Project cites both. Language Landscape1⁸ is a web-based participation project that takes a different perspective on the distribution of language, and instead plots the geospatial location of an individual when they make a specific recording. This moves beyond the representation of languages as grounded in specific locations, and instead maps specific moments of language use.

There is also an increasing number of projects that effectively use language map- ping for a specific language, or group of languages. The Algonquian Linguistic At- las has a large collection of phrases spoken in Algonquian languages from across Canada; the viewer can click on each one to hear how the phrase is spoken in the lan- guage of that area. The Yup’ik Environmental Knowledge Project1⁹ is also concerned with mapping indigenous language, but focuses on place names and other geographic knowledge of the Yup’ik communities. Bowern’s work on the history of Australian languages includes the production of mapping data as both a research tool and a research output.2⁰ There are many other useful examples of maps used for research and educational purposes. We hope the examples discussed here provide some illus- tration for the interested linguist about different ways in which it is possible to use language mapping to support both research and community-oriented purposes.

2.3 Ethical considerations Language maps can be problematic constructions. As static and selective representations of complex phenomena, they can acts as both rep- resentations of power and tools of power (Luebbering 2013:49). As Peeters (1992:7) notes, there is a certain authority given to a place or a people by putting them on a map. For endangered language communities this may be a good thing, giving them recognition and status that they may never previously have experienced, even if those maps are only used within the community and research publications. But mapping can also be used as a tool of division, or evidence in territorial claims. Therefore, lin- guists need to think very hard about the potentially exclusionary or over-expansive story their map might tell.

1⁵http://www.endangeredlanguages.com.

1⁶http://www.endangeredlanguages.com/lang/5710. (The Glottolog data appears to be based on the Ethno- logue information, and is available here: http://glottolog.org/meta/downloads)

1⁷http://www.endangeredlanguages.com/lang/5249.

1⁸http://languagelandscape.org/.

1⁹http://eloka-arctic.org/communities/yupik/atlas/index.html.

2⁰http://www.pamanyungan.net/.

(10)

It is also worth considering whether the community will want their location mapped and distributed. While some groups may approve of this activity, it should not be assumed that everyone wants to be ‘on the map.’ It may be appropriate to only indi- cate their location very generally within the larger nation or state where they reside, and then present a more detailed map that doesn’t necessarily give traceable GPS information. For some communities, being locatable may be politically or socially undesirable. It may be appropriate to still collect geospatial data for the researcher’s project, but not make publicly available maps. There may also be contexts where it is not appropriate to collect this data, or to map anything other than a very generalized location.

Researchers should check with their IRBs about the handling, storage, and dis- semination of geolocation data in relation to the specific context of each research project. We believe, however, that it is important to ensure that you do have a pro- cess in place for seeking permission from the community to use geospatial data. This is particularly the case if multimedia maps are made featuring images of community spaces linked to their specific geographical location. The kind of permissions you will need to ensure are contingent upon the social structure of the community you are working with; in some communities there may be a specific official body that it is best to approach about collecting this type of data (for more on this topic see Bowern 2008:3; Chelliah & De Reuse 2011:124).

It is also worth considering whether all local geographic features are appropriate for mapping. Some communities may have sacred spaces or totem lands that they do not wish to have located. Including the community in any GPS coordinate plotting and sharing early map drafts with them can help ensure that they are comfortable with the amount of information you are building into a map. As in all language documentation data collection, sensitivity to context is key.

2.4 What to put on the map Choosing what to put on a map can change how it is read. Girnth (2010) suggests asking four basic questions before starting a map:

1. What is the aim of the map?

2. What is mapped?

3. How is it mapped?

4. What mental picture does the map produce?

In this paper, we take it that the aim of the map is to represent the location of one or more languages or language communities. But for the second question, there may be other geological or political features that need to be considered. You may choose to include national or local borders to indicate the relationship between a language group and larger geo-political constructions. Including other language groups or urban centers may help contextualize the linguistic landscape. Some features of the local geography may provide specific challenges to map clarity that will need to be addressed; as one reviewer noted, the Himalayas provide a particular challenge in

(11)

this regard due to the rapid changes in elevation. Different map types will allow you to focus on different features and help you and the community visualize what is most important.

Within the target group being mapped, you may choose to map them as a single unit or break the groups down further. You will need to decide on what grounds you wish to make such distinctions and what divisions that may indicate in a map.

Some groups that you have good reason to consider as separate dialects or languages on linguistic grounds may have strong social or cultural reasons to be identified as a single group.

Of course, the location of language groups may not be the only language-related mapping target. As Lubbering (2013:44) notes, there is a diversity of topics and vari- ables, and their geographic arrangement, when it comes to languages. With the tools we demonstrate in this paper, it would also be possible to create maps that illustrate lexical variation between groups, density of language maintenance in particular areas, or areal contact.

Whatever you plan to map, considering the design elements will be vital to ensur- ing the map tells the story you want. This addresses Girnth’s third question above.

Ormeling (2010) provides a good overview of the kinds of elements you will need.

Some will be more relevant to particular mapping projects than others.

The ‘projection’ is the way that the Earth’s sphere is rendered in flat space. Some projections can make the polar areas appear relatively large compared to equatorial areas; however, if you are mapping a space the size of Germany or smaller, as is most likely when mapping smaller languages, any distortion from different projections should not be noticeable (Ormeling 2010:29). The ‘zoom level’ will determine how much terrain is visible in the mapped pane, be it the entire globe or a small pocket of land. The colors and forms you use for different map elements will also be part of the story. Points of the same size will be perceived to represent a similar number of speakers, and darker colors will be more prominent on the map and should be used to map key elements. Digital mapping allows you to try different design styles, and modifying different features will give you insight into what works best for the map you wish to make.

Accessibility for people with color-blindness is another design consideration to keep in mind, as some color distinctions are difficult to see for up to eight percent of the population. If you are using a red and green distinction it is possible this may not be visible to some members of your audience. There are a number of color-blindness simulators online so that you can see the effect of some design choices on common color-blindness types.21 The advantage of making digital maps is that it is very easy to modify color schemes, or if you are required to work with a particular palette, to also duplicate the map in color-safe contrasts as well.22

Each language context will present its own challenges and points of interest for the task of mapping. Any map is inherently controversial, as no one map will satisfy 21We like: http://www.color-blindness.com/coblis-color-blindness-simulator/.

22Duplication was the solution for these Australian variation maps from the Linguistics Roadshow, with a series of color safe maps: http://lingroadshow.com/resources/englishes-in-australia/vocabulary/mapping- words-around-australia/mapping-words-around-australia-cs/.

(12)

all users (Peeters 1992:6), but there are some generalizable considerations to be made.

For this paper we are assuming that the map will be of the location of settlements in which the target variety is spoken, be they individual homesteads, small villages, or large towns and cities. You will have to decide if you want to mark all of these as individual points (which may be preferable if there are a manageable number of to- kens), or to indicate the language area with a single polygon (an enclosed shape made from a collection of points; see GPS section below for more detail). You may wish to represent each village as a separate point, but then have larger polygons represent- ing the influence of different contact languages or related languages. The following images from Google Earth demonstrate these differences in visualization: Figure 4 provides a map of some Pnar-speaking towns in NE India as individual points, Figure 5 represents these towns as a single polygon, and Figure 6 represents both points and polygon in relation to other polygons representing other language communities.23 Each allows for a different understanding of the relationship between Pnar speak- ers and neighboring linguistic communities, giving different mental pictures of the distribution of Pnar as far as Girnth’s fourth question is concerned.

Figure 4. Some Pnar villages as points (Basemap © Google Earth, rest © Ring)

23It can be noted that the polygons drawn here in Figures 5 and 6 have odd corners. This is partly at- tributable to the difficulty associated with using a mouse to draw the points. One promising way that the authors have used to get around this difficulty is to load Google Earth on a touch-sensitive device with a stylus, such as the Microsoft Surface Pro. Polygons and lines produced with the stylus are much closer to the lines drawn by hand on paper.

(13)

Figure 5. Pnar villages collectively represented as a polygon (Basemap © Google Earth, rest © Ring)

Figure 6. Some Pnar villages represented as points and a polygon in contrast to poly- gons of neighboring languages (Basemap © Google Earth, rest © Ring)

(14)

2.5 The challenges of borders in language mapping There are challenges with as- signing languages to space and delimiting between languages (Williams & Ambrose 1988). One particular challenge is how to represent the borders of different language groups. Unlike natural geological features, which are inherent to the landscape, and geopolitical borders, which are measured with great precision, the fluidity of language and the nature of language contact can make representing the edge of a language group a particular challenge.

Mapping methods with one-language-per-area and clear borders between lan- guage groups are at odds with our understanding of the way languages can often co-exist in an environment (Luebbering 2013:41). Dahl & Veselinova (2005) sug- gest that with modern mapping technology and small language speaking populations the best way to represent endangered language communities is the map at the level of individual settlements where feasible. This allows us to see the specific location that languages are linked to, rather than forcing ourselves to carve specific domin- ions for each language. In Figure 6, this is represented by overlapping borders, with a particularly large population of an exogamous language group being identified by a polygon overlay within the Pnar area.

Even this kind of boundary-marking may distract from the reality of language contact, bilingualism, trade, or marriage movements in an area (Mackey 1988:26;

Williams & Ambrose 1988:110; Stanford 2009; Luebbering 2013:44). A focus on language or dialect boundaries distracts from the fact that there are many interesting features of the contact between language groups. Conversely, linguistic boundaries may correlate with other kinds of boundaries, which may be part of a larger story of migration, history, and trade. As a classic example from language variation in North America, Labov (2001:208) discusses the ‘Inland North’ speech area, which also serves as a boundary for a variety of other activities including migration, politics, and woodwork joining styles in house-building. For those interested in establishing isoglosses for variation boundaries within a group, Labov et al. (2006:41-44) provide a detailed methodology in the introduction to The Atlas of North American English.

Dynamic digital maps allow for the viewer to toggle on and off certain informa- tion, depending on how the map is built, which may allow for better representation of these factors than a static map. Building your own map allows you to represent the areal reality in the best way possible; however, representing borders will always be a particular challenge to keep in mind.

2.6 Possibilities of GIS In §3 we demonstrate how to make basic maps for locat- ing the distribution of languages in space. This is not the only use of mapping that linguists may find useful. Geographic Information Systems (GIS) allow not only for the collection and representation of geographic data, but analysis as well. While ex- ploration of the possibilities of GIS for language documentation projects is beyond the scope of this paper, we hope that in giving language documentation researchers a basic introduction to language mapping we can open the possibility of further col- laboration with GIS researchers.

(15)

Hoch & Hayes (2010) give a good overview of GIS/linguistics research and discuss potential analytical possibilities in the field. Montgomery & Stoeckle (2013:57-64) give an updated discussion with particular reference to Perceptual Dialectology re- search (see p. 75 for a description of GIS visualization of aggregated PD data; see also Preston 2010 and Goebl 2010 on perceptual dialectology). Kretzschmar (1996;

2013a; 2014) provides a good overview of the analytic possibilities when working with variation data, such as the likelihood of eliciting a particular item in a particular area, and Wattel & van Reenen (2010) give an introduction to probabilistic maps in variation.

Linguists may find that working with GIS practitioners gives them new perspec- tives on language data in relation to other geographical features. Manni (2010) gives an example of mapping both linguistic and genetic variation, demonstrating that Dutch dialects and the distribution of genetic features show some parallels but linguistic features are not dependent on genetics alone. Inoue (2010) looks at the relationship between rail line distances and use of standardized linguistic forms in Japan, finding cross-dependency between language phenomena and infrastructure access, rather than just absolute distance between varieties.

If linguists have not made full use of the possibilities of geographic research, it is comforting to know that many geographers have long neglected the complexities of mapping language (Williams 1988:1). This situation has changed, with the develop- ment of the field of ‘geolinguistics’ or ‘linguistic geography’ (Mackey 1988; Williams 1988; Luebbering 2013; Labov & Preston 2013); however, there is still much possibil- ity for developing geolinguistics within the domain of endangered languages. Hilde- brandt & Hu (2013) offer one example of such a collaboration, creating a multimedia mapping of languages and their varieties in the state of Manang in Nepal, which is helping to visualize and re-think the complex settlement patterns and histories of Manang, as well as providing new perspectives on the vitality of the languages spo- ken in the area. Linguists who understand the basic process of language mapping will be better equipped for possible collaboration with GIS researchers to identify ex- isting research or develop new research directions (cf. Chamberlain 2015, in which language shift was related to environmental watersheds using maps).

3. Making maps Above we briefly discussed how geographic maps provide a use- ful visual representation for language documentation. In the following sections we work through the process of making maps, from GPS data gathering and process- ing to using that data to build a map. Currently available tools mean that making effective maps, both static and dynamic, is a skill easily within reach of language documentation practitioners.

It is possible to make many different maps using the same dataset, so it is impor- tant to have decided exactly what you want your map to represent before starting the process. We begin with an explanation of how to gather data using GPS (§3.1), and how to make this data ready for mapmaking (§3.2). We then introduce popu- lar software for map creation (§3.3), which vary in terms of the degree of flexibility they offer, and the amount of coding required to manipulate the final product. We

(16)

introduce a simple workflow (§3.4) using CartoDB, which allows you to add and ma- nipulate data, and has many predefined map features. Following this we introduce a more complex workflow (§3.5), involving editing data in Google Maps and styling the final map in TileMill. This option requires use of CartoCSS, a simple markup language based on CSS and built specifically for mapmaking. We conclude with a discussion of the distribution of electronic and static maps, attribution of copyright and archiving (§3.6).

3.1 Gathering GPS data GPS data is the primary means of locating positions on Earth in the 21st century. Satellites orbiting the Earth have allowed us to identify particular points in relation to a global spatial map since at least the 1980s (see Parkinson 1996; Guier & Weiffenbach 1997; Pellerin 2006). In the last 10 years this has become more commonplace for consumers, with dedicated GPS devices be- coming accessible for hikers, trekkers, and ocean and land navigation.

As noted above, our focus in this paper is to describe maps and mapmaking for linguists doing language documentation who want a visual representation of geo- graphical spaces and would like to use open-source software or tools.2⁴ Collecting GPS location data is another facet of language documentation workflow that needs to be considered when developing and undertaking fieldwork. As van Uytvanck et al.

(2008) note, it is important to consider the method of collecting this data and storing it.

In the last 5 years GPS has become an essential part of smartphone technology and is used by various apps to identify users in relation to nearby businesses and advertising, or for navigation in apps such as Google Maps, with GPS functionality generally being referred to as ‘Location Services’.2⁵ Smartphones provide a useful tool for the mapmaking field linguist. Smartphone use is increasing in developing countries (where many of the world’s unwritten and under-described languages are spoken), and carrying a smartphone will no longer make you look out of place. There are also many free GPS tracking apps that can store and display real-time GPS infor- mation. Thus, collecting GPS data that can be used in mapping does not have to be an additional task in the linguist’s workflow, nor interrupt other activities. Searching for ‘GPS’ in your phone’s app store brings up thousands of free and paid apps that can be downloaded and used for a range of purposes.

2⁴Excellent proprietary mapping tools do exist and are widely used, such as ArcGIS (http://www.arcgis.com), but since many of the features that linguist mapmakers will use are shared with open-source tools, we describe the latter here.

2⁵Location services are available on any modern mobile smart phone. On an iPhone, this is accessed by selecting ‘Settings > Privacy > Location Services’ and turning Location Services to ‘On.’ Then location services for individual applications can be enabled. To have photos that you take on your device auto- matically store GPS coordinates, for example, select ‘Camera’ under Location Services and turn on the location services for the built-in camera. Other phones may vary in how the location services are enabled for photos, but the process is similar—searching ‘How to enable location services for camera on [Name of Your Device]’ using Google should give step-by-step instructions for your particular smart-phone.

(17)

With location services enabled for photographs taken on a smartphone device, the GPS information is automatically recorded when the picture is taken.2⁶ It is often culturally acceptable and even expected in many places that a foreigner would take pictures at every opportunity, allowing for the collection of GPS locations while also taking images, although sensitivity should of course be exercised. The photographs may be used by the linguist as visual aids in reconstructing their travels for map- making, or they may form part of the final mapping output by linking the mapped points to the images in a multimedia map. This is particularly useful in ethnobotani- cal or local history projects. As we discussed in §2.3 the researcher must always be attentive to their ethical obligation to the community when collecting photographic as well as GPS data.

Keep in mind that this location information may be displayed or recoverable on- line publicly if you upload the photo to a public site such as Flickr, Facebook, Insta- gram, or a personal website. Here your particular privacy settings will determine the photo’s accessibility. You may have to remove the GPS information from the photo if you do not want the GPS coordinates to be included, as they are embedded in the im- age file metadata (with ‘location services’ on). While many people do not mind this, it is worth being aware of, particularly if you or community members have privacy concerns.

This location information is often viewable on your device by selecting the photo and looking at its ‘information’ or ‘details.’ However, this is not always the case.

On iPhone, for example, the information is only viewable on third-party apps such as ‘Koredoko,’2⁷ a screen shot of which is shown in Figure 7. If you have privacy concerns, it may be good to use a similar app to check whether GPS data is embedded in your photos.

A dedicated GPS device can also be extremely useful. The benefits of a dedicated device are that they generally have longer battery lives, are more robust, and some can manipulate data points on the go. However, they are not as commonplace and as such may attract more attention in a fieldwork context than comparable use of a smart-phone. They also may require additional software to transfer the data to your computer when mapping. For those who are doing a lot of walking or who want to plot boundaries or roads manually, however, a dedicated device may be the better investment. They can also be used to demonstrate to people the task of collecting data for mapping, therefore starting the community discussion of what should be mapped. GPS trackers may allow you to mark specific locations, but may also simply auto-generate a numeric value for each of the recorded GPS points, so ensure you test your GPS device, including extracting data from it, before taking it to the field. You may need to record your own metadata for the names of the enumerated points so 2⁶At the time of writing this paper, all current iPhone models are able to collect GPS data with each image, as are mid-to-high price point Android models. Small consumer camera models like the waterproof Nikon Coolpix and Canon PowerShot and professional (DSLR) models like some of the Sony Alpha and Canon EOS series also have this feature. Video cameras with GPS functionality include the Sony Handycam models and the ContourGPS camcorder. We only expect this feature will become more ubiquitous, but it is worth confirming that the model you want has it along with the other features necessary for your particular use case.

2⁷http://labs.kawabatafarm.jp/Koredoko/.

(18)

Figure 7. Screen capture of a photo with its GPS point, viewed in the Koredoko app

that you can remember what they refer to when you return to your mapping data at a later date.

It should be noted that GPS data exists primarily as points or lines. Keeping this in mind can help the linguist decide what kind of GPS data (and how much of it, i.e. how frequently points should be taken) is necessary to help identify the points and boundaries for creating a map to display the kind of information that the linguist is interested in. For example, creating a map of villages where each village is represented as a single point will require only one GPS data point per village, while collecting data on routes between villages will require a lot more GPS data points to be taken in order to produce an accurate map.

It is also possible to take more of an armchair approach to aggregating GPS data.

Services like Google Earth (discussed below) now offer high-quality satellite images of many places on the earth’s surface. Remote hillsides may not have the same detail as a major urban centre, but once basic GPS locations have been established it may be possible to mark out other villages, landmarks or features based on the satellite photography and familiarity with local landmarks rather than using a GPS device in situ. New technology also allows scans of (older or hand-drawn) maps to be aligned with existing coordinates,2⁸ which could become a base map for future work.

2⁸See: http://www.davidrumsey.com/ (Georeferencer App) and the ‘georeferencer’ plugin for QGIS, i.e. http://www.qgistutorials.com/en/docs/georeferencing_basics.html

Also: http://spatial.scholarslab.org/creating-gis-datasets-from-historic-maps/. This tutorial describes working with ArcGIS, which is not free or open-source software.

(19)

3.2 Uploading, editing, cleaning GPS data The process of obtaining GPS data from a device will vary from product to product, but many will have support software, or online documentation for how to do this. Whatever the file type the GPS program works with, you should find a workflow where the final product is a KML (or some- times a KMZ) file. KML is an acronym for ‘keyhole markup language,’ a specific arrangement of XML data that many of the mapping programs use to store GPS co- ordinates and other information. If you open a KML file in a text editor you will see XML tags being used to organize the data. A KMZ file is a zipped file that contains a KML and additional supporting files. This zipped file is often smaller than a KML because of the compression, and many programs that can read KML also support KMZ.

When managing mapping data you should think in terms of layers. ‘Layers’ refer to demarcations that you intend to build into your map. These are collections of similar information that you will color or shade the same, in order to contrast with other layers. Layers are the primary way that data is organized and identified in most mapping software. Maps may have a single layer that identifies oceans, another that identifies land masses, and others that identify geographical features such as rivers and mountains, or those that identify political boundaries such as country, state, and local borders. Geographical layers may be shaded or colored to give the impression of elevation, while political boundaries are often lines of different width or type.

Thinking in terms of layers will help you identify and separate the kind of GPS information you want to represent on your map. As mentioned in §3.1, this will likely be a series of points to begin with. However, you may need several different sets of points—a set of village points for one language variety, a set of village points for another variety, a line or periodic points to identify potential trade routes, or a set of points to identify marketplaces or cultural/historical points of interest. With careful metadata assisted by photos that you take along the way, these GPS coordinates become a rich source of information to support future maps and mapping projects.

If you take points without photographs, remember to make note of the location of each point (for example the village name or the route name).

An alternative to using KML files is to create a list of longitude and latitude values for points you want to map and save them as a CSV (comma separated values) file in a program like Excel. Each layer of information you want to display should have its own CSV file. You may already have these values, or have taken them from Google Maps or another mapping platform. An example for locations in a CSV is given in Table 1, which lists the location of the four main Yolmo language areas in Nepal. The X column is for longitude and the Y column is for latitude. These points were used to generate the map in Figure 3 and are also in the downloadable CSV file ‘Yolmosites’

that accompanies this paper.

CSV tables can only be created for point data, not for line or polygon data. If you have a larger number of points it may be more time-consuming to generate a CSV table than working with GPS and a program like Google Earth.

(20)

Table 1. Set of longitude and latitude values for locations where Yolmo varieties are spoken

Language X Y

Lamjung Yolmo 84.316263 28.234986

Melamchi and Helambu Valley Yolmo 85.589647 27.949222

Kagate 86.071987 27.349813

Ilam Yolmo 88.040085 26.933091

One of the most useful tools for uploading, editing, and cleaning GPS data is Google Earth, a free cross-platform program.2⁹ Google Earth will accept data from a wide range of formats, which can then be manipulated, and allows KML export.

Many GPS devices will allow direct upload to Google Earth. Google Earth allows you to move data points, create data points and export these for use in other programs.

Other programs can also be created as add-ons to Google Earth.3⁰ It is possible to take a screen capture of Google Earth to use as a display map; however, it is not a clear or elegant way to represent geodata and therefore we only use it as a first step for editing and manipulating.

Figure 8 is a screenshot of Google Earth Pro, which is now freely available for consumers, with Lamjung Yolmo village data visible. Google Earth Pro allows for higher resolution screen shots, and the ability to create videos of movement through the mapped environment.

Figure 8. The Google Earth representation of Yolmo village GPS data in Figure 2 2⁹http://www.google.com/earth/.

3⁰Such as those found at: http://earth.tryse.net/programs.

(21)

The panel on the left includes lists of GPS data, which can be arranged thematically.

You can observe that the Lamjung Yolmo village points are kept together in one folder (“Yolmo villages”), alongside other villages and roads traced either with a GPS unit or by looking at high-quality Google Earth satellite imagery. The Lamjung folder is next to a Ramechhap folder, which contains locations of the Kagate-speaking villages in another district.

Figure 9 shows the main features of the Google Earth toolbar for creating map data. The red rectangle highlights elements that can be added to the current view.

From left to right: points, polygons, paths, and image overlays, which each create a new layer. The final image of the camcorder allows for the creation of ‘flyover’ tours of areas, which may be useful for some illustrations of linguistic areas.

Figure 9. Screen capture of a Google Earth toolbar

Google Earth allows you to move data points, create data points, and export these for use in other programs. It is possible to take a screen capture in Google Earth.

Using the small ‘printer’ icon on the top bar, the program will allow you to create a title and a legend.31 The screen is then saved as a PDF. The Google Earth logo will be created on the map and cannot be removed.

Figure 10 is a screenshot of the Google Earth representation of the Yolmo villages in Figure 2 that could be exported for publication. You can see clearly where Google Earth has stitched together different GPS satellite images.

One advantage of displaying maps in Google Earth is the impressive representation of the three-dimensional environment. Figure 11 is the same map data, rotated to demonstrate more clearly the distribution of these five villages on the hillside, and the nature of Lamjung’s mountainous terrain.

If you have taken pictures with embedded GPS data, you can download or transfer your pictures to your computer, ensure that they contain GPS data, and use a program such as Geotag32 or Photo KML33 (both free), to convert your images into sets of GPS points that you can open in Google Earth or another map editor. Geotag is a Java- based program for getting GPS data (in EXIF format) from images and exporting the GPS information into several formats, including KML. To use it, simply download the Java program from the SourceForge site (link in footnotes), make sure you have the latest version of Java, and run it. To add data select ‘File > Add image…’ or ‘File

>Add images from directory…’ (Figure 12). To export data, select the files that you wish to group together, right click, and select an export option such as ‘Google Earth 31This process does not always work as desired, which is why the legends in Figures 4–6 were created

manually.

32http://www.geotag.sourceforge.net/

33http://www.visualtravelguide.com/Photo-kml.html

(22)

Figure 10. The Google Earth representation of Yolmo village GPS data in Figure 2

Figure 11. The Google Earth representation of Yolmo village GPS data in 3D

(23)

>Export selected images’ (Figure 13). Then select a location on your computer to store the data and type a name for the KML file.

Figure 12. Opening images in Geotag

Figure 13. Exporting image information from Geotag

There are a number of steps you can take to make it easier to create geospatial datasets from GPS photodata. The first is to organize your photos into folders. Create folders for each layer that you want to build into your map, and copy or move the photos that are associated with that layer into the proper folder. Then open each

(24)

folder individually within Geotag and save each group of images in the Google Earth format. Here you will notice that the KML file links to the location on your hard drive where your image is stored, and that all the images in a layer are grouped together under the heading ‘Geotag.’ This is information that you can edit in Google Earth. You can change the names/titles/headings, delete the images, and even adjust the location of each of the points. To export from Google Earth, the easiest process is to right-click on the point (or folder of points) you wish to export, select ‘email,’ and email them to yourself (Figure 14). They will then be in an email as a downloadable KMZ attachment.

Figure 14. Exporting from Google Earth

Although there are some advantages to using Google Earth, it is not always the best way to create a map for illustration or display. As you can see, the pushpins are not very attractive,3⁴ and while the satellite images of the terrain look fine on screen, it may be too much information for other mapping purposes or for black and white publication. Therefore we mostly use it as a first step for editing and manipulating.

3.3 Mapmaking using existing software In this section we identify some of the useful mapping software that is freely available. Some of these require an internet connec- tion, while others can be downloaded for use offline. These programs also vary in terms of user friendliness, i.e., visual simplicity, variety of menu options, format of data, intuitive nature of interface, and access to underlying code. The software that we review briefly here are (in order of complexity for the user from simplest to most technical): Google Earth, CartoDB,3⁵ TileMill,3⁶ and QGIS.3⁷ This section is intended 3⁴The pushpin style can be edited by right-clicking, but there are a limited number of options.

3⁵http://cartodb.com.

3⁶http://www.mapbox.com/tilemill.

3⁷http://www.qgis.org.

(25)

as a basic outline of the kinds of tools that exist, and the advantages and disadvan- tages of each project from the perspective of linguists interested in mapping. It is not intended to provide a comprehensive review. More detailed walk-through tutorials are given for CartoDB in §3.4 and Tilemill in §3.5. These are not the only avail- able mapping programs available; amongst many others, Story Maps3⁸ allows for the creation of map-based infographics, and Mapbox3⁹ is the fully online successor of TileMill.

Google Earth, already introduced in §3.2 above, is freely downloadable software for visualizing the earth as a globe and for zooming in to specific points or geograph- ical features. It incorporates GIS data and allows users to add and share points and descriptions, and is particularly useful for identifying places you have been on the globe through geographical features. It has some of the most detailed satellite im- agery available to general users. It is possible to import, draw, and export points, lines, and polygons in KML and KMZ formats. For those who have an internet connection but do not want to install Google Earth, Google also offers My Maps as part of Google Maps.⁴⁰ This has fewer features than Google Earth but allows for the creation of points or lines in layers that can then be exported to KML format.

Google Earth and My Maps also allow you to share your maps publicly within these programs. TileMill and CartoDB are the two programs we have had the most expe- rience with, both in our own research and in training other mapmakers. They build on the same fundamental skills as other mapping programs, but represent two ends of the scale in terms of ease of accessibility.

CartoDB is an online map-making tool, primarily for dynamic digital maps, al- though they do also have a static map export option. Because the maps are hosted on the CartoDB website they can be accessed anywhere and embedded into other websites. CartoDB requires users to register an account, with the free user accounts limited in their storage size. Maps made in the free accounts are currently limited to only four layers, which will constrain the number of features you can represent on a single map. All maps made with the free version are publicly viewable, so may not be appropriate for mapping sensitive data. Paid accounts offer more storage and private maps. CartoDB allows you to upload existing datasets, for example those created in Google Earth, but also to create or modify datasets within the program, which is not possible in map design programs like TileMill. The data created or mod- ified in CartoDB can also be exported again. CartoDB offers design interfaces that offer less flexibility than fully featured software but allow for the quick creation of maps with no knowledge required of scripting languages. CartoDB have a version of their free accounts for researchers with an academic email address.⁴1 This allows for live-update linking to a Google Docs form, which may be useful for data collection and update.

3⁸http://storymaps.arcgis.com/.

3⁹http://www.mapbox.com/.

⁴⁰http://www.google.com/mymaps.

⁴1http://cartodb.com/industries/non-profits/.

(26)

TileMill is specifically designed with the non-specialist cartographer in mind, and focuses on allowing users to create visually pleasing maps. Unlike CartoDB, it is not possible to edit datapoints within the program; they must be edited in another pro- gram such as Google Earth. Built mainly to create maps used in web browsers, a coding window allows users to fine-tune adjustments to layers using a variety of CSS called CartoCSS. While it is not necessary to fully learn this code, users may find that learning the basics will make their maps more visually interesting and understand- able. Tutorials are available online,⁴2 and it is possible to follow increasingly difficult tutorials working with provided data to master the features. The data that you use to make maps is stored locally on your own computer, which means TileMill requires you to set up your own hosting if you want the maps to appear online.

With a focus on creating interactive web-based maps, TileMill is not always ideal for users who are primarily interested in creating print-ready maps, as exporting may require some finessing. As another caveat, TileMill is no longer in active development, with the developers having moved to development of Mapbox Studio.⁴3 TileMill is currently easier to use than MapBox Studio, offers better static map export, and we believe it is still preferable for the kind of language mapping work linguists are likely to undertake. One limitation is that TileMill does not work on Mac OSX 10.10 (Yosemite) and above.⁴⁴

QGIS is a full-featured professional graphical GIS editor.⁴⁵ This program is freely available and runs on all major operating systems. It imports from nearly any map- ping format and allows the user to edit and create GIS layers of points, lines, and polygons in a visual layout similar to Photoshop. However, the level of detail and ar- ray of menus and options outstrip most basic mapmaking needs. This is an excellent tool but is much more powerful than the typical linguistic mapmaker requires. We do not include QGIS in our workflows, as we find that the less powerful programs are still more than sufficient for the needs of most language mapmakers. If you are interested in building more complex GIS projects, learning QGIS may be worth the time, but starting with the programs we discuss will offer good basic training in mapping nomenclature and design. QGIS has additional features beyond the scope of tools such as TileMill and CartoDB. One particularly useful feature is the ability to align existing printed maps with geographical coordinates in a process known as

‘georectification’.⁴⁶

There are many different ways to create maps, but as Kretzschmar (2013a:67–68) notes, “the worst kind of linguistic map is the one that the analyst cannot make, and so cannot look at or show.” In this section we describe two approaches that we have found to be both useful and accessible. The first is a ‘simple’ workflow using CartoDB

⁴2http://www.mapbox.com/tilemill/docs/manual/.

⁴3http://www.mapbox.com.

⁴⁴People have had success running TileMill on Mac through Wine (http://www.winehq.org/) or Docker (http://hub.docker.com/); however, as both of these require familiarity with Terminal commands we cannot readily recommend these options to all readers. TileMill servers can be created, where the user logs into the program through a web browser, which avoids the limitation for Mac users. It may be possible to talk to your University IT department about setting up such a service.

⁴⁵http://www.qgis.org.

⁴⁶A tutorial on how to do this can be seen here: http://www.youtube.com/watch?v=A-jBYc9pLiQ.

(27)

that allows the user to create a basic map easily and also enables further expansion and collaboration, but which requires an internet connection. The second is a more complex workflow that makes use of Google Earth and TileMill.

3.4 Basic workflow: CartoDB CartoDB allows users to either import existing data in a range of formats, including KML and CSV, or to create and edit datasets within the program. In this section we will walk through creating a basic map of Yolmo language locations using the CSV sheet given in Table 1. Addressing Girnth’s (2010) four questions in §2.4 the aim of this map is to represent the location of the four known Yolmo varieties (q. 1). These will be mapped, as will the geopolitical border of Nepal and the relevant major cities (q. 2). The border will be mapped using line data, while the language varieties and cities will be mapped using points (q. 3). The mental picture that this map aims to create is an understanding of the location of these varieties in relation to each other, the country of Nepal and major cities (q. 4).

That dataset can be downloaded alongside this paper, and is called ‘Yolmosites.’⁴⁷ Locations for Kathmandu and Pokhara are also mapped, as they are key locations in Nepal; these are available for download as ‘Cities’ from the same location.

You will need to create a user account before you begin, and all of your data will be stored online, so if you create or modify data in CartoDB, remember to down- load the data occasionally for backup and archiving. CartoDB allows you to create datasets, with each representing one layer of information on a map, and then to com- bine multiple datasets into maps. Figure 15 is a screenshot of the opening page when you log into your CartoDB account (the maps are from Gawne’s account).

Figure 15. CartoDB list of maps in Gawne’s account

Clicking on ‘maps’ at the top left will allow users to toggle over to the page where their map data is stored. Datasets provide the basis of creating different maps—multiple maps can draw on the same datasets, allowing you to create different visualizations of the same data very quickly. It is possible to view each dataset as a separate map as well as building them into combined maps.

⁴⁷The datasets are available for download from

https://scholarspace.manoa.hawaii.edu/bitstream/10125/24692/2/Yolmosites.csv and https://scholarspace.manoa.hawaii.edu/bitstream/10125/24692/3/Cities.csv.

(28)

When you create a new map (green button at the top-right), CartoDB gives you the option of creating a blank map, or using one or more of your datasets. Datasets can be imported in multiple formats (including KML and CSV) or you can browse the datasets available on the website, although their datasets may not match your needs.

Gawne had already uploaded the Yolmo sites dataset to her CartoDB account, so she selected the ‘yolmosites’ dataset. She also selected a dataset called ‘cities,’ which is a CSV with the locations of Kathmandu and Pokhara, as these provide useful references for people familiar with Nepal. This dataset is available in the CartoDB database.

When datasets are selected and the map is created, CartoDB will provide a default styling to the points on the newly created map. Figure 16 is a screenshot when the map has been created.⁴⁸

Figure 16. New map with Yolmo sites and cities datasets selected

Moving around the screen from the top in a clockwise direction, we see that

‘Map view’ is currently highlighted. It is possible to also select ‘Data view’ which will show a table of values for whichever layer you currently have selected. Each dataset layer is represented in its own tab on the right of the map visualization. In this screenshot ‘yolmosites’ is currently open, and ‘cities’ is stacked at the bottom.

The toggle at the top of the tab can hide that layer from the visualization. Currently displayed in the ‘yolmosites’ tab is the layer wizard, which is used for basic styling of the dataset. Other useful features along the left side of the tab include a CSS window for more detailed styling of data using CartoCSS code, which we discuss further in the workflow below (§3.5). It is also possible to enter SQL code or filter the data providing basic analytical tools, as well as creating legends and info windows for digital map display. The options tab allows you to add features for digital display of maps, including a share function, and locking the zoom if you only want people

⁴⁸The completed map can be viewed at

http://lgawne.cartodb.com/viz/c5d973ee-9342-11e5-bd91-0e5db1731f59/public_map.

Referenties

GERELATEERDE DOCUMENTEN

Due to total maturation, high concentration, low product differentiation, threat of substitutes, high entrance and resign barriers because of the scale of the industry and

Figure: Temperature and heat flux of a very fast circular flow; Pe = 5 × 10 9.

In the highest elevations of the central highlands, snow cover remains but rapid melting has likely occurred in the lower elevations of the central highlands.. During the next

After examining the development of the figure of Mother India, the role of the father in Indian childhood and the idea of the oedipal alliance, we revisit Gandhi as the

CartoDB allows you to upload existing data sets, for example those created in Google Earth, but also allow you to create, or modify, data sets within the program, which is

Analysis of various European noxious species lists for their species occurrences in crop and/or non-crop habitats (crop vs. environmental weeds) and their origin (native vs. alien

[r]

Figure 9: Simple figure included with scaling factor calculated to scale figure to meet specified