• No results found

The Order of the Query: An investigation into the epistemic affordances of databases within digital methods

N/A
N/A
Protected

Academic year: 2021

Share "The Order of the Query: An investigation into the epistemic affordances of databases within digital methods"

Copied!
89
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The Order of the Query

An investigation into the epistemic affordances of databases within digital methods

Cristel Kolopaking

10002223 | cristelkolopaking@gmail.com

Supervisor: dr. Bernhard Rieder

Second Reader: dr. Thomas Poell

RMA New Media & Digital Culture

Graduate School of Humanities | University of Amsterdam

Submitted to the Department of Media Studies at the University of Amsterdam, Faculty of Humanities, in partial fulfilment of the requirements for the degree of

Master of Arts (M.A.)

11

th

(2)

Table of Content

Table of Content 2 List of figures 3 Acknowledgements 4 Abstract 5 Introduction 6

The Scientific Unconscious 6

Perception 8

Affordances & Purposes 11

Untangling Digital Methods 13

Chapter I: The Evolution of Order 17

1.1 Representation and Classification 17

1.2 List, table and tree to structure ontology 19

1.3 Static versus flexible classification 22

1.4 Moving from one to several tables: the relational database 25

Chapter II: Untangling Digital Methods 30

2.1 ePistolarium 31

2.1.1 Tool Analysis 31

2.1.2 User Experience 34

2.2 Digital Methods Initiative’s Twitter Capturing and Analysis Toolset 37

2.2.1 Tool Analysis 37 2.2.2 User Experience 41 2.3 Reflection 45 2.3.1 Methodological Considerations 45 2.3.2 Methodological Interaction 47 Conclusion 50 Bibliography 53 List of Appendices

Appendix 1: Interview Charles van den Heuvel 57

Appendix 2: Interview Erik Borra 68

Appendix 3: Interview Thomas Poell 72

Appendix 4: Interview Oscar Coromina 79

(3)

List of Figures

Figure 1: p. 21 Figure 2: p. 22 Figure 3: p. 32 Figure 4: p. 34 Figure 5: p. 40 Figure 6: p. 45

(4)

Acknowledgements

The solely act of reading and writing is not enough to complete such an ambitious project as a M.A. thesis; one should therefore acknowledge the importance of those sparking moments that connect the dots through inspiring conversations. I would like to reserve this space for a personal note of gratification towards several people who contributed to the establishment of this research. First of all, I am deeply grateful to my supervisor Bernhard Rieder, who has transferred not only his expertise to me on this shared research interest, but also his

confidence in my potential of undertaking such an inquiry. Our numerous conversations have always rejuvenated me and stimulated me intellectually. Next to that, I would like to thank Thomas Poell, who has taken the time to act as a second reader and with whom I had the pleasure working for as an assistant, thereby introducing me to research activities beyond the R.M.A program. A warm thanks to all the persons who have taken the time to share the conversation with me on this subject through interviews: prof. Charles van den Heuvel, Erik Borra, dr. Thomas Poell, dr. Oscar Coromina and finally prof. Richard Rogers, whose interview is not included in quotes, but acted as a guide for the research directions. I would further like to thank all the members of the New Media staff who have made this Research Master program an intellectually inspiring experience. Finally, I would like to thank Guus Rietbergen for helping me structuring my thoughts.

(5)

Abstract

This thesis develops an approach towards untangling digital research methods in order to gain a deeper understanding of the way the database as underlying information structure facilitates ordering and analysing data. Since researchers working with digital research methods are often shielded from the database and its management system through a web-based interface, the database and its ordering practices tend to be part of the ‘scientific unconscious’;

contributing to scientific progression but not being discussed accordingly in within scientific community. Two digital research methods will be untangled, by analyzing the ePistolarium tool, which works with scholarly letters from the 17th century and the Digital Methods Initiative Twitter Capturing and Analysis Tool (DMI-TCAT), which captures- and allows analysis Twitter microblogs from. The two examples gives insight into the way databases work with digitized and digital native data and how the structuring of these data is crucial for the analytical possibilities with it. The thesis is unique in providing a number of interviews with researchers working with both tools as designers or users, thereby giving an insight into the purposes and perception in relation to the database’s affordances. This empirical account is strengthened by a theoretical framework on the concept of order and its development through different information structures. Altogether, it becomes possible to reveal the essence of databases as part of digital methods and to understand how its affordances contribute to knowledge formation within Humanistic disciplines working with computational technology.

Keywords

Databases Ÿ order Ÿ epistemology Ÿ digital research methods Ÿ methodological transparency Ÿ philosophy of technology Ÿ software studies

(6)

Introduction

The Scientific Unconscious

What starts as a personal interest, forms the fundamental root for an intellectual inquiry. One of the most profound and never-ending questions that attracts me, is concerned with what knowledge is and how it is being composed. A key moment in this exploration is when I discovered that knowledge, or episteme as the Greek would call it, can itself be considered a concept that is historically contingent, being prone to different cultural- and time dependent interpretations (Foucault 2002). My interest got increasingly sparked when Michel Foucault relates his contingent notion of episteme to the way the world is perceived in terms of representations and how these are relatively ordered. The following quote is telling for the arbitrariness that surrounds ordering processes:

‘On what ‘table’, according to what grid of identities, similitudes, analogies, have we become accustomed to sort out so many different and similar things? What is this coherence – which, as is immediately apparent, is neither determined by an a priori and necessary concatenation, nor imposed on us by immediately perceptible contents? For it is not a question of linking

consequences, but of grouping and isolating, of analysing, of matching and pigeon-holing concrete contents; there is nothing more tentative, nothing more empirical (superficially, at least) than the process of establishing an order among things’ (2002: p. xxi)

This passage raises some profound questions with regards to the origin as well as the procedure of ordering things. First of all, it is unclear whether order is situated within the characteristics of external objects or whether it is inherent to the perceiving subjects, who could impose an order among things. Second, and this is an accumulative point in relation to the first, since there is this ambiguity surrounding the origin of order, its empirical procedure is likewise contested and it could be argued that order can become justified only in relation to validated techniques, i.e. a method. In that sense, it seems that the goal of order can be interpreted as a way to bridge the gap between the perceiving subject and the external world of objects and phenomena, so that intelligible claims can be created about it. The first point raised here is of course a fundamental question, as well as an open-ended one to which there is no singular answer. An answer might best be found in the interaction between the subject and object, so that it becomes visible how order comes into being. This directs my approach to the middle part that sits in between any question and answer, that is, the methodology. Since the second point also addresses the methodological sphere as a validation of ordering,

(7)

this strengthens the choice for this direction as a way of studying order in terms of its creation and manifestation.

The phenomenon of order tends to remain quite invisible, whether in relation to its situatedness in subjects or objects, or with regards to its role within knowledge formation. It forms part of the ‘scientific unconscious’, which can be conceived as the invisible processes that influenced the visible marks of progress or discovery within the history of science (Foucault 2002: p. xi)1. However, for Foucault the process of ordering is inherent to the formation of knowledge at large, since ‘the sciences always carry within themselves the project, however remote it may be, of an exhaustive ordering of the world; they are always directed, too, towards the discovery of simple elements and their progressive combination; and at their centre they form a table on which knowledge is displayed in a system

contemporary with itself’ (2002: p. 82). Thus, in the inquiry to gain more understanding of the way order comes about in terms of its origins and procedures as well as its relation to knowledge formation, one is necessitated to investigate a scientific field, its accompanying projects and the combination of a grid and system that establish this whole.

The field that I am interested in is that of Humanities, or more specifically the part of it that deals with computational techniques for analytical purposes. With the introduction of computational techniques and its digital format, the process of ordering transformed in terms of information representation and manipulation. This transformation is important to study, since it changes our interaction with- and the formation of knowledge on a methodological level. Therefore, I want to study the way knowledge is composed through digital research methods within the fields of history and new media studies, the first dealing with traditional and preserved content and the second with the most recent and ‘digital only’ content. More specifically, I will investigate the underlying ordering processes within these digital methods as manifested in databases, which I consider to be the current version of ‘the table on which knowledge is displayed in a system contemporary with itself that allows studying the most simple elements and their progressive combination’.

For many researchers who work with digital methods, the underlying management system of the database remains concealed behind a web-based interface. I aim to offer a model of investigation that takes into account the interaction between the affordances of those databases as well as the perception and purposes of the researchers using it. Ultimately, the goal of this thesis is to reveal the ‘scientific unconscious’ of order as embedded within

1 ‘The history of science traces the progress of discovery, the formulation of problems, and the clash of

controversy; it also analyses theories in their internal economy; in short, it describes the processes and products of the scientific consciousness’, whereas [scientific unconsciousness] tries to restore what eluded that consciousness: the influences that affected it, the implicit philosophies that were subjacent

(8)

databases and demonstrate how methodological interaction with databases potentially contributes to the production of knowledge within the Humanities.

Perception

As mentioned above, the fundament on which this thesis will build is the model to investigate the methodological interaction between subject and object, or researcher and system of order in this case. Since the system of order will be the database within digital research methods, it is interesting to first delineate the relation between technology and knowledge from a historical perspective. Along the way the key elements of the model of investigation will be introduced, starting with the subject’s perception, as a way to grasp the affordances that an object has to offer, which is ultimately guided by the purposes of the subject. Opening with perception, it will become clear how this can be interpreted as one’s attitude towards external objects or (technological) developments, but can at the same time be understood for its primary role within any scientific inquiry being the key sense that allows humans to grasp its position in relation to the external world. The initial foundation for methodology as recorded by Aristotle consisted of observation and reasoning: techniques inherent to the capacities of mankind. This method determined the classical Greek perception of knowledge to be relying on such capacities defining ‘episteme, as a representation of science or theoretical (pure) knowledge’ (Boon 2011: p. 52).

However, a key moment appears when this order of things is disrupted and the capacities of mankind are no longer deemed sufficient to meet the goals of scientific

investigation. The first realization of concrete techniques for scientific purposes occurs during the Scientific Revolution in the 18th century, when Galileo and Bacon combined abstract reasoning with the modeling, building and using of instruments for the purpose of

experiments, as related to the tradition of technical know-how from artisans within workshops (Machamer 2006: p. 69). Originally, technology was deemed hierarchically inferior to

science, as Aristotle considered ‘techne, as art or craft, or (experience-based) practical knowledge’ (Boon 2011: p. 52). However, the rise of the empirically driven scientific method led to a new understanding of science based around the Modern ‘mechanical worldview’ where technology provides intelligibility from intervention with nature, differing from the Aristotelian idea that nature and its ‘secret order’ will reveal itself.

From the Scientific Revolution onwards, technology became crucial to support theoretical theses, making knowledge dependent of technically infused methods. This was supported by the idea that human’s incompleteness is naturally ‘leading him to develop technological tools and methods to overcome this situation’ (Galimberti as cited in Russo

(9)

2012: p. 70). Philosopher of technology Steven Dorrestijn affirms that ‘in “early philosophy of technology”, the first phase of philosophical thinking about technology, from the

Enlightenment until well into the twentieth century, the dominant conception of technology was, in general, very positive, sometimes “utopian”. Scientific reason and technical progress would bring humanity to a next stage, progressively overcoming the precarious state of human existence, thus moving towards perfection and completion’ (2015: p. 3).

However, this technological perception is also historically variable and relative, especially in its relation to knowledge (Mayr 1976: p. 664). Therefore, when technological inventions turned out bad for humanity, such as the atomic bomb and capitalist industries, a more pessimistic account emerged during the mid twentieth century, urging the limitation of technological development to preserve human control2 (Dorrestijn 2015: p. 5). Within the scientific sphere, this spirit reinforced the classical distinction of the Greek science and technology dichotomy: ‘science aims at enlarging our knowledge through devising better and better theories; technology aims at creating new artifacts through devising means of

increasing effectiveness’ (Skolimowski 1966: p. 376). This meant that methods could be technologically driven, but they should at all points be regarded as instrumental and not as taking control over humans or their interpretative responsibilities.

Although this critical attitude towards technology is still present, there has also been an argument from the 1970s onwards for the hybrid view of science and technology3. It focuses on the different types of impact of technology through an empirical approach, therefore referred to as the ‘empirical turn’ or the third phase of philosophy of technology (Dorrestijn 2015: p. 6-7). In response to the science-technology dichotomy, the notion of techno-science attempts to acknowledge the actual overlap in intention and ‘a reversal of the values attached to science’ and technology (Forman as cited in Bensaude-Vincent 2009: p. 1). The way philosopher of technology Gilbert Hottois introduced the term ‘techno-science’ honors the increased need for concrete techniques, i.e. technological artifacts, within modern scientific research and also conceives it as driving force for new research fields (Bensaude-Vincent 2009: p. 2).

One such field that has taken the ‘technological tools’ in hand to (re)interpret

historical or present phenomena can be found in the Digital Humanities, which can be broadly understood as the collective of fields within the Humanities that make use of computational

2 In relation to the ‘pessimistic’ or second phase of philosophy of technology Dorrestijn refers to

philosophers such as ‘Ellul (1964) [who] argued that modern technology had become ‘autonomous’ at the expense of the autonomy of humans. Heidegger (1977) believed that the technical way of thinking had come to determine how humans relate to the world: they see the world as a stock of resources for humans to use and manipulate’ (2015: p. 5)

3 With regards to the third phase of philosophy of technology Dorrestijn notes that ‘Haraway (1985)

uses the image of the ‘cyborg’, while Latour (1993) argues that humans and things do not exist without each other, but are always ‘hybrids’ (2015: p. 6).

(10)

technology for research purposes. However, its exact meaning is continually reformulated, which can be logically declared by the parallel technological development that constantly broadens the horizon of research possibilities for the different fields and practitioners. The name ‘Digital Humanities’ itself is even of quite recent origin, being a result of the

methodological expansion and attraction of a wider audience, when the term got adopted with the publication of its Blackwell Companion in 2004 (Fitzpatrick 2012: p. 13). Before Digital Humanities, one would speak of humanities computing, which marks the initial turn towards the calculative capacities of the computer for textual or literary purposes.

Johanna Drucker, a key figure within the debate on Digital Humanities development, reflects: ‘as the first phase of digital humanities reveals, the exigencies of computational method reflect its origins in the automation of calculation. Counting, sorting, searching and finding non-ambiguous instances of discrete and identifiable strings of information coded in digital form are the capacities on which digital technology performed the tasks of corpus linguistics to create Father Busa’s concordance’ (2012: p. 87). The referred to Roberto Busa was the first to explore the automation of linguistic analysis in the 1940s and he created a concordance on St. Thomas Aquinas’ work called the ‘Index Thomisticus’; a lemmatized list of 11 million words that allows searching for their immediate context and meaning

(Schreibman et al. 2004). Despite the fact that the result ‘offers a powerful instrument for extending the capacities of humanistic scholars’, Drucker argues that it is based on

a-humanistic principles since ‘the creation of the concordance does not depend on interpretation even if the resulting work supports it’ (2012: p. 87). This is exemplary for the distinction between ‘those who suggest that digital humanities should always be about making (whether making archives, tools, or new digital methods) and those who argue that it must expand to include interpreting’ (Fitzpatrick 2012: p. 14). A distinction that ultimately goes back again to the classical science and technology dichotomy, due to the fear of decreased value for human capacities.

The debate on the intentions and practices of Digital Humanities can thus be traced to the different attitudes that shape the ‘technological perceptions’ we saw earlier within

philosophy of technology, varying from optimistic, pessimistic and empirically driven. These technological perceptions are decisive for the ‘interpretation of the instrument’. Take for example the case of literary scholar Margaret Masterman, who was –like Busa– eager to explore the potential functionalities the computer could have for provoking new insights in her field. She interrogated in the 1960s whether the computer could be conceived as a ‘telescope for the mind’ instead of a purely menial tool for calculations. Her reading

encouraged contemplating ‘the whole range of what its possessors could see and do [so] that, in the end, it was a factor in changing their whole picture of the world’ (Masterman as quoted

(11)

in Mccarty 2012: p. 113). Literary scholar Louis Milic aligned with the exploratory approach of Masterman and decried the lack of imagination on the side of the researcher, stating in 1966: ‘we do not yet understand the true nature of the computer. And we have not yet begun to think in ways appropriate to the nature of this machine’ (Milic as quoted in Mccarty 2012: p. 115).

The examples of Busa and Masterman demonstrate the influence that one’s attitude or perception can have on the exploration of technology for new or different purposes. Both used computational techniques for different purposes than they were originally intended for, being adapted for linguistic purposes such as quantified content analysis and the qualitative interpretation thereof, whereas before the computer was conceived as calculative tool for quantitative purposes and numerical content only. It is interesting to delve a little bit deeper into the way the ‘true nature of the machine’ can be grasped and extended for varying and previously unimagined purposes.

Affordances & Purposes

In ‘A Question Concerning Technology’ (1977), Martin Heidegger provokes the

technological perception of mankind by justifying the argument that technology should be conceived instrumentally and inferior to its makers, since technology threatens to take over otherwise. Heidegger’s account is rather pessimistic, belonging to the ‘second phase of philosophy of technology’ (Dorrestijn 2015) and the sort of technology described is mechanical, based on the model of the engine. However, the core idea can be translated towards digital technology as well and the ‘technological perception’ could also be reversed towards a positive one. Strength lies in Heidegger’s fundamental argument that humans are demanded to interact with technology, which he calls enframing, so that technology will reveal its essence. If we take the idea of un-concealment with regards to ‘essence’ in combination with an optimistic technological perception, than one could grasp the core capacities of digital technology that potentially serve scientific purposes.

Heidegger’s ‘essence’ can also be expressed through the term ‘affordances’, which originates from psychologist James Gibson’s ‘Theory of Affordances’ (1977). He created this noun in relation to the verb ‘to afford’, allowing him to describe objects as well as

environments without needing to classify or label them in terms of qualities, such as surface, colour or form (1977: p. 134). Affordances can in that sense be broadly understood as the properties that an object or environment offers. Information expert John Unsworth states that ‘it is important to distinguish a tool from the various uses that can be made of it, if for no other reason than to evaluate the effectiveness of the tool for different purposes. A hammer is

(12)

a very good nail-driver, not such a good screw-driver, a fairly effective weapon and a lousy musical instrument’ (2002: p.1). As Unsworth mentions, the affordances stand in direct relation to purposes, which means that the affordances can be perceived differently based on different purposes of different observers. However, the affordances are there to be

unconcealed and remain independent from the needs of the observer.4 For Gibson, this means that an affordance possesses a kind of materiality, but is simultaneously dependent of a degree of subjectivity, namely the observer, in order to reveal its inner workings.5 This encapsulates an invitational relation between the object and the observer, which we can also find in Heidegger’s conception of the interaction between humans and technology.

The concept of affordances is fruitful to work with in order to reveal and determine the contribution that a specific instrument could make to knowledge formation. Media-studies scholar Esther Weltevrede transfers the concept of affordances to the digital realm, thereby expanding its application beyond the concrete objects of physical environments, and she pays special attention to digital media’s research affordances that contribute to analytical purposes (2016: p. 13). In her account she honours the diversity of interpretation of affordances by different actors as well as the inherent assumptions and designed use of digital media (2016: p. 12). It must indeed be stressed that the affordances of a computer are extremely broad and that its ‘revealing’ is logically constrained due to the different parties and stakeholders that all together form the assemblage of hardware, software and interface levels. Unsworth likewise stresses that, ‘because the computer is – much more than the hammer – a general purpose machine (in fact, a general purpose modelling machine) it tends to blur distinctions among the different activities it enables (…) but they are not all equally valuable, they don’t all work on the same assumptions – they’re not, in fact, interchangeable’ (2002: p.1). Thus, the

computational affordances that can be found on the interface level, features such as like-buttons, tags, or hyperlinks, serve other purposes than, for example, the affordances on a software level, such as databases or algorithms.

While affordance can be understood as inherent to the ‘object’ or computer in this case, purposes are located within the subject, that is, the researcher here. The way affordances

4 ‘The affordance of something does not change as the need of the observer changes. The observer may

or may not perceive or attend to the affordance, according to his needs, but the affordance, being invariant, is always there to be perceived. An affordance is not bestowed upon an object by a need of an observer and his act of perceiving it. The object offers what it does, because it is what it is.’ (Gibson 1979: p. 138)

5 Gibsons aims to overcome the mind-matter duality and therefore argues for the idea that affordances

can be both objective and subjective: ‘An important fact about the affordances of the environment is that they are in a sense objective, real, and physical, unlike values and meanings, which are often supposed to be subjective, phenomenal and mental. But actually, an affordance is nether an objective property nor a subjective property; or it is both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behaviour. It is both physical and psychical, yet neither. An affordance points both ways,

(13)

can be revealed is dependent on the one hand of the perception or attitude of the researcher and on the other hand of the purpose, or objectives that the researcher might have. Since the affordances might be geared towards different purposes than that of the researcher, they can be repurposed methodologically. We saw this already in the case of Busa and Masterman who repurposed the calculative affordances for linguistic analysis. Weltevrede reflects on the way current digital research methods repurpose the computational affordances stating: ‘by building scripts and scrapers on top of the digital media, researchers negotiate with the assumptions and purposes inscribed into these media and appropriate and recontextualize them in their research design (2016: p. 16). This is in line with the vision of web

epistemologist Richard Rogers, who initiated the development of digital methods to do Internet-related research that takes into account both online culture and society at large as apparent through digital media. For him, the concept of repurposing is linked to the kind of research that is sensitive to the affordances of the medium, instead of importing existing methods from social sciences or humanities (2013: p. 19).

Thus, the concept of repurposing supports the exploration of the affordances of the computational technology and together with the notions of perception and purposes in general, they can help flesh out methodological interaction. Since databases are embedded within digital research methods, it will first be probed which types of digital methods there are and how they can be investigated for this specific inquiry.

Untangling Digital Methods

As may have become clear up until now, there has been an increased usage of the affordances of computational technology within the Humanities for different purposes dependent of the field of inquiry. The Digital Research Tools (DiRT) directory offers an overview of the variety of purposes and accompanying tools one could work with for different data types, stating: ‘DiRT makes it easy for digital humanists and others conducting digital research to find and compare resources ranging from content management systems to music OCR, statistical analysis packages to mind-mapping software’6. Next to these publicly available tools, it is often the case that tools are built as part of an University program, whereby the tools form the fundament of a new digital research method or specific project. Examples of these can be found within ‘Digital Humanities’ research centres, such as the one from Stanford, where projects like ‘Text Technologies’ and ‘Mapping the Republic of Letters’ show how historical texts can be repurposed for digital research that investigates the

6 The alphabetical list classifies tools purpose-wise, affording tools for annotation, archiving, capturing,

modelling, organizing, programming and visualizing data amongst others. See http://dirtdirectory.org, accessed 23-06-2016

(14)

development of print technology or the mapping of letter exchange within an European network of modern scholars7. Bernhard Rieder and Theo Röhle, who design and work with digital methods, explain that ‘the heuristic function of digital research methods in the humanities is mostly focused on the finding of patterns, dynamics, and relationships in data’ (2012: p. 70). Due to this heuristic functionality, we can understand digital methods as a means to an end and not as an end on its own, since they are not problem solving in nature, but rather serve as a way of discovery or studying aspects of a phenomenon. Next to that, the technically infused methods can be thought of as exploring the new way of working with digital representations and the ‘digital folding’ of reality (Berry 2011: p. 1).

However, with regards to the kind of data, Richard Rogers notes that one should be attentive to the difference between the types of research working with pre-digital information and digital-only information: ‘an ontological distinction may be made between the natively digital and the digitized, that is, between the objects, content, devices, and environment that are “born” in the new medium and those that have “migrated” to it’ (2013: p. 19). The work of Digital Humanities is in that sense mostly concerned with digitized data through digitized methods, whereas the field of New Media works with digital native data and medium specific methods. With regards to this methodological distinction, Rogers stresses that ‘one may view current Internet methods as either those that follow the medium (and the dominant techniques employed in authoring and ordering information, knowledge and sociality) or those that remediate or digitize existing method’ (2013: p. 38). The idea of following the medium is in line with the previously mentioned concepts of using the affordances of digital devices for research-driven purposes and in that way repurposing them with regards to their initial purposes, such as social media platforms or search engines. In contrast to this approach, the non-medium driven research focuses on human expressions such as the word in general, music, theatre, design, painting, phonetics which translates into methods that have its origin in disciplines, such as hermeneutics and sociology.

Since digital methods are built in- and used through the software of digital devices, their composition is often concealed from their users. Digital humanists Ramsay and Rockwell reflect on this, stating: ‘a well-tuned instrument might be used to understand something, but that doesn’t mean that you as the user, understand how the tool works’ (2012: p. 80). They distinguish the type of tools that are made for general usage, which are often transparent in their role as facilitators although they do not explain or argue their contribution (2012: p. 78). This leads Ramsay to claim that methods themselves are worthy objects for

7 See for more examples http://shc.stanford.edu/digital-humanities or the projects by UCLA’s Digital

Humanities program: http://www.cdh.ucla.edu/projects/ , that of Berkley:

http://digitalhumanities.berkeley.edu/projects , King’s College:

(15)

study and reflection (2004: p. 177). In addition to this Rieder and Röhle emphasize that one should be attentive to the embedded purposes of the makers of tools, as they state that ‘in creating truly digital methods, we mechanise parts of the heuristic process, and we specify and materialise methodological considerations in technical form’ (2012: p. 75). Consequently, they argue for transparency in response to this concealment, which ‘simply means our ability to understand the method, to see how it works, which assumption it is built on, to reproduce it, and to criticise it’ (ibid). Especially in cases where the users of the methods are not the creators, the concern should not merely be with the output of the method, but equally with the input, since researchers might use the method for a variety of purposes other than the pre-imagined ones.

However, in addition to this critical attitude on the side of the users, the demand for methodological transparency can also be answered by untangling digital research methods, consisting of an investigation of the method’s affordances and purposes, which refers to the computational affordances as well as the intended usage that is inscribed into the method by its creators. Untangling can in that sense be understood as the act of revealing, as discussed earlier in relation to Heidegger’s interaction with technology. As mentioned earlier, the introduction of computational technology and its digital format have transformed the process of ordering in terms of information representation and manipulation. The locus of order within computational technology is the database, which declares the specific focus on this information system, especially within the realm of digital research methods where knowledge is shaped and ultimately composed. Therefore, the process of untangling will focus on two different digital methods that rely on databases, one from the field of digital history and one from new media. The first is a tool called Epistolarium, which works with digitized data from the 17th century consisting of letters between Dutch scholars. The second tool is called the Digital Methods Initiative’s Twitter Capturing and Analysis Tool (DMI-TCAT), which – as the name already denotes - captures and allows analysis of Twitter data, i.e. digital native data from the Internet.

By investigating the interfaces of both tools first, the underlying mechanisms of the databases will be grasped in combination with available documentation. This is expanded through interviews with the designers- and a selection of users of both tools in order to understand their perception on what they think the affordances of the tool and its underlying database are, contrasting this with their specific research purposes. The interviews will in that sense serve as a direct investigation of the methodological interaction between the users or makers and the tools, along the lines of the affordances-purposes model outlined above. This will be brought into relation with the rationale behind ordering processes as constituted in the theoretical part. Altogether, the whole of theoretical and first-hand analysis of the platform

(16)

and software through the direct practitioners allows me to get as close as possible to the current Humanistic approach towards the usage and imagination that surrounds databases within the process of knowledge formation.

(17)

Chapter I

The Evolution of Order

Databases are an ubiquitous feature of life in the modern age, and yet the most all- encompassing definition of the term ‘‘database’’ – a system that allows for the efficient storage and retrieval of information – would seem to belie that modernity. The design of such systems has been a mainstay of humanistic endeavor for centuries; the seeds of the modern computerized database being fully evident in the many text-based taxonomies and indexing systems which have been developed since the Middle Ages. Whenever humanists have amassed enough information to make retrieval (or comprehensive understanding) cumbersome, technologists of whatever epoch have sought to put forth ideas about how to represent that information in some more tractable form. (Ramsay 2004: p. 177)

This passage from Stephen Ramsay, a professor in English and lecturer in programming within Digital Humanities departments, underlines how traceable our technologies are. Database and their underlying logic have their origin in a variety of previous techniques, such as the taxonomy and index, as well as information structures, such as the list, table and tree. These techniques help us to store information so that we can get an overview of what is in there and analyse or retrieve it, as well as expanding it over time. Although the overall goal of these techniques is unified, each of them does serve individual purposes, based on their specific affordances. In this part I aim to trace the technologies – or techniques – that accumulatively formed the database as our present condition of possibility for efficient storage and retrieval of information, so that we understand better the phenomenon of order as present within our most recent and technological methods for epistemic inquiry.

1.1 Representations and Classification

Before one can arrange things, they need to become such conceptually, and thus we will begin tracing order’s origin to the formation of representations. According to Smiraglia and van den Heuvel, ‘conceptions form part of the ideas we have of reality, they can serve to obtain a classification of objects – even of the whole universe – and to find new truths and facts’ (2013: p. 368). Plato actually considered two ‘orders of reality’ and his concepts are thus based on a correspondence between these two realities, where one is the plane of perfection, i.e. the universe or macrocosm, and the other is the physical world of imperfections, our earthly microcosm. His logical reasoning was based on resemblance between these two spheres and therefore he argued that physical objects or appearances have a perfect Form in the higher realm of the universe. So ‘Form’ is the universal, the singular

(18)

category that binds together all particulars, which correspond to that Form in terms of their characteristics. For example, there can be many beautiful things, but there is only one Form called ‘Beauty’ and it remains unaffected by the in- or decrease of beautiful particulars8. In this way, he understood Form to be unchangeable and existing on its own, while the particulars ‘participate’ in the Form. Altogether, Form can be understood as the main characteristic that groups together things with that same characteristic, although Form is of higher grounds than the participating things.

In contrast to Plato’s ‘two orders of reality’, Aristotle grounds truth within the physical realm in eyesight of human beings, and he came to the conclusion that Form is not a thing on its own in another reality, but can actually be found within the concrete substance of things. This allowed thinking of Forms as ‘principles’ or classes that declare why some things fit into one category and why others do not (Weinberger 2007: p. 69). This move fully

realized the process of classification as a technique to group items according to the one-over-many principle. A more recent definition of classification can be found in C.M Sperberg-McQueen account, where he neatly outlines the rationale behind classification: ‘Classification is, strictly speaking, the assignment of some thing to a class; more generally it is the grouping together of objects into classes. A class, in turn, is a collection (formally, a set) of objects which share some property’ (2004: p. 161). Interestingly enough, the rationale behind classification is often pinpointed on the grouping of things based on similarities; however, in relation to these similarities the process of ordering automatically marks the distinction between things or groups of things as well. Altogether, there are two levels that constitute order; the creation of conceptual representations of things and the grouping of these based on similarities and distinctions, which is called classification.

Within the process of classification itself there is a methodological distinction between ‘perfect’ order and ‘imperfect’ order. A ‘perfect’ order would stick as close as possible to the entire set of characteristics of an object to determine the classification process. ‘Imperfect’ or artificial order on the contrary, refers to the processes where one or several characteristics are picked out at the expense of others, or when artificial characteristics are added, dependent on the final purpose of the order. According to Sperberg-McQueen a ‘‘‘perfect” classification scheme, in the sense described above of a scheme that perfectly captures every imaginable similarity among the objects of O, is thus a purely imaginary construct; actual classification schemes necessarily capture only a subset of the imaginable properties of the objects, and we must choose among them on pragmatic grounds’ (2004: p.

8

As Plato states in ‘The Republic’ 476a: ‘That, since beauty and ugliness are opposites, they are two.’ And as they are two, each of them is single. The same is true of justice and injustice, good and evil, and all qualities; each of them is in itself single, but they seem to be a multiplicity because they appear

(19)

172). But the same argument could be made for the level of representations, since despite the fact that Plato considered ‘Forms’ to be perfect representations, this view itself could also be contested. Unsworth provokes this thought by stating that ‘the only completely accurate representation of an object is the object itself. All other representations are inaccurate; they inevitably contain simplifying assumptions and possibly artifacts’ (2002). Nonetheless, Unsworth encourages working with representations in order to map out terrains of knowledge and enabling them to be navigated for different purposes.

For Unsworth, the process of selecting a representation implies making a set of ontological commitments, which ‘are in effect a strong pair of glasses that determine what we can see, bringing some part of the world into sharp focus, at the expense of blurring other parts. These commitments and their focusing/blurring effects are not an incidental side effect of a representation choice; they are of the essence’ (2002). Interestingly, Unsworth considers ontological commitments as essential to what is being represented and this idea can also be translated to the level of classification. We can thus understand ordering as an ontological commitment, whereby a selection is made of objects to be represented as well as the characteristics on which they are grouped or classified. This corresponds to the idea by Sperberg-McQueen that ‘every ontology can be interpreted as providing the basis for a classification of the entities it describes. And conversely, every classification scheme can be interpreted with more or less ease, as the expression of a particular ontology’ (Sperberg-McQueen 2004: p. 162).

Working with this idea that a classification reflects an ontology and, moreover, an ontological commitment, allows seeing the intentions of the makers behind an order. Therefore, by examining historical examples of concrete orders it will become clear how different sorts of orders are based on different perceptions and different purposes. Next to that, the examples will make more concrete how order can manifest itself methodologically, in the form of techniques and grids.

1.2 List, table and tree to structure ontology

As we have seen in the previous paragraph, order is based on the process of selecting representations of things and grouping them based on similarities and distinctions, creating the one-over-many structure of information organization. The most basic expression of order can be found in a list that presents similar things under one header, i.e. the class that binds the similarities. However, lists do not allow grasping the relations between different classes, or groups of things, and this urged the need for systematics: a method of organizing information based on a system. The first example of this can be found in the 18th century’s Systema

(20)

Naturae, i.e. the system of nature: ‘such, traced out, as it were, in dotted lines, was the great grid of empirical knowledge: that of non-quantitative orders. And perhaps the deferred but insistent unity of Taxinomia universalis appeared in all clarity in the work of Linnaeus, when he conceived the project of discovering in all the concrete domains of nature or society the same distributions and the same order’ (Foucault 2002: p. 84). Carl Linnaeus (1707-1788) had laid the foundation for taxonomy, i.e. the science of classifying organisms, and is therefore considered to be the father of modern classification as is still in use today.

So how did his order come about in terms of procedures as well as format? Linnaeus divided nature into three sorts of representations: the kingdoms of animals, vegetables and mineral and in line with Aristotelian logic, he presented a method of classifying the plant kingdom into five levels: classes, orders, genera, species and varieties (Headrick 2000: p. 22). The first two of these levels, ‘classes and orders’, can be conceived as the manifestation of Linnaeus’ system, in the sense that they allow identifying plants efficiently. They are thus cultural constructs, whereas the middle two levels, ‘genera and species’, were deemed

‘natural’ and eternal: ‘every genus is natural, and created as such in the beginning: genera and species are confirmed by revelation, discovery and observation’ (Linnaeus as cited in

Headrick 2000: p. 22). The eternal aspect reflects the Platonian idea of the ‘perfect’ Form which is the unchangeable class that is there before- and despite of human observation. The importance of genera and species shines through in the binomial system that Linnaeus created in line with Aristotle’s logic to name each organism by a two-word combination based on their genus and species, i.e. similarities and differentiating characteristics. In this way, Linnaeus had transformed Aristotle’s method into a concrete system for representing and ordering nature, as well as exporting this into an all-encompassing table as can be seen in figure 1, which was to be systematically expanded over time.

(21)

Figure 1: Linnaeus taxonomy of the animal kingdom as presented in a tabular structure, 1735

One would expect that Linnaeus’ classification of organisms stays as close as possible to the entire set of characteristics that can be found in nature in order to define and differentiate it. However, as outlined above, Linnaeus had consciously chosen for an

imperfect order that prioritizes certain characteristics over others to create an efficient system. He identified plants based on their distinctive male and female organs thus creating the sexual system, thereby excluding other characteristics of the plant such as their parts and functions, physiology and ecology (Headrick 2000: p. 22). In a similar vein, the binomial system might have imposed a pattern of names, i.e. representations, which did not correspond to the actual underlying reality. Despite these moves towards imperfect or artificial order, the system proved to be efficient and it got internationally adapted which allowed the discipline of botany to grow. The choice for artificial order can in this case also be justified since it is most applicable in cases where there are many varieties and no clear discontinuities (Headrick 2000: p. 21). This shows that imperfect order might not be as much about order as a direct representation of things, but rather as an undertaking for the analytical purposes that can be generated from it. In Linnaeus’ case, the taxonomy gave way to systematically identify and expand knowledge on organisms. Next to that, the tabular grid exceeded the former structure of organizing information in lists, since the table gives way to the storage, retrieval and comparative analysis of large quantities of information, as well as the development of other complex classifications.

(22)

Although Linnaeus’ table might not directly show its underlying scheme of order, it can be retraced to the shape of a tree-structure as can be seen in figure 2, which starts with a few main classes that branch off into different levels of sub-categories. This linear structure allows adding new findings into the existing categories by connecting it through a

ramification. As Weinberger states: ‘trees are a supremely powerful way of understanding systems as complex, as say, the universe. (…) The tree of knowledge, the tree of species, the breakdown of the human body into major biological subsystems, the division of

consciousness into reason and emotion, even the division of the earth into continents and countries – all are ways of understanding, not ways of looking up information’ (2007: p. 71). Here we can recall Unsworth’s encouragement to work with representations in order to map out terrains of knowledge and enabling them to be navigated for different purposes. Next to that, Linnaeus’ classification is illustrative for the way the ontology of the natural kingdoms came into being, as well as the ontological commitment he made to identify them.

Figure 2: the tree-structure as a potential organisation of information

1.3 Static versus flexible classification

Next to the taxonomy, indexes form a major part of the history of order as well. While the taxonomy contributes to the formation of knowledge by affording an informational overview for the purpose of analysis, the index can be considered a structure of existing knowledge for the main purpose of information retrieval.

A special example can be found in the case of reference works, such as the dictionary and encyclopaedia, whose purpose seems to be both explanatory as well as retrieval: ‘a dictionary explains words, whereas an encyclopaedia explains things. Because words achieve their usefulness by reference to things, however, it is difficult to construct a dictionary

(23)

in Headrick 2000: p. 144). Their difference is therefore to be found on the level of the arrangement of information: the dictionary being arranged alphabetically and thereby following a list-structure, whereas the encyclopaedia works thematically based on the

classifications of a hierarchical tree-structure. As Headrick outlines, their arrangement is also telling for their specific purpose: ‘the alphabetical order emerged as the key characteristic of compendia designed for efficient information retrieval, while the thematic arrangement characterized works of didactic erudition’ (2000: p. 160). The benefit of the alphabetical order is that it is easily learned, but above all it is arbitrary and impersonal, thus allowing an order that is not based on characteristics or importance. A thematic order on the contrary, arranges concepts according to a preconceived plan, thereby prioritizing certain

characteristics over the other in a hierarchical way.

Although the information structure of an encyclopaedia is expressed on paper as a list of things, their thematic order refers to the underlying tree structure. This inherently shows the ontological commitment that is made to cluster fields of knowledge according to the vision of its makers. Due to the tree structure, the position of each discipline counts in terms of its status within the bigger hierarchy of knowledge, each ramification thus implying a degrade. A good example of this can be found in the French 18th century Encyclopédie by Diderot and d’Alembert. The aim of their encyclopaedias was to provide an explanation of all existing knowledge: ‘the word encyclopaedia, Diderot explained in the Prospectus, derived from the Greek term for circle, signifying “concatenation (enchaînement) of the sciences. Figuratively, it expressed the notion of a world of knowledge, which the Encyclopedists could circumnavigate and map’ (Darnton1999: p. 194). Diderot and D’Alembert were both

philosophers, which directed their thematic order, ‘for in the tree of Encyclopédie philosophy was not so much a branch as the principal trunk’ and ‘the philosophes came to be recognized or reviled as a kind of party, the secular apostles of civilization (Darnton 1999: p. 199 & 208). In contrast to that, religion got removed from its former prime position, being designated to the non-empirical unknowable, which did not deserve a place in a work that explains the known through experiments and reason in the Modern world (Darnton 1999: p. 205). Thus, by studying the underlying tree of knowledge, one can grasp the ontological commitment and trace the context and vision of its makers, which is telling for the political aspect of imposing an order among things.9

Another example of indexes as an order of things can be found in the organization of knowledge in libraries for the purpose of efficient retrieval. In 1876 Melvil Dewey proposed the Dewey Decimal System as a universal way of cataloguing books for libraries, organizing books alphabetically within subjects, which implied rearranging books relative to each other

9 For an extensive outline of the tree of knowledge behind the Encyclopédie in comparison to those of

(24)

topically, instead of physically. He combined this mix of thematic and alphabetical order with the attribution of a numerical system to make his system more expansive since the ‘decimals offered Dewey an infinity of subdivisions; by placing topics to the right of the decimal point, he could stretch his subject areas without limit (Weinberger 2007: p. 54). The underlying classification scheme thus includes multiple characteristics and thereby involves a

hierarchical of ‘increasingly fine distinctions’, organizing for example literary in the 800s, then breaking down into 820’s for English literature, 830s for German literature etc. whereas the third number denotes the genre such as drama or fiction (Sperberg-McQueen 2012: p. 163). Despite the fact that the numerical system has no quantitative significance, the sequence of values is carefully chosen. Since this is another way of structuring knowledge for public use undertaken by a philosopher, it would come as no surprise to find philosophy in the top rank of the 100s, showing how numerical classification in combination with a thematic tree still prevails a hierarchical or ranked order of things.

In response to this ‘static’ order of things, which is based on a predetermined tree-structure, the rise of ‘flexible’ classification marks a turning point. We can find an example of this in the successor of the DDC, namely the Universal Decimal Classification system, or UDC in short. The makers, Paul Otlet and Henri La Fontaine introduced in 1904 tables of general application that were free from predefined concepts, but did have detailed though reusable attributes and most of all ensured the relation between different fields of knowledge through interlinked symbols and syntax. As current UDC Chief Editor Aida Slavic explains with regards to this form of synthetic classification: ‘the role of classification in such systems is to serve as an underlying knowledge structure that provides systematic subject organisation and thus complements the search using natural language terms’ (2004).

We can see a further expansion of flexibility occur in S.R. Ranganathan’s colon classification, where classification was based on colons, which separated facets or five basic categories: personality, matter, energy, space and time, which were then subdivided into a list of possible values called ‘isolates’. By combining the isolates with the facets, a flexible way of classifying books occurred that did not require advancing a tree of knowledge beforehand (Weinberger 2007: p. 80). The rise of faceted or colon classification changed the practice of information retrieval to a more flexible system that allows navigating multiple paths since there is no static or hierarchical order, i.e. no pre-defined starting points. Thus, whereas an enumerative classification contains a full set of entries and linear procedure for all concepts, as we have seen in the case of the encyclopaedic tree of knowledge, the faceted classification is based on semantic categories that can be combined interchangeably in order to create concepts. The rise of flexible classification systems allows rethinking the role of knowledge and the way it should or could be ordered.

(25)

The examples of enumerative classification show the politics that surround hierarchical order and the top-down view that comes along with it. The introduction of faceted classification can be seen as a more flexibly adapted order, since knowledge and reality are multifaceted and in that sense a system of order might take a more basic approach so as to stick those characteristics. Next to that, its understanding and access might be more dependent of the ‘navigator’ than the ‘mapmaker’. By creating a universal order, one implicitly creates a unilateral view and this caused the enumerative projects to fall short in acknowledging the diversity of knowledge and its subjective ways of understanding and retrieving.

Next to that, the approaches behind the Encylopédie and the DDC are not coherent with the fact that knowledge is an ever-growing phenomenon that does not fit in a tree-structure or on the right place of a decimal, since its diversity requires different starting points and not just different end points. This is not only the case for the indexes, but also for the table of Linnaeus, as it is bound to the same restrictions of the tree-structure as well as its format of expression: ‘Linnaeus’ system not accidentally shares properties with the paper that expresses it: bounded, unchanging, the same for all readers, two-dimensional, and thus only with difficulty able to represent exceptions and complex overlaps, making all visible in a glance, with no dark corners. Linnaeus’s organization took the shape it did in part because he constructed it out of paper’ (Weinberger 2007: p. 77).

Since knowledge is in constant development, the desire to increase and improve its format of expression requires a simultaneous pace for continuous editing. The physical format of paper is rigid in the sense that it needs to be reprinted when changes or updates occur. Physical limitations thus guide the organization of knowledge ‘from management structures to encyclopaedias (…) we have organized our ideas with principles designed for use in a world limited by the laws of physics (Weinberger 2007: p. 7) From the moment you abstract a conceptual order that is not based on physical laws of atoms, you can make any abstraction for categorization that works well for several interests, that is, which honours the diversity of knowledge.

1.4 Moving from one to several tables: the relational database

As we have seen in previous paragraphs, there have been quite a number of analogue attempts at ordering information. With the introduction of the computer as a new medium of

expression, the doors opened for the adaptation of previous information structures and more expansive experimentation with that. Next to that, its expressive power allowed for the content itself to become digitized, that is, computational representations that are increasingly

(26)

atomized and modular. Due to this extra level of abstraction, an increased flexibility becomes possible, in terms of combining, connecting and attributing pieces of information to several places in line with faceted classification. As Weinberger states: ‘the gap between how we access information and how the computer accesses it is at the heart of the revolution in knowledge. Because computers store information in ways that have nothing to do with how we want it presented to us, we are freed from having to organize the original information the way we eventually want to get at it’ (2007: p. 99).

This liberation is inherently linked to the relational database; an information system that is especially not concerned with how data is stored physically and due to this abstraction allows for more flexibility than previous information management system. The database10 can be considered an accumulation of previously discussed information systems: it stores, orders and retrieves information. However, the affordances of databases differ from previous systems and this is due to several reasons that will be discussed here in order to prepare for the next chapter where they will be examined on their epistemic contribution. Just to refresh the mind, we can understand affordances in line with Gibson’s statement that ‘an affordance is not bestowed upon an object by a need of an observer and his act of perceiving it. The object offers what it does, because it is what it is’ (1977: p. 138). Thus, it becomes possible to investigate the core characteristics of databases in comparison to previous information systems.

In general, the database’s functionality can be understood as a ‘computerized record-keeping system’, implying that information can be broken down into parts, i.e. data, which is stored in a repository, or base. Storage of data takes place in the physical hardware, such as magnetic disks or SSD drives, and the manipulation of data occurs via the software part called ‘Database Management System’ or DBMS in short (Date 2004, p. 6). The database is thus a concrete set of data storage that can only be accessed via a DBMS. This makes the DBMS the central place of action in terms of ordering and retrieving and it works on demand, which means that it requires user interaction in order to perform. This aligns with the purpose of a database, which is ‘to store information about a particular domain (sometimes called the universe of discourse) and to allow one to ask questions about the state of that domain’ (Ramsay 2004: p. 179). The currently dominant database, the relational one, can thus be conceived as the first classification system that enables a searchable ontology.

Within the development of classification, the possibility of searching ontology means that there is an increased degree of control over- and interaction with information for more varied purposes. Whereas the static order often represents a top-down and ‘universal’ order, the dynamic, or searchable order allows creating a personalized ‘sub-order’ within a bigger

(27)

scheme of classified information. The way this ‘sub-order’ is constituted is through the query, which can be understood on the software level as a set of instructions to the DBMS, but on the interface level it means that any user can ask a question to the specific domain through one word or a string of words. On the software level the instructions that are given through database query language SQL (Structured Query Language) can be as varied as adding to- or removing files from the database, as well as inserting, changing and deleting data into existing files (Date 2004: p. 3). Ramsay illuminates these operations further:

‘Database queries can reach a significant level of complexity; the WHERE clause can accept Boolean operators (e.g. ‘‘WHERE year_of_birth > 1890 AND year_of_- birth < 1900’’), and most SQL implementations offer keywords for changing the ordering of the output (e.g. ORDER BY year_of_birth). Taking full advantage of the relational model, moreover, requires that we be able to gather information from several tables and present it in one result list (2004: p. 190).

What becomes clear from this quote is how specified the interaction between a user and the database can be, and especially that ‘most SQL implementations offer keywords for changing the ordering of the output’. This means that the query represents a key moment within database interaction; it transforms the database form a ‘passive’ storage information system towards a performance where classification is composed on demand. The query embodies the key class on which the dataset is ordered, it groups the related data and excludes the non-related data, and thereby creating a new table that contains the inquired sub-order. Sperberg-McQueen confirms this idea by explaining that ‘when the axes and their values are derived post hoc from the items encountered in the collection of objects being classified, we may speak of an a posteriori or data-driven system. Author-specified keywords and free-text searching are simple examples of data-driven classification’ (2004: p. 167). Order is in that sense situated within the query and the query can be conceived as a personal inquiry, although it only becomes possible in interaction with- and through the medium. The DBMS is all centred on this user-interaction and this can be considered the first and main affordance of databases, as it allows constantly generating a personalized ‘order of the query’ in

correspondence to its user.

The second affordance can be located in the structure of data organization, also referred to as the database model. There have been several database models, starting with the hierarchical topology at the end of the 1960s, which was a direct adaptation of the tree-structure, thus running into the problem of having to ascribe each new data entity to a parent (Stonebreaker & Hellerstein 2005: p. 3). This allowed occurrence of overlapping data within several branches of the tree and changing a type of data was to be done ‘manually’ for each entity within each branch. Therefore, the networked database model was introduced in the 1970s, where new entities did not need to be added to a parent, but could have multiple or

(28)

zero parents due to its graph topology (Stonebreaker & Hellerstein 2005: p. 8). However, this is a very complex and time-costly way of maintaining data still, since it requires a

programmer to navigate back-and-forth in order to change data.

Around the same time in the 1970s came the relational database model, which had a tabular structure, consisting not of one, but several tables that are linked. The term ‘relational’ could be misunderstood as the relations between tables, but ‘relation’ is actually a

mathematical term for table (Date 2004: p. 26). Codd, the inventor of the relational database model, was a mathematician and therefore did not focus on physical implementation, but rather on the logical abstraction of data. He specifically envisioned the interaction between users and databases to consist merely on the software level of the DBMS, as he states: ‘future users of large data banks must be protected from having to know how the data is organized in the machine’ (1970). This resulted in the physical data independence that allows separating the DBMS and hardware and thus the data from its physical organization (Date 2004: p. 22). This move towards physical data independence, is exactly the sort of abstraction that facilitates data organization in more diverse ways than would have been possible in the physical realm. By dividing information over several tables, the problem of data redundancy is solved, since updating data can be done in one or several tables at a time, as the table in question can be referred to through a specific operator, instead of duplicating data. This ultimately saves time and reduces possible inconsistency. Next to that, the relational aspect literally amplifies relations on several levels: between data points, objects, classes, datasets, and ultimately between databases themselves.

Within research context, this multiplies the epistemic possibilities. Although the database does not possess any theory, it allows looking for different kinds of relations, such as patterns or correlations, since the information is as atomized as possible thereby

accommodating unknown future interests. As Ramsay emphasizes ‘dealing with patterns necessarily implies the cultivation of certain habits of seeing (…), to use a database is to reap the benefits of the enhanced vision which the system affords (2004: p. 195). In that way one can learn from the data instead of just ‘having’ it. Researchers can enter their research objectives or initial ideas into the database through queries. Working with a small dataset means that one can get a fast overview of ‘what is in there’, which is enhanced by moving back and forth within the data through the queries. In this way, one can construct a sub-order that answers questions or sparks these constructively.

Database expert C.J. Date emphasizes the possibility of generating new insights through the relational structure as well: ‘a relational system is a system in which the data is perceived by the user as tables (and nothing but tables). The operators available to the user for (e.g.) retrieval are operators that derive “new” tables from “old” ones’ (2004: p. 26). Since

(29)

databases store records from enterprises, such as companies or institutions, the database is said to be a ‘bucket of facts’, which could lead to new facts by making connections between the data. Date stresses that inferring additional facts from given facts is exactly what the DBMS does when it responds to a user query: ‘the word data derives from the Latin for “to give”; thus, data is really given facts, from which additional facts can be inferred’ (2004: p. 15).11 Thus, the process of deriving new tables from existing ones, additional facts from given, allows an iterative working chain through user queries that ultimately results in ‘one final’ output table for the user that encapsulates the personalized order of things based on the query. The resulting output table can be compared to Linnaeus’ taxonomy, since it shares the technique of ordering information in tabular structure for analysis. However, the digital format of the database encourages distinctive user interaction and this makes the table of output potentially different each time.

Consequently, the omnipresence of relational databases opens up the power of information ordering to a wider spectrum of users with distinctive needs, allowing for different embedded worldviews and varied orders. As Date emphasizes: ‘any given user will typically be concerned only with some small portion of the total database; moreover, different users’ portions will overlap in various ways. In other words, a given database will be

perceived by different users in many different ways. In fact, even when two users share the same portion of the database, their views of that portion might differ considerably at a detailed level’ (2004, p. 8). Therefore, it is interesting to now turn to practical examples of relational database usage within digital research methods to explore how their purposes align with above explored affordances and enquire how different users and creators of these methods experience this interaction.

11 The idea that data consists of facts, allows making true propositions according to the formal theory

of logic where a thing is either true or false. This level of understanding the effects of data within databases is inherently linked to the fact that most databases are built on a relational model of data, which is essentially a formal theory based on logic, that allows making a logical abstraction of data in contrast to its physical layer.

Referenties

GERELATEERDE DOCUMENTEN

Furthermore, Jaworski and Kohli (1993) could establish a positive effect when firm performance was measured subjectively, but did not get a significant result when firm

48 Randall McGowen, ‘The body and punishment in eighteenth-century England’, The Journal of Modern History 59-4 (December 1987) 651-679. 48 It seems to have been the intention of

Personen met alleen lager onderwijs hebben significant meer gekeken dan personen met een Havo- of hogere opleiding (gemiddelde kijkdichtheid was respectieve- lijk

These advantages are, for instance, the wider choice concerning the place where the getter can be built in (in small valves this is sametimes a serious

eeuws kuiltje met typisch laat steengoed en een ontginningsgreppel werden geen relevante archeologische sporen of vondsten aangetroffen die wijzen op menselijke aanwezigheid

Zo worden investeringen in mestopslag verminderd, maar door de vermen- ging van de eigen drijfmest met die van anderen heeft men minder grip op de mest- kwaliteit.. De minder

Op beperkte schaal worden reststromen uit de tapijtindustrie gebruikt als bevloeiingsmatten. Dit zijn cirkelvormige producten, gemaakt van vlas, kokos of jute, die

De resultaten laten zien dat de meeste varenrouwmuggen van buiten de compost (de dijk rond het bedrijf en het weiland) op de vangplaten komen en van af daar de compost besmetten.