Comparing Social Network Diagrams

(1)

Comparing Social

Network Diagrams

An empirical experiment on Node-link

and Matrix diagrams

Master thesis Information Science

11th December 2019

(2)

(3)

(4)

Abstract

For the visual representation of social networks, it is common to use node-link diagrams. However, the matrix diagram is often suggested as a good alternative. The goal of this study is to investigate the viability of the matrix diagram as a method of network representation. 40 participants have been recruited to take part in an experiment wherein their task was to answer questions by looking at node-link diagrams or matrices. Aside from being either node-link diagrams or matrices, the networks they were based on also vary in the amount of persons and relationships they contain. The questions require the participants to look at the persons in the diagram and the relationships between them. For each given answer the response time and accuracy were measured. The results of our experiment suggest that matrix diagrams are generally more usable for all networks, as well as for questions that require the participant to scan the entire diagram. This may be caused by the participants needing less time to search a matrix when compared to what they need for a node-link diagram.

(5)

1 Introduction

This section will first outline the problem this study seeks to address, before discussing why this problem is worth solving from both theoretical and practical perspectives.

1.1 Problem introduction

Social media platforms such as Facebook or Twitter generate ever-larger data-sets that may be used to study social networks among humans (Khan et al., 2014). Although statistical analysis and automated classification are often used to draw conclusions from these large amounts of data, the preliminary explora-tion the same data can be facilitated by well-designed and useful diagrams. Visualisation research with respect to social networks has focused on the use of node-link diagrams, but this style of diagram becomes less effective as the social network becomes large (Ghoniem et al., 2004). In this study we consider a pos-sible alternative to the node-link diagram: the matrix diagram. This diagram style is more well-suited for large and dense networks, but is relatively under-used in studies into the use of diagrams despite this superiority (Ghoniem et al., 2004).

1.1.1 Theoretical Motivation

The approach of this study is partially inspired by the work of Ghoniem et al. (2004), who studied the usability of node-link diagrams and matrices. They conclude that node-link diagrams and matrices are complementary visualization techniques, with the former being best suited for small networks and the latter more appropriate for representing larger networks. While the authors consider the two types as equally valuable, they note the under-use of matrix diagrams for social network visualisation. The goal of this study is to gain more insight into the usability of matrices and node-link diagrams representing social networks. Investigating the task performance of participants on different types of tasks and social networks of different sizes and composition will help achieve that goal.

Size is not the only variable whose effect on usability can be investigated. In Bakker and Bosveld-de Smet (2018), the authors discuss several additions to make node-link diagrams more expressive, including different node sizes and color encoding of both nodes and links. This study will look at several such variables.

1.1.2 Practical Motivation

Aside from the research goal described above, this study also hopes to provide practical insights into which tasks and networks work best for either node-link diagrams or matrices. Having more insight into how the different diagram aspects affect their ease of use will also guide development and design theory

(6)

for visualization tools: these tools may then be able to recommend a particular diagram type or give an indication how useful it is for a particular data-set.

(7)

2 Background

2.1 Visual Communication

Discussing visual communication in a way that helps us explain what we think makes for a good diagram starts with deconstructing it into three main parts: the graphical domain that describes what kind of image is used (or alternatively the verbal domain if the representation uses spoken or written language), the ap-plication domain that describes the information being depicted (or alternatively the information being verbalized), and the link that determines how informa-tion is translated between the two. It is the nature of this link component that tells us the most about how well a visualization works as a representation of information.

(a) Graphical communication

(b) Textual communication

Figure 1: The parts involved in graphical and textual communication, inspired by Wang et al. (1995)

The quality of the transfer between the two domains determines how well-suited the image is for transferring information to the user. According to Wang et al. (1995), a good link should be:

• natural: users consider the graphical elements used naturally represent-ative of the information in the application domain.

• not misleading: no incorrect inferences can be made with the image. • expressive: all and only the application domain data are represented. • effective: user interpretations are relatively fast, content-rich and less

(8)

A link lacking in some or all of these characteristics may result in incorrect inferences about the source data.

The distinction between textual/verbal and visual communication depicted in Figure 1 was also explored by Shimojima (1999), who concludes that ’graphic’ representations restrict the spectrum of information that can be expressed. This seems to benefit their use as an information-transfer instrument, as it limits the scope users have to use to understand the image (Shimojima, 1999).

This tells us that a good understanding of visual communication requires a look at more than just the empirical results of its use; the underlying mechanisms can not be ignored. Yet, research into visualization seems to favour empirical, design-focused studies. While new visualization techniques and products are regularly developed and tested, there remains a lack of insight into the cog-nitive processes that underpin them: most studies are limited to comparing performance with and without use of visualizations (Novick, 2006). This study seeks to investigate visualization in a more comprehensive manner, in particular the social network diagram.

2.2 Characteristics of diagrams

A diagram can be defined as a symbolic representation of information, making it an abstract form of visual communication. Indeed, Bosveld-de Smet (2005) describes diagrams as "abstract graphical representations that share their ab-stractness with words, and their exploitation of space with other pictorial rep-resentations." This seems to be a key characteristic here. Abstractness strives to express important aspects of information without necessarily trying to convey everything.

Our way of testing the usability of social network diagrams is based on research into classifying and defining the capacity of diagrams to represent information. These studies on visualization investigate both the shape and the underlying mechanisms that facilitate information transfer, with two main approaches: an inferential one that focuses on the effects of a diagram and an analytical one that carefully examines the shape of a diagram.

2.2.1 An inferential approach

In a tutorial on graphical representations Shimojima (2004) compares visual representations and textual/verbal/symbolic representations using an approach based on formal logic and natural language. The result is a list of facilitating effects diagrams have on information transfer:

• Free ride properties: some types of diagrammatic representations may enable the expression of secondary information along with the primary data it is meant to represent.

• Auto-consistency: a diagrammatic representation is unable to express in-consistent information.

• Specificity: if the representation expresses certain types of information, it must also express other, additional information.

(9)

• Meaning derivation properties: the representation’s ability to express se-mantic contents that can only be derived from other, more basic sese-mantic content

To illustrate this with an example, consider a standard node-link diagram. A line between two nodes expresses a link. The amount of lines connecting a single node to other nodes also expresses something about that nodes relative importance or activity (Free ride properties). If the lines connecting the nodes are colored, this can express what type of link they represent. But at the same time this also, unavoidably, tells those observing the whole diagram which type of link is most prevalent across the whole network (Specificity). It is also impossible to draw the arrows in an incorrect way, since all that it needs to be is a line connecting two points (Auto-consistency ). The main takeaway from this is that according to Shimojima et. al. an image cannot be inconsistent, as opposed to a verbal/textual expression of the same information.

2.2.2 An analytical approach

As with other forms of visualization, a rigorous methodology is required to ana-lyze and compare diagrams properly. This analytical approach differs from the inferential one Shimojima et al. used above, in that they cover physical attrib-utes rather than effects on information transfer. Hegarty et al. (1991) divide diagrams into three distinct categories: iconic diagrams in which the elements resemble what they represent, charts and graphs that display typically quantit-ative information and schematic diagrams that depict more abstract concepts. The latter category contains the type of diagrams used most for scientific pur-poses, and is therefore the one we are interested in (Novick, 2006).

Seeking a way to classify the structural properties of diagrams, Novick and Hur-ley (2001) and Novick (2006) describe a structural analysis based on 10 static and dynamic properties of hierarchies, matrices and node-link diagrams. These are divided into two broad categories: those that deal with the static appearance of a diagram, and those that deal with the dynamics of information movement in the diagram.

• Static

– Global structure: the general form of the diagram.

– Building block: the basic units of which the diagram is composed. – Number of sets: The number of different object/concept sets that

the diagarm is optimally suited for

– Item/link constraints: the degree to which items can be linked in the diagram.

– Distinguishability: the degree to which items in a set differ to the viewer.

– Link type: the type of relationships the diagram is best suited to display.

(10)

– Absence of a relation: the ability of the diagram to convey in-formation about the non-existence of a possible relationship. • Dynamic

– Linking relations: indicates whether the style of linking between nodes is one-to-many and/or many-to-one.

– Path: the ability of the diagram to display paths connecting at least three items.

– Traversal: the manner in which paths can self-connect.

These properties are used to describe aspects of the graphical domain. These descriptions may then be used to compare the usability of node-link and matrix diagrams.

The two perspectives of diagram investigation discussed in this section are both useful; the perspective of this study is that both appearance of a diagram and the way it facilitates information transfer must be investigated. This multi-faceted approach should provide the most comprehensive answer to our question: why is one diagram more useable than the other?

2.2.3 Usability of diagrams

The goal of this study is to gain more insight into the usability of matrices and node-link diagrams representing social networks.

To compare the two diagram types, it is necessary to define what makes a social network diagram ’good’ at representing the social network. Preferably this is a value (such as ’readability’) that indicates which of the two alternatives is better suited to represent a particular social network. Various authors use different terms and definitions for this value: (Ghoniem et al., 2004) defines readability as the relative ease with which the user finds the information they are looking for. The term preferred by this study is usability, defined as the degree to which a diagram promotes or hinders the user’s ability to make inferences about the social network data it represents. Given this definition, the goal of this study is to determine whether matrix diagrams or node-link diagrams are more usable; with large and/or dense network data, as well as in different tasks. The next step, then, is to devise a way to compare the two diagram types.

Green and Petre (1996) described an approach to evaluate the usability of an interactive system using a list of what they call Cognitive Dimensions. These dimensions are are a way of making general statements about the user-system relationship, they are not meant as a definitive model of the user’s cognitive processes or as guidelines for the design of a system. The authors describe them as Discussion Tools, used to hold a conversation about design instead of being a list of design criteria. While the framework was originally developed for visual programming languages, the authors intend it to be applicable to other systems as well; while it is meant for interactive systems, it provides some handholds for usability analysis in general as well. Versions of the cognitive dimensions modified for the less interactive task used in this study will be used to investigate and explain any differences in usability between the two diagram types.

(11)

The Cognitive Dimensions are terms selected to explain the human-machine interaction aspects of a system as well as the results of this interaction. Green at al. also felt they must be easily understood and coherent with each other in line with their intended purpose as discussion tools. Each dimension, slightly generalized from the original, is listed briefly below:

• Abstraction Gradient : how much of the system does the user need to see to answer the question?

• Closeness of mapping: how useful is knowledge about the ’real’ situation to the user?

• Consistency: how much does the user’s understanding of one problem help solve a subsequent problem?

• Diffuseness: how much symbols does the system need to express inform-ation?

• Error-proneness: what is the chance that the user mis-interprets the sys-tem?

• Hard mental operations: How much does the user need to remember while solving a problem?

• Hidden dependencies: Can the user use the relationship between different aspects of the system to solve problems faster?

• Premature Commitment : can the user choose a strategy that hampers them in solving problems correctly?

• Progressive evaluation: is there a way for the user to check that their approach is correct before settling on a solution?

• Role-expressiveness: how clear is the meaning of a system aspect in ex-pressing what its use is?

• Secondary notation: what different ways does the user have in the system to express meaning?

• Viscosity: how many mental transformations does the user need to make to solve a problem?

• Visibility: how many elements that the user needs to solve a problem are actually visible/distinguishable?

2.3 The application domain: Social networks

While the diagrams themselves embody the graphical domain, the application domain encompasses what we seek to represent with those diagrams: social network data. After a general description of this type of data, we will focus on a particular class of social network: social media such as Facebook and Twitter, as well as social business software such as Microsoft’s Sharepoint and Jive. Afterwards, we will be ready to discuss how the two domains come together before discussing our hypotheses.

(12)

The study of social networks is long established, with the work of Jacob L. Moreno being an early example and relevant research likely pre-dating even that (Freeman, 2000). Social networks describe the presence and nature of relationships between individuals. Each network consists of a set of entities and a relation between these entities.

2.3.1 On-line social networks

We focus specifically on the type of social network that can be found on plat-forms that enable instant communication. This type of social network is typic-ally computer-based, with individuals communicating via messages in varying ways (e-mails, live text chatting etc.) and with varying content (text, images, files etc.). One important point of difference with off-line networks is the explicit nature of the on-line variants: interaction between two users, such as messages sent or membership of the same sub-group in the system, can be measured dir-ectly and completely. By contrast, interactions in real life are harder to quantify in the same manner.

2.3.2 Enterprise social media

The focus of this study is a specific form of social media: Enterprise Social Media (ESM). This focus was chosen because it provides a scenario that can be realistically emulated for this study.

Also known as Social Business Software (SBS), ESM are tools used to facilitate communication and interaction within organizations. The users of ESM are generally all members of the same group or place of employment. In addition, they are only used for communication within the organization; although these companies may have a presence in public social media, communication in that environment is meant for clients, not members (Leonardi et al., 2013). Another important difference is the motivation for the use of social media: whereas users use public versions mainly to keep in touch with each other, ESM is adopted to stimulate knowledge-sharing and cooperation (Richter and Riemer, 2009). ESM applications are typically referred to as platforms to emphasize that they support a range of communication methods: in addition to direct messaging, they may also support micro-blogging (Riemer et al., 2010). The latter involves posting messages in a stream that is available to all users, but not ’served’ to any of them unless they have subscribed to that particular stream. A notable example of a micro-blogging service is Twitter (Riemer et al., 2010).

Use of ESM generates communication data much like public social media net-works and like the latter these data hold valuable information on the community that generates it. While statistical analysis has a significant role in the process of generating insights from these data, their use requires exploration of the data first. This first look at the value of the gathered information may use images and diagrams to represent the data-points in a form that is more easily interpreted by the user.

While our study does not use actual ESM data, this application domain does provide a useful, realistic scenario. ESM networks are limited in both size and

(13)

complexity, which makes them easier to realistically simulate than other social networks.

2.4 Social Network Diagrams

Exploring ESM data, much like with any other data from social networks, can be aided by using diagrams. Proper social network diagrams allow researchers to discover patterns that they would otherwise have missed (Freeman, 2000). One way to represent the information from a network graphically is an adjacency matrix in which the cell at the cross-section of the target and the source entity contains information on the nature of the relationship. This can be as simple as a ’1’ if the relationship exists and a ’0’ otherwise, or more detailed symbols rep-resenting more complex information. Alternatively, an edge list simply consists of a list of each pairing of nodes (see Figure 2a). Transforming the data this way makes it less intelligible, but it is more easily expanded for larger networks (Butts, 2008). The edge list can also easily be transformed into a diagram via computer software (see Figure 2b).

(a) Edge list & Adjacency Matrix

(b) Node-link & matrix diagrams

Figure 2: Tansformation of network data into diagrams

This study focuses on two types of diagrams that can be used for social networks, which differ in both design and use.

(14)

2.4.1 Node-link diagrams

The use of node-link diagrams to study social networks can be traced back to 1932, the earliest known examples being hand-drawn graphs (Freeman, 2000). In its most basic form, node-link diagrams represent persons as circles, and the existence of a relationship between two persons as a line connecting the relevant circles. This style of diagram is easy to understand, as it uses visual language that is common and requires little instruction to comprehend. Indeed, few students will leave secondary education without having encountered such a diagram (Novick and Hurley, 2001).

But this style of diagram has disadvantages as well. Node-link diagrams get cluttered when the amount of nodes and links increase; the lines will begin to overlap and increase in length to connect nodes across larger distances (see Figure 3). While the nodes can usually be kept distinguishable from each other, this becomes impossible for the links as link density increases (Ghoniem et al., 2004).

This inherent tendency to become cluttered is not present in an alternative representation method: matrix diagrams.

2.4.2 Matrix diagrams

Matrix diagrams employ a more abstract approach than node-link diagrams do. Matrices are grids of cells, with each cell containing an equal amount of informa-tion. In addition, the position of a cell in the matrix also represents informainforma-tion. Data in the cells may be represented by numbers, words or symbols, but also by coloring the cells. Parts of the matrix may be set apart with dividing lines as well.

In a matrix diagram used for social network representation, persons are repres-ented by rows and columns in a grid. Each person is represrepres-ented by both a row and a column; the square at the intersect between a row and column represents the relationship between the two persons connected to that row and column. At first glance this diagram type is more difficult to use, with each element containing less meaning and thus putting a higher burden on the user’s abil-ity to comprehend it. In addition, few potential users are familiar with matrix diagrams: users typically experience difficulties in using them without prior ex-perience (Kriglstein et al., 2018).

However, recent evidence suggests that a matrix diagram might be the super-ior choice to depict denser social networks: Kriglstein et al. (2018) investigate how users transform node-link diagrams to matrices and vice-versa, and con-clude that matrix diagrams allowed for more creativity in representing the links found in a social network. These findings suggest there is a place for matrix dia-grams in social network representations, leading to the question whether they might compare favourably to node-link diagrams at least in some situations.

(15)

(a) Node-link diagram (b) Matrix diagram

Figure 3: Comparison of two diagram types for the same network containing 50 nodes and 400 links (Ghoniem et al., 2004)

2.5 Comparing node-link and matrix diagrams

While node-link and matrix diagrams can both be used to represent most of not all data from a social network, they differ greatly in how this is done, making direct comparisons difficult. One inferential approach is to let test subjects use the diagrams, and then see which type of diagram performs best as an inform-ation transfer tool.

Ghoniem et al. (2004)’s research into readability of matrices and node-link dia-grams centered on asking participants to complete a series of tasks while viewing a diagram. These tasks required the participant to either identify individual ele-ments, assess the overall counts of a type of element or find a series of connected elements. If these tasks can be completed speedily and correctly by users, then the diagram in question can be taken as very readable.

Ghoniem et al. conclude that the matrix diagram was better suited to represent large or dense diagrams, though they also stated that node-link diagrams were still superior for more limited networks. Notably, they remark that matrix style representations seem under-used for this purpose considering its relative simplicity and suitability to certain tasks. They also suggest wider use might diminish the negative effects of unfamiliarity on the readability of the matrices. Another, more recent work that looks at both node-link and matrix diagrams is Kriglstein et al. (2018). Assuming that the two diagram types complement each other as pointed out by Ghoniem et al. (2004), the authors study the implic-ations arising from that combined use. Specifically, they seek to analyze how users transformed one style of diagram into the other.

Using a set of networks differing from each other in complexity, the authors create pairs of diagrams consisting each of one matrix and one node-link dia-gram. In an experimental setting, the diagrams are presented to test subjects. For each diagram, they are asked to draw the corresponding other diagram type based on the diagram they saw.

(16)

in-terpreting the process of how the participants create them, the authors describe a number of patterns they uncover: the aesthetics of both diagrams become more important as the number of links in the source network increase, but the order of the different links and link-types has little influence on the way the participants process them. They conclude that whereas node-link diagrams are fairly uniform in how participants think they should appear, matrix diagrams are afforded more flexibility in appearance.

In this study, the flexible nature of matrix diagrams is studied by integrating several types of additional information: size of the network (few or many people in the network), number and type of persons in the network, number and type of messages between persons, and their activity in the network. This allows us to compare the two diagram types with respect to the ease with which users can interpret the information they represent. It also provides variety; differ-ent combinations of these variables will create differdiffer-ent diagrams to use in an experiment.

2.6 Hypotheses

Based on the work of Ghoniem et al. (2004) described above and the research methods described alongside it, we predict that matrix diagrams will outper-form node-link diagrams under specific circumstances. To check this prediction, we have created a task involving questions that must be answered using a social network diagram. The network diagrams differ from each other in their compos-ition, while the questions cover different types of actions the participant must take to answer them. The prediction about how these varying circumstances will affect the usability of the diagrams forms the basis of our hypotheses below, before we proceed to describe the experimental study to test these hypotheses. 2.6.1 Main hypothesis

H0

Matrix diagrams and node-link diagrams are equally usable for the represent-ation of social networks involving 50 persons (i.e. a large network) and those involving 20 persons (i.e. small networks).

HA1

Matrix diagrams are more usable than node-link diagrams for the representation of social networks involving 50 persons (i.e. large networks).

HA2

Matrix diagrams are less usable than node-link diagrams for the representation of social networks involving 20 persons (i.e. small networks).

2.6.2 Sub-hypotheses

In order to test the main hypothesis, this study first tests a number of more specific sub-hypotheses. These make use of the following information that is included in the social network represented:

(17)

• number of persons in the network: 20 (small networks) or 50 (large net-works)

• number of messages between the persons in the network

• activity of the persons in the network, based on the number of their out-going messages (high versus low activity)

• type of messages between persons in the network (work-related (ask a question versus answer a question) versus non work-related)

The sub-hypotheses are listed below. The names of the hypotheses follow a specific pattern that makes clear the kind of observation the participant may adopt in order to be able to interpret the diagram correctly. The first letter of each name indicates the level of focus for a question: Local or Global. A local level of focus means that participants have to look for specific employees and/or relations between employees in the diagram, whereas at aglobal level the participants are expected to look at groups of these elements without checking their identities or number.

The second letter indicates the size of the network: Large (50 employees) or Small (20 employees). The number indicates which specific type of question the hypothesis refers to, as indicated in the summary descriptions above. For example: GS2 refers to a question that requires the participant to look at a Global level at a Small network, with the ’2’ referring to questions involving activity categories.

• Local: Participant must look for 1 to 3 employees and the messages they send or receive.

– LL1: If the network is large, participants can establish which message type is sent between two employees better when looking at matrix diagrams, compared to node-link diagrams.

– LL2: If the network is large, participants can answer questions about the activity of individual employees equally well when looking at matrix diagrams and node-link diagrams.

– LS1: If the network is small, participants can identify the type of a message between two employees better when looking at node-link diagrams, compared to matrix diagrams.

– LS2: If the network is small, participants can answer questions about the activity of individual employees equally well when looking at matrix diagrams or node-link diagrams.

• Global: participant must look for and compare majorities and minorities of employee activity and message types across the entire diagram.

– GL1: If the network is large, participants can answer questions about the majority message type better when one of the message types is more prevalent in a matrix diagram compared to the same circum-stances for node-link diagrams.

– GL2: If the network is large, participants can answer questions about majority activity better when looking at matrix diagrams, compared to node-link diagrams.

(18)

– GS1: If the network is small, participants can establish which mes-sage type is in the majority better when one of the mesmes-sage types is more prevalent and looking at matrix diagrams compared to the same circumstance for node-link diagrams.

– GS2: If the network is small, participants can answer questions about majority employee activity equally well when looking at matrix dia-grams, compared to node-link diagrams.

2.6.3 Hypothesis summary

Table 1: Listed for each hypothesis is which diagram is expected to yield the fastest response times and greatest accuracy for either network size: M = matrix, NL = node-link, Eq = equal performance

Hypothesis Description Large Small L_1 Message type between two employees M NL L_2 Activity category of one employee Eq Eq G_1 Majority message type M M G_2 Majority activity category M Eq

(19)

3 Method

3.1 Experimental design

In order to study and compare the usability value of the two diagram types, we use an empirical, participant-based experiment. We expose a group of test subjects to a series of social network diagrams, with their task being to answer one question for each diagram. Aside from diagram type (node-link or matrix), the diagrams also vary in size and the relative amount of questions sent. The questions target the social network represented by the diagram. They differ in what elements the participants must look for in the diagram, and what they must do with the identified elements to answer the question.

The aspects in which the diagrams and questions differ serve as the independent variables, while aspects of the participant’s answers are the dependent variables. Independent variables

• Network type: matrix or node-link • Network Size: 50 or 20 employees

• Proportion of communication types: the percentage of non-green messages that are colored red (indicating questions) rather than blue (indicating answers); either 20%, 50% or 70%.

• Number of employees targeted by question: either 1, 2 or all

• Topic of the question: whether the question asks the participant to identify the colors of the employee nodes (which indicate their activity), the colors the links between them or both.

Dependent variables

• Answer response time: how fast the question is answered after the diagram is displayed

• Answer Correctness: whether the answer to the question was correct To give the participants a concrete situation to use when interpreting the dia-grams, we created a scenario that frame the networks as usage patterns of a Social Business Software communication program. This scenario can be found in Appendix A.

3.2 Participant description

3.2.1 Demographics

Our preferred participant has already encountered social network diagrams at least a few times in her/his career, so that the concept is at least familiar to her/him.

Participants have to pass two main requirements before being allowed to parti-cipate: they have to be able to read English at at least a moderate level (in order for them to understand the instructions), and they must be able to distinguish

(20)

colors normally (in order to use the diagrams effectively). In total, 40 parti-cipants generated valid data, 20 of which identified themselves as female and 20 as male. 3 additional participants’ data were judged invalid due to technical errors in the display of the question screens.

3.2.2 Recruitment

Participants were recruited via messages posted to social media sites, posters posted to bulletin boards around university buildings as well as via personal interactions among acquaintances and university course groups. They were rewarded with a choice of candy and/or fruit, at an estimated average value betweene1,00 and e4,00 per person.

3.3 Materials

3.3.1 Social Networks

Figure 4: Example of a generated social network

The diagrams used for the experiment were based on artificial networks created for this study (an example can be seen in Figure 4). The generated networks vary along three axes:

• The amount of employees in the network is either 20 or 50. The employees are represented as circles in the node-link diagram and as squares in the

(21)

left column and bottom row in the matrix diagram. They are colored ac-cording to their activity level, with employees that send out more messages being colored darker.

• The percentage of messages, represented as lines in node-link diagrams and as cells in matrix diagrams, that are non-work related. Three different values are used: 70%, 50% and 20%.

• The percentage of work-related messages that are questions. Again, three different values are used: 70%, 50% and 20%.

For example: 50% of a network’s messages might be non-work related. If the percentage of questions in that network is 50% as well, that means 25% of all the sent messages are questions. The remainder are considered answers.

The different combinations of these three variables result in 36 different net-works, 30 of which have been used in our experiment.

A custom script has been written in Python using the NetworkX package (Hag-berg et al., 2008), and a directed configuration model has been used to generate an edge list. These scripts may be found in Appendix C. In order to simulate a network with varying communication activity among the employees in that network, half of the employees are designated as knowledge holders: these are more likely to send messages. The remaining employees are designated know-ledge seekers. Having assigned one such role to every employee, the required number of links are divided using the role of each employee by generating an in-degree and out-degree count for each employee.

As the networks represent communication patterns, all messages are directed; each message has a source and a target employee. In order to enable the use of matrix diagrams, parallel messages in the same direction are removed. Self-loops (connections from an employee to her/himself) are also removed as they hold no practical meaning for the stated experiment. As a final step, the messages are assigned different meanings: the three categories of messages are non-work related, questions and answers. Each message can only belong to one of these categories. Which category of messages is in the majority varies according to the two message percentage variables that were mentioned earlier.

3.3.2 Diagrams

The social network information (our application domain) with the various vari-ations as generated with the method described above are converted into dia-grams (our graphical domain) using third-party software.

Gephi is used to create traditional node-link diagrams (Bastian et al., 2009) while the Yifan hu Proportional clustering algorithm is used to create ordered employee clusters with a semi-random layout. These diagrams follow the clas-sic node-link layout: employees are represented by nodes, whereas messages between them are depicted by links between nodes. The direction of messages is denoted by arrowheads on the lines pointing towards the message receiver. Matrix diagrams are created in Microsoft Excel. The matrix diagrams depict

(22)

the same information that is represented in the node-link diagram, though in a different format: the employees are represented by the squares in the left-most column and the bottom-left-most row. A square at the cross-intercept of two employees denotes a message sent between them, with the employee at the left being the message sender. Examples of both diagram types can be seen in Figure 5.

(a) node-link diagram (b) matrix diagram

Figure 5: The two diagram types. Note that both diagrams are based on the same network data in the application domain.

The coloring scheme of the diagrams is the same for both diagram types. The employees are colored salmon or orange based on their activity level: the amount of messages an employee has sent. The color of the messages is determined by the message type they represent: Non-work related messages are green, whereas questions and answers are red and blue, respectively.

One important difference between our images and the diagrams as used by Ghoniem et al. (2004) pertains to interactivity; whereas the latter used images that respond to the participant’s mouse position by highlighting relevant nodes and links, our images are static. While this may result in decreased usability and clarity of the diagrams at a local level, we believe it also constitutes a much more realistic scenario; aside from the impossibility of translating such interactivity to a paper medium, most images present on both web-pages and electronically transmitted documents are static.

3.3.3 Tools

The experiment has been executed in the Qualtrics (2018) online survey tool, an online platform that can be accessed through any web browser.

3.4 Questions

Whereas Ghoniem et al. (2004) asked their participants to fulfill 7 tasks for each diagram they presented, we pose one question for each diagram. Each question

(23)

only appears with one diagram; to ensure the questions are not too difficult, each question was formulated with a specific diagram in mind.

3.4.1 Aspects targeted by questions

The questions are ordered in categories according to two criteria: the level at which the participant must investigate the graph, and the nature of the elements they must study to answer the questions:

• Level

– Local focus: the characteristics of 1-3 elements must be studied, which means the participant must ’zoom in’ on a specific part of a diagram.

– Global focus: the characteristics of many or all elements must be studied, which means the participant must ’scan’ the diagram and form an overview of the whole image.

• Element

– Node color: the activity of the nodes – Link color: the type of the messages

– Combination: both node and link colors must be studied

Each pair of node-link and matrix diagram is associated with one question, in order to study the effect the diagram type has on the participants’ ability to answer it. Each question is thus posed twice.

3.4.2 Question design

Each question screen in the survey (see Figure 6 for an example) consists of the following elements: a diagram, the appropriate legend to remind participants of how the representations work, the question with answer buttons and a NEXT button. The latter will let the participant proceed to the next page with a new diagram and question, once they have answered the current question. All questions use a Yes/No answer format, with the addition of a ’I do not know’ option. While either Yes or No is the correct answer for every question, we think omitting this third option would encourage guessing the answer for questions that require more work. Furthermore, the use of this third option provides an additional measure for difficulty.

(24)

Figure 6: A question as posed in our experiment. Note the addition of a legend explaining the meaning of each element

Each question is accompanied by an invisible (to the participant) element that produces several timing-related measures: time elapsed before first mouseclick, time elapsed before the last mouseclick, and time elapsed before the page is submitted. We consider the last measure to be the most significant, since it records the time the participant required to produce an answer which in turn may indicate how difficult they found the question.

3.5 Experiment procedure

3.5.1 Experiment conditions

Participants take part in the experiment in a controlled environment with no other activities occurring in the same room. This room varied in size for tech-nical reasons, but the set-up was always identical to the one displayed in Figure 7.

(25)

Figure 7: Typical experiment set up. 3.5.2 Procedure

Participants are welcomed into the room by the experimenter, who takes them through the procedure using a script:

Experimenter procedure 1. He/she signs a consent form.

2. He/she is asked whether they can read English well enough, and whether he/she has any eyesight deficiencies.

3. A brief overview is given of the different phases of his/her participation and what their task will be.

4. He/she is advised to turn off or silence any devices that may interrupt them.

5. He/she is asked whether he/she has any questions, and advised that they may ask any questions until the main phase of the experiment, after which they are expected to remain quiet.

6. He/she is then asked to start the experiment.

7. After the experiment is completed, the participant undergoes a short scrip-ted interview with the experimenter to document their experience. The experimenter remains present in the same room during the whole proced-ure to ensproced-ure that distractions, both from outside sources and caused by the participant, are minimized. The task itself is detailed below:

Participants procedure

1. The participant’s demographic information is retrieved using a short form 2. The scenario of the experiment is explained through text.

(26)

3. Instructions explaining how the diagrams are to be used in the task and what their components represent are given. Each diagram type is ex-plained separately, along with a legend that displays the meaning of the diagram components.

4. Each participant is then given four practice questions, selected to be relat-ively straightforward, with the diagrams chosen to represent the spectrum of diagrams available. If the participant answers any of the practice ques-tions incorrectly, they are given feedback on their error and instructed to choose the correct answer; the participant cannot proceed unless the correct answer is selected.

5. Once all practice questions have been passed, the main experiment begins; the participant is presented all 30 questions, in a randomized sequence. 6. After all questions have been answered the experiment ends with a message

thanking the participant, as well as an opportunity for them to leave their e-mail address should they want to be informed about their score and the results of the study.

3.6 Lessons from pilot

A pilot study conducted before the main study inspired several points of im-provement:

• Some questions were considered too difficult or time consuming, leading us to shorten the experiment by removing some of the more difficult ques-tions.

• Participants requested a more detailed explanation before and after the experiment.

• The matrix diagram was re-designed to be more intuitively readable by moving the node row to the bottom of the diagram.

• The images were re-sized so that elements appeared to have the same size in both the larger and smaller network variants.

(27)

4 Results

In total, 43 participants completed the tasks. The data of three participants were removed during data-gathering because of technical difficulties in the soft-ware used to display the questionnaire. All results from the remaining 40 par-ticipants were used for the descriptive and inferential statistics in this chapter.

4.1 Participant Demographics

Leaving out the results from the rejected participants, the data set includes the results of 40 participants, 20 of which identified as female and 20 as male. When asked to list their occupation 7 participants described themselves as employed whereas 12 were Bachelor students, the remaining 21 being Master students. As can be seen in Tables 2 and 3, most participants are 30 or younger, with a mean age of 25.

Min Max Mean Std. Dev. Age (yrs) 18 61 25.426 7.5

Table 2: Statistics for participant age Age group Female Male <= 20 3 1 21 - 30 16 15 31 - 40 1 2 40 - 50 0 1 50 - 60 0 0 60 - 70 0 1 Total 20 20

Table 3: Gender distribution per age group

4.2 Descriptive statistics

4.2.1 Overview of measures and performance

In this section we provide an overview of the distribution of the performance data across the entire data-set. The impact each variable had on the performance of the participants, individually and in combination with others, is discussed further below.

Note that these data concern the 30 questions of the main phase only: per-formance for the instruction and practice segments of the task are not included. There are four performance measures:

• Response time: the response time per question between display of question and the participants’ answer, measured in seconds.

(28)

– Total Response time: summed response times per question of all participants for one question in minutes.

• Correctness: Correctness of the answer to one question, either 1 (Correct) or 0 (incorrect).

– Error rate: the error count per question.

Both Error Rate and Total Response Time are summed derivatives of Correct-ness and Response time respectively. This means that the 40 answers (one from each participant) given to each question are collapsed into a single value: 1200 measures become 30. The Error rate can be 40 at most when measured across questions, since that is the number of participants that have answered each question.

Since the sample sizes for Error Rate and Total Response Time are much smal-ler compared to Response Time and Correctness, they were not included in the statistical analysis.

Table 4 lists distribution data for these measures when summed across all ques-tions. Similar information summed across participants can be found in Ap-pendix D.

For each question there are three possible answers: ’Yes’, ’No’ and ’I don’t know the answer’. For our data-set, any answer but the correct one was considered incorrect, with the error rate thus reflecting two different types of answer. The ’I don’t know’ option was used 66 times across the data-set, with questions L.2.2_N and L.5.7_N causing the majority of these answers with 20 and 15 total errors respectively.

Min Max Mean Std. Dev. Nq

Total Response Time (mins) 6.3 43.9 19.6 9.3 30 Error Rate 0 27 5.8 6.46 30

Table 4: Statistics for participant performance across questions

Participants take an average of 15 minutes to complete the main phase of the experiment, and they make an average of 5.8 mistakes while doing so. This distribution is illustrated in Figure 8.

(29)

Figure 8: Distribution of summed performance per question 4.2.2 Diagram type

The main variable we investigated the influence of is the type of diagram used: either node-link diagram or matrix. As seen in Table 5, the participants need roughly half a minute on average to answer a question. Questions involving matrix diagrams seem to be answered around 4 seconds faster on average, while the higher standard deviation indicates that response times are less consistent as well. The accuracy of the answers is slightly higher as well.

Measure Diagram Type Min Max Mean Std. Dev. Np∗q

Response time (s) Matrix 3.50 136.29 26.83 19.88 40x15 Node-link 3.12 585.89 31.93 32.76 40x15 Correctness Matrix 0 1 0.88 0.33 40x15 Node-link 0 1 0.83 0.38 40x15 Table 5: Performance for the two types of diagram

Note that the sample size indicates how many data points each category consists of: in Table 5, the sample size for matrix diagrams is 40 participants x 15 questions is 600.

(30)

4.2.3 Network aspects

Except for the type of diagram, there is one other variable that determines the appearance of the diagrams: size. It should be noted here that not all combinations of variables have the same amount of corresponding questions; the sample size for each group of measurements is listed under N in Table 6. Size

Diagram size can be either 20 nodes (small) or 50 nodes (large). The per-formance differences between the two sizes can be seen in Table 6: size seems to influence correctness more than response time, and it does so in line with expectations: questions involving smaller networks seem to be answered more quickly.

Measure Network size Min Max Mean Std. Dev. N Response Time (s) Small 4.18 161.77 28.46 20.87 40x18

Large 3.116 585.89 30.76 34.58 40x12 Correctness Small 0 1 0.88 0.32 40x18 Large 0 1 0.81 0.39 40x12 Table 6: Performance for the network size

Proportion of communication types

There are three types of communication present in the social networks: • Non-work related

• Questions (work-related) • Answers (work-related)

The Proportion of communication types (pQuestion) indicates what percentage of the work-related messages are questions rather than answers. This percent-age can be either 20, 50 or 70 %. To illustrate: if pQuestion is 50%, that means there are just as many questions as answers in the network.

The ratio work/non-work messages varies with the same 20/50/70 % propor-tions, resulting in various combinations of the three message types. For ex-ample, if pQuestion is 70 % and the percentage of non-work messages is 20%, that means most messages in the network are questions.

As seen in Table 7, the proportion of communication types affects performance in a linear manner: larger concentrations of question messages in the network decrease response times and increase mean correctness.

(31)

Measure Prop. Comm. Type Min Max Mean Std. Dev. Np∗q Response Time (s) 20 % 4.24 585.89 35.53 37.37 40x10 50 % 4.18 137.5 25.94 17.23 40x10 70 % 3.12 161.77 26.67 21.77 40x10 Correctness 20 % 0 1 0.81 0.39 40x10 50 % 0 1 0.86 0.35 40x10 70 % 0 1 0.90 0.31 40x10 Table 7: Performance for the proportion of communication types

4.2.4 Question aspects

Two variables affect the questions posed to participants.

Target employees Target employees refers to the number of employees the participant needs to look at to answer the question. This may be either 1, 2 or all employees, the latter referring to questions that ask the participant to look at all employees depicted. The employees are represented as nodes in the node-link diagram, and as 2 square cells in the left-most column and the bottom row. The results for this variable are shown in Table 8. N target employees seems to influence performance with longer response times and lower correctness when more employees need to be looked at. In particular, questions involving just one employee are answered far more accurately and fast. Another notable difference is the difference in standard deviation between ’all’ and the other two values: having to look at all employees seems to make response time much less consistent, even though correctness is not affected in the same way.

Measure N target employees Min Max Mean Std. Dev. N Response Time (s) 1 3.12 79.62 17.97 12.02 40x6 2 4.24 137.5 29.92 19.62 40x10 All 4.18 585.89 33.88 34.28 40x14 Correctness 1 0 1 0.96 0.20 40x6 2 0 1 0.85 0.36 40x10 All 0 1 0.82 0.39 40x14 Table 8: Performance for number of target employees

Question topic

The topic of each question can be either activity category of the employee (the color of the nodes / left and bottom squares), message type (the color of the link) or both in one question. The influence of Question topic is ambiguous; while activity category questions are answered faster on average, the difference in accuracy is less pronounced.

(32)

Measure Topic Min Max Mean Std. Dev. Np∗q

Response Time (s) Activity category 3.12 585.89 26.95 33.44 40x14 Message type 4.20 135.62 30.38 18.66 40x12 Both types 9.92 161.77 34.88 23.37 40x4 Correctness Activity category 0 1 0.84 0.37 40x14

Message Type 0 1 0.86 0.35 40x12 Both Types 0 1 0.89 0.31 40x4 Table 9: Performance for question topic

4.2.5 Interaction

The interaction between the different variables and Network Type is a possible explaining factor for the performances measured. For example: since questions involving small diagrams and matrix diagrams individually result in the best performance, we expect small matrix diagrams to lead to the best performance of the 4 possible combinations. If this is not the case, there may be an interaction effect we should check for. Tables 10 to 13 display performance statistics for the interactions between Network Type and other Independent Variables involved in the experiment: vs. Size in Table 10, vs. pQuestion in Table 11 and vs. question Topic in Table 13.

In Table 10, mean Response Time sees a greater increase for matrix diagrams in Large diagrams, as expected. There is also a corresponding decrease in Cor-rectness in these circumstances.

Measure Type Size Min Max Mean Std. Dev. Np∗q

Response Matrix Large 3.50 136.29 28.24 23.12 40x6 Time (s) Small 4.18 124.90 25.89 17.37 40x9 Node-link Large 3.12 585.89 33.28 43.01 40x6 Small 4.92 161.77 31.03 23.61 40x9 Correctness Matrix Large 0 1 0.84 0.37 40x6 Small 0 1 0.91 0.29 40x9 Node-link Large 0 1 0.78 0.41 40x6 Small 0 1 0.86 0.35 40x9 Table 10: Performance for the interaction between network type and size As shown in Table 11, the influence of pQuestion on Response time is not linear for matrix diagrams: 50% questions leads to the fastest response here. The influence on Correctness is more like the individual effect of pQuestion: Correctness increases as the percentage of questions increases.

(33)

Measure Type pQuestion Min Max Mean Std. Dev. Np∗q Response Matrix 20 % 4.24 136.29 34.41 24.53 40x5 Time(s) 50 % 4.18 72.57 21.18 10.63 40x5 70 % 3.50 135.62 24.90 19.56 40x5 Node-link 20 % 4.92 585.89 36.65 46.86 40x5 50 % 6.69 137.5 30.71 20.90 40x5 70 % 3.12 161.77 28.43 23.70 40x5 Correctness Matrix 20 % 0 1 0.85 0.36 40x5 50 % 0 1 0.87 0.34 40x5 70 % 0 1 0.93 0.26 40x5 Node-link 20 % 0 1 0.78 0.42 40x5 50 % 0 1 0.85 0.36 40x5 70 % 0 1 0.87 0.34 40x5 Table 11: Performance for the interaction between network type and proportion of communication types

The interaction of N Target Employees with Network Type is summarized in Table 12. For both Response Time and Correctness the difference between 2 and All is much smaller for node-link diagrams than for matrix diagrams, whereas the difference between 1 and 2/All is more like the linear individual effect of N target employees for both network types.

Measure Type N target employees Min Max Mean Std. Dev. Np∗q

Response Matrix 1 3.50 58.60 15.46 8.97 40x3 Time (s) 2 4.24 79.44 24.97 13.25 40x5 All 4.18 136.29 33.04 24.35 40x7 Node-link 1 3.12 79.63 20.49 14.03 40x3 2 4.92 137.5 34.87 23.38 40x5 All 4.20 585.89 34.73 41.95 40x7 Correctness Matrix 1 0 1 0.97 0.18 40x3 2 0 1 0.89 0.31 40x5 All 0 1 0.84 0.37 40x7 Node-link 1 0 1 0.95 0.22 40x3 2 0 1 0.8 0.40 40x5 All 0 1 0.8 0.40 40x7 Table 12: Performance for the interaction between network type and N target

employees

Table 13 shows the effect of the question topic on performance differences between matrix and node-link diagrams. Performance on Message type identi-fication seems steady between the two types, whereas Activity category identific-ation is done faster and with higher correctness on average for matrix diagrams. One can also see that response times for Activity category identification are much less consistent for node-link diagrams.

(34)

Measure Type Question Topic Min Max Mean Std. Dev. Np∗q

Response Matrix Activity cat. 3.50 136.29 23.45 19.79 40x7 Time (s) Message type 5.74 135.62 30.30 20.60 40x6 Both types 9.92 83.3 28.27 15.89 40x2 Node-link Activity cat. 3.12 585.89 30.45 42.72 40x7 Message type 4.20 86.86 30.47 16.55 40x6 Both types 10.40 161.77 41.49 27.54 40x2 Correctness Matrix Activity cat. 0 1 0.89 0.31 40x7 Message type 0 1 0.85 0.35 40x6 Both types 0 1 0.91 0.28 40x2 Node-link Activity cat. 0 1 0.79 0.41 40x7 Message type 0 1 0.86 0.35 40x6 Both types 0 1 0.88 0.33 40x2 Table 13: Performance for the interaction between network type and question

topic

4.3 Inferential Statistics

4.3.1 Single Independent Variable

The summaries below list the results of statistics tests on the effect of single Independent Values on three performance measures. The goal is to statistically test the independent effect of the variables explored in the descriptive analyses. For example, the effect of diagram size on Response time (see Table 14) may be considered significant: the p-value for this listing illustrates the probability that the differences in performance between the two size classes are the result of pure chance. In this case it is just 0.00116, and since this is well above the usual Alpha limit of 0.05 we can view this difference as significant. The same conclusion may be drawn for correctness given the corresponding listing in Table15.

Response Time: ANOVA

Table 14 shows significance test results for the response times. The Analysis Of Variance test has been used because of the nature of the measure (continuous) and the variables (categorical).

All variables seem to have a significant impact on the response time, except for Size. The latter has a p-value well above 0.05, indicating that it’s less certain that this influence was not caused by chance.

Independent Variable N1 / N2 / N3 dfDV dfIV f-Value p-value

Type 40x15 / 40x15 1198 1 10.61 0.00116 Size 40x12 / 40x18 1198 1 2.063 0.151 N target employees 40x6 / 40x10 / 40x14 2 1197 30.24 1.54e-13 Topic 40x14 / 40x4 /40x12 2 1197 5.877 0.00288 Table 14: ANOVA test summaries for influence of individual Independent Vari-ables on Response times

(35)

Correctness: Wilcoxon

Tables 15 and 16 show accuracy results for the four variables, with the influence of Size being significant in this case. The impact of Question topic is tested as not being significant however, with a p-value well above 0.05.

Independent Variable N1 / N2 Test statistic p-value Effect Size

Type 40x15 / 40x15 V = 4270.5 0.005372 0.08 Size 40x12 / 40x18 W = 160560 0.0006438 0.10 Table 15: Wilcoxon signed rank test summaries for influence of individual In-dependent Variables on Correctness

Independent Variable N1 / N2 / N3 chi-squared df p-value Effect Size

N target employees 40x6 / 40x10 / 40x14 27.202 2 1.239e-06 0.14 Topic 40x14 / 40x4 /40x12 2.8549 2 0.2399 0.03 Table 16: Kruskal-Wallis rank sum test summaries for influence of individual Independent Variables on Correctness

4.3.2 Interaction between independent variables

Having tested the effects of several individual variables on performance, the next step is to test any interactions effects that may exist between variables that had a significant impact on the performance measures.

Response Time: Factorial ANOVA

IV Sample Sizes dfDV dfIV f-Value Pr(>F)

Type 40x15 / 40x15 1194 1 10.61 0.0012 Size 40x12 / 40x18 1194 1 2.08 0.15 Type : Size 40x6 / 40x9 / 40x6 / 40x9 1194 1 0.0009 0.975862 Table 17: Factorial ANOVA test summaries for the influence of the diagram type - size interaction on Response time.

A notable result of the Interaction analysis can be seen in Table 18: while both diagram type and the number of Employees in the question have a significant effect on response time, the interaction between them does not. This indicates that the impact of one of these variables is not dependent on the other. In other words, the number of employees the participant has to look for affects the response time just as much when looking at a matrix diagram as when looking at a node-link diagram.

(36)

IV Sample Sizes dfDV dfIV f-Value Pr(>F)

Type 40x15 / 40x15 1194 1 11.16 0.0008608 N Emp 40x5 / 40x10 / 40x14 1194 2 30.59 1.108e-13 Type : N Emp 40x3 / 40x5 / 40x7 / 40x3 / 40x5 / 40x7 1194 2 2.82 0.0601812 Table 18: Factorial ANOVA test summaries for the influence of the diagram

type - N target employees interaction on Response time Correctness: Factorial logistic regression

Tables 19 and 20 list the results of tests for the interaction effect on correctness. As with response time, the interaction effect is tested between Type and each of four other variables individually. Table 19 shows a p-value above 0.05 for the interaction between the two variables. This means there is no significant interaction effect between either Size and Type that affects correctness. The same is true for N target employees and Type, as seen in Table 20.

IV Sample Sizes Estimate Std. Error t value P r(> |t|) (Intercept) 0.84 0.02 37.25 <2e-16 Type-Node-link 40x15 -0.06 0.03 -1.83 0.0682 Size-Small 40x18 0.06 0.03 2.19 0.0287 Type-Node-link : Size-Small 40x9 0.01 0.04 0.34 0.7364 Table 19: Factorial logistic regression summaries for the influence of the diagram

size - type interaction on the correctness.

IV Sample Sizes Estimate Std. Error t value P r(> |t|) (Intercept) 0.97 0.03 30.45 < 2e-16 Type-Node-link 40x15 -0.02 0.04 -0.37 0.710522 nEmp-2 40x10 -0.08 0.04 -1.91 0.056464 nEmp-All 40x14 -0.13 0.04 -3.45 0.000577 type-Node-link : nEmp-2 40x5 -0.07 0.06 -1.29 0.19682 type-Node-link : nEmp-All 40x7 -0.02 0.05 -0.36 0.722670 Table 20: Factorial logistic regression summaries for the influence of the diagram

type - N target employees interaction on the Correctness. Summary

When looking at the interaction analysis, it seems that none of the interaction effects tested for had a significant impact on the participants’ performance. This indicates that the variables influence the performance of the participants independently, without this intrinsic influence being affected by the Type of diagram.

Table 21 summarizes what the statistical analysis says about the impact of several variables on performance.

(37)

Variable Value Best Resp. Time Best Accuracy

Size Small Matrix Matrix*

Large Matrix Matrix* Proportion of communication types 20 % Matrix Matrix

50 % Matrix Matrix 70 % Matrix Matrix N target employees 1 Matrix* Matrix 2 Matrix* Matrix All Matrix* Matrix Question topic Activity category Matrix* Matrix*

Message Type Equal* Equal* Both Types Matrix* Matrix* Table 21: Summary of the performance of matrix diagrams vs. node-link

(38)

(39)

5 Discussion

5.1 Conclusions

Our main hypothesis is that matrix diagrams have greater usability than node-link diagrams when representing large networks. This has been confirmed in a more comprehensive sense than we anticipated, as matrix diagrams were more usable in all conditions. Table 22 summarizes the status of our sub-hypotheses following our findings. The high performance induced by matrix diagrams likely overshadows the expected reverse case of node-link diagrams performing better with small diagrams, we will give a possible explanation for this effect below.

Sub-Hypothesis Description Predicted best type Test result LL1 Large Network, link color Matrix Confirmed LL2 Large Network, node color Equal Rejected LS1 Small Network, link color Node-link Rejected LS2 Small Network, node color Equal Rejected GL1 Large Network, Majority link color (High P) Matrix Confirmed GL2 Large Network, Majority node color Matrix Confirmed GS1 Small Network, Majority link color (High P) Matrix Confirmed GS2 Large Network, Majority node color Matrix Confirmed

Table 22: Listed for each Sub-hypothesis is for which measures the statistically tested results matched the predictions.

The influence of network type has been statistically proven: Response Time was significantly shorter for questions involving matrix diagrams, while correctness was higher.

Correctness was also affected by Network size; Small networks were answered correctly more often while that variable did not affect response time in the same way.

The most significant indicator of question difficulty seems to have been N target employees; response time showed significant improvement under the influence of this variable. Looking at the descriptive analysis, this seems to be explained mostly by the difference between 1 employee and the other two options. This is not difficult to explain; questions involving 1 employee usually did not require much searching.

5.1.1 Interaction effects

The impact of the type of diagram was proven in our experiment, but whether this effect was dependent on the size of the network represented is less clear. As a result, while the experiment does provide evidence for a positive impact of matrix type diagrams on usability it does not provide a clear indication of how this effect interacts with other aspects of difficulty.

The lack of significant interaction effects present in the results means that the impact of variables is fairly straightforward.

(40)

Put simply, all independent variables have the same effect on performance re-gardless of network size or type; while network Type and Size both affect per-formance, this effect was not significantly different for a particular combination of the two. Because Small networks lead to better performance than Large net-works, and matrix-type diagrams are similarly better than node-link diagrams, the best combination of the two is the Small matrix diagram.

A possible complication for the checking of interaction effects was the unequal group sizes of all Independent variables except Network Type. For example, there were more than twice as many questions that target ’All’ employees com-pared to those that just target one. Although the prerequisites were checked and passed for all significant statistics results listed, the difference in sample size may still have had a confounding effect.

In particular, it should be noted that the influence of size on the impact of the other variables can’t be completely eliminated for these reasons.

5.1.2 Surprising results

The generally superior performance by participants on questions using matrix diagrams is surprising. Following Ghoniem et al. (2004)’s conclusions it was expected that node-link diagrams would result in better performance for small networks. Most notably, identifying the type of message sent between two em-ployees was thought to be much easier for node-link diagrams since that does not involve looking up a cross-section in a matrix.

The Cognitive Dimensions approach discussed by Green and Petre (1996) serves to offer an explanation for this unexpected under-performance of node-link dia-grams. Node-link diagrams have a higher Closeness of mapping than matrix diagrams, because their shape more closely matches the real situation whereas matrix diagrams only resemble Excel sheets. On the other hand matrix dia-grams maintain a higher Visibility at all times: given its ability to represent persons and relationships without overlap, a better performance with large net-works was expected.

That this is also true for small networks was not foreseen. Participants do not need to do an ’intersection look-up’ like in a matrix, and with less mental op-erations to do (Viscosity ) and being less Error-prone node-link diagrams were expected to let participants perform better.

A possible explanation for this effect is that the identification of the employ-ees rather than the messages they send is a better predictor for performance: whereas the position of the numbered nodes in a node-link diagram varies, their position is ordered in matrices. In Cognitive Dimensions terms; Consistency may be more important than Closeness-of mapping in our task.

This is supported by the results as well: the only variable value for which mat-rix diagrams did not lead to better performance than node-link diagrams was the Question Topic ’message type’ i.e. the color of the links. If the cluttered appearance of large node-link had really been a decisive factor, matrix diagrams should have performed better here.

(41)

a mean Error rate of only 5.8 out of 30 questions, the task was performed better than we expected. If the difficulty was too low, this might have hidden differences in performance between diagram types that could be more obvious if the questions were harder to answer.

Finally, a possible confounding factor was the fact that the employees were al-ways ordered by ID number in the matrix diagrams. As such, finding individual employees will have been an easier task than in Node-link diagrams, where the position of an individual employee in the image was different for every network. Consequently, questions that involved looking up employee ID numbers were likely answered faster on average with matrix diagrams.

5.2 Project Experience

As our project centers around a participant-based experimental study, many of our difficulties stemmed from this part of the project. The number of parti-cipants (40) limits the sample size. The majority were young university students, which means intrinsic variables like level of education and reaction speed may have skewed the results.

Other problems were technical in nature. The Qualtrics online questionnaire software was upgraded by its developers at least twice during the project, each time causing unannounced alterations to the task. On the first occasion this was only discovered when the task appeared differently during a participant session. Regular checks of the integrity of the task by the experimenter at the start of participation sessions were implemented to help prevent any further incidents.

5.3 Future work

One thing clearly left to explore is the question of what impacts performance more; the numerical ordering of the nodes in a matrix network, as discussed above, or the cluttering of the links between nodes in a node-link network. By varying the degree of links in the networks, which we leave static across all variations, one might be able to test the effect of clutter better than this study has done. Similarly, the effect of ordering the nodes in matrix diagrams may be investigated by clustering them differently, possibly with randomization. In addition, it is worth looking into how the location of the node ID column/row affects performance. For example, one could investigate how performance would change if the bottom node row was moved to the top instead.

A more general recommendation for an experiment similar to ours is to vary the different variables more between diagrams. For example: while we varied the relative amount of communication types in two linked variables, one of which we analyzed, a more straightforward approach may be to use one variable.

(42)

Acknowledgements

First of all I would like to thank my supervisor Leonie for guiding and supporting me intensely and immensely throughout this lengthy project, and for combating my repeated writing incongruities.

I’m also deeply grateful to the people who volunteered to sit through my ex-periment, as well as to my friends and family who either participated, found participants and/or offered advice and a listening ear.

Finally, I extend heartfelt thanks and apologies to the custodians at the Faculty of Arts, for their support and tolerance of my efforts to arrange rooms to perform the experiment in.