A User-Centered Study and Evaluation of Current RDF Data Storage Managers

(1)

A User-Centered Study and Evaluation of Current RDF

Data Storage Managers

SUBMITTED IN PARTIAL FULLFILLMENT FOR THE DEGREE OF MASTER

OF SCIENCE

Khuong Ho Si

10591168

M

ASTER

I

NFORMATION

S

TUDIES

H

UMAN-

C

ENTERED

M

ULTIMEDIA

F

ACULTY OF

S

CIENCE

U

NIVERSITY OF

A

MSTERDAM

July 20

th

₂₀₁₈

1st_Supervisor ₂nd_Supervisor

Dr. Ali Khalili Dr. F.M. Frank Nack

Network Institute, Informatics Institute,

(2)

A User-Centered Study and Evaluation of Current RDF Data

Storage Managers

Khuong Ho Si

University of Amsterdam Amsterdam, the Netherlands

khuonghs@gmail.com

ABSTRACT

The number of Linked Data tools has grown since 2005, which shows an increase in both its attention and usage. Conducting a literature review on the existing LD tools shows that there are many tools from the same categories. This implies that tools differ in terms of usability and functioning. What the differences are, in terms of usability, has been a gap in the scientific community to this day. This paper reports on a user-centered study of a selection of six RDF data storage managers. Using an evaluation framework and a post-task questionnaire, while performing RDF related CRUD tasks, has brought various usability issues to the surface.

KEYWORDS

Linked Data, LOD, Linked Data tools, User-centered study, usability, RDF

1 INTRODUCTION

The internet has provided humankind with a new form of informa-tion sharing by lowering the barriers that allow users to publish and access information. Using Web browsers, users could navigate through the World Wide Web using hypertext links in search for relevant users’ search queries. Search engines index the documents and analyze the structure of links between them [9]. However, the Web has evolved itself to being a global sharing platform for more than just documents. It is mostly data that modern Web publishers are distributing. This data can be of any kind and in any form, therefore sacrificing much of its structure and semantics [8]. Con-sequently, Tim Berners-Lee came up with a standard to structure the data on the Web and enables it to be machine-readable, and therefore creating linkable data [6].

Linked Data relies on two technologies, Uniform Resource Iden-tifiers (URIs) [35] and the HyperText Transfer Protocol (HTTP) [15]. URIs are globally unique identification mechanisms, which is a com-pact sequence of characters that identifies an abstract or physical resource [35]. The HTTP is a universal access mechanism that is been used by the World-Wide-Web global information initiative since 1990 [15].

With the amount and rate of data being published on the Web, numerous efforts have been initiated to build tools that assist Web content creators to publish and interlink structured data on the Web. Linked Data tools can be categorized in different stages. The Linked Data life cycle classifies the following categories: Storage, Authoring, Interlinking, Classification, Quality, Evolution/Repair, Search, Browsing and Explorationand Extraction [4].

While there are numerous applications that assist Web content creators to enable their data to be linkable, the user experience for

these Linked Data tools is at the moment not very intuitive and graceful and is more focused on experienced Web content creators. Over the years, the emphasis has been merely on the ontologies and the correctness of the Linked Data specification rather than how users would use and interact with it. Therefore, in terms of usability, there is still lots of room for improvements. The Storage category is the most important category as it is the basis for Linked Data. To create linkable data, the data has to be created and stored in a data storage manager, either in graphs based or in relational-based systems. The data that is stored is called an RDF (Resource Description Framework), which is a language that consists of a triple that uses URIs as an identifier. The graph-based system can be classified in either RDF graphs based or other graph-based, such as labeled property graph-based data, which are graphs that are attributed, labeled and multi-graphs [45]. To improve the feasibility of this study, the scope of this study is exclusively in the Storage category of the Linked Data life cycle and on the graph-based data storage systems, due to its importance and limited attention. What the differences are, in terms of usability, between the different types of storages has never been studied.

This leads to the following research question:

• What usability problems do users face when interacting with RDF data storage managers?

To find the answers to the research question, the following, more concrete, subquestions support the research:

• To what extent does the tool focuses on the user interface? • What are the differences between RDF data systems and other

graph-based systems?

• How can the user experience be improved?

This paper describes the usability study concerning the following RDF data storage managers: Virtuoso Conductor1, Stardog2, Blaze-graph3, AllegroGraph WebView 6.4.14, ClioPatria5and Neo4j6. The tools will be evaluated using an evaluation framework and a post-task questionnaire, while performing RDF related create, retrieve, update and delete (CRUD) tasks.

This paper is divided into 8 sections. Section 2 describes the scientific work that others have performed related to Linked Data tools and its usability. Section 3 illustrates the literature review methodology that has been conducted in order to obtain and to make the selection of the tools. Section 4 depicts the methodology and justifies the methodology of the user-centered study. Section 5

1_{www.virtuoso.openlinksw.com} 2_{www.stardog.com} 3_{www.blazegraph.com} 4_{www.franz.com/agraph/allegrograph} 5_{www.cliopatria.swi-prolog.org} 6_{www.neo4j.com}

(3)

reports on the results, which is divided per tool. Section 6 discusses the results and research questions. Section 7 gives a conclusion of the paper. And the final section 8 outlines the future work that could and should be conducted after this study.

2 RELATED WORK

This section illustrates various studies related to Linked Data tools in terms of its usability in previous user-centered studies outlining the evaluation methods.

Over the past twenty years, overall usability studies of Linked Data tools have been lacking. However, when searching for spe-cific tools, various studies concerning ontology editing, appear. In 2000, [13] shows a study of several ontological engineering tools based on their usability. In this study, they used the system evalua-tion framework of [25], which was conducted three years earlier. The framework is intended for the evaluation of software systems, however, it was a very subjective and therefore hard to measure method. Therefore, in [13] they decided to add another evaluation method, which was the checklist evaluation. As a continuation of [13], [19] shows a comparison of six ontology-engineering tools in the accordance with three different dimensions: the user interface, the ontology-related issues found in the tool and the tool’s capacity to support the construction of an ontology by several people at different locations. To include both objective as well as subjec-tive metrics, the usability evaluation method used was heuristic evaluation and user testing.

A different view of ontologies that have been created or edited, is by making use of a visual modeling tool. In [18], the paper shows an evaluation of a visual modeling tool called OWL-VisMod (no longer functional). The evaluation is based on a user-centered ap-proach by only using questionnaires. [17] reported an evaluation of two ontology visualization techniques for information visualiza-tion. One is indented tree and the other is graph technique. They measured the effectiveness, efficiency, workload and satisfaction using three questionnaires, which are NASA-TLX [23], SUS [10] and Usefulness, Satisfaction and Ease of use questionnaire [32] and reaction cards [5]. In addition, the participants performed tasks while the experimenters measured the task success and the time spent on the tasks.

[42] shows an evaluation report of a different category in the Linked Data life cycle, namely on the Search, Browsing and Ex-plorationcategory. This report is done on Rexplore, which is a scholarly data explorer tool. The evaluation involved participants performing tasks and filling in questionnaires. After completing certain tasks, the participants were asked to fill in a SUS usability questionnaire and a second questionnaire about the strengths and weaknesses of the tested systems. The objective metrics measured were task success rate and the time spent on the task.

Another different category in the Linked Data life cycle is Au-thoring. In [29], the paper shows a systematic literature review on semantic content authoring tools and in particular its user inter-faces. In this paper, they report on the interface types and features that could contribute to pre-defined quality attributes and eventu-ally realize user-friendly interfaces.

An evaluation study related to the category Storage in terms of the databases and querying interfaces was conducted in [26]. In this paper, they evaluated four different query interfaces and compared the four systems against each other in a usability study using a post-test questionnaire SUS and a comparison questionnaire. The four in-terfaces are Ginseng [7], NLP-Reduce [27], Querix [28] and Seman-tic Crystal [48]. Furthermore, an evaluation paper of various RDF databases is described in [50]. This paper evaluated those databases that support SPARQL query languages and evaluates its general features, such as software producer, associated licenses, project documentation, support, extensibility, architecture overview, avail-able query languages and interpretavail-able RDF data formats. [14] shows a survey of RDF storage approaches while reviewing the tools based on the storage scheme, storage support, query language, update support, inference support, scalability and distribution of twenty-four different tools. A survey on RDF storage managers such as, 3store [21], 4store [22], Virtuoso, RDF-3X [39], Hexastore [55], Apache Jena7, SW-Store [1], BitMat [3], AllegroGraph and Hadoop/Hbase [11] was carried out in [41]. This comparative study was based on the technical implementations of the tools. In addi-tion to [41], [37] shows a survey on RDF stores and in particular the software components dedicated to the storage, representation and retrieval of semantic information. The survey included Alle-groGraph, Bigdata(R) [53], Apache Marmotta8, Openlink Virtuoso, Oracle 12c www.oracle.com/database/technologies/index.html and Stardog. In 2016, a survey is performed in [2], which categorizes different tools in four main RDF storage models: A giant Triple Table storage, concept-based storage, entity storage and binary storage.[43] provides an overview of the different RDF data man-agement approaches. This paper differs from [41] in the sense that in this paper, the focus lies on the approaches and not the tools. A very recent survey is described in [44]. This paper introduces state-of-the-art RDF storage and query technologies according to some classification criteria.

Although there have been various overview and comparative studies conducted on RDF data storage managers, minimal attention has been given to the usability of these storage managers. This paper aims to fill the research gap of usability studies in the Storage category of the Linked Data life cycle. And in particular, the RDF data storage managers, regardless of the technical implementations.

3 DATA COLLECTION

This section describes the data collection method and the investi-gated Linked Data tools. The subsection data collection method can be further divided into the research question, search strategy and inclusion/exclusion criteria.

3.1 Data collection method

In order to find the desired data from all the available literature, an altered systematic literature review method has been used. The reason for this approach was to explore a wider range of journals and conferences in order to discover a great amount of Linked Data tools. This section can be divided into the formulation of a research question, search strategy and the inclusion/exclusion

7_{www.jena.apache.org} 8_{www.marmotta.apache.org} 2

(4)

criteria. The research questions give us an aim of the desired out-come. The search strategy includes the reviewed libraries, journals /conferences, keywords and the inclusion and exclusion criteria, which specify the types of study designs and outcomes that will be included or excluded in the review.

3.1.1 Research question. The goal of this review is to go through the scientific literature on Linked Data tools and to find out which tools are existing and classifies them into categories. To achieve this goal, the aim is to answer the following general research question:

• What Linked Data tools are existing?

This research question can be divided into the following sub-questions:

• RQ1. Has the published tool been created? • RQ2. Is the tool still functional?

• RQ3. What is the corresponding Linked Data category? • RQ4. What are the corresponding activities?

3.1.2 Search strategy. To find the answer to the research ques-tions, the following electronic databases were reviewed as these were considered to be the most relevant ones:

• ACM Digital Library9 • SpringerLink10

• IEEE Xplore Digital Library11 • ScienceDirect12

Within these digital libraries, the major journals and conferences on the Semantic Web were reviewed. These are listed below:

• Semantic Web journal • Journal of Web Semantics

• International Semantic Web Conference • European Semantic Web Conference • Semantics Conference

• Knowledge Engineering and Knowledge Management Con-ference

• Knowledge Engineering and Semantic Web Conference Besides these journals and conferences, any references found in the papers that refer to other journals or conferences were also taken into account. This resulted into a more expansive overview of the published Linked Data tools, rather than purely looking at the seven major journals and conferences on the Semantic Web.

Based on the research questions and pilot searches, the following basic search strings were formalized as they seem to be the most appropriate ones for the review:

• Linked Data OR LOD OR Open Data OR Semantic Web • tool OR application OR framework OR workbench OR

system

• entity OR entities • classify OR classification • ontology OR ontologies

• triple store OR triple storage OR RDF • data graph OR data graphs

• storing OR storage 9_{www.dl.acm.org/} 10_{www.link.springer.com/} 11_{www.ieeexplore.ieee.org} 12_{www.sciencedirect.com} • query OR querying • linking OR linkage • quality • enriching OR enrichment

• browse OR browsing OR search OR searching OR explore OR exploration

• extracting OR extraction

• user interface OR UI OR user experience OR UX OR user-support

From these search terms, only one had to present in the article for it to be reviewed. This resulted in a substantial amount of papers. To refine this search, some inclusion and exclusion criteria were added.

3.1.3 Inclusion. The inclusion criteria refine the great number of papers into the tools that are actually valuable. The inclusion criteria are:

• Accepted peer-reviewed papers • Papers that are published after 2013

• Papers that are affiliated with Linked Data tools and/or its user interfaces

• Tools that are still existing and functional

3.1.4 Exclusion. The exclusion criteria also help to discover the valuable papers by excluding the ones that are:

• Published before 2013. However, as previously mentioned, papers that refer to other published papers before 2013 and mention LD tools that are still existing, are acknowledged. • Not affiliated with Linked Data tools, rather with languages

and/or approaches

3.2 Linked Data tools

With the literature review, a great number of tools came forth, however, the relevance of the tool for this study only resides if the tool is actually existing and functional. If this is the case, the tool with some general information, its reference and its year of publi-cation are listed down along with the LD category and activities it is applicable to. The detailed table can be found in figures 1 to 3 in appendix A.

However, since the categorization of tools was troublesome, due to the overlapping of categories or the ambiguity of the tools and/or categories, the categories were left out while making the graphs. In Appendix A, figure 4 shows the number of tools that were published per year. Using a trendline, it is evident that the number of tools increases. The reason for this is presumably the rise in attention and utilization of Linked Data tools in recent years. Moreover, it is noticeable that the number of tools declined after 2016, which is possibly caused by the saturation of the Linked Data tools developed over the years.

In Appendix A, figure 5 displays the number of tools published per journal or conference. It is evident that the number of tools also increase over the years per journal/conference. Moreover, the outliers, showing a high number of publications (5 and 4 publica-tions per year) are from major semantics journals and conferences (Semantic Web Journal, International Semantic Web Conference and European Semantic Web Conference).

(5)

3.3 Tools selection

For the evaluation of the tools, a selection has been made based on its category, availability of a UI for performing the tasks, differences in RDF implementation, availability of community editions, its popularity and usage.

The Linked Data Life cycle category Storage is chosen since that is the most important stage of Linked Data. The creation and storage of the linkable data is critical, due to reason that it is the basis for building other Linked Data applications. Furthermore, if this step, which is the first threshold of using Linked Data, is already out of reach, adaptation for new end-users will be difficult and could be improved when recognizing and improving the usability of those tools.

Besides the category, all the tools are required to have some sort of UI that facilitate the execution of the tasks. Additionally, the goal was to evaluate end-user’s experience with different RDF implementations, therefore Neo4j was included since it differs from the others in terms of RDF implementations.

Moreover, the tools are required to be freely available in any form. OpenLink Virtuoso and Allegrograph have an open-source edition, Stardog enterprise edition has a 30-day trial. Blazegraph and CleoPatria are fully open-source. AllegroGraph and Neo4j have a free community edition.

Additionally, they are all popular and being used, which can be deducted from the related work. Most of the tools are mentioned or evaluated in the related work. Furthermore. the tools are used for considerable projects such as Virtuoso Universal Server, which is the operating system for DBpedia13, a project aiming to extract structured content from the information of Wikipedia14. NASA15 uses Stardog to build knowledge graphs. Wikidata16is powered by Blazegraph. AllegroGraph is used in open-source projects, com-mercial projects, but also in the Department of Defense projects [47]. The Dutch Ships and Sailors datacloud17(DSS) is an example of a connected, knowledge graph, which is hosted on a ClioPatria semantic server. Furthermore, Neo4j is implemented in over 250 commercial companies including Comcast18, Cisco19, Ebay20and Walmart21[36].

The following list gives an overview of the criteria for the selec-tion of the tools:

• Tools are in the Storage category of the Linked Data life cycle

• Tools provide a UI to perform the tasks • Tools provide community edition • Tools are popular

• Tools are being used

From the all the discovered tool during the literature review, the following selection of tools has been made:

RDF graph-based: 13_{www.dbpedia.org} 14_{www.wikipedia.org} 15_www.nasa.gov 16_{www.wikidata.org} 17_{www.2016.semantics.cc/dutch-ships-and-sailors} 18_{www.corporate.comcast.com} 19_{www.cisco.com} 20_www.ebay.com 21_{www.walmart.com} (1) OpenLink Virtuoso (2) Stardog (3) Blazegraph (4) AllegroGraph (5) ClioPatria Labeled property-based: (1) Neo4j

Table 1, shows the characteristics of the selected tools, such as the underlying data model, license, type of user and underlying technology.

4 METHODS

This section describes how the various Linked Data tools are evalu-ated based on its usability. This section is divided into the evaluation methods, measuring tools, tasks , evaluation framework and envi-ronment.

Usability is measured along three dimensions: effectiveness, ef-ficiencyand satisfaction [12]. The International Organization for Standardization (ISO) defines usability as the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use [49].

Effectivenessmeasures the completeness and accuracy with which users achieve specific goals. It is the measurement of the user’s ability to perform a given task [30]. This is also known as the completion rate or the fundamental usability metric. It is typically recorded as a binary metric, with 1 = task completed and 0 = task not completed.

Efficiencydescribes the resources that a user depleted to perform these tasks. It is the relation between the accuracy and the com-pleteness with which users performs these tasks and the resources used to achieve them [16]. Those resources can be the number of clicks, time to perform the task, number of errors, page views, etc. Satisfactiondescribes the user’s assessment of how well the prod-uct/system met his or her needs and desires [20]. It is a measure-ment of comfort and positive attitudes that a user experiences using the system [16].

4.1 Evaluation methods

In the field of usability evaluation methods, there are 3 types of methods, which are: Theory-based methods, expert-based methods and user-based methods [58][51][52][34][40].

4.1.1 Theory-based methods. Theory-based methods examine the processes required that are involved between the system’s pro-cesses and outcomes. It is a method to understand how programs work [56]. This method does not use any tester, however, tries to construe how the system works using logical reasoning. The theory-based method is not suitable for the comparison between the various tools and lacks in the measured metrics, for the reason that there are no testers involved. This method is therefore not applicable and thus not useful.

4.1.2 Expert-based methods. Expert-based methods are eval-uations methods that are performed by usability experts. These methods include heuristic evaluations, cognitive walkthroughs and

(6)

Underlying data model License Type of user Underlying technology OpenLink Virtuoso RDF graph Community edition Developers C, C++

Stardog RDF graph 30-day free trial En-terprise edition & Community edition

Developers Java, Groovy, Clojure and .NET

Blazegraph RDF graph Fully open-source Developers Java

AllegroGraph RDF graph Community edition Developers C#, C, Common Lisp, Java and Python

ClioPatria RDF graph Community edition Developers & End-users &Academici

C and Prolog

Neo4j Labeled property graph Fully open-source Developers & End-users

Java, .NET, JavaScript, Python and Ruby

Table 1: Characteristics of the tools

checklists. It requires a usability expert, which I am not. Further-more, This method measures the quantitative metrics, however, it does not measure any qualitative metrics. This method is not suitable for a non-expert and only shows the quantitative aspect. It is therefore not useful for this evaluation study.

4.1.3 User-based methods. User-based methods are methods that require users to perform the evaluations. These empirical usability evaluation captures the performance of the user while taking objective and subjective measures into account [46]. Due to various testers, it can be time-consuming. However, the obtained measures are very valuable and provide great depths of insights. Therefore, for this study, instead of multiple testers, I evaluate the LD tools using the user-based method.

The usability metrics can be divided into objective/quantitative and subjective/qualitative metrics. Quantitative metrics are the number of clicks, time spent on a task, number of page views, num-ber of errors, heat maps and completion rate. Although quantitative metrics are objectively measurable, the use of qualitative usability metrics are also very common and has significant value. Quali-tative metrics allows a tester to compare similar interfaces or to benchmark against current competitors to determine if the usability of another product/system is comparable and/or has exceeded an agreed-upon relative threshold [30].

For this study, the measured quantitative metrics is the time spent on a task and the completion rate. These metrics can be mea-sured using screen recordings described in the subsection below. As for the qualitative metrics, a post-task questionnaire is admin-istered immediately after each task to improve accuracy [24]. The questionnaire consists of three short questions that are relevant to the tasks, see figure 1. Furthermore, the goal was to complete the tasks within a maximum task time of 15 minutes. Whenever the tasks were unable to be completed, using Google22to look for instructions was permitted.

4.2 Measuring tool

Usability evaluation Web design tools are divided into four cate-gories, which are screen recordings, heat maps, experiment manage-ment and behavioral analytics [31]. Screen recordings give the best insights, however, the designer time costs are high. Nonetheless,

22_{www.google.com}

Figure 1: Task questionnaire

since our goal is to obtain a great depth of insight with less empha-sis on the designer time cost, screen recordings are valuable when performing certain tasks. The screen recordings are able to measure the previously described quantitative metrics. This method is also called remote usability tests, which capture the user interactions. This happens synchronously in real-time, however, since the screen recordings can be stored, the videos are also viewed asynchronously [31]. The used screen recorder is SimpleScreenRecorder23, which captures the whole screen and has the ability to be hidden and run-ning in the background. This function is especially useful due to the minimization of the Hawthorne effect, which is the phenomenon whereby participants, in any human-centered study, show atypical high levels of performance simply because they are aware that they are being studied [33]. Furthermore, a timer is used to record the task duration times of each task.

4.3 Tasks

Due to the complexity and dissimilarity of the tools, it was not feasible to test all its functionalities. Therefore, for this study, we limit the tasks for those that are relevant for any data storage system.

In order to analyze the differences in usability for each tool, several tasks related to RDF store data management systems were formulated. These tasks include the CRUD (Create, Retrieve, Update and Delete) operations. The four tasks are:

(1) Create and store RDF data.

23_{http://www.maartenbaert.be/simplescreenrecorder/} 5

(7)

General

1. Evaluate the installing process 2. Evaluate the clarity of the interface

3. Evaluate the speed of updating after new data is inserted 4. Is the meaning of the commands clear?

5. Evaluate the stability of the tool 6. Does the tool provide feedback? 7. Evaluate the help system RDF

1. Evaluate the creating process of RDF data 2. Evaluate the query process of RDF data 3. Evaluate the editing process of RDF data 4. Evaluate the deletion process of RDF data

Table 2: Evaluation framework

(2) Retrieve RDF (3) Update/edit RDF (4) Delete RDF

The specific tasks can be found in Appendix B.

4.4 Evaluation Framework

The evaluation is according to a system evaluation framework based on the framework described in [13]. This evaluation framework is intended for the evaluation of software systems, however, tailored to the evaluation of RDF data management system. Table 2 displays the evaluation framework.

The first part of the evaluation framework evaluates general usability aspects of the tool. The second part evaluates the CRUD tasks as well as RDF related processes.

4.5 Environment

The environment where the tasks took place is in a quiet study room. The machine used is a Dell XPS 13 9360, Windows 10 Home, 64 bit with 8 Gb ram. However, since installing the tools on Windows was troublesome, the Operating System (OS), was changed to Linux using an external bootable Linux hard drive for the installing of the tools. The OS used is Ubuntu 18.04 LTS.

4.5.1 Tester. The evaluation was only conducted by myself and no other testers. The goal was to find participants performing the tasks, however, finding them was not possible in the short time period.

5 RESULTS

This section describes the results acquired from the evaluation of the tools using the evaluation framework with a think-aloud method. In addition to the evaluation framework, the post-task questionnaire also portrays the qualitative metrics obtained with every task. The section is divided into subsections of each tool, which is subdivided into the general, the RDF component of the evaluation framework and the post-task questionnaire per task.

Figure 2: Virtuoso file upload

5.1 Virtuoso Conductor

OpenLink Virtuoso is developed by OpenLink Software as a data-base engine hybrid that combines the functionality of a traditional RDBMS, virtual database, RDF and XML in a single system [37].

5.1.1 General. The installation process took around 3 hours. A pre-defined list of packages was required to be installed, however, gperf, gawk, bison and flex packages were not installed successfully. Nonetheless, the server is running and the tool can be used. The interface is crowded with nine tabs, each having their own sub tabs and seven navigation menus on the left. These various options do not make it pleasant to navigate through the tool. Virtuoso uses the term ‘repository’ for the data management. The speed of updating after new data is inserted is fast and the tool is stable. The meaning of the buttons are clear, however, the names of the tabs are not. There is limited feedback given and can also be incorrect, an example is given in the RDF section below. This makes it hard to notice if anything has been changed and/or changed correctly. A plus side of the feedback is that it has a yellow background, which makes it very visible. There is no overall help system. However, there is are two hyperlinks ‘documentation’ and ‘tutorial’ in the navigation menu, which is always apparent.

5.1.2 RDF. The expectation to create RDF data was in the tab ’Database’, however, within the tab there are five options of which four of them are ambiguous. This resulted in going through the tabs one by one. At the tab ’Linked Data’ there were 9 sub tabs. One being ‘Quad Store Upload’. Task 1 is the creation of a triple and not a quad, however, it is the closest designation there is. Once the file was uploaded, no visualization of the data was present. Going to another tab ‘Web Application Server’, there is also an option to upload data, see figure 2. After uploading, the file name showed up at the WebDAV Content. Due to the visualization of the file, the uncertainty that the task was completed disappears. The task time duration is 06:24 minutes. The post-task questionnaire scores are: Difficulty score = 2, Satisfaction score = 2 and Time score = 2.

The same maze holds for the query process. Similar to task 1, finding where to perform a query was going through all the tabs one by one. When arriving at the tab ‘Linked Data’, the first query action was incorrect due to a typo. This incorrectness was made clear by an error saying “Virtuoso 3700 Error SP030: SPARQL compiler, line 6: syntax error at ; ”. There are no visible numbering of lines and when counting the code lines, the query reached a maximum of 5 lines, which means that the error feedback was not

(8)

Figure 3: Virtuoso query results

accurate in that part, however, it was right with the ‘ ’. The query results visualization is pleasant. It is clear what the subject and object of the query results are, see figure 3. The task time duration is 07:52 minutes. The post-task questionnaire scores are: Difficulty score = 3, Satisfaction score = 3 and Time score = 2.

For task 3, the idea was to find the triples and delete those that had to be changed. However, going through the tabs did not result in an overview of the stored triples. In the WebDAV Content tab where the file is showed, is an edit icon, which opens a text editor. Changing and saving the data did not result in a changed result when querying for the updated data. Options ran out and browsing on Google was necessary. The maximum task time duration of 15 minutes passed and the task was not completed. The post-task questionnaire scores are: Difficulty score = 1, Satisfaction score = 1 and Time score = 1.

The final task also required changing the data by finding the instance Amsterdam and to delete it. From the previous task, it became obvious that deleting data was awfully complex, let alone finding it. Therefore, Google was used, which led to the manual from Openlink themselves. However, the examples given consisted of RDF data with a title. Our data did not have a title, which made the deletion process problematic. The location of where to type the code examples was also unclear. Trying all possible adjusted examples in the query window did not complete the task. The maximum task duration passed and the task was not completed. The post-task questionnaire scores are: Difficulty score = 1, Satisfaction score = 1 and Time score = 1.

5.2 Stardog

Stardog is a graph database based on pure Java storage and designed for mission-critical applications [37].

5.2.1 General. The installation process was problematic with an installation duration of multiple hours over multiple days. The Debian based installation approach according to the site did not work. The installation process got stuck at one point, after which the Linux and OS approach was carried out. It showed some error when connecting to the server in the terminal. It turned out that the PATH creation went wrong. This error could not be resolved using their installation guidelines. A third-party guideline explained the installation process in more detail, which eventually enabled the server to run. The interface is very clean and therefore clear. It only has three tabs: Databases, Security and Query Management. Stardog uses the term ‘database’ for the data management. The

Figure 4: Stardog file upload

speed of updating after new data is inserted is very fast. The com-mands all have easy-to-understand names, which makes navigating through the tool pleasant with a feeling of control. The tool is stable , however, the properties view in the database browser does not work. It does not show any properties, which should be presented. The tool shows some user-friendly feedback, such as placeholder texts for adding and removing data. Furthermore, there are some additional information bars and a confirmation overlay when trying to delete a database. To delete a database, one must type in the name of the database to confirm. There is no help system. There is a search bar and a help icon, however, the search bar does not show any results and the help icon just shows 2 keyboard shortcuts. The only help assistance is at the bottom of the page. It has two hyperlinks: Stardog Docs and Stardog Community, which opens a new webpage of their website, one being the manual and the other being the community forum.

5.2.2 RDF. The creating process of RDF data is relatively easy since the only relevant tab is the Database tab. On the page, there is a big green button ’New DB’, and nothing else. Within the database creation page, there are numerous options available, most of them incomprehensible. When completing the database, the following feedback is presented: Database created! Database (name) was created, go to (name) console to add data. The (name) console was not visible in the tabs, however, it is clickable in the feedback. Once within the console, there are again three tabs: Query, Browse and Data. The Data tab shows an arrow pointing downwards, which implies a drop-down menu. This is also the case, where the menu consists of ‘Add’, ‘Remove’ and ‘Export’, see figure 4. When clicking on ‘Add’, an overlay window pops up where you can browse for a file and/or add a tag. Once uploaded, the following feedback show ‘Success!’ Data added successfully. The task time duration is 01:29 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 5 and Time score = 5.

When trying to perform task 2, the initial thought was going to the tab ‘Query Management’. However, the title ‘Running Queries’ showed nothing. It seemed that it is a page to review past queries and manage those and not to create new queries. This led to the navigation of the ‘Database’ tab and to select the newly created database. The page showed a Query sub-tab, which led to the Query Panel. Here one can type the query. To execute the query, the expectation is that the execution button is at the bottom, however, it is positioned at the top. The results show the subject as well as the object in a clear table, however, you would have the scroll down to see the results, see figure 5. The task time duration is 01:22 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 4 and Time score = 5.

(9)

Figure 5: Stardog query results

Figure 6: Stardog delete

For task 3, the ‘Browse’ tab seems to be a logical step, if a rep-resentation of the data is desired. This feature makes it easy to find the data and to edit/update it. The idea was to remove all the classes and properties with Amsterdam as subject. The ‘Classes’ display the City class in which Amsterdam and Rotterdam are the two instances. At ‘Properties’ the property ‘areaCode’ and ‘part’ are not further clickable and not able to be edited. The last method was to delete all the Amsterdam related properties and relations and upload a new file with the updated data, requested in task 3. Using the browse function to see whether it was uploaded successfully, was not an option since this function does not work well. Querying the new results did produce the right data. This confirms that the new updated file is successfully uploaded. The task time duration is 07:34 minutes. The post-task questionnaire scores are: Difficulty score = 2, Satisfaction score = 2 and Time score = 2.

Once it was clear that Stardog did make the changes correctly, yet the browse function did not work properly, deleting data was done more comfortably. In the drop-down menu of ‘Data’ tab there is an option ‘Remove’. However, the inadequate knowledge of RDF data disrupted the usage of this method. Therefore, another method was required. In the Browse tab, there is an option to select the classed and properties that are related to Amsterdam and to delete those instances, see figure 6. The task time duration is 01:55 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 5 and Time score = 5.

5.3 Blazegraph

Blazegraph is an ultra high-performance graph database supporting Apache TinkerPop and RDF/SPARQL APIs. It supports up to 50 Billion edges on a single machine [38].

5.3.1 General. The installation process was rather easy with a duration of 1.5 hours. The main problem was that the tool specifi-cally required Java version 8. The more recent version (Java 9) is not compatible, which is not made clear in the installation guide-lines. The interface is clean with 7 tabs and no sub-tabs. While Stardog uses the term lqdatabase’ as data management, Blazegraph uses the term ‘Namespaces’. The speed of updating after new data is inserted is fast and the commands and buttons are clear. The tool seems rather stable once it is connected. Problems with the connection to the server did arise on the testing day, which forced for another testing day. Feedback is lacking. When creating a new namespace, there is no feedback saying that is it successfully cre-ated. The feedback in the query window looks more like a log file than a suggestion or recommendation. One feature that is helpful is the coloring of texts in the query window. It shows when things go right or wrong, however, it uses a color for everything, which does lose the overview. There is a help system and a status tab. The status tab displays a sort of log file, which does not make any sense to an end-user. However, at the tab ‘Welcome’ and ‘Namespaces’, a small text provides information on how to perform certain actions. Furthermore, at the bottom of the page, there is a hyperlink that says ‘Blazegraph - Wiki’, however, this leads to their main website (www.blazegraph.com) and not a Wikipedia page as expected.

5.3.2 RDF. For task 1, creating RDF data, the initial thought is to go to update, since the task is to add data and to update it. However, it does not seem like the right place to add RDF data. Going through the tabs again, the first one to stumble upon is the ‘Welcome’ tab. Reading the instructions more carefully this time, it shows a hyperlink to the tab ‘Namespace’. Here you are able to create a namespace. Once done, there is no feedback response only an addition to the table under the title ‘Namespaces’. After reading the instruction below, it became obvious that you first have to select ‘Use’. Going to the lqUpdate’ tab, it says “(Type in or drag a file containing RDF data, a SPARQL update or a file path or URL)”. After uploading a file, the feedback response at the bottom of the screen shows a very small ‘modified: 6, millisecond: 187, which inclines that there have been some modifications. The task time duration is 01:46 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 2 and Time score = 4.

Querying is very easy since the only relevant tab for querying is ‘Query’. The results are shown in a clear representation, where the object and subject are displayed. Additionally, the time of execution, query, results, execution time and delete function is also shown, see figure 7. The task time duration is 01:29 minutes. The post-task questionnaire scores are: Difficulty score = 5, Satisfaction score = 35 and Time score = 5.

For task 3, finding the file to update it is the first step. Going to the ‘Update’ tab, there is an option to upload the file. Once uploaded, the data appeared in a text, which was editable. Doing that and saving it, looked as if the data had been updated, see figure 8. However, when performing a query to test the results, it

(10)

Figure 7: Blazegraph query results

Figure 8: Blazegraph query update

seemed that the Update tab only adds data and does not remove data. Therefore, in order to change the data, the old file had to be deleted and a new, updated file had to be uploaded. The task time duration is 02:30 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 3 and Time score = 4.

The same procedure holds for task 4. Finding the instance Ams-terdam was not succeeded and just as in task 3, a new file without the instance Amsterdam is uploaded and the old file is deleted. The task time duration is 03:05 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 3 and Time score = 4.

5.4 AllegroGraph WebView 6.4.1

AllegroGraph is a closed source RDF store developed by Franz Inc. in Common Lisp. Currently, it is being used in various open source and commercial projects. It is the storage component for the TwitLogic project that is bringing the Semantic Web to Twitter data [37].

5.4.1 General. The installation guidelines are not clear with a lot of redirection links. There is not a clear distinction of which version to download and for which operating system. The instal-lation duration was 1.5 hour, however, the login did not work for some reason. Contacting Franz Inc. for assistance, they offered a different username and password, however, logging in was still not possible. After re-installing the tool, the login finally succeeded. The interface is very clean, with only three tabs: ‘Utilities’, ‘Admin’ and ‘User test’. The first page leads to ‘Repository’ where once can add data. So it means that their terminology for data management is ‘repository’. The speed of updating after inserting new data is a

Figure 9: AlleroGraph file upload

fraction slower than the previous tools, however, it does not create any hindrance. The meaning of the commands are very clear, there is no ambiguity in them. As mentioned before, the tool is stable, however, the problematic log in part is a component to consider. There is feedback, which shows up in the middle of the screen with a yellow background. It is visible and easy to understand, however, the feedback window has to be deleted manually, which is odd and aggravating. The help system is lacking, although, there is a ‘Documentation’ drop-down menu, yet, most of the hyperlinks do not work.

5.4.2 RDF. It is easy to find how to create new data. Once logged in, the first page leads to ‘repository’ where creating a new repository is the first option, see figure 10. Once within the reposi-tory, there is a list of options shown. One of them is called ‘Import RDF from an uploaded file’, see figure 9. The feedback shown when uploading a file, is suddenly on the right and with a different layout, saying ‘Triples successfully imported.’, which is not very visible. However, it does show the title of your created repository and the number of statements, which are corresponding to the uploaded data file. The task time duration is 01:19 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 4 and Time score = 5.

For task 2, both methods for the task were performed. The view-ing method was found in the ‘Explore the Repository’ section and ‘View triples’, where it showed all the triples with the Subject, Predicate, Object, the number of results, query time and other infor-mation. The SPARQL query was easily performed in the ‘Queries’ tab. It showed a window where the SPARQL query can be executed. I made a type, with an extra ‘’ at the end. The feedback showed “Executing query failed: Line 5, found ‘’. Was expecting one of: EOF, GROUP, HAVING, LIMIT , OFFSET, ORDER, VALUES”. The explanation was not correct, however, the first comment was very useful since it showed the problem. The results are in the same format as the ‘View triples’ feature, see figure 10. The task time duration is 01:33 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 4 and Time score = 4.

(11)

Figure 10: AllegroGraph query results

Figure 11: AllegroGraph delete

For task 3, the repository menu under ‘Load and Delete Data’ showed ‘Delete statements’. Here you would have to fill in the Sub-ject, Predicate, Object and Graph, however, it gave as a feedback ‘0 statements deleted’. This was odd since the query results matched with the Subject, Predicate and Object. Adding the statement popu-lationTotal with the value of ‘842343’ was not accepted. This led to the deletion of the entire file and uploading a new/updated file. The task time duration is 13:22 minutes. The post-task questionnaire scores are: Difficulty score = 2, Satisfaction score = 2 and Time score = 2.

Knowing the difficulties from task 3, for deletion of statements, the simplest way was to go to ‘View triples’. There it is possible to click on a subject and delete all the statements with ‘Amsterdam as the subject, see figure 11. One would have to do this one by one, with each statement and each having its own confirmation window. After this deletion process, going back to ‘View triples’ was a simple way to see if it was succeeded. The task time duration is 01:09 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 2 and Time score = 5.

5.5 ClioPatria

ClioPatria is a comprehensive Semantic Web development frame-work based on SWI-Prolog. SWI-Prolog provides an efficient C-based main-memory RDF store that is designed to cooperate nat-urally and efficiently with Prolog, realizing a flexible RDF-based environment for rule-based programming [57].

5.5.1 General. The installation process is relatively easy with a duration of 2 hours. The reason for this simple, but relatively

Figure 12: ClioPatria upload file

long duration is due to the fact that GIT24and SWI Prolog25has to be installed before ClioPatria can be installed. The interface is very clear, with six clearly defined tabs. The only problem is that two tabs are ambiguous, which are ‘Admin’ and ‘Administrator’. The designers have thought about his and made the ‘Administrator’ tab italic. The terminology for the data management is the same as Blazegraph, which is ‘Repository’. The speed of updating after new data is inserted is very quick. As said before, the tabs are very clear and the same holds for the commands. The reason for this clarity is the simplicity of the tool. There are no crashes whatsoever, which makes the tool very stable. The feedback is very clear. If a query went wrong, a red bar shows the error and at the line numbering a red exclamation mark where it went wrong. For other errors, the tool opens a new window showing the error. There is a help system in the tab ‘Help’, which has to following sections: Documentation, Tutorial, Roadmap and HTTP Services. The links redirect the tool to these section without leaving the tool’s web page, which is satisfying.

5.5.2 RDF. Due to the simplicity and clarity of the interface, it is obvious that for uploading a file, one would have to go the ‘Repository’ tab which opens a drop-down menu, with as first option ‘Load local file’. Once uploaded, the feedback shows ‘Operation completed’. Furthermore, as a confirmation of the uploaded file, the information such as CPU time, resources and triples are shown, see figure 12. The task time duration is 00:23 minutes. The post-task questionnaire scores are: Difficulty score = 5, Satisfaction score = 5 and Time score = 5.

For task 2, there are no viewing options, so a SPARQL query was required. This was easy to find, as shown in figure 13. The Query tab has two options: ‘YASGUI SPARQL Editor’ and ‘Simple Form’. The Simple Form page did not make any sense, so going to YASGUI SPARQL Editor, resulted into the trusted environment. After typing the query, the execution button is not the regular button saying ‘execution’, but a ‘play’ icon. It initially looked odd, however, there were no other options that could have the same meaning as ‘execute’. After running the query, a table showed the results, with the number of entries, subjects and objects. There is also an option to view it in other ways than via a table, such as via a Raw Response, Pivot Table and Google Chart, see figure 13. The task time duration is 01:40 minutes. The post-task questionnaire scores are: Difficulty score = 5, Satisfaction score = 5 and Time score = 5.

Removing triples can be found under the ‘Repository’ tab, which is called ‘Remove triples’. The explanation given is that the three fields (Subject, Predicate and Object) should be in ntriples/Turtle

24_{www.git-scm.com} 25_{www.swi-prolog.org} 10

(12)

Figure 13: ClioPatria query results

Figure 14: ClioPatria delete

notation. Having tried to remove ‘City’ as an RDF type for the Subject Amsterdam, it returned an Internal server error saying “Type error:‘ntriples(predicate)’ expected, found rdf:type (an atom) (Field must be in N-triples notation)”. Not knowing how to rewrite the data into an N-triples notation for the tool to be readable, the last method was to upload a new file with the changes. This could easily be done in the first tab ‘Places’ under ‘Graphs’, see figure 14. The task time duration is 03:13 minutes. The post-task questionnaire scores are: Difficulty score = 3, Satisfaction score = 3 and Time score = 3.

For task 4, the same actions are required for the beginning of task 3. Another search for the triples in a graph-based or table based way was found using the query results and clicking on them. However, there are no options to delete those triples, which led to the uploading of a new file without the instance Amsterdam and deleting the old file. The task time duration is 03:57 minutes. The post-task questionnaire scores are: Difficulty score = 2, Satisfaction score = 2 and Time score = 3.

5.6 Neo4j

Neo4j is a JVM-based NoSQL database. As the leading graph data-base, its model is intuitive and expressive, mapping closely to your whiteboard domain model. For highly connected data, Neo4j is thousands of times faster than relational databases, making it ideal for managing complex data across many domains, from finance to social, telecoms to geospatial [54].

5.6.1 General. The installing process is relatively easy, with a duration of 1 hour. The login is very problematic, you would expect to first be logged in, in order to work with the tools, however, when trying to log in, a new window opened with ‘server not found’ error. The interface looks very clear and professional. It has 3 navigations menus on the top and 3 at the bottom on the left of the screen

Figure 15: Neo4j add data

instead of tabs. The downside is the window for the command lines, which is very small. The speed of updating after new data is inserted is very fast and the meaning of the commands are clear. The major problem is the stability of the tool. The tool crashes continuously and makes the whole machine crash with it. Why this occurs is not known. There is feedback in the command windows of the previous code command and the result, however, it does not show any feedback when the system crashes. Furthermore, there is a clickable tutorial as assistance, however, it is very limited and not very helpful.

5.6.2 RDF. The first thing was trying to log in, which did not work. The server was not found. The next step was to try creating the data without logging in. On the left, there is a database icon, which was thought to be the place where to create/upload data. However, there was not an option to create a new database. The star icon below, which stands for ‘Favorites’, has an option to create a new folder, however, no further actions are possible after that. On the bottom, there is an option to drag a .txt file, for which the standard .ttl file had to be changed to a .txt file in order to drag it. After dragging, the feedback showed ‘(name).text has been added.’ and shown in the saved script with a clickable link. After clicking on the link, the text was uploaded into the command line. When pressing the play button the following error was shown ‘RROR Neo.ClientError.Statement.SyntaxError: Invalid input @: expected <init>(line 1, column 1(offset: 0)) “@base <http://example.org/>”. At this point, Google was used. Which made it clear that you would have to create nodes manually with codes, see figure 15. Creating the nodes was succeeded, however, the tool and my machine froze for 1 minute at 12:04. The task time duration is 14:09 minutes. The post-task questionnaire scores are: Difficulty score = 1, Satisfaction score = 2 and Time score = 1.

For task 2, there was an option called ‘Monitor the system, which made the implication of monitoring the data, however, it was all about the technical status. When pressing on the database icon, it showed Amsterdam, City and Rotterdam as node labels. Those labels are clickable and the Amsterdam node was clicked on. There appeared 2 red dots and 2 green dots, each color showing 020. Most likely, a ‘create’ code went wrong, which created additional dots. When hovering the mouse over the dot, on the bottom it showed areaCode: 010 and part: Hoogvliet. Furthermore, on the left navigation menu in database information, there were 3 property keys: city, areaCode and part. When clicked on ‘part’, it showed the

(13)

Figure 16: Neo4j query results

Figure 17: Neo4j delete

entity ‘nodefi and part ‘Amsterdam-Noord’ and ‘Hoogvliet’. Same holds for clicking on ‘City’ and ‘areaCode’, see figure 16. The task time duration is 01:57 minutes. The post-task questionnaire scores are: Difficulty score = 4, Satisfaction score = 3 and Time score = 4. When trying to perform task 3. The tool was very ‘buggy’ and crashed continuously. Thinking that closing all the existing win-dows below the command line would help, it did not change any-thing about the stability of the tool. The tool crashed four times during this task and even lost all the nodes due to the lost connec-tion to the Neo4j server. After turning back on the server via the terminal, the nodes returned. However, the maximum task time duration was reached and the task was not succeeded. The post-task questionnaire scores are: Difficulty score = 1, Satisfaction score = 1 and Time score = 1.

For task 4, going through the nodes, there is an option to delete the attributes, however, it is only visually and temporarily deleted. Seeing no other delete options, Google was used, which led me to the Neo4j developer manual. The manual showed a straightforward code for deleting a node, see figure 17. Furthermore, during this task, the tool did not crash, perhaps the re-running of the server changed something. The task time duration is 00:56 minutes. The post-task questionnaire scores are: Difficulty score = 5, Satisfaction score = 5 and Time score = 5.

6 DISCUSSION

This section summarizes the results from the evaluation framework and post-task questionnaire. Furthermore, it answers the research questions and depicts the discussion points of this study.

6.1 Results overview

When evaluating the tools according to the evaluation framework, the first general criterion is ‘Evaluate the installing process’. The installation process for Blazegraph and Neo4j was relatively easy compared to the other tools. Virtuoso Conductor and Stardog both had some difficulties while AllegroGraph 6.4.1 was the most difficult one, due to the unclear manual on the site, and problematic log in. The second criterion, the clarity of the interface, is for most of the tools perceived as very positive. Only the interface of Virtuoso Conductor is way too complicated and tumultuous.

The third criterion, the speed of updating, is for every tool excel-lent. There was not a tool that responded in a slow way that creates any tedious usability.

The same holds for the fourth criterion, the clarity of the com-mands, except for Virtuoso Conductor. The names of the tabs are often ambiguous, which does not make the functions clear, however, the buttons are straightforward.

The fifth criterion, the stability of the tool, differs per tool. For the stability of Virtuoso Conductor and ClioPatria, no remarks can be given. On the other hand, Neo4j crashed continuously when performing the tasks. Stardog, Blazegraph and AllegroGraph WebView 6.4.1 all are all acceptable. Stardog has nonfunctional browsing actions. Blazegraph has problems with the connection to the server, while AllegroGraph WebView 6.4.1 has complications with logging in.

The sixth criterion, the evaluation of the feedback, is satisfactory for all tools, except for Blazegraph. Blazegraph’s feedback looks more like a log file than an actual recommendation or suggestion. The feedback of Virtuoso Conductor and AllegroGraph WebView 6.4.1 is acceptable, while the feedback of Stardog, ClioPatria and Neo4j is very well taken into considerations by the designers.

The seventh and last general criterion, the evaluation of the help system, is overall poorly evaluated for all tools, except for ClioPa-tria. ClioPatria has a help tab with a drop-down menu of 4 sections, which can be clicked through inside the tool. Virtuoso Conductor and Stardog both do not have a built-in help system, however, they provide links to their documentation and other relevant websites. Blazegraph has no help system, but one hyperlink ‘Blazegraph -Wiki’, which redirects to their main website. AllegroGraph We-bView 6.4.1 has a Documentation tab with a drop-down menu, however, most links do not work. Neo4j has a clickable tutorial, however, it does not provide much information.

The first RDF related criterion, the evaluation of the creating process, is acceptable for most tools, except for Virtuoso Conductor and Neo4j. For the case of Virtuoso Conductor, the problem lies in the complicated interface and not knowing where to upload the data. For Neo4j, the problem was that the creation method differs significantly from other graph-based tools, which was not known and therefore extended the task duration.

The second criterion, the evaluation of the query process, is evaluated with a high to very high subjective metric. Only Virtuoso Conductor is meager, due to again, the complexity of the tool’s interface.

The third criterion, the evaluation of the editing process, is in-adequate for all tools, except for Blazegraph, which barely scores above the average of 3.

(14)

The last criterion related to the RDF data management is the evaluation of the deletion process. Stardog and Neo4j score very high, whereas Virtuoso Conductor scores very low. Blazegraph and AllegroGraph WebView 6.4.1 are acceptable, while ClioPatria scores a below average score.

The summary of the evaluation framework is presented at table 3. A plus (+) means positive, while a double plus (++) means very positive. A zero (0) means reasonable, a minus (-) is negative, while a double minus (--) means very negative. For the General criteria, the + and - is given subjectively, while for the RDF criteria, the score reflects the total average score ((Difficulty score + Satisfaction score + Time score)/3) withdrawn from the post-task questionnaire, where a score with <0.5 = rounded below and <0.5 = rounded up. A total average score of 1 = --, 2 = -, 3 = 0, 4 = + and 5 = ++.

Furthermore, for an overview of all the task scores per tool, figures 6 to 9 in Appendix B, shows all the post-task questionnaire scores per task. Additionally, figure 10 to 13 in Appendix B, shows all the task duration times per task in minutes. It is noticeable that the task duration time of Virtuoso Conductor for every task is high and reaches its maximum in task 3 and task 4 since it was not completed within the time frame. Furthermore, the task times differ per tool and task and does not show one particular tool, besides Virtuoso Conductor, which has an overall high task time duration.

6.2 Answering the research questions

From the results, it is noticeable that there are components that disrupt the usability of the tools. It starts with the installing process. Many tools are not designed for a Windows operating system, therefore, complicating the installation process for those machines, which are the majority of the computer users. Answering the first subquestion: To what extent does the tool focuses on the user interface?, it is noticeable that most tools, except Virtuoso Conductor, offer a clear interface, however, the stability and help system are overall lacking. Virtuoso Conductor most definitely does not have its focus on the user interface, whereas the other tools try to optimize this usability part.

The second subquestion: What are the differences between RDF data systems and other graph-based systems?, starts from the cre-ation of the data. Whereas the RDF data systems allow for a .ttl file upload, Neo4j does not. In Neo4j, one has to upload either a .text file or by writing codes. Once the data is there, the representation of the data also differs. With RDF data systems, the data is often represented in a table form, while in Neo4j, the data is presented as a node containing attributes. The editing and deleting process is also performed with codes in Neo4j, whereas for the other tools, the UI provides some editing features.

The third and last subquestion: How can the user experience be improved?, starts, as previously mentioned, with the installing pro-cess. There should be .exe files available for all operating systems, which can be downloaded and installed with 2 clicks. Furthermore, the UI of Virtuoso Conductor needs to simplify, categorizing the tabs properly, considering that the UI is of major importance for the usability and by making a better UI, the barrier of utilizing the tool lowers for end-users. Additionally, Neo4j has to vigorously stabilize its tool and login procedure. Stardog has to make the

browser options functional, Blazegraph should stabilize the connec-tion to the server and AllegroGraph has to fix their login procedure. Another important RDF related process, that all tools fail, except Blazegraph, is the editing process. This process should be easier, especially improving the exploration of the data and making it able to delete certain triples, instead of the data.

Table 4 answers the general research question: What usability problems do users face when interacting with RDF data storage man-agers?. It lists all the encountered problems that occur in one or more tools.

6.2.1 Guidelines. After evaluating the tools, it was evident that there are some usability problems that occur in all the tools. The following guidelines serve as a general recommendation to improve the overall usability:

(1) Make the installation process straightforward (2) Keep the interface clean

(3) Provide feedback and a help system (4) Make the data visible

1. Create one full package that can be installed from one source. Make clear distinctions between the different editions and operat-ing systems.

2. Avoid countless tabs and ambiguous names. Group the features and place them under one tab.

3. Always show feedback, even when actions are performed cor-rectly. Explain in a comprehensive manner the error and provide a help system, which could be used when needed. Add general examples in the help system.

4. Show the data, in whatever format, to the users. Additionally, make this data editable without having to use a SPARQL query.

6.3 Discussion points

Most of the tools are not Windows friendly, which forced me to boot a hard disk with Ubuntu in order to install the programs. The installing process was very difficult for a non-developer, who has never booted and/or used Linux. The methodology of the testing, especially, since I have tested my self, is obviously questionable. The Hawthorne effect is present, due to my knowledge of the recordings. Furthermore, the results could be very different in larger testing groups, since in this study there is only one tester. Moreover, the purpose of the evaluated tools may differ, which could affect the usability results for the evaluation. Besides, for the summary of the evaluation framework (table 3), for the RDF related criteria, the valuation is based on an equal distribution of importance for the variable’s difficulty score, satisfaction score and time score.

7 CONCLUSION

The focus of this study was on the evaluation framework, therefore losing time to find testers and the addition of the number of tools, which was certainly intended to be increased. My knowledge of RDF statements is very limited, which led to, on one hand, interest-ing results in terms of usability for inexperienced end-users. On the other hand, the results are therefore exclusively representative for end-users with limited knowledge of RDF data management. This paper shows the usability issues of Virtuoso Conductor, Stardog, Blazegraph, AllegroGraph WebView 6.4.1, ClioPatria and Neo4j,

(15)

Criterion Virtuoso Conductor

Stardog Blazegraph AllegroGraph WebView 6.4.1

ClioPatria Neo4j

General

Evaluate the installing process - - ++ -- + ++

Evaluate the clarity of the inter-face

- ++ ++ ++ ++ ++

Evaluate the speed of updating after new data is inserted

++ ++ ++ ++ ++ ++

Is the meaning of the com-mands clear?

0 ++ ++ ++ ++ ++

Evaluate the stability of the tool

++ 0 0 0 ++

--Does the tool provide feed-back?

+ ++ - + ++ ++

Evaluate the help system - - -- -- +

--RDF

Evaluate the creating process of RDF data

- ++ 0 + ++

--Evaluate the query process of RDF data

0 + ++ + ++ +

Evaluate the editing process of RDF data

-- - + - 0

--Evaluate the deletion process of RDF data

-- ++ + + - ++

Table 3: A summary of the evaluation framework results. A plus (+) means positive, while a double plus (++) means very positive. A zero (0) means reasonable, a minus (-) is negative, while a double minus (--) means very negative.

Usability problems 1. Installation difficulties 2. Complex interface

3. Ambiguous labels on the tabs 4. Non-functional features

5. Instability of the tools during login or during the tasks 6. Insufficient feedback

7. Non existing or limited help system 8. Created or uploaded data is hidden 9. No option to edit data

Table 4: Usability problems

according to its effectiveness, efficiency and satisfaction. It de-scribes what should be considered for improvement, which lowers the Linked Data tools barrier. Lowering the barriers, in terms of usability, encourages new and non-experienced users to get in-volved with Linked Data tools. This is particularly desired, due to all the unregulated data on the web and the need for structured and reusable data.

8 FUTURE WORK

For an improved future study, it is essential to add more participants for better results. Furthermore, using domain expert participants with more detailed CRUD tasks will lead to a deeper insight into the usability problems that may not have brought to light in this study.

Moreover, as previously mentioned, our selection of six tools can be increased and/or applicable for other categories of the Linked Data life cycle. Additionally, my evaluation framework can be enlarged and/or other usability evaluations methods can be added for more detailed results.

ACKNOWLEDGMENTS

I would like to thank my supervisor Ali Khalili for his valuable feedback, time and enthusiasm. Furthermore, I would like to thank Frank Nack for his time and assistance as a second reader, but especially as a teacher during this whole year.

REFERENCES

[1] Daniel J Abadi, Adam Marcus, Samuel R Madden, and Kate Hollenbach. 2009. SW-Store: a vertically partitioned DBMS for Semantic Web data management. The VLDB Journal18, 2 (2009), 385–406.

[2] Saleh Albahli and Austin Melton. 2016. RDF Data Management: A survey of RDBMS-Based Approaches. In Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics. ACM, 31.

[3] Medha Atre, Jagannathan Srinivasan, and James Hendler. 2008. Bitmat: A main-memory bit matrix of rdf triples for conjunctive triple pattern queries. In Proceedings of the 2007 International Conference on Posters and Demonstrations-Volume 401. CEUR-WS. org, 1–2.

[4] S¨oren Auer, Lorenz B¨uhmann, Christian Dirschl, Orri Erling, Michael Hausenblas, Robert Isele, Jens Lehmann, Michael Martin, Pablo N Mendes, Bert Van Nuffelen, and others. 2012. Managing the life-cycle of linked data with the LOD2 stack. In International semantic Web conference. Springer, 1–16.

[5] Joey Benedek and Trish Miner. 2002. Measuring Desirability: New methods for evaluating desirability in a usability lab setting. Proceedings of Usability Professionals Association2003, 8-12 (2002), 57.

A User-Centered Study and Evaluation of Current RDF Data Storage Managers

A User-Centered Study and Evaluation of Current RDF

Data Storage Managers

SUBMITTED IN PARTIAL FULLFILLMENT FOR THE DEGREE OF MASTER

OF SCIENCE

Khuong Ho Si

10591168

M

I

S

H

C

M

F

S

U

A

July 20

2018

A User-Centered Study and Evaluation of Current RDF Data

Storage Managers

Khuong Ho Si

ABSTRACT

KEYWORDS

1

INTRODUCTION

2

RELATED WORK

3

DATA COLLECTION

3.1

Data collection method

3.2

Linked Data tools

3.3

Tools selection

4

METHODS

4.1

Evaluation methods

4.2

Measuring tool

4.3

Tasks

4.4

Evaluation Framework

4.5

Environment

5

RESULTS

5.1

Virtuoso Conductor

5.2

Stardog

5.3

Blazegraph

5.4

AllegroGraph WebView 6.4.1

5.5

ClioPatria

5.6

Neo4j

6

DISCUSSION

6.1

Results overview

6.2

Answering the research questions

6.3

Discussion points

7

CONCLUSION

8

FUTURE WORK

ACKNOWLEDGMENTS

REFERENCES

₂₀₁₈