• No results found

FeikoRitsema BachelorThesis DesignandImplementationofFullStackSearch:TheCaseofUnifiedMultilanguageReader

N/A
N/A
Protected

Academic year: 2021

Share "FeikoRitsema BachelorThesis DesignandImplementationofFullStackSearch:TheCaseofUnifiedMultilanguageReader"

Copied!
50
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Design and Implementation of Full Stack Search: The

Case of Unified

Multilanguage Reader

Bachelor Thesis

Feiko Ritsema

Supervisors:

Dr. M. F. Lungu Dr. V. Andrikopoulos

Faculty of Science and Engineering

Software Engineering Department

(2)

Abstract

This thesis describes the process of designing and implementing full-stack search and subscription functionality, with relevant research into search techniques and user interaction. The author of this work has chosen a way of implementation, which will be described and explained in the thesis, along with the chosen architecture. Secondly, this work also tries to demonstrate how the selected approach provides a good level of us- ability. Lastly, the paper will present how the evaluation of the search and subscription by test users proves the improvement in usability of the application as the added functionality will personalize the content to a greater extent.

(3)

Contents

Contents ii

1 Introduction 1

1.1 Structure . . . 2

2 Related Work 4 2.1 Zeeguu . . . 4

2.1.1 Zeeguu Core . . . 5

2.1.2 Zeeguu API . . . 5

2.1.3 Zeeguu Unified Multilanguage Reader . . . 5

2.2 Related Applications . . . 5

2.2.1 Google . . . 5

2.2.2 Feedly . . . 6

3 Problem Description 7 4 User Experience and Interaction Design 9 4.1 Old user interaction overview . . . 9

4.2 New user interaction decisions . . . 10

4.2.1 Adding dialogs . . . 13

4.2.2 Search . . . 16

4.2.3 Naming conventions . . . 17

5 Architecture and implementation 18 5.1 Zeeguu-Core back-end . . . 19

5.1.1 Model . . . 19

5.1.2 Mixed recommender . . . 24

(4)

CONTENTS

5.1.3 Content Configuration Hashing . . . 25

5.2 Zeeguu-API . . . 27

5.2.1 Search . . . 27

5.2.2 Topics . . . 28

5.2.3 User Languages . . . 28

5.3 Unified Multilanguage Reader architecture . . . 30

5.3.1 Dialogs . . . 30

5.3.2 Search . . . 30

5.3.3 Events . . . 31

6 Results and Evaluation 32 6.1 Questionnaire Results. . . 32

7 Conclusion 36 8 Future Work 38 8.1 Ordering . . . 38

8.2 Natural Language Processing . . . 38

8.3 AI to predict what users like . . . 39

Bibliography 40 Appendix 42 A Workflow 42 A.1 Iterations . . . 42

A.2 Continuous feedback loop . . . 42

A.3 Regular meetings . . . 42

B User Feedback 44

(5)

Chapter 1 Introduction

Learning a new language is something more and more people are trying to achieve to communicate better when travelling, to keep mentally fit or out of sole interest of the language and/or country. Most of these people don’t know where to start when learning a new language. This is caused by the fact that there are many options and there can definitely be a challenge when choosing the right option for oneself. With the new technologies and the highly mobile and connected society, there is a rise in learning languages online.

There are a lot of language learning applications and websites like Duolingo, Babbel, Memrise to be found on the internet. All these ap- plications have on thing in common: these applications try to teach the users the new language (mostly) by teaching the users the broad and ba- sic vocabulary and grammar of a language By teaching the same words and sentences of a language, one can possibly have difficulties speaking in real life situations as there is no average user, as every user has different interests and topics this person likes, moreover one needs the ability to understand full sentences and context, which often is not taught in vocab- ulary learners. To contradict the fact that these vocabulary learners are needed, Atilgan concluded that: ”The constant repetition of words in an extensive text has the potential to lead the learner to pick up words sub- consciously that will subsequently contribute to vocabulary growth and

(6)

CHAPTER 1. INTRODUCTION

better writing in terms of vocabulary and content.”[1]. Besides the con- text, other sentences and words will be used in real life situations then in the optimal and rather correct learning situation. Timothy Bell concludes that extensive reading is highly effective for improvement in reading and overall development: ”Extensive reading programs can provide very ef- fective platforms for promoting reading improvement and development from elementary levels upwards.” [2].

The Zeeguu project is a project built in order to facilitate the acquisi- tion of foreign languages to a greater extent and with greater ease than some of the predecessors. The goal is to create an online personalized textbook in which the user can read articles in preferred languages and translate words when necessary. ”Instead of manually adding words to an external vocabulary practice system, the most relevant unknown words en- countered in the readings are automatically scheduled for practice, when possible in the context of the original text, since learning a word in con- text is more effective.”[3]. These personalized words can later be learned in the personalized an proven effective exercises. [4] This learning from context shows great promise: ”Whereas learning from context is demon- strably more difficult in a second language, second-language readers have been shown to gain significant word knowledge simply from reading, and increasing second-language students’ volume of reading has been found to produce significant gains in vocabulary knowledge and other aspects of linguistic proficiency. ”[5] To increase the level to which the reader can be personalized into a personal textbook, the user can select sources to read from in the current Unified Multilanguage Reader [6] [7]. This however is still not a highly convenient and personalized environment, as such a reader would require a more specific selection of articles the user wants to read.

1.1 Structure

This document describes the Online Practice Platform. The next sections are focused on describing the process of analyzing, developing and eval-

(7)

CHAPTER 1. INTRODUCTION

uating the essential and minimal building blocks for a language practice tool. Thus, this paper is structured in the following manner:

1. Related Work This section investigates the research done in the field, the related applications that solve similar problems, and the direction of this project.

2. Problem Description This part defines the problem and lists ne- cessary requirements to solve the problem at hand.

3. User Experience and Interaction Design This section describes the solution of the problem and it’s visual implementation.

4. Architecture and Implementation This is about the architec- ture and back-end decisions that have been made for the platform.

5. Evaluation The results and insights gained are described in this part.

6. Conclusion This consists of a reiteration and draws the final con- clusions.

7. Future Work This section discusses future work and possible im- provements to the system.

(8)

Chapter 2

Related Work

There are several related applications to what Zeeguu is trying to achieve, all in a different way than Zeeguu however. In this related work section, the work on the Zeeguu project will be considered, as well as two related applications to the reader. The two other applications that will be con- sidered are Google, the well known online search engine and Feedly, an online platform to read news articles in which the users can customize the feed to read from.

2.1 Zeeguu

The Zeeguu Ecosystem presented by Lungu[3] [8] is an ecosystem con- sisting of several separate components, all designed to interact with each other in a particular way. There is the Zeeguu-Core for the back-end com- putations and model, the API to have endpoints to expose the needed methods to other applications interacting with the Core, and lastly the Unified Multilanguage Reader, which serves as a user interface for finding and reading articles.

(9)

CHAPTER 2. RELATED WORK

2.1.1 Zeeguu Core

The Zeeguu Core1 is the main model behind the entire zeeguu infrastruc- ture and the back-end for retrieving making changes or getting data from the model.

2.1.2 Zeeguu API

The Zeeguu API2 is a Restful API which exposes the backend to other applications. Through the API, other applications like the reader and mobile applications can GET or POST data from or to the back-end as these applications can not communicate directly with the Zeeguu core.

2.1.3 Zeeguu Unified Multilanguage Reader

The Zeeguu Unified Multilanguage Reader3 described by van den Brand and Chirtoaca is the main user interface of the system, where the users can select and read articles. The Unified-Multilanguage-Reader communicates with the API to access the model and to be able to function properly.

2.2 Related Applications

The related applications that are closest to the Unified Multilanguage Reader and therefore this project are considered.

2.2.1 Google

Google is a very well-known search engine, which can basically be used to search anything on the web, which is a lot of data. As the aim of the project is to search through many articles very efficiently, the purpose and idea of Google is closely related to parts of this thesis. The efficient and well-developed way of searching and ranking that Google uses could be useful when developing full stack search. [9]

1https://github.com/zeeguu-ecosystem/Zeeguu-Core

2https://github.com/zeeguu-ecosystem/Zeeguu-API

3https://github.com/zeeguu-ecosystem/Unified-Multilanguage-Reader

(10)

CHAPTER 2. RELATED WORK

2.2.2 Feedly

Feedly is an online news reader where users can create their own custom feeds based on other sources and interests. The user interface offers the adding of standard topics with hashtags and moreover, users can search for every name or keyword in the search bar. The users can then choose the language in which the articles for the particular content should be displayed. Therefore Feedly has quite a few similarities to the Unified Multilanguage Reader, though it can only be used for reading articles in different languages. The Zeeguu project offers the functionality to trans- late words live and practice these words later in the exercises. The purpose of both applications therefore differ, Feedly aims to offer a personalized feed of articles in your own language, while Zeeguu tries to offer a per- sonalized feed of articles in a language the user wants to learn, combined with more capability of translating and practicing these words.

(11)

Chapter 3

Problem Description

The current news reader in the Zeeguu ecosystem built by van den Brand [7] and Chirtoaca [6] suffers of a few issues. Van den Brand concluded:

”Many users would like to be able to filter articles based on content. A search feature or similar query tool is often given as an example.”[7] and.

Furthermore, some of the feedback of a user group who used the platform for a month was:

• ”I would like to avoid articles with information about accidents with human casualties”

• ”Better display of the articles and tags such as Gaming or News”

• ”Add a choice for different topics not only for the sources”

• ”... Order articles in different subjects like Animals, Fashion and not Politics”

From these issues evaluated by the creators of the Unified Multilanguage Reader, users who used the platform for a month, and the future work sec- tion ”Second, improved content recommenders and difficulty estimators are needed in order to provide an even better personalization of content and thus increase learner interest and motivation.”[3] one can conclude that the main problem users are facing is that there is no such possibility to filter the content of articles, and the sources which are currently avail- able are not focused enough, so the users have no real way of filtering any

(12)

CHAPTER 3. PROBLEM DESCRIPTION

content of liking or disliking.

RQ: How can the level of personalization of the reader be increased while not modifying much the current architecture?

Based on this research question, the hypothesis is that creating content filtering functionality, which contains, but is not limited to live filtering and searching for specific content, subscribing to certain pre-defined and custom topics and lastly the filtering of pre-defined topics and custom filters are sufficient for the users to be able to personalize the article feed to the desired level. As with these extra functionalities, the user has the ability to personalize the feed of articles that one is reading from to a great extent, being able to subscribe to favorite topics and being able to filter out topics one really does not want to read about.

These features should be implemented in a way that the features are:

• Intuitive: without any extra explanation, the user should be able to understand and use the features.

• Fast: the system must respond and return feedback to the user fast in order to be user-friendly.

• Expressive: the system must be expressive, and therefore the user must be able to find the desired content.

In conclusion, there is a user-driven need for accurate content filtering functionality to increase the joy and usability of the overall Unified Multil- anguage Reader, which at the least must be easy to user, self-containing, correct and efficient which will thus increase the level to which the reader is personalized for the user.

(13)

Chapter 4

User Experience and Interaction Design

This chapter describes the user experience and interaction design de- cisions. First, a overview of the old system is given, followed by the newly implemented design choices and the corresponding motivation and explanation. When referred to ’the reader’, the writer refers to the Unified Multilanguage Reader.

4.1 Old user interaction overview

In the old Unified Multilanguage Reader in Figure4.1the Material Design from Google was used as the base of the design ”... the Material Design movement initiated by Google fits our particular needs in many ways.”[7].

The reason for this was the bright but minimal design which is tactile and intuitive for the prospective users. Furthermore some choices have been made concerning positioning of elements, introducing new users and using for example Material Light Icons. With this, the creators laid out a strong, maintainable code base and user-friendly design for the reader to be built upon and improved further, together with the well-documented github pages 1 about the project, dependencies and development. When the user selects an article to read, the user is redirected to the actual reading

1https://zeeguu-ecosystem.github.io/Unified-Multilanguage-Reader/

(14)

CHAPTER 4. USER EXPERIENCE AND INTERACTION DESIGN

environment[6] which is out of the scope of this thesis and therefore not included in the design decisions made.

Figure 4.1: The original Unified Multilanguage Reader

4.2 New user interaction decisions

In order to accommodate the requirements for the reader to be more per- sonalized for the users, several changes had to be made, which will all be elaborated on in respective sections. As this thesis is about personalizing and thus filtering the user content, and not about the presentation of the actual articles and reading, the article presentation and actual reader are kept as they are right now, proved to be working well. The majority of the visual changes take place in the top right menu which used to be called

”My Sources”. A brief overview of the new reader is demonstrated along with elaborated design choices:

• Dialogs for adding topics or filters

• Search bar and notification

• Naming conventions

(15)

CHAPTER 4. USER EXPERIENCE AND INTERACTION DESIGN

Figure 4.2: The new overview of the right menu with nothing selected

In Figure 4.2 the new content menu is displayed. Noticeable changes include:

1. Renaming of ’My Sources’ to ’My Content’ as this pane is not only about the sources anymore. Instead, the pane functions as a content selector, where users can personalize the content of articles shown.

2. The material design search icon which, when clicked, expands into a search bar. More on this in a later section.

3. The addition of three chips, for adding languages, ’Interesting con- tent’2 and ’Not Interesting content’. These use the same chips to be in line with the design of the reader.

In Figure4.3, one language ’English’ has been selected, and two topics have been selected of which one is predefined ’Sport’ and the other is a custom topic ’California’. The results of the articles for the new selection of content is immediately displayed in the article list in the main view.

Some visual indentation of the topics and languages has been added in or- der to distinguish them from the chips opening the dialogs. Furthermore,

2This consists of predefined topics and custom topics

(16)

CHAPTER 4. USER EXPERIENCE AND INTERACTION DESIGN

Figure 4.3: View after having selected a language and two topics

all of the chips which indicate content filtering have a material cancel icon in order to remove the content filter easily and with one click.

Figure 4.4: View after adding ’Six nations’ to the ’Not Interesting’ list In figure 4.4, a filter has been created for ’Six nations’, as one might notice, the article about the six nations is immediately removed from the article list. For the custom topics, search and filters there is no case

(17)

CHAPTER 4. USER EXPERIENCE AND INTERACTION DESIGN

sensitivity in order to be more user-friendly.

4.2.1 Adding dialogs

The main purpose of these dialogs is to add particular content to the selection of the user. Therefore these dialogs must be clear, easy to use and concise. Furthermore one of the most important things about the dialogs is to keep the design in line with the rest of the reader, and make sure the dialogs scale well to smaller devices like smartphones and tablets, as the reader is to be used on multiple devices.

Figure 4.5: The dialog that opens when one wants to add a language

When a user clicks on the material add icon in the ’Languages’ chip for example, a dialog window opens as demonstrated in Figure4.5. When the user uses the mouse to click on the ’Close’ button or outside the dia- log window, or the user uses the ’esc’ key, the dialog closes and the user returns to the Figure 4.2 view.

In the actual dialog to add a language, illustrated in Figure 4.6, the user is presented with the languages not followed yet, accompanied by the adding icon and description text to clarify to the user.

(18)

CHAPTER 4. USER EXPERIENCE AND INTERACTION DESIGN

Figure 4.6: The dialog for adding languages

Figure 4.7: The dialog for adding topics

The dialog for adding ’Interesting’ content, displayed in Figure 4.7 is similar to the language dialog. The main difference presented is that the user can either choose from the predefined topics, which are topics cre- ated by the Zeeguu team, or the user can choose to add an own ’custom’

topic. When clicking on the link ’Or add your own topic!’ in the bottom of the dialog, the user is redirected to a new dialog demonstrated in Fig- ure 4.8. This mainly has to do with keeping the design clean and easy to use, and furthermore making sure the reader is scalable to smaller devices.

(19)

CHAPTER 4. USER EXPERIENCE AND INTERACTION DESIGN

Figure 4.8: Adding a custom topic

As aforementioned, when the user clicks on the ’Or add your own topic!’ link, the custom topic dialog opens, as illustrated in Figure 4.8.

This dialog simply has an input text field where the user can input the topic or keywords to subscribe to, and two buttons. The properties of closing the dialog are the same as the others.

The ’Not Interesting’ dialog and the custom filter dialog, following the first, are similarly shaped and functional as the ’Interesting’ dialogs and therefore are not explicitly mentioned here. The reader is instead invited to visit https://zeeguu.unibe.ch/ and create a beta-tester account in order to try this out.

(20)

CHAPTER 4. USER EXPERIENCE AND INTERACTION DESIGN

4.2.2 Search

Figure 4.9: A search for the term ’England’

The search bar in Figure 4.9 appears only when the material search icon is clicked by the user, this makes the search bar blend in smoothly.

As soon as the bar appears, the user can input search terms, and use the enter key to submit the search. The search that will be conducted is a live search and will return articles that match the particular search term(s).

Figure 4.10: The notification bar when searched for the term ’England’

After the user searched for ’England’ for example, by typing this into the search bar like Figure 4.9 and using the enter key to submit, the side bar will be closed and the user will be presented with the articles that match the search. As one can see in Figure 4.10, the user is presented

(21)

CHAPTER 4. USER EXPERIENCE AND INTERACTION DESIGN

with the search notification. The search notification was implemented in the style of the articles above the articles in the article list. The reason for the notification is that users should be aware that the current content in the article list is the content for a particular search term. Besides that, the notification and the articles will remain there after reading an article and returning to the list, so the user can further explore the articles for the search terms. The notification is easily removed by clicking on it, and together with the removal, the articles will be reloaded for the actual content configuration the user has set in the menu.

4.2.3 Naming conventions

Several iterations went over the process of naming the content menu and it’s options. The first thought from a computer scientist point of view was to name the chips and dialogs ’Topics’ and ’Filters’, which seemed to be the logical option. Though when receiving feedback from actual users, this was not apparent, and users assumed ’Filters’ were content filters, like the topics. For this reason the final decision was to go with ’Interesting’

content and ’Not Interesting’ content.

(22)

Chapter 5

Architecture and implementation

In order to realize the design elaborated on in the past section, the user interface changes are not enough. For the reader to work well, the back- end of the reader and the server, which includes different components, need to have extended functionality. This chapter gives a rough overview of the most important and useful parts that are developed for the following components of the Zeeguu ecosystem[3]:

• Zeeguu Core The backbone of the system, retrieving, modifying and adding articles and other components in the Zeeguu model.

• SQL Database For the new topic and subscription management, several components of the database had to be changed or added.

• Zeeguu API The endpoints added to expose the core to the reader and other applications.

• Unified Multilanguage Reader The user interface to find and read articles.

A brief overview and explanation of all the parts will be given, accompan- ied by motivation for certain decisions that were made. The line number references given will only work for the current (July 2018) version of the Core, API and reader.

(23)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

5.1 Zeeguu-Core back-end

5.1.1 Model

For the content management and subscriptions to be implemented, several extra database tables had to be added to the old model.

Figure 5.1: The new database additions

Only the additions are shown for the purpose of having a clear over- view, and the other part of the database being out of scope for this thesis.

Due to this fact however, the foreign key constraints on languages, articles and users are not shown. For every article id, user id and language id there is a foreign key constraint with the corresponding ’Article’, ’User’

and ’Language’ tables.

Article Word

The ArticleW ord1 class is used to store the keywords found in the articles, which are later used to perform search queries for the live article search.

The ArticleW ord class has a many to many relationship with the Article

1https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/article word.py

(24)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

class, as multiple words can occur in an article and multiple articles can contain the same word. The word in ArticleW ord are unique, as the usage is like a look-up table to find a search word and retrieve the corresponding articles matched with that particular word. The class has two fields:

• id : A unique and auto-incremented id.

• word : A string which is the keyword.

Articles Cache

The ArticlesCache2 class is used to cache the articles found for certain content configurations. The class calculates the hash for the configuration of source subscriptions, language subscriptions, topic subscriptions, topic filters, search subscriptions and search filters (articles cache.py#L39). This hash is calculated in a away that for the same configuration, the same result will always be returned. The method to calculate the content hash takes all the content as parameters and uses the respective sorted ids to created the hash string, which is further explained in the Content Config- uration Hashing section. As the hash is not user-linked, this can be used for every user with the particular configuration. The ArticlesCache class consists of 3 fields:

• id : A unique and auto-incremented id.

• content hash : A string for the calculated hash.

• article id : This links the hash to an article.

Topic

This class is the general T opic3 in English. This topic is used for the mapping of the articles to topics in the article topic map. Furthermore this general variant of the topic is displayed in the reader for the articles and in the dialog. The T opic class contains only two fields:

2https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/articles cache.py

3https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/topic.py

(25)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

• id : A unique and auto-incremented id.

• title : The English name of the topic as a string.

Localized Topic

The LocalizedT opic4 is the localized version of a topic. It is the same as the corresponding topic, but then for a specific language. This way, each topic has a localized topic for every language. The localized topic is used to map the topics to articles, as this contains the keywords in every language, for which the topic should be added to an article. The LocalizedT opic has the following 5 fields:

• id : A unique and auto-incremented id.

• topic id : This links to the topic it belongs to.

• language id : This links to the language it belongs to.

• topic translated : This is the translation of the topic title for the given language.

• keywords : This is a string with all the space separated keywords for the particular language and topic.

Topic Subscription

The T opicSubscription5 class contains the user topic subscriptions. This is used to keep track of which user is subscribed to which topic, only for the pre-defined topics. The subscription is created when a user subscribes to a topic, and deleted when the user unsubscribes. Logically the fields in this class are:

• id : A unique and auto-incremented id.

• user id : This links the subscription to a user.

4https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/localized topic.py

5https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/topic subscription.py

(26)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

• topic id : This links the subscription to a particular topic in the T opic table.

Topic Filter

The T opicF ilter6 class contains the user topic filter subscriptions. This is used to keep track of which user is subscribed to which topic filter, only for the pre-defined topics. The filter subscription is created when a user subscribes to a topic filter, and deleted when the user unsubscribes.

Logically the fields in this class are:

• id : A unique and auto-incremented id.

• user id : This links the filter subscription to a user.

• topic id : This links the filter subscription to a particular topic in the T opic table.

Search

A Search7 is string which any user can enter in the user interface. When a user uses the instant search, no new object will be created, only when a user subscribes or filters a search. When a user unsubscribes from a search or search filter, the search object is removed again. The Search class is a simple class consisting of:

• id : A unique and auto-incremented id.

• keywords : A string containing the keywords for which a user searched.

Search Subscription

The SearchSubscription8 class contains the user search subscriptions.

This is used to keep track of which user is subscribed to which search.

6https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/topic filter.py

7https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/search.py

8https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/search subscription.py

(27)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

The subscription is created when a user subscribes to a search, and de- leted when the user unsubscribes. Logically the fields in this class are:

• id : A unique and auto-incremented id.

• user id : This links the subscription to a user.

• search id : This links the subscription to a particular search in the Search table.

Search Filter

The SearchF ilter9class contains the user search filter subscriptions. This is used to keep track of which user is subscribed to which search filter.

The search filter subscription is created when a user subscribes to a search filter, and deleted when the user unsubscribes. Logically the fields in this class are:

• id : A unique and auto-incremented id.

• user id : This links the filter subscription to a user.

• search id : This links the filter subscription to a particular search in the Search table.

User Language

In the old model, the user learned only one language. Therefore the user class contained the learned language and the native language of a user.

With the new model, a user can read in more than one language, and therefore the U serLanguage10 class is needed. This class keeps track of which user learns which language, a personalized version of a language.

Besides that, the class keeps track if the user is reading news in the lan- guage and if the user is doing exercises in the language. Lastly it keeps track of the inferred and declared level in each language. The fields used for this are:

9https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/search filter.py

10https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/model/user language.py

(28)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

• id : A unique and auto-incremented id.

• user id : This links the user language to a user.

• language id : The language this user language is about.

• declared level : A user-declared level of the language

• inferred level : The inferred level, based on what the user is read- ing and translating.

• reading news : A boolean set to true if the user has selected this language in the reader.

• doing exercises : A boolean set to true if the user wants to do exercises in this language.

5.1.2 Mixed recommender

The mixed reccomender11 is used mostly as a central point for retrieving the recommended articles and the articles for a live search. Furthermore

the mixed recommender has a method recompute recommender cache if needed (mixed recommender.py#L35), which is called on every change of the con-

tent configuration of a user. The method computes the articles for the configuration, gets the calculated hash and adds all the articles for the hash. The articles are computed in the way described below.

u s e r a r t i c l e s = g e t a r t i c l e s f o r t h e u s e r s o u r c e s and l a n g u a g e s u s e r t o p i c a r t i c l e s = g e t a r t i c l e s f o r t h e u s e r t o p i c s

i f u s e r t o p i c a r t i c l e s i s n o t empty

u s e r a r t i c l e s = i n t e r s e c t u s e r a r t i c l e s l i s t w i t h u s e r t o p i c a r t i c l e s l i s t

u s e r f i l t e r a r t i c l e s = g e t a r t i c l e s f o r t h e u s e r f i l t e r s i f u s e r f i l t e r a r t i c l e s i s n o t empty

u s e r a r t i c l e s = g e t t h e d i f f e r e n c e between u s e r a r t i c l e s l i s t and u s e r f i l t e r a r t i c l e s l i s t

s o r t t h e u s e r a r t i c l e s l i s t on most r e c e n t d a t e

11https://github.com/zeeguu-ecosystem/Zeeguu-Core/blob/master/zeeguu/content recommender/mixed recommender.py

(29)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

r e t u r n u s e r a r t i c l e s

Where the get methods perform the queries to the database in order to get the correct articles for the particular topics, filters, languages or sources.

5.1.3 Content Configuration Hashing

As described in the Articles Cache part in the model of the Zeeguu Core, the user configuration is hashed and the articles are cached for this hash when a user changes the content configuration in the reader.

The hash is calculated for the entire content configuration of the user (art- icles cache.py#L39), which includes the languages, topics, filters, searches and search filters. The decision was made to hash the entire configura- tion in stead of the separate components as this would increase the speed.

In stead of having to make up to 10-15 queries to the database, a single query with the entire hash suffices this way. The high speed comes with a trade-off of having to store all the user configurations, but the storage space that is used for the cache is a trade-off which is justifiable when the user speed is increased.

The hash is calculated based on the ids of all the different aforemen- tioned components. It also prepends the first letter(s) of the type of content it is to the ids so it is always the same and always unique for a different configuration. The hashing should be improved by using a separ- ator like a comma between the different ids, so the content configuration can be calculated from the hash. This makes the cache invalidation and recalculation easier. Some examples of hashed configurations:

• l57tfssf : languages with id 5 and 7 selected

• l57t11121314fssf : languages with id 5 and 7 selected, topics with id 11, 12, 13 and 14 selected.

• l57t12f11s11sf12 : languages with id 5 and 7 selected, topic with id 12 selected, filter with id 11 selected, search with id 11 selected and search filter with id 12 selected.

(30)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

By hashing these configurations which are not linked to a user, users have fast retrieval of the articles belonging to the configuration. Further- more, when other users have the same configuration, the articles do not have to be recalculated. And lastly, to make sure the articles stay up to date, the cache is invalidated and all the hashes and corresponding articles are recalculated every night.

(31)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

5.2 Zeeguu-API

The Zeeguu API contains all the endpoints that can be accessed by applic- ations like the Unified Multilanguage Reader, the Chrome extension and mobile applications. In order to use the new core functionality that has been implemented in the Unified Multilanguage Reader front-end, several new endpoints are required to interact with the core back-end. These new endpoints are listed with a short explanation of the workings of the respective endpoints is given in this section.

5.2.1 Search

The Search endpoints12 are used to get the articles for an instant search, subscribe and unsubscribe from searches, and get the subscribed searches.

The last three endpoints mentioned here are also available for search fil- ters, but as these are exactly the same except for the name, there is no need to explain these.

• GET search (search.py#L176) : A GET request that expects one parameter, a string with the search keywords. This request returns the articles as a JSON list.

• GET subscribe search (search.py#L25) : This GET request ex- pects one parameter just like the instant search: a string with the search keywords. This is a GET and not a POST request (as one would expect) for the reason of the Search object being created on the back-end and returned in a JSON format, to be displayed in the reader.

• POST unsubscribe search (search.py#L47) : The POST unsub- scribe search expects a search id as data and uses this id to find the particular search object and remove it, along with the corresponding subscription.

12https://github.com/zeeguu-ecosystem/Zeeguu-API/blob/master/zeeguu api/api/search.py

(32)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

• GET subscribed searches (search.py#L76) : To GET the sub- scribed searches, the method uses the flask user to find the subscrip- tions and return all the search objects as a JSON list.

5.2.2 Topics

The T opic endpoints13 are used to get the topics, subscribe and unsub- scribe from topics, and get the subscribed topics. The last three endpoints mentioned here are also available for topic filters, but as these are exactly the same except for the name, there is no need to explain these.

• GET interesting topics (topics.py#L95): Interesting topics are defined as the topics that a user has not followed or filtered yet.

As such, the GET interesting topics returns a JSON list of topic objects.

• POST subscribe topic (topics.py#L23): This POST request ex- pects the topic id as data in the request and uses this to find the corresponding topic and create the subscription.

• POST unsubscribe topic (topics.py#L44): This POST request expects the topic id as data in the request and uses this to find the corresponding topic and delete the subscription.

• GET subscribed topics (topics.py#L69): The GET subscribed topics simply returns a JSON list of subscribed topics for the particular user.

5.2.3 User Languages

As defined in the core-model section, a U serLanguage is a personalized version of a language. This stores data about the user with respect to the particular language. There are several endpoints14 used for the U serLanguage.

13https://github.com/zeeguu-ecosystem/Zeeguu-API/blob/master/zeeguu api/api/topics.py

14https://github.com/zeeguu-ecosystem/Zeeguu-API/blob/master/zeeguu api/api/user languages.py

(33)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

• GET user languages (user languages.py#L89): The GET user languages returns a JSON list of user languages based on the user that reques-

ted the the list.

• POST user languages/modify (user languages.py#L27): The modify POST request for a user language is one of the more in- teresting ones. This POST request can take from one to four para- meters in the body. The body must contain the language id the user wants to modify, secondly the request can contain two booleans, language reading and language exercises. Finally the language level can be added to modify the inferred level. If any of these last three is sent, the request will change the specific user language accordingly (user languages.py#L51) and return ”OK” for success.

• GET user languages/delete (user languages.py#L65): Deleting an entire user language should not be necessary but is possibly with this request, with only sending the language id.

• GET user languages/interesting for reading (user languages.py#L133):

This quite specific endpoint returns all the languages that are in- teresting for reading for a particular user, which is defined as the language that the user is not reading yet. It returns the languages as a JSON list.

• GET user languages/reading (user languages.py#L111): The

last endpoint does the opposite of the aforementioned interesting for reading:

the reading endpoint returns a JSON list of the languages the the user currently is reading.

(34)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

5.3 Unified Multilanguage Reader architec- ture

The main goals of the Unified Multilanguage Reader were to keep the style in line with the already existing reader i.e. material design lite with the use of several libraries as the old reader is scalable and has a smooth user-interface. In this part some elaboration on how the design given in chapter 4 was created architecturally.

5.3.1 Dialogs

The dialogs used for adding languages, sources, interesting content and non-interesting content are creating using the Sweetalert library. The reason for using Sweetalert is that these dialogs look good and scale well. The custom layout templates loaded in the Sweetalert dialog are in the source already and the options which are loaded from the server are rendered using M ustache, where after the corresponding actions are added to the different buttons in the dialog. When the user adds a certain option to the content, this option is removed from the list.

5.3.2 Search

There are several decisions that were made concerning the search interac- tion with the user, in order to make the interaction as user-friendly and effective as possible.

Material Search Bar

To make the content menu drawer as minimal and good-looking as pos- sible, we decided to use the material search icon for the search and not have a standard text input box there already. When a user clicks the icon, the minimal input box opens and the user can enter the search assignment, and search with the return key. The search input is then ’validated’ by checking that the input is not empty, and the search is performed. This is accompanied by closing the drawer, emptying the input and showing

(35)

CHAPTER 5. ARCHITECTURE AND IMPLEMENTATION

the search notification for the particular search term, and lastly load the articles that match the search term.

Search Cookie

When a user searches for specific keywords, the articles that are found are returned and displayed in the article list by clearing the list and loading the appropriate articles. When a user enters an article to read it, and returns to the article list after, the search results should still be available, in order to minimize the amount of actions a user has to perform. To achieve this, a cookie (main.js#L140) is locally stored with the particular search term. When the article list is loaded, the cookie is first retrieved, and if the cookie exists, the search articles are loaded from the cache (Art- icleList.js#L135)15 and the user can continue looking through the search results. This cookie is only removed on clicking the search notification bar, and thus ’deleting’ the search (main.js#L136).

5.3.3 Events

In order to make sure the user is notified when a request is loading, and the article list is refreshed when a user changes the content configuration, events are used within the reader. Any class like the subscription lists can fire these events when needed, so when the content configuration of the user is updating or it has been updated. When such an event is fired, then event listener (main.js#L47) in the main controller16 catches the event and if it is the loading event, the article list is cleared and the loading elephant icon will be shown, and if it is the subscription event, the article list is cleared and the new articles for the content configuration are loaded. This way the reader is always up to date with the back-end.

15https://github.com/zeeguu-ecosystem/Unified-Multilanguage-

Reader/blob/master/src/umr/static/scripts/app/subscription/ArticleList.js

16https://github.com/zeeguu-ecosystem/Unified-Multilanguage- Reader/blob/master/src/umr/static/scripts/app/subscription/main.js

(36)

Chapter 6

Results and Evaluation

In order to evaluate the project and therefore the new article browser in the reader, we created a questionnaire which we had several users fill out, to see whether the requirements we set in the problem description are met.

Due to the project still being in development for a long time, the response to the many sent out questionnaires was low. Nonetheless, we were able to collect valuable feedback and understanding into how users interact with the system from the users that did participate in the questionnaire.

The main feedback is analyzed in this chapter and the full feedback of the users is in the appendix.

6.1 Questionnaire Results

There were several different sections which would test the different parts of the system on intuitiveness, user-friendliness and speed. Figure 6.1

Figure 6.1: How hard was it to find English articles?

(37)

CHAPTER 6. RESULTS AND EVALUATION

represents the answer to the question ”How hard was it to find English articles?”, it shows that half of the users found the adding of the English language rather easy, and the other half of the users had problems find- ing out. From the extra feedback that was collected in this question, it became apparent that users who had troubles to add the language, had trouble finding the menu due to a slight bug in the system. Thanks to the feedback, the bug was solved and therefore this does not imply that the language selection is hard. The users who had used the old reader before did not have any trouble finding the menu and adding the language.

Figure 6.2: How hard was it to add the topic ’Sport’ ?

For the majority of the users, adding a topic was already significantly easier than adding the language, when comparing Figure 6.1 to Figure 6.2. This is most likely caused by the users being able to find the menu easier as the menu was accessed in the question before. Most users were satisfied with topic selection and only had feedback on the ordering of the articles, which will be discussed later. In Figure 6.3, the users respon-

Figure 6.3: How hard was it to add the topic ’England’ ?

ded that overall the adding of the topic ’England’ worked well, some users were confused between adding a custom topic and searching for a keyword

’England’ which gets all the articles for that keyword.

(38)

CHAPTER 6. RESULTS AND EVALUATION

Figure 6.4: How hard was it to search for your search words?

For most of the users the search for an own preferred search term was really clear as demonstrated in Figure 6.4. One user however scored the search a 4, and therefore indicating that it was rather hard. When going through the extra feedback though, the user indicated that the articles were old and therefore scored the question high. This was a temporary bug that has been fixed and therefore one can conclude that the search is intuitive and easy to use. In Figure 6.5 one can see the clear distinction

Figure 6.5: How hard was it to filter out your unwanted content?

between the users who found the process easy and others who found the process of filtering out content rather hard. Several users had nothing to add and thought the process was really clear, though on the other hand, the users who struggled commented ”It isn’t clear to me where you can filter” and ”Should this go via menu option Not Interesting?” this makes it seem an issue of formulating the question, as the filter versus not interesting has been discussed in the design chapter. Users might have misunderstood the formulating of ’filtering out’ content and therefore failed or were confused.

The overall experience of the users in Figure6.6 averages to just below a 7 out of 10. This indicates that the users found the reader rather pleasant

(39)

CHAPTER 6. RESULTS AND EVALUATION

Figure 6.6: How was your overall experience with the reader?

and easy to use, though most of the users have some critical feedback on things that could be improved: ”User-friendly, however: perhaps an indication to highlight the menu-option”.

Figure 6.7: Do you think that with such an article browser configuration you could personalize your news to your liking?

Figure 6.7 indicates that 100% of the users who responded indicated that the in their opinion, the reader is definitely suited to personalize the content in the reader to the personal preference. This is very positive as this was the original goal and intent of the project.

Overall, the user feedback was positive though very critical. This is, for the obvious reason of improving the reader, very useful. More on what will and can be done with the user feedback will be explained in the conclusion and future work.

(40)

Chapter 7 Conclusion

For the project, all the functional requirements have been implemented.

In the evaluation however, it shows that not all the non-functional re- quirements are fully being met as well. As the compromise of slow adding of topics, languages and filters had to be made in order to be able to cache all the articles for the user content, the users now experience a small, to sometimes slightly big, delay when adding items in the menu. This is done however in order to improve overall reading experience as the articles will load practically instantly.

Furthermore, users have delivered valuable feedback on things that could be improved upon too. The most important feedback provided by the users was about not being able to sort the articles in a particular way, and having the pre-defined ’random’ ordering. Currently the ordering is based on the date of the the articles. This will be elaborated on in the Future Work chapter.

In the realization and design of the user interaction, the most import- ant thing was to try to think like the user would. Often, especially when designing and building the application as a computer scientist, this can be hard. Therefore one of the best ways to go is to start with the first iteration, get feedback from users, discuss this and apply changes when necessary. When iterating over the design several times, with a continu- ous feedback loop, the application can get close to the optimal level of

(41)

CHAPTER 7. CONCLUSION

user-experience and ease of use. From the evaluation it seemed that users were positive about the new reader and became more positive as the users got acquainted with the reader.

Every user that has given feedback thinks that the platform could be very useful for reading content in multiple language and could be adjusted to the exact own interests of the particular user. The goal was to have a higher level of personalization in the reader and when taking this feedback into account, this goal has been achieved. Therefore the hypothesis stated in the problem description holds. There are however several things pointed out which could be improved, which is discussed in the Future Work chapter.

(42)

Chapter 8

Future Work

8.1 Ordering

Firstly, based on the user feedback received in the evaluation, the ordering of the articles displayed to the user in the reader could be improved upon.

Currently the ordering is rather simple, ordering the articles by date.

Adding a button to order the articles by difficulty, closest match, date descending or amount of words would be an interesting improvement.

This way users can not only filter the content of the articles, but can also select the order in which the articles should be displayed, and thus make it easier to choose the right article.

8.2 Natural Language Processing

The live search, custom topics and topic mapping could be improved to a great extent when introducing NLP (Natural Language Processing).

The articles could also be filtered on the summary or content when con- taining several word combinations or the same word often. Furthermore with comprehensive NLP search technology, there would also be room for searching for synonyms of the actual search terms, and thus yielding a greater amount of articles for the preferred topic or filter of the user.

(43)

CHAPTER 8. FUTURE WORK

8.3 AI to predict what users like

Lastly, an extra feature that could be implemented in the future would be to create a predictive model to decide which articles a certain user would like, based on the articles that the user already read. This would greatly improve the personalization of the reader without the user having to select the suited topics and filters, and would make readers overall more motiv- ated to use the system, as a consequence of the system being even more easy to use. One would however have to be careful with predicting the articles for a user, as wrong predictions could have serious consequences.

(44)

Bibliography

[1] Aylin Baris Atilgan. Effects of extensive reading on writing in terms of vocabulary. INTESOL Journal, 10(1), 2013. 2

[2] Timothy Bell. Extensive reading: Why? and how? The Internet TESL Journal, 4(12):1–6, 1999. 2

[3] Mircea Filip Lungu, Luc van den Brand, Dan Chirtoaca, and Martin Avagyan. As we may study: Towards the web as a personalized lan- guage textbook. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, CHI 2018, Montreal, QC, Canada, April 21-26, 2018, page 338, 2018. 2, 4, 7, 18

[4] Martin Avagyan. Building blocks for online language practice plat- forms. 2017. 2

[5] William E Nagy. On the role of context in first-and second-language vocabulary learning. Technical report, University of Illinois at Urbana- Champaign, Center for the Study of Reading, 1995. 2

[6] D. Chirtoaca. Apollo simplicity and intuitiveness in a personalized multilingual reading tool. 2017. 2, 7,10

[7] L.A.H. van den Brand. Prometheus efficiency and usability in a per- sonalized multilingual feed manager. 2017. 2, 7,9

[8] Mircea F. Lungu. Bootstrapping an ubiquitous monitoring ecosystem for accelerating vocabulary acquisition. In Proccedings of the 10th European Conference on Software Architecture Workshops, ECSAW 2016, pages 28:1–28:4, New York, NY, USA, 2016. ACM. 4

(45)

BIBLIOGRAPHY

[9] Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd.

The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999. 5

(46)

Appendix A Workflow

A.1 Iterations

We worked with an iterative process, setting deadlines at the beginning of the project for every sprint. There would also be certain specifics for each deadline like a UI mockup or the minimum viable product. By working through the process iteratively, we could have several rounds of feedback, also explained in the next section.

A.2 Continuous feedback loop

Due to the iterative process, there would be feedback at the end of every iteration end improvement. This is really valuable as the project will not just be delivered, with possibly one round of feedback, but the project keeps evolving to the needs of users. When having this continuous feed- back loop of building, feedback and improving upon the feedback, one will get the most out user-friendly and easy to use product in the end.

A.3 Regular meetings

By meeting up regularly, say almost every week, during the course of the project, the project ended up to be a lot better. During these meetings one can discuss different problems and new or improved requirements a

(47)

APPENDIX A. WORKFLOW

lot better. This way, the process is not one sided but gets input from both sides and can possibly become better by having different views.

(48)

Appendix B User Feedback

I clicked on French instead of English therefore I didnt see any articles but I understood where to go

The menu items were not very explanatory. More effective to describe the different menu items

Maybe an arrow to indicate to the menu in the top right corner I first had only Spanish selected as the language I wanted to learn.

When I changed it to English, the website could not find any English articles. Even when I changed the sources it could look in, it did not manage to display any English articles.

Place the menu button more central and add search instructions I found the menu as it was shaking but there could be a better indic- ation where it is for new users

Instructions that you have to delete the other language you’re learning first and then don’t forget to add a source.

Table B.1: Responses to: ”Do you have any feedback on the process of finding English articles?”

(49)

APPENDIX B. USER FEEDBACK

I found an article about the World cup and then I clicked on the sports, then all the other articles faded. Was that because there arent any sports articles?

The website only displayed sports articles of 1-3 months ago. I would have liked to see more recent articles.

It was not that hard, but the loading took a few seconds, is this normal?

Table B.2: Responses to: ”Do you have any feedback on the process of finding articles about sports?”

I added England as search engine, now I realize that I didnt do it rught in the previous question and I could have added sports to my search on the right

The selection with all selected interesting took a while. In general if you want to change your selction you have to select Menu again ( bit in efficient) have to press het Loupe again

For this case I also could only find articles from 1-3 months ago. Would have liked to see some more recent articles.

Make it clear that you have to add your own topic

I think i added england but i’m not sure if it works, I still see the sports articles

Table B.3: Responses to: ”Do you have any feedback on the process of finding articles about England?”

Sort the highest ranked articles from high to low ( now it looks random order)

I could only find one article about birds, which was 2 months old.

Would have liked to see some more and more recent articles.

Articles are old, 1 month at least..

Table B.4: Responses to: ”Do you have any feedback on the process of finding articles about a word of your liking?”

(50)

APPENDIX B. USER FEEDBACK

I didnt see any articles anymore, is that the idea?

Should this go via menu option Not Interesting?

It isn’t clear to me where you can filter

I’m not too sure if it worked, but I think it did

Table B.5: Responses to: ”Do you have any feedback on the process of filtering out articles about a topic you dislike?”

I like the simplicity

User-friendly, however: perhaps an indication to highlight the menu- option

It took me some time to figure out how to find the right articles.

Maybe make some sort of an introduction at the beginning, explaining the user how to use the website. There should, for example, always be sources and topics selected. However, since I did not know this, I could not find any articles to browse in.

Try to make the website a bit more user friendly

More messages that it worked or something and some explanation where the menu is for new users maybe

Table B.6: Responses to: ”Do you have any general feedback on the reader?”

Referenties

GERELATEERDE DOCUMENTEN

 While the constructs defined in the literature review where shown to be important for a positive customer experience, the degree to which they need to be integrated in a website

And as more companies are focusing their online marketing activities on user generated content and thus user generated websites, it raises the question how type of website

Gegeven dat we in Nederland al meer dan twintig jaar micro-economisch structuurbeleid voeren, vraagt men zich af waarom de aangegeven verandering niet eerder plaats vond, op

Procentueel lijkt het dan wel alsof de Volkskrant meer aandacht voor het privéleven van Beatrix heeft, maar de cijfers tonen duidelijk aan dat De Telegraaf veel meer foto’s van

Olivier is intrigued by the links between dramatic and executive performance, and ex- plores the relevance of Shakespeare’s plays to business in a series of workshops for senior

In addition, in this document the terms used have the meaning given to them in Article 2 of the common proposal developed by all Transmission System Operators regarding

In doing so, the Court placed certain limits on the right to strike: the right to strike had to respect the freedom of Latvian workers to work under the conditions they negotiated

Through electronic funds transfer and attests that we can rely exclusively on the information you supply on ment forms: nic funds transfer will be made to the financial institution