Cloudy with a chance of Concepts
Towards the automatic creation of concept maps and
topic clouds from case law documents
Suzanne Bardelmeijer
10716971
Bachelor thesis
Credits: 18 EC
Bachelor Opleiding Kunstmatige Intelligentie
University of Amsterdam
Faculty of Science
Science Park 904
1098 XH Amsterdam
Supervisors
dr. R.G.F. Winkels
dr. A.W.F. Boer
Leibniz Center for Law
Faculty of Law
University of Amsterdam
Vendelstraat 8
1012 XX Amsterdam
Abstract
Over the past few years, more and more legal documents have become publicly available online. These legal documents are oftentimes overly complex due to their lengthy and complicated struc-ture. Concept maps and topic clouds both aim to visualize topics that pervade in a large corpus and therefore may provide instant comprehension of a particular topic. Different algorithms have been created that generate topic clouds automatically, however, the creation of a concept map is still a labour-intensive and time consuming procedure. In order to help automate this process, the proposed method used pre-selected promising n-grams and Latent Dirichlet Allocation to create concept maps automatically from Dutch case law documents. To evaluate the created concept maps and to compare these with topic clouds, novices in the field of law were given a survey in which they had to rank in descending order four concept maps and topic clouds according to their similarity with a given section of a case law. This resulted in an overall Spearman rank-order correlation coefficient of 0,7750 for the concept maps and 0,8384 for the topic clouds. Concluded is that it is possible to automatically create concept maps from Dutch case law documents, however, due to the marginal difference in performance of the two visualizations, it cannot be concluded that there exists a distinct preference for topic clouds over concept maps for the visualization of underlying topics in the case law documents.
Contents
1 Introduction 4
2 Theoretical Foundation 5
2.1 Concept maps . . . 5
2.2 Topic modeling algorithms . . . 5
2.3 Scope of this research . . . 6
3 Method 7 3.1 Data collection . . . 7
3.2 Data pre-processing . . . 7
3.3 Creating the list of stop words . . . 7
3.4 Selecting promising n-grams . . . 7
3.5 Building a topic model . . . 8
3.6 Creation concept maps . . . 9
3.7 Creation topic clouds . . . 10
3.8 Evaluation . . . 10 4 Results 11 5 Evaluation 12 6 Conclusion 14 7 Discussion 15 References 16 Appendices 17 A MALLET commands 17
B Survey Part I: questions 18
Acknowledgement
I would like to thank my supervisors Radboud Winkels and Alexander Boer for their expert advice and encouragement throughout this project.
Moreover, I would like to thank everybody who filled in my survey and Maarten Sukel for his assistance.
1
Introduction
Over the past few years, more and more legal documents have become publicly available online. In 2016, 1.6 million cases were published on the website: www.rechtspraak.nl (Rechtspraak, 2017). The legal documents that accompany these cases are oftentimes overly complex due to their lengthy and complicated structure. Therefore, there is a necessity for new techniques to be established to mine important information topics. A clear visualization of the different topics that a docu-ment is composed of is needed to have an instant comprehension of the content of that docudocu-ment. Understanding a legal document quickly could prevent professionals and novices from feeling over-whelmed by the length of the document. In addition, it allows them to work efficiently with all the available information.
One way to visualize different topics is through topic clouds. A topic cloud is a visual repres-entation of words (concepts) where the importance of a word over an underlying set of data (text) is expressed by its size (Castano, Ferrara, & Montanelli, 2013). Another way of visualizing know-ledge in complex documents is via concept maps. A concept map is a graphical representation of knowledge in which the core concepts and the relations that connect these concepts are hierarchical structured in a diagram (Novak & Ca˜nes, 2008). These concept maps, constructed by an expert in a certain field, help organize prior and newly acquired knowledge and therefore enable information gathering (Novak & Ca˜nes, 2008). Although both methods aim to visualize topics in a large and structured set of texts, named a text corpus, a concept map concentrates on showing relations between the concepts whereas a topic cloud highlights the concept itself. Topic clouds and concept maps may provide professionals and novices in the field of law with instant comprehension of a particular topic, this could lead in to time saving during the evaluation process of relevance of a document.
In recent years, different algorithms have been created that generate topic clouds automatically. However, despite the fact that concept maps have proven their value (Novak & Ca˜nes, 2008), the creation of a concept map is still a labour-intensive and time consuming process mainly because substantial expert knowledge is needed (Boer & Sijtsma, 2014).
Although in previous research attempts were made, using different approaches and topic mod-elling algorithms, such as Latent Dirichlet Allocation, to automatically create concept maps from text documents, these attempts were unsuccessful in perfecting results and leave room for im-provements (Scholten, 2016). One of the suggested imim-provements includes the use of n-grams. An n-gram is a contiguous sequence of n terms, where a term represents a word or number in a given text (Wang, McCallum, & Wei, 2007).
With previous results and attempts in mind, this thesis presents a new approach for the creation of a system that is able to automatically create concept maps from case law documents published on www.rechtspraak.nl. Rather than focusing on creating complete concept maps, this thesis aims to create comprehensive concept maps. Although the maps may be incomplete, they will capture the essence. This leads to the research question: to what extent is it possible to automatically create a comprehensible concept map from Dutch case laws using pre-selected promising n-grams and Latent Dirichlet Allocation? Furthermore, a comparison will be made between the generated concept maps and topic clouds to answer the question if there exists a preference for one of the two.
This thesis builds on existing research and will start off describing related work in topic mod-elling of Dutch case law documents. Thereafter, the proposed method is discussed, followed by the evaluation and interpretation of the experimental results. To conclude this thesis, recommenda-tions for future research will be made.
2
Theoretical Foundation
2.1
Concept maps
Fundamental research in concept mapping was first done by Novak and his researchers. His research was based on the assimilation theory of Ausubel who emphasized that acquiring new knowledge is based on the assimilation of new concepts into existing frameworks (Ausubel, 1963). Concept maps are graphical tools for representing knowledge through organizing concepts and their rela-tions, leading them to enable to advance human learning and understanding (Novak & Ca˜nes, 2008).
Due to the fact that domain experts manually construct most concept maps, the process of con-structing a concept map requires a substantial amount of time and expert knowledge (Chen, Wei, & Chen, 2008). In order to facilitate the process of constructing concept maps, a number of studies have focused on providing methods to help automate the process.
The procedure for the construction of a concept map starts with defining a topic, identifying the most important concepts corresponding to the defined topic and connecting these concepts by adding, at the most three, relations. These relations, called linking phrases, often consist of a verb phrase and form together with its two connecting concepts a proposition (Novak & Ca˜nes, 2008).
The method for the automatic creation of concept maps proposed by Boer and Sijtsma aimed to extract these propositions based on the term frequency of the concepts. The higher the term frequency of the concepts in the proposition, the bigger the importance of this proposition to the concept map (Boer & Sijtsma, 2014). Although this method shows potential for the automatic creation of concept maps, the selection of the most important concept nodes needs to be improved. In order to achieve better results, Scholten extended their method by using a topic modeling al-gorithm (Scholten, 2016). This alal-gorithm identifies topics that pervade in a large collection of documents where a topic is represented as a list of words that accompany that topic. Propositions that contained terms that correspond to words describing a topic were marked as informative and therefore were more likely to eventually appear in the concept map.
2.2
Topic modeling algorithms
As stated above, topic modeling algorithms are used for the discovery of topics that pervade a large collection of documents and they have proven their value as a powerful text-mining tool (Blei & Lafferty, 2009). Since these algorithms do not require any prior annotations and topics “emerge” from the analysis of the corpus automatically, topic modeling algorithms could be of great interest in the process of the automatic construction of a concept map.
Villalon and Calvo proposed a method for the automatic extraction of important concepts from student essays using grammatical parses and a topic modelling algorithm called Latent Semantic analysis (LSA) (Villalon, Calvo, & Chen, 2009). Their method starts by identifying noun phrases grammatically and selecting nouns and compound nouns. Thereafter, using LSA, a topic repres-entation of associations between these compound nouns was extracted based on the term-document co-occurrence matrix (Villalon et al., 2009). Despite the fact that LSA is simple and efficient, it exhibits a major limitation: it has great difficulties with polysemous terms (Blei & Lafferty, 2009).
To overcome this limitation most topic models are currently constructed using Latent Dirich-let Allocation, which was first presented by Blei, Ng and Jordan in 2003 (Blei, Ng, & Jordan, 2003). The idea behind LDA is that documents are seen as a mixture of different topics, where a topic is formally defined as a distribution over a fixed vocabulary (Blei et al., 2003). LDA is part of the larger field of generative probabilistic modeling algorithms. The statistics underlying LDA are based on the idea of a generative process that has led to the content of the document, including hidden variables (Blei et al., 2003). Both the observed and hidden random variables are defined by this generative process over a joint probability distribution. This joint probability distribution is used to compute the posterior distribution of the hidden variables given the observed variables
(Blei et al., 2003). Here, the observed variables can be understood as the words of the documents and the hidden variables as the hidden topic structures.
LDA results in the creation of a topic model where each topic is associated with a document in different proportions and has probabilities of generating various words (Blei et al., 2003). There-fore, topics can be seen as collections of terms that frequently appear together where these terms have different probabilities of appearance in documents associated to that topic. For instance, a topic about food is more likely to produce words as “rice” and “cheese” than a topic about taxes. Some terms are more strongly connected to a topic than other terms. This strength is the posterior probability of finding a word associated with a topic. One topic might be composed of many occurrences of “rice”, “cheese” and “pasta”, whereas another may be composed of a great deal of “milk” and “cow” with a few occurrences of “cheese”.
LDA operates under the bag-of-words assumption resulting in the fact that word order is not taken into account. This assumption is plausible for the identification of a topic, but gives a dis-advantage when interpreting them (Blei & Lafferty, 2009). In order to overcome this problem, more advanced topic modeling algorithms were created with small adjustments or additions to the existing LDA algorithm. Research showed that topic models based on LDA with a multi-word expression approach provide a better understanding for what a topic is about (Wang et al., 2007) (Blei & Lafferty, 2009). Multi-word expression approaches, such as Turbo Topics or Topical N-gram Model, aim to find significant n-N-grams related to a topic (Blei & Lafferty, 2009) (Wang et al., 2007). An n-gram is a contiguous sequence of n terms, where an terms represents a word or number, in a given text (Wang et al., 2007). An n-gram of size two is referred to as a bi-gram, size three is a tri-gram and an n-gram of size four is called a tetra-gram. For example, this approach has led to a multi-word expression such as “white house” to be found and categorized to a political topic.
2.3
Scope of this research
In order to build a system that is able to automatically create comprehensible concept maps, a topic modeling algorithm needs to be used that takes word order into account. For the scope of this research, LDA will be used to create a topic model. However, to preserve word order, promising n-grams will be created and added to the text documents before applying LDA. This method of constructing a topic model takes bi-grams, tri-grams and tetra-grams into account and will be explained in the next section.
3
Method
3.1
Data collection
By collecting Dutch case laws, the data set was created. A corpus of Dutch legal documents is published on www.rechtspraak.nl in XML formatted files and case laws from different area’s of law can be downloaded from this website. The field of law that this research is interested in is Immigration Law. The code for the collection of case laws from www.rechtspraak.nl was constructed by Erwin van den Berg (Berg, 2015). Usage led to a data set consisting of 250 case laws from between the years 2015-2017. Although this code was used for Immigration Law in particular, it is not domain specific and can be used for the collection of documents of other area’s of law as well.
3.2
Data pre-processing
Most Dutch case laws have a common structure, starting off with a summary of the case (Dutch: Inhoudsindicatie), followed by the actual verdict which consists of the procedure (Dutch: Proces-verloop), the considerations (Dutch: Overwegingen) and the decision (Dutch: Beslissing). Since every section of the case law contains information about the case and could therefore be of interest in identifying underlying topics, all sections are considered relevant.
Since case laws are XML formatted files, every section could be extracted by searching for the right sequence of XML markup. The section “inhoudsindicatie” is preceded by “<inhoudsindicatie id =” and exceeded by “</inhoudsindicatie>”. The other sections are identified by the sequence “<section>” at the start of a section and at the end of the same section the sequence “</section>”. Once all sections are extracted, the XML markup is removed since these XML constructs should not be taken into account when constructing a topic model. Eight case laws were excluded from the data set because these files did not contain all of the sections.
After this pre-processing step the data set consisted of 968 text files. Every file represents a section of a case law. A file is named after its ECLI number, which is an unique identification number, and a section number. To finalize the pre-processing step, every file was removed of its punctuation and all capital letters were converted to lowercase letters.
3.3
Creating the list of stop words
Before proceeding to the construction of topic model, a list of stop words was created which prevents the creation of meaningless topics. Stop words are words that are very common in a particular language, such as “the” and “of” and therefore have very little informational value. The list of stop words is created by computing the frequency of every word in the corpus. If a word occurs in more than a specified percentage of all case laws, the word is added to the list of stop words. In this thesis the threshold is set to 20%, hence words that appear in more than 20% of the files are not taken into account while constructing a topic model. The code used for the construction of the list of stop words was created by Erwin van den Berg (Berg, 2015).
3.4
Selecting promising n-grams
As stated in the literature section, LDA operates under the bag-of-words assumption and does not take word order into account. To overcome this disadvantage, n-grams were created before constructing the topic model. This was done by dividing the text in every file into separate word groups. For example, when constructing a bi-gram the sentence “this is a sentence” is divided into the word groups “this is”, “is a”, “a sentence”, whereas for a tri-gram the resulted word groups are “this is a” and “is a sentence” and for a tetra-gram “this is a sentence”
Once all the possible bi-, tri- and tetra-grams for a file are constructed, it is necessary to identify the promising n-grams. A n-gram is promising when none of the terms in the n-gram consist of one character or one digit (0-9). Based on the example before, the bi-grams “is a” and “a sentence” are
not considered promising because these n-grams contain the term “a”. Furthermore, for a n-gram to be promising, none of its terms should be included in the list of stop words. Eventually the promising n-grams that rise from a text file are added to the end of this file and all files are saved in the directory “ngramcaselaws”.
3.5
Building a topic model
In order to discover the underlying topics in the data set, the open source implementation of LDA in MALLET was used to create a topic model. MALLET (MAchine Learning for LanguagE Toolkit) is a package for topic modelling algorithms and is written by McCallum (McCallum, 2002). The creation of a topic model in MALLET follows two steps. First, MALLET requires two input arguments which are all the files of the directory “ngramcaselaws” and the list of stop words respectively. A regular expression is included to ensure that the n-grams are properly processed and that terms of length one cannot describe a topic.
The output is a MALLET formatted file called “thesis2.mallet” which is then used in the second step to determine the topics. At the beginning of this step, the target number of topics that MAL-LET will search needs to be set. In this research the target number is set to 50 which leads to the creation of three files. The “thesis50 keys.txt” contains the top 19 terms that the topics consists of, “thesis50 composition.txt” shows to what degree a topic matches a section of a case law and in “thesiswordcount.txt” the importance of a term to a topic is expressed. All used MALLET commands can be found in appendix A.
Although the number of topics that LDA produces is arbitrary, it has a big influence on the informational value of the created topics. If the target number is reduced, separate topics have to fuse; if the number increases, topics have to undergo fission. In order to ensure a topic is describing a theme that is well presented in a set of documents, only documents that have more than 20% similarity with one of the topics are selected.
To guarantee a topic is not too specific, the remaining topics are compared based on the number of documents they describe. Figure 1 shows the topics on the x-axis and the number of topics on the y-axis. For the 20 topics that describe the most documents, from topic 9 to topic 45, a concept map is created.
3.6
Creation concept maps
After finishing the steps mentioned above, 20 topics with their describing terms were extracted from the data set. These terms represent the concept nodes in a concept map of that topic. Since a concept map is created out of concepts and the relations between them, it is necessary to assign weights to these relations in order to identify the links that are most important.
Figure 2 shows an overview of the weight assigning process. First all possible links are created between the 19 terms that describe a topic, which resulted in 171 possible links. Then a weight is assigned to a link based on the word frequencies of the two terms that are linked.
Figure 2: Assigning weights to links
In order to take all possible connections of the two terms into account, the weight for a link per file is computed by multiplying (instead of adding) those two frequencies. In figure 3 this process is shown.
Figure 3: Computing weights
To obtain the total weight of a link, all computed weights per file are added and stored in a list in descending order. The most informative links, the links with the highest weight, were selected manually after which the concept map was constructed using CmapTools (Ca˜nas et al., 2004). However, as stated in the literature section, a comprehensible concept map should not contain concepts with more than three (incoming and outgoing) links. Therefore, once a concept already has three links with other concept, new links to that concept will not be added to the concept map. This process will continue until the concept map consists of 15 connected concepts.
However, some of the created concept maps contain fewer concepts due to the describing terms of a topic where some of these terms are single words and some are n-grams. When a word also appears in a n-gram, the word and the n-gram are considered equal and only the n-gram will be added as a concept to the concept map. For example, “Minderjarige Kinderen”, “Minderjarige”
and “Kinderen” all belong to the same concept: “Minderjarige Kinderen”. Furthermore, the same applies to two terms that belong to the same verb or noun, hence, “verleend” becomes “verlenen” and “lidstaten” shifts to “lidstaat”.
Although traditional concept maps exist of linking phrases containing a verb, the linking phrases in the created concept maps will exist of a number. Nonetheless, these linking phrases still contain text, but correspond to a section of a case law in which the link has the highest score. This approach ensures that the idea of the traditional concept map is maintained. For the scope of this project, this information was not used, however, when integrated in an application, this information could eventually provide users more insight in the origin of the relation between two concepts.
3.7
Creation topic clouds
Running LDA using MALLET resulted in the creation of “thesiswordcount.txt” document in which the importance of a term to a topic is expressed. This document gives valuable information since not every term is equally associated with a topic and some terms are more strongly connected to a topic than others. The topic cloud is created using the Lexos text analysis tool which is developed by the Lexomics group and is freely available online (Kleinman, LeBlanc, Drout, & Zhang, 2016). The Lexos tool requires one input argument, the “thesiswordcount.txt” document, after which the topic clouds are created. The strength of association of a term to a topic is expressed in a topic cloud by the size of a term.
3.8
Evaluation
To evaluate the created concept maps and topic clouds novices in the field of law were given a survey in which they had to complete a sorting task. The survey consisted of four questions where in each question the participant was given a section of a case law and was asked to:
1. Rank in descending order four concept maps according to their similarity with the given document. Here the concept map placed on number 1 fitted the document the best.
2. Rank in descending order four topic clouds according to their similarity with the given doc-ument. Here the topic cloud placed on number 1 fitted the document the best.
In appendix B the complete survey can be found. The sections of case law documents that were used are presented in appendix C.
4
Results
The proposed method has led to the creation of 20 concept maps and 20 topic clouds. Figure 4 and 5 show the concept map and topic cloud created from topic 36.
Figure 4: Concept map of topic 36
As stated before, the linking phrases do not exhibit verbs, but numbers. These numbers correspond to the document in which the link has the highest weight value. The list of numbers together with their corresponding documents is shown in Table 1.
Table 1: Links and corresponding documents
Link Number Document
1 ECLI:NL:RBDHA:2017:2028-2.txt 2 ECLI:NL:RBDHA:2016:3998-1.txt 3 ECLI:NL:RBDHA:2017:3356-2.txt 4 ECLI:NL:RBDHA:2017:3356-2.txt 5 ECLI:NL:RBDHA:2017:2654-2.txt 6 ECLI:NL:RBDHA:2017:2028-2.txt 7 ECLI:NL:RBROT:2017:1105-2.txt 8 ECLI:NL:RBDHA:2016:3998-1.txt 9 ECLI:NL:RBDHA:2017:2654-2.txt 10 ECLI:NL:RBDHA:2017:2654-2.txt 11 ECLI:NL:RBDHA:2016:5480-2.txt 12 ECLI:NL:RBDHA:2017:2654-2.txt 13 ECLI:NL:RBDHA:2017:2654-2.txt 14 ECLI:NL:RBDHA:2016:15324-2.txt 15 ECLI:NL:RBDHA:2017:2654-2.txt 16 ECLI:NL:RBDHA:2017:976-2.txt 17 ECLI:NL:RBDHA:2017:3356-2.txt 18 ECLI:NL:RBDHA:2017:2910-2.txt 19 ECLI:NL:RBDHA:2017:2654-2.txt 20 ECLI:NL:RBDHA:2016:5480-2.txt 21 ECLI:NL:RBDHA:2016:5480-2.txt
5
Evaluation
For this thesis, six novices in the field of law filled in the survey. For each question in the survey, participants had to rank concept maps and topic clouds according to their degree of similarity with a given section of a case law. Table 2 presents the documents, identified by their ECLI number, that were used for the survey. These documents were selected since their content covers a wide variety of topics which makes them suitable for ranking. As shown in Table 2 the topics beneath the ECLI number belong to different degrees to the document and this value decreases gradually. For instance, document ECLI:NL:RBDHA:2017:3176-2 exists of 51% of topic 7, 12% of topic 21, 6% of topic 33 and for 1% of topic 16.
Table 2: Documents and the topics they consist of
ECLI:NL:RBDHA:2017:3176-2 ECLI:NL:RBDHA:2017:417-2 ECLI:NL:RBDHA:2017:2654-2 ECLI:NL:RBDHA:2017:780-2 Topic 7 - 0,513392857 Topic 48 - 0,295010846 Topic 36 - 0,489282386 Topic 10 - 0,328301887 Topic 21 - 0,117857143 Topic 33 - 0,1127 Topic 29 - 0,131407269 Topic 45 - 0,105660377 Topic 33 - 0,05625 Topic 30 - 0,067245119 Topic 3 - 0,021435228 Topic 30 - 0,030188679 Topic 16 - 0,014285714 Topic 12 - 0,015184382 Topic 9 - 0,009319664 Topic 12 - 0,01509434 The Spearman rank-order correlation, which is a non parametric measure of the degree of associ-ation between two variables, was used to evaluate the survey. This test does not assume that both data sets are normally distributed, but requires that the data is measured on a scale that is at least ordinal (Zwillinger & Kokoska, 2000). In order to measure the correlation between the ranks made by the participants and the original ranks, the Spearman rank-order correlation coefficient for both the concept maps and topic clouds was computed. This was done using a statistical function that was implemented in a Python package called SciPy (Jones, Oliphant, Peterson, et al., 2001). This statistical function is based on equation 1, where ρ exhibits the Spearman rank-order correlation coefficient, d the difference between the ranks of corresponding variables and N equals the number
of observations.
ρ = 1 − 6 ∗P d
2
N (N2− 1) (1)
A coefficient that equals 1, entails a positive correlation between the two variables, implies that all participants were able to rank the concept maps or topic clouds in correct order. If all participant ranked all of the concept maps and topic clouds in exactly the reversed order, this results in a coefficient equal to -1. If there exists no correlation at all, which implies that participants were not able make adequate ranks, the coefficient is equal to 0.
The statistical function in SciPy not only results in a value for ρ but returns a p-value as well. The p-value demonstrates the probability of an uncorrelated system producing data sets that have a Spearman rank-order correlation coefficient similar to the one computed from these data sets (Zwillinger & Kokoska, 2000). However, the reliability of the p-value is dependent on the size of the data sets. P-values corresponding to data sets larger than 500 observations (N) are presumed to be valid (Zwillinger & Kokoska, 2000).
Table 3 presents the Spearman rank-order correlation coefficient and p-value for every survey question and in total. For question one, the Spearman rank-order correlation coefficient equals to 1 and the p-value corresponds to 0 for both the concept maps and topic clouds. For question two ρ equals 0.76667 and 0.966667 and the value of p equals 1,00E-05 and 0 for concept maps and topic clouds respectively. The values for ρ and p for question three correspond to 0.93333 and 0 for concept maps and to 1 and 0 for topic clouds. Answers to question four resulted in a values for ρ and p of 0.4 and 0.05278 for concept maps and for topic clouds 0.394993 and 0.0561. All observations combined lead to a total value of the Spearman rank-order correlation coefficient of 0.7750 for concept maps and 0.8384 for topic clouds, whereas the p value correspond to 1,94E-04 and 1,62E-10 respectively.
Although p-values were computed and are shown in this section, these values are not trustworthy due to the fact that six novices completed the survey. This leads in an N of 24 per question and an N of 96 for the total whereas an N of at least 500 is needed for a valid statistical interpretation of the p-values.
Table 3: Spearman rank-order correlation coefficient and the p-vale per question
Concept Maps p-value (Concept Maps) Topic Clouds p-value (Topic Clouds)
1 1 0 1 0
2 0.76667 1,00E-05 0.966667 0
3 0.93333 0 1 0
4 0.4 0.05278 0.394993 0.0561 Total 0.7750 1,94E-04 0.8384 1,62E-10
6
Conclusion
The method that is presented in this thesis for the automatic creation of comprehensible concept maps from case law documents shows potential since novices in the field of law were able to rank the created concept maps adequately given a section of a case law.
As shown in Table 3, although the number of participants is too small to draw any statistic-ally proven conclusions, the correlation coefficients accompanying question one indicate a positive correlation for both concept maps and topic clouds. This implies that all participants were able to rank all concept maps and topic clouds in the correct order for question one. The value of ρ for question two and three is 0.76667 and 0.93333 for the concept maps, whereas this value equals to 0.966667 and 1 for the topic clouds. This suggests that participants were capable of making more adequate ranks using topic clouds than concept maps in question two and three. In question four, participants found it difficult to rank the concept maps and topic clouds in correct order causing a relatively low value of ρ, 0.4 for concept maps and 0.394993 for topic clouds, compared to the questions one, two and three. This indicates that question four was the most difficult question in the survey.
All results combined lead to a correlation coefficient of 0.7750 for the ranks concerning the concept maps and a coefficient of 0.8384 for the topic clouds. This suggests that over all participants performed slightly better at ranking topic clouds compared to ranking concept maps, which could imply a minor preference for topic clouds for the visualization of topics in Dutch case law docu-ments.
Based on the results from the evaluation, it can be concluded that it is possible to automatically create comprehensible concept maps from Dutch case law documents using pre-selected promising n-grams and LDA. However, due to the marginal difference in performance of the two visualiza-tions, it cannot be concluded that there exists a distinct preference for topic clouds over concept maps for the visualization of underlying topics in the case law documents. Moreover, the variation of the value of the correlation coefficients per question indicate that the difficulty of each question was not equal. To obtain results that are not influenced by this disparity, the survey should have consisted of more questions.
7
Discussion
The development of the proposed method has shown promising results for the automatic construc-tion of comprehensible concept maps from Dutch case laws documents, although some improve-ments could still be made.
While these results suggest that participants were able to make more correct ranks using topic clouds than concept maps, this does not necessary imply that topic clouds have more informa-tional value. Participants could use two main methods to rank the concept maps and topic clouds. One method is aimed at identifying the internal structure of the given document, whereas the other is focused on words itself. Here, the first method requires substantial knowledge of the content of the document, while the second is based on a more superficial resemblance. Therefore, the results of the survey could be misleading since the group of participants who filled in the survey only consisted of novices in the field of law and therefore do not have substantial knowledge in the field of Immigration law. Therefore, it is possible that participants based their ranks on superficial resemblance instead of internal structures between concepts, which was essentially the purpose of the constructed concept maps. Further research could include experts as well to examine this hypothesis. In addition, more participants are needed in order to draw statistically proven conclu-sions.
Although MALLET is capable of creating topic models in an instant, its usage requires decisions that could have significant impact on the results. For example, the number of topics in this research was set to 50, however, if this number is set to a lower value, some topics that were separate have to fuse. If the number increases, the opposite applies and topics have to split. Setting the number of topics to different values leads to the creation of different topics. More research needs to be done to evaluate the number of topics that is used for a topic model and results could eventually lead to a feature that is able to select the appropriate number of topics automatically.
Despite the fact that the created concept maps consisted of only distinct concepts, some of the topics were in fact described by terms that stem from the same word. For example, ”lidstaat” and ”lidstaten” both described topic 12. Although a number of preliminary pre-processing steps were performed in order to establish a clear-cut topic model, the addition of stemming could lead to improved results. Stemming was not performed since Dutch parsers do not achieve high results on Dutch case law documents due to their complicated structure and use of language. The develop-ment of parsers specially made for the analysis of legal docudevelop-ments could lead to better results in future research.
The last notion that needs to be addressed concerns the appearance of the concept map. As stated in the literature section, a traditional concept map is composed of a number of proposi-tions in which two concepts are linked by a linking phrase. Although this linking phrase normally consists of a (single) verb, the linking phrases in this thesis contain a whole section of a case law. Moreover, these sections corresponding to the linking phrases were not shown to participants dur-ing the evaluation. This deprives participants of potential valuable information about the origin of the link between two concepts and could therefore result in different ranks opposed to when this information was available to them. Further research could integrate the sections of a case law corresponding to a linking phrase in an application to provide this knowledge.
References
Ausubel, D. P. (1963). The psychology of meaningful verbal learning.
Berg, v. d. E. (2015). Development of a recommender system for dutch case law, with the use of a topic model. Bachelor Thesis, Universiteit van Amsterdam.
Blei, D. M., & Lafferty, J. D. (2009). Visualizing topics with multi-word expressions. arXiv preprint arXiv:0907.1013 .
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3 (Jan), 993–1022.
Boer, A., & Sijtsma, B. (2014). Semi-automatic construction of skeleton concept maps from case judgments. Network Analysis in Law .
Ca˜nas, A. J., Hill, G., Carff, R., Suri, N., Lott, J., G´omez, G., . . . Carvajal, R. (2004). Cmaptools: A knowledge modeling and sharing environment.
Castano, S., Ferrara, A., & Montanelli, S. (2013). Mining topic clouds from social data. In Proceedings of the fifth international conference on management of emer-gent digital ecosystems (pp. 108–112). New York, NY, USA: ACM. Retrieved from http://doi.acm.org/10.1145/2536146.2536171 doi: 10.1145/2536146.2536171
Chen, N., Wei, C., & Chen, H. (2008). Mining e-learning domain concept map from academic articles. Computers Education, 3 , 1009-1021.
Jones, E., Oliphant, T., Peterson, P., et al. (2001). SciPy: Open source scientific tools for Python. Retrieved from http://www.scipy.org/ ([Online; accessed 6-26-2017])
Kleinman, S., LeBlanc, M., Drout, M., & Zhang, C. (2016). Lexos v3.0.
(ht-tps://github.com/WheatonCS/Lexos/)
McCallum, A. K. (2002). Mallet: A machine learning for language toolkit.
(http://mallet.cs.umass.edu)
Novak, J., & Ca˜nes, A. (2008). The theory underlying concept maps and how to construct and use them. Florida Institute for Human and Machine Cognition.
Rechtspraak, D. (2017). Jaarverslag 2016. Retrieved from
http://www.jaarverslagrechtspraak.nl ([Online; accessed 05-30-2017])
Scholten, V. (2016). Automatic creation of skeleton concept maps from case law documents. Bachelor Thesis, Universiteit van Amsterdam.
Villalon, J., Calvo, R., & Chen, H. (2009). Concept extraction from student essays, towards concept map mining. IIEE International Conference on Advanced Learning Technologies.
Wang, X., McCallum, A., & Wei, X. (2007). Topical n-grams: Phrase and topic
discovery, with an application to information retrieval. , 697–702. Retrieved from http://dx.doi.org/10.1109/ICDM.2007.86 doi: 10.1109/ICDM.2007.86
Zwillinger, D., & Kokoska, S. (2000). Crc standard probability and statistics tables and formulae. FChapman Hall.
Appendices
A
MALLET commands
For this research, MALLET 2.0.7 was run on a OS-X Yosemite machine.
MALLET 2.0.7 was located at:
/Users/suzannebardelmeijer/Documents/mallet-2.0.7
The pre-processed case laws with n-grmas are located at:
/Users/suzannebardelmeijer/Documents/mallet-2.0.7/ngramcaselaws
The list of stop words was located at:
/Users/suzannebardelmeijer/Documents/mallet-2.0.7/stopwords20.txt
Geachte respondent,
In samenwerking met mijn begeleiders dr. Radboud Winkels en dr. Alexander Boer heb
ik voor mijn afstudeerproject BSc Kunstmatige Intelligentie een systeem ontwikkeld dat
automatisch concept maps creëert van de belangrijkste onderwerpen die in een collectie
documenten voorkomen. Voor dit onderzoek bestaat de collectie documenten uit secties
van rechterlijke uitspraken afkomstig van
www.rechtspraak.nl
. Een concept map is een
compacte visualisatie waarin de belangrijkste concepten van een onderwerp met elkaar
verbonden zijn. Op deze manier kan een concept map het onderwerp van een
bijbehorende set documenten in één oogopslag inzichtelijk maken.
Een andere manier om het onderwerp van een set documenten weer te geven, is
middels topic clouds. In een topic cloud zijn de concepten niet met elkaar verbonden,
maar laat de grootte van het concept zien hoe sterk deze verbonden is met dat
onderwerp.
Hieronder is een voorbeeld van een concept map (links) en een topic cloud (rechts)
weergegeven.
Om de gegenereerde concept maps en topic clouds te beoordelen, ben ik op zoek naar
respondenten voor mijn evaluatie.
De evaluatie bestaat uit vier vragen waarin u per vraag wordt gevraagd om voor een
sectie van een rechterlijke uitspraak:
a) aan te geven (in een rangorde) welke van de onderstaande concept maps het
beste past bij het geven document
b) aan te geven (in een rangorde) welke van de onderstaande topic clouds het beste
past bij het gegeven document.
Bedankt voor uw medewerking.
Met vriendelijke groeten,
Suzanne Bardelmeijer
Vraag 1 – Document: ECLI/NL/RBDHA/2017/3176-‐2
A) Welke van de onderstaande concept maps past het beste bij het bovenstaande
document? Rangschik uw antwoord waarbij de concept map op nummer 1 het beste bij
het document past.
Topic 21:
Topic 33:
Topic 7:
Topic 16:
1: Topic
2: Topic
3: Topic
B) Welke van de onderstaande topic clouds past het beste bij het bovenstaande
document? Rangschik uw antwoord waarbij de topic cloud op nummer 1 het beste bij
het document past.
1: Topic
2: Topic
3: Topic
4: Topic
Vraag 2 – Document: ECLI/NL/RBDHA/2017/417-‐2
A) Welke van de onderstaande concept maps past het beste bij het bovenstaande
document? Rangschik uw antwoord waarbij de concept map op nummer 1 het beste bij
het document past.
Topic 33:
Topic 30:
Topic 12:
Topic 48:
1: Topic
2: Topic
3: Topic
4: Topic
B) Welke van de onderstaande topic clouds past het beste bij het bovenstaande
document? Rangschik uw antwoord waarbij de topic cloud op nummer 1 het beste bij
het document past.
1: Topic
2: Topic
3: Topic
4: Topic
Vraag 3 – Document: ECLI: NL:RBDHA:2017:2654-‐2
A) Welke van de onderstaande concept maps past het beste bij het bovenstaande
document? Rangschik uw antwoord waarbij de concept map op nummer 1 het beste
bij het document past.
Topic 36:
Topic 3:
Topic 29:
Topic 9:
1: Topic
2: Topic
3: Topic
4: Topic
B) Welke van de onderstaande topic clouds past het beste bij het bovenstaande
document? Rangschik uw antwoord waarbij de topic cloud op nummer 1 het beste
bij het document past.
1: Topic
2: Topic
3: Topic
4: Topic
Vraag 4 – Document: ECLI: NL:RBDHA:2017:780-‐2
A) Welke van de onderstaande concept maps past het beste bij het bovenstaande
document? Rangschik uw antwoord waarbij de concept map op nummer 1 het beste
bij het document past.
Topic 10:
Topic 30:
Topic 45:
Topic 12:
1: Topic
2: Topic
3: Topic
4: Topic
B) Welke van de onderstaande topic clouds past het beste bij het bovenstaande
document? Rangschik uw antwoord waarbij de topic cloud op nummer 1 het beste
bij het document past.
1: Topic
2: Topic
3: Topic
4: Topic
ECLI:NL:RBDHA:2017:3176 Rechtbank Den Haag 2017-03-28
ECLI:NL:RBDHA:2017:3176 2017-03-30
2 Overwegingen 1. Verweerder heeft de asielaanvragen van
eisers niet in behandeling genomen op de grond dat Italië
verantwoordelijk is voor de behandeling van deze aanvragen. 2. De rechtbank stelt vast dat eisers de beroepsgrond dat zij geen gebruik hebben gemaakt van het door Italië afgegeven visum om het
grondgebied van de lidstaten binnen te komen, ter zitting hebben laten vallen. 3. Eisers betogen, met een beroep op het arrest van het Europees Hof voor de Rechten van de Mens (EHRM) van 4 november 2014 inzake Tarakhel tegen Zwitserland (ECLI:CE:ECHR:
2014:1104JUD002921712), dat overdracht aan Italië, zonder een
individuele garantie van directe plaatsing in een SPRAR-locatie, in hun geval schending betekent van artikel 3 van het Verdrag tot bescherming van de rechten van de mens en de fundamentele vrijheden (EVRM). Daarvoor wijzen eisers op het rapport van de Danish Refugee Council (DRC) en Swiss Refugee Council (SRC) van 9 februari 2017: ‘Is Mutual Trust Enough? The situation of persons with special reception needs upon return to Italy’ (DRC/SRC-rapport) en de AIDA-update van 28 februari 2017, waarin dit rapport wordt genoemd. Uit dit rapport volgt dat gezinnen met minderjarige kinderen, ondanks de garantie van de Italiaanse autoriteiten dat zij in een SPRAR-locatie worden opgevangen, het risico lopen in voor hen ongeschikte
tussenopvang terecht te komen. Ook vrezen eisers in een ongeschikte SPRAR-locatie, zoals beschreven op pagina 14 van het
DRC/SRC-rapport, terecht te zullen komen. De beroepsgrond slaagt. 3.1. Uitgangspunt is dat verweerder er op grond van het interstatelijk vertrouwensbeginsel van mag uitgaan dat Italië zijn
verdragsverplichtingen zal nakomen en dat het aan eisers is het tegendeel aannemelijk te maken. 3.2. Niet is in geschil dat eisers en hun minderjarige kind als kwetsbare personen in de zin van het arrest Tarakhel moeten worden aangemerkt, waarvoor adequate opvang is vereist. De Italiaanse autoriteiten hebben bij brieven (circular letter) van 8 juni 2015 en 15 februari 2016 garanties gegeven voor de opvang van gezinnen met minderjarige kinderen. 3.3. De Afdeling heeft in onder meer de uitspraken van 16 september 2016 (ECLI:NL:RVS:2016:2533), 9 december 2016 (ECLI:NL:RVS:
2016:3291) en 16 januari 2017 (ECLI:NL:RVS:2017:73) geoordeeld dat verweerder er op grond van het interstatelijk vertrouwensbeginsel in beginsel van uit mag gaan dat de Italiaanse autoriteiten voornoemde garanties gestand zullen doen door gezinnen met minderjarige
kinderen op zogeheten SPRAR-locaties op te vangen en dat zij het verweerder zullen melden wanneer zij niet kunnen voldoen aan de eerder gegarandeerde opvang van gezinnen met minderjarige kinderen binnen deze SPRAR-locaties. In de uitspraak van 9 december 2016 heeft de Afdeling in dit verband, onder verwijzing naar het arrest van het EHRM van 21 juli 2016 in de zaak N.A. en anderen tegen Denemarken (ECLI:CE:ECHR:2016:0628DEC001563616), overwogen dat het aantal opvangplekken binnen de SPRAR-locaties weliswaar beperkt is, maar dat niet is gebleken dat de Italiaanse autoriteiten de
toezegging de capaciteit van de opvang te zullen vergroten indien daartoe noodzaak bestaat, niet zullen nakomen. Daarnaast komt volgens de Afdeling blijkens de uitspraak van 16 september 2016 betekenis toe aan de toezegging van verweerder dat hij de overdracht