• No results found

Reproducibility review of: Comparing supervised learning algorithms for spatial nominal entity recognition

N/A
N/A
Protected

Academic year: 2021

Share "Reproducibility review of: Comparing supervised learning algorithms for spatial nominal entity recognition"

Copied!
2
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Reproducibility review:

"Comparing supervised learning algorithms for

Spatial Nominal Entity recognition"

This report is part of the reproducibility review at the AGILE conference. For more information see https://reproducible-agile.github.io/

This document is published on OSF at https://osf.io/suwpj/ To cite this report use

Ostermann, F. O., and Nüst, D. (2020, July). Reproducibility review of: Comparing supervised learning algorithms for Spatial Nominal Entity recognition.

https://doi.org/10.17605/OSF.IO/SUWPJ

Reviewed paper

Amine Medad, Mauro Gaio, Ludovic Moncla, Sébastien Mustière and Yannick Le Nir: Comparing supervised learning algorithms for Spatial Nominal Entity recognition. AGILE GiScience Ser., 1, 15. https://doi.org/10.5194/agile-giss-1-15-2020, 2020.

Source code: https://github.com/MedadAmine/Spatial-nominal-entity-recognition

Summary

The authors have done a commendable job at providing all required input data, scripts, and documentation to run the analysis. The reproduction was hindered because differences in computational environment required some initially undocumented adjustments for the libraries used, which have now been documented. It should be noted that the analysis requires substantial downloads, disk space, and processing power to run. Eventually, the reproduction was mostly successful.

Reproducibility reviewer notes

The materials on GitHub have an MIT license.

Data

Original hiking texts: not available, although there is a list of words Lexicon: FastText freely available online

Corpus: entire corpus not available, although there is a list of words

Samples for analysis available (named corpus), but not documentation as to the meaning

Processing

- uses open source libraries

(2)

- using requirements.txt to install libraries in new virtual environment throws error (incompatible versions), fixed through manual install of libraries

- pre-trained FastText model is massive to download

- example for installation path of model doesn't match load path in scripts - cudart64 error (ignored) for Tensorflow, depending on GPU

- TreeTaggerError: "Can't locate TreeTagger directory (and no TAGDIR specified)" We were able to resolve the TreeTagger issue by following the TreeTagger installation instructions (https://cis.uni-muenchen.de/~schmid/tools/TreeTagger/ ) and downloading the French parameter file. This has now been documented in the repository as well. Afterwards we were able to execute all cells in the provided Jupyter Notebook within a local container (using repo2docker with the --editable option).

Results

The direct link between paper and code/models still has to be inferred. The outputs showed small numerical differences as shown in this commit:

https://github.com/reproducible-agile/Spatial-nominal-entity-recognition/commit/ 872d110b507e3ba94aa1a23f29fa8539bc9255ff

Some suggestions for further improvements on an already very commendable effort: - The README should clearly mention the datasource of the download (Facebook AI research?)

- Maybe you could provide a suitable test dataset of a more manageable size, for

demonstration and testing; this would even allow to share your workflow as a Binder (see https://mybinder.org )

- Lastly, you could mention your execution times (along with a description of the used hardware)

Referenties

GERELATEERDE DOCUMENTEN

To address the problem of ''unraveled agreements'', Lantis proposes a post commitment politics framework based on three arguments: international cooperation is the product of

In Leeuwarden start in januari één van de vijf eerste InnovatieWerk- Plaatsen, met Van Hall Larenstein als trekker: het IWP Health, Food & Technology gaat alle krachten bundelen

Indien deze aanname niet juist is, zal de stikstofdepositie van de locatie hoger zijn dan weergegeven in de passende beoordeling.. Voor de verplaatsing van beide

The chapter sets out distinct layers of methods discourse: (1) protocols that encompass the procedures, experimental design, and setup; (2) broader commitments to experimentation as

A sufficient condition is obtained for a discrete-time birth-death process to pos- sess the strong ratio limit property, directly in terms of the one-step transition probabilities

In dat jaar zou de graaf van Loon zich op zijn residentie in Kuringen (Hasselt) teruggetrokken hebben en vanuit deze residentie de abdij nog vóór 1194 gesticht

1 –3 This article reports a case of an ovarian mixed germ cell tumour, who received bleomycin-containing chemotherapy and pre- sented with bleomycin-associated pulmonary

Bij de vertaling van zwelling en krimp, onder invloed van temperatuur en vocht, in een inwendige belastingstoestand wordt aangenomen dat er een evenredig verband