• No results found

A Workflow for the Semantic Annotation of Field Books and Specimen Labels

N/A
N/A
Protected

Academic year: 2021

Share "A Workflow for the Semantic Annotation of Field Books and Specimen Labels"

Copied!
2
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Biodiversity Information Science and Standards 2: e25839 doi: 10.3897/biss.2.25839

Conference Abstract

A Workflow for the Semantic Annotation of Field

Books and Specimen Labels

Lise Stork, Andreas Weber, Eulàlia Gassó Miracle, Katherine Wolstencroft

‡ Leiden Institute of Advanced Computer Science, Leiden, Netherlands § University of Twente, Twente, Netherlands

| Naturalis Biodiversity Center, Leiden, Netherlands

Corresponding author: Lise Stork (l.stork@liacs.leidenuniv.nl) Received: 15 Apr 2018 | Published: 13 Jun 2018

Citation: Stork L, Weber A, Miracle E, Wolstencroft K (2018) A Workflow for the Semantic Annotation of Field Books and Specimen Labels . Biodiversity Information Science and Standards 2: e25839.

https://doi.org/10.3897/biss.2.25839

Abstract

Geographical and taxonomical referencing of specimens and documented species observations from within and across natural history collections is vital for ongoing species research. However, much of the historical data such as field books, diaries and specimens, are challenging to work with. They are computationally inaccessable, refer to historical place names and taxonomies, and are written in a variety of languages.

In order to address these challenges and elucidate historical species observation data, we developed a workflow to

(i) crowd-source semantic annotations from handwritten species observations, (ii) transform them into RDF (Resource Description Framework) and

(iii) store and link them in a knowledge base.

Instead of full-transcription we directly annotate digital field books scans with key concepts that are based on Darwin Core standards. Our workflow stresses the importance of verbatim annotation. The interpretation of the historical content, such a resolving a historical taxon to a current one, can be done by individual researchers after the content is published as linked open data. Through the storage of annotion provenance, who created

‡ § | ‡

© Stork L et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

(2)

the annotation and when, we allow multiple interpretations of the content to exist in parallel, stimulating scientific discourse.

The semantic annotation process is supported by a web application, the Semantic Field Book (SFB)-Annotator, driven by an application ontology. The ontology formally describes the content and meta-data required to semantically annotate species observations. It is based on the Darwin Core standard (DwC), Uberon and the Geonames ontology. The provenance of annotations is stored using the Web Annotation Data Model. Adhering to the principles of FAIR (Findable, Accessible, Interoperable & Reusable) and Linked Open Data, the content of the specimen collections can be interpreted homogeneously and aggregated across datasets. This work is part of the Making Sense project: makingsenseproject.org. The project aims to disclose the content of a natural history collection: a 17,000 page account of the exploration of the Indonesian Archipelago between 1820 and 1850 (Natuurkundige Commissie voor Nederlands-Indie)

With a knowledge base, researchers are given easy access to the primary sources of natural history collections. For their research, they can aggregate species observations, construct rich queries to browse through the data and add their own interpretations regarding the meaning of the historical content.

Keywords

Linked Data, Biodiversity, Natural History Collections, Ontologies, crowd-sourcing, Semantic Annotation, History of Science

Presenting author

Lise Stork

Referenties

GERELATEERDE DOCUMENTEN

For example, the participant entity structures (containing modifier link structures and recursively embedded entity structures) for the quantifications with structured

nanicolle was originally designed for making Chinese collection labels and identification labels, with its first version completed on 2016/8/3 (ver. 1.07), the typesetting of

The setup for semantic macros described in the STEX modules package works well for simple mathematical functions: we make use of the macro application syntax in TEX to express

The applicability of the semantic model and the annotation approach is demonstrated using image scans from a collection of 8,000 field book pages gathered by the Committee for

z Enhanced search of News topic (logical inferences) z Intelligent presentation – Semantic interfaces z Unified news management – Semantic CMS. What we have done

In our method this should be taken into account at the steps where domain information is added to the basic shared domain model, being the steps where the actors,

In his book, A history of modern computing, Paul Ceruzzi starts with an introduction explaining that in the 1980s it was not at all obvious to write on the history of computing:

The participants in the study indicated that they experienced the disciplinary procedure of the organisation as traumatic “and I was very nervous, cried the whole