• No results found

Named Entity Extraction and Disambiguation from an Uncertainty Perspective

N/A
N/A
Protected

Academic year: 2021

Share "Named Entity Extraction and Disambiguation from an Uncertainty Perspective"

Copied!
1
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Named Entity Extraction and Disambiguation from an

Uncertainty Perspective

Mena B. Habib, Maurice van Keulen

Database group, University of Twente, The Netherlands {m.b.habib , m.vankeulen}@ewi.utwente.nl

Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This work addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process.

It is the aim of this work to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations.

References

1. M. van Keulen, Mena B. Habib: “Handling Uncertainty in Information Extraction.” Proceedings of the 7th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2011), pages 109-112, 2011.

2. Mena B. Habib and M. van Keulen. “Named Entity Extraction and Disambiguation: The Reinforcement Effect.”. In Proceedings of the 5th International Workshop on Management of Uncertain Data, MUD 2011, collocated with the international conference on Very Large Databases VLDB 2011, pages 9-16, 2011.

Referenties

GERELATEERDE DOCUMENTEN

An idealized (3D) model is set up to investigate the most prominent processes governing the migration of sand waves, being characteristics of tide constituents, sediment

To have ground truth data of our classes for training and testing, we manually annotated 297 bounding boxes of traffic signs in the images.. The data is split into training set and

data outlier filtering data aggregation data transformation feature selection hyperparameter assessment data mining V VII VI IV II I III 1 2 3 cooling tower management..

This dissertation evaluates the proposed “Capacity Building Guidelines in Urban And Regional Planning For Municipal Engineers And Engineering Staff Within Municipalities’

In conclusion, we present a validated quantitative 3DCT analysis of acetabular fractures, which is reliable, observer independent and should be used in addition to the current

The limit active intervals constraint (which is new to XESTT) has the same attributes as the cluster busy times constraint, notably maximum and minimum limits and a sequence of

After the retrieval of the atmospheric gas-constituents, an atmo- spheric correction was performed on the target acquisitions. In the at- tempt severe overcorrections were

The Data Provision module itself is processing the data from these systems to calculate state based energy consumption values and hence provides reference data including necessary