Volltext-Downloads (blau) und Frontdoor-Views (grau)

Detecting annotation noise in automatically labelled data

  • We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active learning. We test our approach on in-domain and out-of-domain data in two languages, in AL simulations and in a real world setting. For all settings, the results show that our method is able to detect annotation errors with high precision and high recall.

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Ines Rehbein, Josef Ruppenhofer
URN:urn:nbn:de:bsz:mh39-80343
URL:http://aclweb.org/anthology/P17-1107
DOI:https://doi.org/10.18653/v1/P17-1107
ISBN:978-1-945626-75-3
Parent Title (English):Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), vol. 1 (Long Papers). July 30 - August 4, 2017 Vancouver, Canada
Publisher:The Association for Computational Linguistics
Place of publication:Stroudsburg PA, USA
Document Type:Part of a Book
Language:English
Year of first Publication:2017
Date of Publication (online):2018/10/04
Publicationstate:Veröffentlichungsversion
Reviewstate:Peer-Review
GND Keyword:Annotation; Automatische Sprachverarbeitung; Computerlinguistik; Fehleranalyse
First Page:1160
Last Page:1170
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Computerlinguistik
Program areas:Pragmatik
Program areas:Digitale Sprachwissenschaft
Licence (German):License LogoUrheberrechtlich geschützt