POS error detection in automatically annotated corpora
- Recent work on error detection has shown that the quality of manually annotated corpora can be substantially improved by applying consistency checks to the data and automatically identifying incorrectly labelled instances. These methods, however, can not be used for automatically annotated corpora where errors are systematic and cannot easily be identified by looking at the variance in the data. This paper targets the detection of POS errors in automatically annotated corpora, so-called silver standards, showing that by combining different measures sensitive to annotation quality we can identify a large part of the errors and obtain a substantial increase in accuracy.
Author: | Ines Rehbein |
---|---|
URN: | urn:nbn:de:bsz:mh39-55986 |
ISBN: | 978-1-941643-29-7 |
Parent Title (English): | Proceedings of the 8th Linguistic Annotation Workshop in conjunction with COLING 2014 (LAW-VIII). August 23-24, 2014. Dublin, Ireland |
Publisher: | ACL |
Place of publication: | Stroudsburg, PA |
Editor: | Lori Levin, Manfred Stede |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2014 |
Date of Publication (online): | 2016/11/21 |
Publicationstate: | Veröffentlichungsversion |
GND Keyword: | Annotation; Automatische Sprachanalyse; Korpus <Linguistik> |
First Page: | 20 |
Last Page: | 28 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Linguistics-Classification: | Computerlinguistik |
Licence (German): | ![]() |