Refine
Document Type
- Conference Proceeding (2)
- Article (1)
Language
- English (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
- Frame-Semantik (2)
- Annotation (1)
- Automatische Sprachanalyse (1)
- Deutsch (1)
- Frame semantics (1)
- Korpus <Linguistik> (1)
- Linguistic annotation (1)
- SALSA (1)
- Semantic role labelling (1)
- Semi-automatic annotation (1)
Publicationstate
Reviewstate
- Peer-Review (2)
- (Verlags)-Lektorat (1)
This paper presents Release 2.0 of the SALSA corpus, a German resource for lexical semantics. The new corpus release provides new annotations for German nouns, complementing the existing annotations of German verbs in Release 1.0. The corpus now includes around 24,000 sentences with more than 36,000 annotated instances. It was designed with an eye towards NLP applications such as semantic role labeling but will also be a useful resource for linguistic studies in lexical semantics.
Corpora with high-quality linguistic annotations are an essential component in many NLP applications and a valuable resource for linguistic research. For obtaining these annotations, a large amount of manual effort is needed, making the creation of these resources time-consuming and costly. One attempt to speed up the annotation process is to use supervised machine-learning systems to automatically assign (possibly erroneous) labels to the data and ask human annotators to correct them where necessary. However, it is not clear to what extent these automatic pre-annotations are successful in reducing human annotation effort, and what impact they have on the quality of the resulting resource. In this article, we present the results of an experiment in which we assess the usefulness of partial semi-automatic annotation for frame labeling. We investigate the impact of automatic pre-annotation of differing quality on annotation time, consistency and accuracy. While we found no conclusive evidence that it can speed up human annotation, we found that automatic pre-annotation does increase its overall quality.