OPUS 4 | Search

Refine

Has Fulltext

yes (6)

6 search hits

1 to 6

Sort by

Year
Year
Title
Title
Author
Author

Auf dem Weg zu einer Kartographie: automatische und manuelle Analysen am Beispiel des Korpus ISW (2021)

Flinz, Carolina ; Ruppenhofer, Josef

Is it worth the effort? Assessing the benefits of partial automatic pre-labeling for frame-semantic annotation (2012)

Rehbein, Ines ; Ruppenhofer, Josef ; Sporleder, Caroline

Corpora with high-quality linguistic annotations are an essential component in many NLP applications and a valuable resource for linguistic research. For obtaining these annotations, a large amount of manual effort is needed, making the creation of these resources time-consuming and costly. One attempt to speed up the annotation process is to use supervised machine-learning systems to automatically assign (possibly erroneous) labels to the data and ask human annotators to correct them where necessary. However, it is not clear to what extent these automatic pre-annotations are successful in reducing human annotation effort, and what impact they have on the quality of the resulting resource. In this article, we present the results of an experiment in which we assess the usefulness of partial semi-automatic annotation for frame labeling. We investigate the impact of automatic pre-annotation of differing quality on annotation time, consistency and accuracy. While we found no conclusive evidence that it can speed up human annotation, we found that automatic pre-annotation does increase its overall quality.

Editorial (2020)

Ruppenhofer, Josef ; Siegel, Melanie ; Struß, Julia Maria

IGGSA-STEPS: Shared Task on Source and Target Extraction from Political Speeches (2014)

Ruppenhofer, Josef ; Struß, Julia Maria

Accurate opinion mining requires the exact identification of the source and target of an opinion. To evaluate diverse tools, the research community relies on the existence of a gold standard corpus covering this need. Since such a corpus is currently not available for German, the Interest Group on German Sentiment Analysis decided to create such a resource and make it available to the research community in the context of a shared task. In this paper, we describe the selection of textual sources, development of annotation guidelines, and first evaluation results in the creation of a gold standard corpus for the German language.

Treebanking user-generated content: a UD based overview of guidelines, corpora and unified recommendations (2022)

Sanguinetti, Manuela ; Bosco, Cristina ; Cassidy, Lauren ; Cetinoglu, Özlem ; Cignarella, Alessandra Teresa ; Lynn, Teresa ; Rehbein, Ines ; Ruppenhofer, Josef ; Seddah, Djamé ; Zeldes, Amir

This article presents a discussion on the main linguistic phenomena which cause difficulties in the analysis of user-generated texts found on the web and in social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework of syntactic analysis. Given on the one hand the increasing number of treebanks featuring user-generated content, and its somewhat inconsistent treatment in these resources on the other, the aim of this article is twofold: (1) to provide a condensed, though comprehensive, overview of such treebanks—based on available literature—along with their main features and a comparative analysis of their annotation criteria, and (2) to propose a set of tentative UD-based annotation guidelines, to promote consistent treatment of the particular phenomena found in these types of texts. The overarching goal of this article is to provide a common framework for researchers interested in developing similar resources in UD, thus promoting cross-linguistic consistency, which is a principle that has always been central to the spirit of UD.

Automatic generation of lexica for sentiment polarity shifters (2021)

Schulder, Marc ; Wiegand, Michael ; Ruppenhofer, Josef

Alleviating pain is good and abandoning hope is bad. We instinctively understand how words like alleviate and abandon affect the polarity of a phrase, inverting or weakening it. When these words are content words, such as verbs, nouns, and adjectives, we refer to them as polarity shifters. Shifters are a frequent occurrence in human language and an important part of successfully modeling negation in sentiment analysis; yet research on negation modeling has focused almost exclusively on a small handful of closed-class negation words, such as not, no, and without. A major reason for this is that shifters are far more lexically diverse than negation words, but no resources exist to help identify them. We seek to remedy this lack of shifter resources by introducing a large lexicon of polarity shifters that covers English verbs, nouns, and adjectives. Creating the lexicon entirely by hand would be prohibitively expensive. Instead, we develop a bootstrapping approach that combines automatic classification with human verification to ensure the high quality of our lexicon while reducing annotation costs by over 70%. Our approach leverages a number of linguistic insights; while some features are based on textual patterns, others use semantic resources or syntactic relatedness. The created lexicon is evaluated both on a polarity shifter gold standard and on a polarity classification task.

1 to 6

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

6 search hits