OPUS 4 | Search

19 search hits

1 to 10

Sort by

Year
Year
Title
Title
Author
Author

Detecting annotation noise in automatically labelled data (2017)

We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active learning. We test our approach on in-domain and out-of-domain data in two languages, in AL simulations and in a real world setting. For all settings, the results show that our method is able to detect annotation errors with high precision and high recall.

Who’s in, who’s out? Predicting the inclusiveness or exclusiveness of personal pronouns in parliamentary debates (2022)

Rehbein, Ines ; Ruppenhofer, Josef

This paper presents a compositional annotation scheme to capture the clusivity properties of personal pronouns in context, that is their ability to construct and manage in-groups and out-groups by including/excluding the audience and/or non-speech act participants in reference to groups that also include the speaker. We apply and test our schema on pronoun instances in speeches taken from the German parliament. The speeches cover a time period from 2017-2021 and comprise manual annotations for 3,126 sentences. We achieve high inter-annotator agreement for our new schema, with a Cohen’s κ in the range of 89.7-93.2 and a percentage agreement of > 96%. Our exploratory analysis of in/exclusive pronoun use in the parliamentary setting provides some face validity for our new schema. Finally, we present baseline experiments for automatically predicting clusivity in political debates, with promising results for many referential constellations, yielding an overall 84.9% micro F1 for all pronouns.

Sprucing up the trees – error detection in treebanks (2018)

Rehbein, Ines ; Ruppenhofer, Josef

We present a method for detecting annotation errors in manually and automatically annotated dependency parse trees, based on ensemble parsing in combination with Bayesian inference, guided by active learning. We evaluate our method in different scenarios: (i) for error detection in dependency treebanks and (ii) for improving parsing accuracy on in- and out-of-domain data.

Argument omissions in multiple German corpora (2018)

Ruppenhofer, Josef

FrameNet (2018)

Ruppenhofer, Josef ; Boas, Hans Christian ; Baker, Collin F.

The FrameNet approach to relating syntax and semantics (2013)

Ruppenhofer, Josef ; Boas, Hans Christian ; Baker, Collin F.

Verifying the robustness of opinion inference (2016)

Ruppenhofer, Josef ; Brandes, Jasper

There is increasing interest in recognizing opinion inferences in addition to expressions of explicit sentiment. While different formalisms for representing inferential mechanisms are being developed and lexical resources are being built alongside, we here address the need for deeper investigation of the robustness of various aspects of opinion inference, performing crowdsourcing experiments with constructed stimuli as well as a corpus study of attested data.

I’ve got a construction looks funny – representing and recovering non-standard constructions in UD (2020)

Ruppenhofer, Josef ; Rehbein, Ines

The UD framework defines guidelines for a crosslingual syntactic analysis in the framework of dependency grammar, with the aim of providing a consistent treatment across languages that not only supports multilingual NLP applications but also facilitates typological studies. Until now, the UD framework has mostly focussed on bilexical grammatical relations. In the paper, we propose to add a constructional perspective and discuss several examples of spoken-language constructions that occur in multiple languages and challenge the current use of basic and enhanced UD relations. The examples include cases where the surface relations are deceptive, and syntactic amalgams that either involve unconnected subtrees or structures with multiply-headed dependents. We argue that a unified treatment of constructions across languages will increase the consistency of the UD annotations and thus the quality of the treebanks for linguistic analysis.

Overview of the IGGSA 2016 Shared Task on Source and Target Extraction from Political Speeches (2016)

Ruppenhofer, Josef ; Struß, Julia Maria ; Wiegand, Michael

We present the second iteration of IGGSA’s Shared Task on Sentiment Analysis for German. It resumes the STEPS task of IGGSA’s 2014 evaluation campaign: Source, Subjective Expression and Target Extraction from Political Speeches. As before, the task is focused on fine-grained sentiment analysis, extracting sources and targets with their associated subjective expressions from a corpus of speeches given in the Swiss parliament. The second iteration exhibits some differences, however; mainly the use of an adjudicated gold standard and the availability of training data. The shared task had 2 participants submitting 7 runs for the full task and 3 runs for each of the subtasks. We evaluate the results and compare them to the baselines provided by the previous iteration. The shared task homepage can be found at http://iggsasharedtask2016.github.io/.

Distinguishing affixoid formations from compounds (2018)

Ruppenhofer, Josef ; Wiegand, Michael ; Wilm, Rebecca ; Markert, Katja

We study German affixoids, a type of morpheme in between affixes and free stems. Several properties have been associated with them – increased productivity; a bleached semantics, which is often evaluative and/or intensifying and thus of relevance to sentiment analysis; and the existence of a free morpheme counterpart – but not been validated empirically. In experiments on a new data set that we make available, we put these key assumptions from the morphological literature to the test and show that despite the fact that affixoids generate many low-frequency formations, we can classify these as affixoid or non-affixoid instances with a best F1-score of 74%.

1 to 10

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

19 search hits