OPUS 4 | Search

Towards a gold standard corpus for detecting valencies of Zulu verbs (2019)

We report on a new project building a Natural Language Processing resource for Zulu by making use of resources already available. Combining tagging results with the results of morphological analysis semi-automatically, we expect to reduce the amount of manual work when generating a finely-grained gold standard corpus usable for training a tagger. From the tagged corpus, we plan to extract verb-argument pairs with the aim of compiling a verb valency lexicon for Zulu.

A Supervised learning approach for the extraction of opinion sources and targets from German text (2019)

Wiegand, Michael ; Chikobava, Margarita ; Ruppenhofer, Josef

We present the first systematic supervised learning approach for the extraction of opinion sources and targets on German language data. A wide choice of different features is presented, particularly syntactic features and generalization features. We point out specific differences between opinion sources and targets. Moreover, we explain why implicit sources can be extracted even with fairly generic features. In order to ensure comparability our classifier is trained and tested on the dataset of the STEPS shared task.

A descriptive analysis of a German corpus annotated with opinion sources and targets (2019)

Wiegand, Michael ; Lapp, Leonie ; Ruppenhofer, Josef

We present a descriptive analysis on the two datasets from the shared task on Source, Subjective Expression and Target Extraction from Political Speeches (STEPS), the only existing German dataset for opinion role extraction of its size. Our analysis discusses the individual properties of the three components, subjective expressions, sources and targets and their relations towards each other. Our observations should help practitioners and researchers when building a system to extract opinion roles from German data.

Overview of GermEval Task 2, 2019 shared task on the identification of offensive language (2019)

Struß, Julia Maria ; Siegel, Melanie ; Ruppenhofer, Josef ; Wiegand, Michael ; Klenner, Manfred

We present the second edition of the GermEval Shared Task on the Identification of Offensive Language. This shared task deals with the classification of German tweets from Twitter. Two subtasks were continued from the first edition, namely a coarse-grained binary classification task and a fine-grained multi-class classification task. As a novel subtask, we introduce the classification of offensive tweets as explicit or implicit. The shared task had 13 participating groups submitting 28 runs for the coarse-grained task, another 28 runs for the fine-grained task, and 17 runs for the implicit-explicit task. We evaluate the results of the systems submitted to the shared task. The shared task homepage can be found at https://projects.fzai.h-da.de/iggsa/

“Konservenglück in Tiefkühl-Town” – Das Songkorpus als empirische Ressource interdisziplinärer Erforschung deutschsprachiger Poptexte (2019)

Schneider, Roman

Der Beitrag beschreibt ein mehrfach annotiertes Korpus deutschsprachiger Songtexte als Datenbasis für interdisziplinäre Untersuchungsszenarien. Die Ressource erlaubt empirisch begründete Analysen sprachlicher Phänomene, systemischstruktureller Wechselbeziehungen und Tendenzen in den Texten moderner Popmusik. Vorgestellt werden Design und Annotationen des in thematische und autorenspezifische Archive stratifizierten Korpus sowie deskriptive Statistiken am Beispiel des Udo-Lindenberg-Archivs.

Detecting the boundaries of sentence-like units on spoken German (2019)

Ruppenhofer, Josef ; Rehbein, Ines

Automatic division of spoken language transcripts into sentence-like units is a challenging problem, caused by disfluencies, ungrammatical structures and the lack of punctuation. We present experiments on dividing up German spoken dialogues where we investigate the impact of task setup and data representation, encoding of context information as well as different model architectures for this task.

Metaphor detection for German poetry (2019)

Reinig, Ines ; Rehbein, Ines

This paper presents first steps towards metaphor detection in German poetry, in particular in expressionist poems. We create a dataset with adjective-noun pairs extracted from expressionist poems, manually annotated for metaphoricity. We discuss the annotation process and present models and experiments for metaphor detection where we investigate the impact of context and the domain dependence of the models.

Deep learning for free indirect representation (2019)

Brunner, Annelen ; Tu, Ngoc Duyen Tanja ; Weimer, Lukas ; Jannidis, Fotis

In this paper, we present our work-inprogress to automatically identify free indirect representation (FI), a type of thought representation used in literary texts. With a deep learning approach using contextual string embeddings, we achieve f1 scores between 0.45 and 0.5 (sentence-based evaluation for the FI category) on two very different German corpora, a clear improvement on earlier attempts for this task. We show how consistently marked direct speech can help in this task. In our evaluation, we also consider human inter-annotator scores and thus address measures of certainty for this difficult phenomenon.

Ein Tool zur Visualisierung des Gebrauchs von Schreibvarianten (2019)

Fischer, Peter M. ; Lang, Christian

In unserem Beitrag stellen wir die Entwicklung eines komponentenbasierten Tools zur Abfrage, Auswertung und Visualisierung von Schreibvarianten vor.

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

9 search hits