Refine
Year of publication
- 2018 (112) (remove)
Document Type
- Article (48)
- Part of a Book (45)
- Conference Proceeding (11)
- Book (4)
- Review (3)
- Part of Periodical (1)
Is part of the Bibliography
- yes (112) (remove)
Keywords
- Deutsch (38)
- Korpus <Linguistik> (27)
- Gesprochene Sprache (12)
- Konversationsanalyse (9)
- Interaktionsanalyse (8)
- Semantische Analyse (7)
- Annotation (6)
- Automatische Sprachverarbeitung (6)
- Grammatik (6)
- Multimodalität (6)
Publicationstate
- Veröffentlichungsversion (67)
- Zweitveröffentlichung (38)
- Postprint (17)
Reviewstate
- Peer-Review (112) (remove)
Publisher
- European language resources association (ELRA) (13)
- de Gruyter (11)
- Erich Schmidt (8)
- Znanstvena založba Filozofske fakultete Univerze v Ljubljani / Ljubljana University Press, Faculty of Arts (7)
- Heidelberg University Publishing (5)
- Verlag für Gesprächsforschung (5)
- Cambridge University Press (3)
- Leibniz-Zentrum allgemeine Sprachwissenschaft (ZAS); Humboldt-Universität zu Berlin (3)
- Springer (3)
- The Association for Computational Linguistics (3)
The workshop presents ATHEN 1 (Annotation and Text Highlighting Environment), an extensible desktop-based annotation environment which supports more than just regular annotation. Besides being a general purpose annotation environment, ATHEN supports indexing and querying support of your data as well as the ability to automatically preprocess your data with Meta information. It is especially suited for those who want to extend existing general purpose annotation tools by implementing their own custom features, which cannot be fulfilled by other available annotation environments. On the according gitlab, we provide online tutorials, which demonstrate the use of specific features of ATHEN
Projektvorstellung – Redewiedergabe. Eine literatur- und sprachwissenschaftliche Korpusanalyse
(2018)
Das laufende DFG-Projekt „Redewiedergabe“ stellt einen Anwendungsfall quantitativer Sprach-und Literaturwissenschaft dar und beschäftigt sich mit dem Phänomen „Redewiedergabe“ auf der Grundlage großer Datenmengen. Zu diesem Zweck wird zum einen ein Korpus manuell mit Redewiedergabeformen annotiert, zum anderen werden Verfahren zur automatischen Erkennung des Phänomens entwickelt. Ziel ist es, Forschungsfragen nach der Entwicklung von Redewiedergabe vor allem im 19. Jahrhundert zu beantworten.
In this paper we discuss a type of copular clause – specificational copular clauses – in which subject properties may be split between two nominative noun phrases. In particular, while the first noun phrase occupies the canonical preverbal subject position, in some languages the finite verb can agree with the postverbal nominative. Such agreement might be expected, on some theoretical assumptions, to show person restrictions. We discuss this phenomenon in two SVO Germanic languages – Icelandic and Faroese – and present new data from Faroese showing that the person effect here follows from the existence of distinct probes for Number and Person agreement.
Das Forschungs- und Lehrkorpus Gesprochenes Deutsch (FOLK), zugänglich über die Datenbank für Gesprochenes Deutsch (DGD), strebt den Status eines Referenzkorpus für den aktuellen mündlichen Sprachgebrauch im deutschen Sprachraum an. Es enthält einen wachsenden Bestand von Audio- und Videoaufnahmen authentischer Gespräche aus verschiedenen Bereichen des gesellschaftlichen Lebens. Die Dokumentation und Repräsentation von Interaktions- und Sprecherinformationen sind bereits seit den Anfängen des Korpusaufbaus integrale Bestandteile von FOLK. Allerdings lag bislang kein ausgearbeitetes, empirisch erprobtes und vollständig in die Korpusinfrastruktur integrierbares Stratifikationskonzept vor. Mit dem vorliegenden Artikel wird ein solches Konzept vorgeschlagen. Es knüpft an frühere Konzeptionen an und wurde anhand der vorhandenen Daten überprüft, korrigiert und erweitert. Dieser Prozess verlief parallel zur Überarbeitung des XML-Schemas zur Metadatendokumentation, um die konkrete Implementierung vorzubereiten. Im Anschluss an eine Skizzierung genereller Aspekte des Korpusdesigns werden die stratifikationsleitenden und ergänzenden Parameter vorgestellt und erläutert. Abschließend werden Ansätze und Strategien zum Korpusausbau diskutiert.
Wir diskutieren in diesem Beitrag Implikationen, mit denen man zu tun bekommt, wenn man kleinste Formen situativer Vergesellschaftung – wir sprechen von kommunikativen Minimalformen – untersucht. Kommunikative Minimalformen sind kurzzeitige, nur wenige Sekunden dauernde, gemeinsam konstituierte Interaktionsereignisse. Ungeachtet ihrer Kürze weisen sie zum einen eine komplexe Interaktionsstruktur auf. Zum anderen besitzen sie auch eine klare soziale Implikation und eigene Wertigkeit. In dem hier untersuchten Fall, bei dem Passanten durch ein offenes Fenster in einen Privatraum blicken und dabei ertappt werden, zeigt sich diese soziale Implikativität als moralische Kommunikation im Sinne der interaktiven Bearbeitung eigenen Fehlverhaltens.
We study German affixoids, a type of morpheme in between affixes and free stems. Several properties have been associated with them – increased productivity; a bleached semantics, which is often evaluative and/or intensifying and thus of relevance to sentiment analysis; and the existence of a free morpheme counterpart – but not been validated empirically. In experiments on a new data set that we make available, we put these key assumptions from the morphological literature to the test and show that despite the fact that affixoids generate many low-frequency formations, we can classify these as affixoid or non-affixoid instances with a best F1-score of 74%.
In this paper we use methods for creating a large lexicon of verbal polarity shifters and apply them to German. Polarity shifters are content words that can move the polarity of a phrase towards its opposite, such as the verb “abandon” in “abandon all hope”. This is similar to how negation words like “not” can influence polarity. Both shifters and negation are required for high precision sentiment analysis. Lists of negation words are available for many languages, but the only language for which a sizable lexicon of verbal polarity shifters exists is English. This lexicon was created by bootstrapping a sample of annotated verbs with a supervised classifier that uses a set of data- and resource-driven features. We reproduce and adapt this approach to create a German lexicon of verbal polarity shifters. Thereby, we confirm that the approach works for multiple languages. We further improve classification by leveraging cross-lingual information from the English shifter lexicon. Using this improved approach, we bootstrap a large number of German verbal polarity shifters, reducing the annotation effort drastically. The resulting German lexicon of verbal polarity shifters is made publicly available.
Both for psychology and linguistics, emotion concepts are a continuing challenge for analysis in several respects. In this contribution, we take up the language of emotion as an object of study from several angles. First, we consider how frame semantic analyses of this domain by the FrameNet project have been developing over time, due to theory-internal as well as application-oriented goals, towards ever more fine-grained distinctions and greater within-frame consistency. Second, we compare how FrameNet’s linguistically oriented analysis of lexical items in the emotion domain compares to the analysis by domain experts of the experiences that give rise (directly or indirectly) to the lexical items. And finally, we consider to what extent frame semantic analysis can capture phenomena such as connotation and inference about attitudes, which are important in the field of sentiment analysis and opinion mining, even if they do not involve the direct evocation of emotion.
We present the pilot edition of the GermEval Shared Task on the Identification of Offensive Language. This shared task deals with the classification of German tweets from Twitter. It comprises two tasks, a coarse-grained binary classification task and a fine-grained multi-class classification task. The shared task had 20 participants submitting 51 runs for the coarse-grained task and 25 runs for the fine-grained task. Since this is a pilot task, we describe the process of extracting the raw-data for the data collection and the annotation schema. We evaluate the results of the systems submitted to the shared task. The shared task homepage can be found at https://projects.cai. fbi.h-da.de/iggsa/
Offensive language in social media is a problem currently widely discussed. Researchers in language technology have started to work on solutions to support the classification of offensive posts. We present the pilot edition of the GermEval Shared Task on the Identification of Offensive Language. This shared task deals with the classification of German tweets from Twitter. GermEval 2018 is the fourth workshop in a series of shared tasks on German processing.