410 Linguistik
Refine
Year of publication
- 2011 (42) (remove)
Document Type
- Part of a Book (24)
- Article (9)
- Conference Proceeding (7)
- Other (2)
Is part of the Bibliography
- no (42)
Keywords
- Deutsch (19)
- Computerlinguistik (8)
- Korpus <Linguistik> (7)
- Computerunterstützte Lexikographie (5)
- Grammatik (5)
- Online-Wörterbuch (4)
- Automatische Sprachanalyse (3)
- Konversationsanalyse (3)
- Polnisch (3)
- Benutzer (2)
Publicationstate
- Veröffentlichungsversion (31)
- Postprint (4)
- (Verlags)-Lektorat (1)
- Preprint (1)
Reviewstate
- (Verlags)-Lektorat (28)
- Peer-Review (6)
- Peer-review (2)
Publisher
- Narr (8)
- Springer (4)
- de Gruyter (4)
- Trojina, Institute for Applied Slovene Studies (3)
- Francke (2)
- GSCL (2)
- Institut für Deutsche Sprache (2)
- Lang (2)
- Association for Computational Linguistics (1)
- Brill (1)
Die Ordnung des öffentlichen Diskurses der Wirtschaftskrise und die (Un-)Ordnung des Ausgeblendeten
(2011)
We introduce a system that learns the participants of arbitrary given scripts. This system processes data from web experiments, in which each participant can be realized with different expressions. It computes participants by encoding semantic similarity and global structural information into an Integer Linear Program. An evaluation against a gold standard shows that we significantly outperform two informed baselines.
Semantic argument structures are often incomplete in that core arguments are not locally instantiated. However, many of these implicit arguments can be linked to referents in the wider context. In this paper we explore a number of linguistically motivated strategies for identifying and resolving such null instantiations (NIs). We show that a more sophisticated model for identifying definite NIs can lead to noticeable performance gains over the state-of-the- art for NI resolution.
In den letzten Jahren entwickelten sich in vielen europäischen Großstädten unter Jugendlichen der 2. und 3. Migrantengeneration ethnolektale Formen des Deutschen. Sie sind charakteristisch für multilinguale Kontexte, in denen Sprecher unterschiedlicher Herkunftssprachen die regionale Umgangssprache des Landes, in dem sie leben, als lingua franca benutzen. Die neuen Formen haben große Überschneidungsbereiche mit den regionalen Varietäten, unterscheiden sich aber prosodisch- phonetisch, lexikalisch und morphosyntaktisch. Meist werden sie nur in bestimmten Kontexten verwendet, und die Sprecher wechseln virtuos zwischen regionalen Varietäten, Herkunftsvarietäten, sprachlichen Mischungen und ethnolektalen Formen.
Auf der Basis von drei ethnografischen Fallstudien in Mannheim wird gezeigt, wie die von den Migrantenjugendlichen entwickelten ethnolektalen Formen aussehen und zu welchen Zwecken die Jugendlichen sie verwenden. Die Jugendlichen haben ein weites Sprachrepertoire, verfugen über ethnolektale sowie standardnahe Formen und nutzen die Differenz zwischen beiden als kommunikative Ressource.
Active Learning (AL) has been proposed as a technique to reduce the amount of annotated data needed in the context of supervised classification. While various simulation studies for a number of NLP tasks have shown that AL works well on goldstandard data, there is some doubt whether the approach can be successful when applied to noisy, real-world data sets. This paper presents a thorough evaluation of the impact of annotation noise on AL and shows that systematic noise resulting from biased coder decisions can seriously harm the AL process. We present a method to filter out inconsistent annotations during AL and show that this makes AL far more robust when applied to noisy data.
In this contribution, we discuss and compare alternative options of modelling the entities and relations of wordnet-like resources in the Web Ontology Language OWL. Based on different modelling options, we developed three models of representing wordnets in OWL, i.e. the instance model, the dass model, and the metaclass model. These OWL models mainly differ with respect to the ontological Status of lexical units (word senses) and the synsets. While in the instance model lexical units and synsets are represented as individuals, in the dass model they are represented as classes; both model types can be encoded in the dialect OWL DL. As a third alternative, we developed a metaclass model in OWL FULL, in which lexical units and synsets are defined as metaclasses, the individuals of which are classes themselves. We apply the three OWL models to each of three wordnet-style resources: (1) a subset of the German wordnet GermaNet, (2) the wordnet-style domain ontology TermNet, and (3) GermaTermNet, in which TermNet technical terms and GermaNet synsets are connected by means of a set of “plug-in” relations. We report on the results of several experiments in which we evaluated the performance of querying and processing these different models: (1) A comparison of all three OWL models (dass, instance, and metaclass model) of TermNet in the context of automatic text-to-hypertext conversion, (2) an investigation of the potential of the GermaTermNet resource by the example of a wordnet-based semantic relatedness calculation.
Einführung
(2011)