Refine
Year of publication
Document Type
- Part of a Book (193) (remove)
Language
- English (147)
- German (43)
- French (2)
- Multiple languages (1)
Has Fulltext
- yes (193)
Keywords
- Deutsch (60)
- Korpus <Linguistik> (47)
- Wörterbuch (19)
- Lexikographie (16)
- Englisch (14)
- Annotation (11)
- Neologismus (11)
- Gesprochene Sprache (10)
- Kontrastive Linguistik (9)
- Zweisprachiges Wörterbuch (9)
Publicationstate
- Veröffentlichungsversion (193) (remove)
Reviewstate
- Peer-Review (193) (remove)
Publisher
- IDS-Verlag (85)
- European language resources association (ELRA) (11)
- Peter Lang (9)
- The Association for Computational Linguistics (7)
- Znanstvena založba Filozofske fakultete Univerze v Ljubljani / Ljubljana University Press, Faculty of Arts (7)
- De Gruyter (5)
- Ids-Verlag (5)
- Heidelberg University Publishing (4)
- Association for Computational Linguistics (3)
- European Language Resources Association (ELRA) (3)
In this paper, we present the Multiple Annotation approach, which solves two problems: the problem of annotating overlapping structures, and the problem that occurs when documents should be annotated according to different, possibly heterogeneous tag sets. This approach has many advantages: it is based on XML, the modeling of alternative annotations is possible, each level can be viewed separately, and new levels can be added at any time. The files can be regarded as an interrelated unit, with the text serving as the implicit link. Two representations of the information contained in the multiple files (one in Prolog and one in XML) are described. These representations serve as a base for several applications.
This paper aims at contributing to the analysis of overlaps in turns-at-talk from both a sequential and a multimodal perspective. Overlaps have been studied within Conversation Analysis by focusing mainly on verbal and vocal resources; taking into account multimodal resources such as gesture, bodily posture, and gaze contributes to a better understanding of participants’ orientations to the sequential organization of overlapping talk and their management of speakership. First, we introduce the way in which overlaps have been studied in Conversation Analysis, mainly by Jefferson (1973, 1983, 2004) and Schegloff (2000); then we propose possible implications of their multimodal analysis. In order to demonstrate that speakers systematically orient to the overlap onset and resolution we analyze the multimodal conduct of overlapped speakers. Findings show methodical variations in trajectories of overlap resolution: speakers’ gestures in overlap display themselves as maintaining or withdrawing their turn, thereby exhibiting the speakership achieved and negotiated during overlap.
In two eye-tracking experiments, we investigated the relationship between the subject preference in the resolution of subject-object ambiguities in German embedded clauses and semantic word order constraints (i.e., prominence hierarchies relating to the specificity/referentiality of noun phrases, case assignment and thematic role assignment). Our central research question concerned the timecourse with which prominence information is used and particularly whether it modulates the subject preference. In both experiments, we replicated previous findings of reanalysis effects for object-initial structures. Our findings further suggest that noun phrase prominence does not alter initial parsing strategies (viz., the subject preference), but rather modulates the ease of later reanalysis processes. In Experiment 1, the object case assigned by the verb did not affect the ease of reanalysis. However, the syntactic reanalysis was rendered more difficult when the order of the two arguments violated the specificity/referentiality hierarchy. Experiment 2 revealed that the initial subject preference also holds for verbs favoring an object-initial base order (i.e., dative object-experiencer verbs). However, the advantage for subject-initial sentences is neutralized in relatively late processing stages when the thematic role hierarchy and the specificity hierarchy converge to promote scrambling.
This article reports about the on-going work on a new version of the metadata framework Component Metadata Infrastructure (CMDI), central to the CLARIN infrastructure. Version 1.2 introduces a number of important changes based on the experience gathered in the last five years of intensive use of CMDI by the digital humanities community, addressing problems encountered, but also introducing new functionality. Next to the consolidation of the structure of the model and schema sanity, new means for lifecycle management have been introduced aimed at combatting the observed proliferation of components, new mechanism for use of external vocabularies will contribute to more consistent use of controlled values and cues for tools will allow improved presentation of the metadata records to the human users. The feature set has been frozen and approved, and the infrastructure is now entering a transition phase, in which all the tools and data need to be migrated to the new version.
Der vorliegende Beitrag erkundet den Zusammenhang zwischen der Komplexität politischer Argumentationsprozesse und der Diversifikation der Semantik von Schlüsselwörtern, deren Bedeutung im Argumentationsprozess umkämpft und in zahlreichen Facetten entfaltet widAdegenstand der Untersuchung ist die Verwendung von „Ökologie" in den Schlichtungsgesprächen zum Bahnprojekt Stuttgart 21. Im Unterscheid zu bisher vorliegenden Analysen zu semantischen Kämpfen geht es weniger darum, wie ein Ausdruck von einer Partei im Gegensatz zu anderen semantisiert wird. Es wird vielmehr gezeigt, wie semantische Diversifizierung und Ambiguität von „Ökologie" im expertischen Argumentationsprozess entstehen und welche kommunikativen Effekte dies für die Möglichkeit der Bürgerbeteiligung mit sich bringt. Es werden drei Praktiken identifiziert, mit denen die Interaktionsteilnehmer selbst auf semantische Diversifizierung und Ambiguität reagieren und versuchen, den Ausdruck eindeutig interpretierbar und die Quaestio entscheidbar zu machen: Strategieunterstellungen, Popularisierungen und Populismus. Die Interaktionsanalysen zeigen dabei, dass diese Praktiken selbst die Problematik, die sie lösen sollen, reproduzieren.
Many applications in Natural Language Processing require a semantic analysis of sentences in terms of truth-conditional representations, often with specific desiderata in terms of which information needs to be included in the semantic analysis. However, there are only very few tools that allow such an analysis. We investigate the representations of an automatic analysis pipeline of the C&C parser and Boxer to determine whether Boxer’s analyses in form of Discourse Representation Structure can be successfully converted into a more surface oriented event semantic representation, which will serve as input for a fusion algorithm for fusing hard and soft information. We use a data set of synthetic counter intelligence messages for our investigation. We provide a basic pipeline for conversion and subsequently discuss areas in which ambiguities and differences between the semantic representations present challenges in the conversion process.
Brown clustering has been used to help increase parsing performance for morphologically rich languages. However, much of the work has focused on using clustering techniques to replace terminal nodes or as a feature for parsing. Instead, we choose to examine how effectively Brown clustering is for unlexicalized parsing by creating data-driven POS tagsets which are then used with the Berkeley parser. We investigate cluster sizes as well as on what information (e.g. words vs. lemmas) clustering will yield the best parser performance. Our results approach the current state of the art results for the German T¨uBa-D/Z treebank when using parser internal tagging.
Vorgestellt werden kontrastive Analysen zur Besetzung und Häufigkeitsverteilung von Vorfeldern im Deutschen und ihren französischen, italienischen, norwegischen, polnischen und ungarischen Äquivalenten in morphosyntaktisch annotierten Wikipedia-Korpora. Im Rahmen der Untersuchung wurden mit korpusanalytischen Methoden quantitative Zusammenhänge bei den sprachspezifischen Ausprägungen von Vorfeldern nachgewiesen, die im Einklang mit typischen Struktureigenschaften der untersuchten Kontrastsprachen stehen. Die Ergebnisse legen aber nahe, dass die untersuchten Vorfeldstrukturen ‒ trotz der beträchtlichen Größe und thematischen Vielfalt der Wikipedia-Korpora ‒ nicht hinreichend repräsentativ sind, um uneingeschränkt Rückschlüsse auf allgemeine Struktureigenschaften der sechs Kontrastsprachen zu ziehen. Hierfür verantwortlich ist insbesondere die ausgeprägte Textsortenspezifizität der Mediengattung (Online-)Enzyklopädie, was mithilfe weiterer Vergleichskorpora aufgezeigt werden konnte.