Refine
Year of publication
Document Type
- Part of a Book (35)
- Conference Proceeding (25)
- Article (18)
- Contribution to a Periodical (7)
- Working Paper (5)
- Book (3)
- Other (2)
- Part of Periodical (1)
- Preprint (1)
Keywords
- Korpus <Linguistik> (97) (remove)
Publicationstate
- Veröffentlichungsversion (36)
- Zweitveröffentlichung (30)
- Postprint (7)
Reviewstate
- (Verlags)-Lektorat (36)
- Peer-Review (26)
- Peer-review (1)
- Verlags-Lektorat (1)
Publisher
- de Gruyter (20)
- European Language Resources Association (ELRA) (7)
- Narr (6)
- European Language Resources Association (5)
- Institut für Deutsche Sprache (3)
- Leibniz-Institut für Deutsche Sprache (IDS) (3)
- Linköping University Electronic Press (3)
- Universitäts- und Landesbibliothek Darmstadt (3)
- Wilhelm Fink (3)
- Gesellschaft für Sprachtechnologie and Computerlinguistik e.V. (2)
This paper describes EXMARaLDA, an XML-based framework for the construction, dissemination and analysis of corpora of spoken language transcriptions. Departing from a prototypical example of a “partitur” (musical score) transcription, the EXMARaLDA “single timeline, multiple tiers” data model and format is presented alongside with the EXMARaLDA Partitur-Editor, a tool for inputting and visualizing such data. This is followed by a discussion of the interaction of EXMARaLDA with other frameworks and tools that work with similar data models. Finally, this paper presents an extension of the “single timeline, multiple tiers” data model and describes its application within the EXMARaLDA system.
Dieser Aufsatz befasst sich mit Fragen, die sich im Zusammenhang mit der Archivierung und öffentlichen Bereitstellungen von gesprächsanalytischen Daten (Audio- bzw. Videoaufnahmen und deren Transkriptionen) stellen. Er gibt zunächst einen Überblick über die Forschungsperspektiven, die eine verbesserte Praxis der Datenm•chivierung flir die Gesprächsforschung bieten würde, und nennt dann einige der wesentlichen Probleme, die in der derzeitigen Praxis der Schaffung solcher Archive im Wege stehen können. Anschließend werden vorhandene Lösungsansätze vorgestellt, die helfen können, diese Probleme zu überwinden.
This paper presents ongoing work on a multilingual (English, French, German) lexical resource of soccer language. The first part describes how lexicographic descriptions based on frame-semantic principles are derived from a partially aligned multilingual corpus of soccer match reports. The remainder of the paper then discusses how different types of ontological knowledge are linked to this resource in order to provide an access structure to the resulting dictionary. It is argued that linking lexical resources and ontologies in such a way provides novel ways to a dictionary user of navigating a domain vocabulary
This paper presents the Kicktionary, a multilingual (English — German - French) electronic lexical resource of the language of football. It explains how a corpus of football match reports was analysed according to the FrameNet and WordNet approaches and how the result of this analysis is presented to a dictionary user via a website
This paper presents the Kicktionary, a multilingual (English - German - French) electronic lexical resource of the language of football. In the Kicktionary, methods from corpus linguistics and two approaches to lexical semantics - the theory of frame semantics and the concept of semantic relations - are combined to construct a lexical resource in which the user can explore relationships between lexical units in various ways. This paper explains the theoretical background of the Kicktionary, sketches the data and methods which were used in its construction, and describes how the resulting resource is presented to users via a set of hyperlinked webpages.
Rescuing Legacy Data
(2008)
This paper discusses issues that arise in the transformation of electronic language data from outdated to modern, sustainable formats. We first describe the problem and then present four different cases in which corpora of spoken language were converted from legacy formats to an XML-based representation. For each of the four cases, we describe the conversion workflow and discuss the difficulties that we had to overcome. Based on this experience, we formulate some more general observations about transforming legacy data and conclude with a set of best practice recommendations for a more sustainable handling of language corpora.