Refine
Year of publication
Document Type
- Article (16)
- Conference Proceeding (15)
- Contribution to a Periodical (9)
- Part of a Book (6)
- Working Paper (4)
- Book (3)
- Other (1)
- Preprint (1)
Is part of the Bibliography
- no (55) (remove)
Keywords
- Korpus <Linguistik> (31)
- Transkription (21)
- Computerlinguistik (20)
- Gesprochene Sprache (20)
- gesprochene Sprache (16)
- Konversationsanalyse (10)
- Deutsch (9)
- Standardisierung (7)
- Datenbank (6)
- Fußball (6)
Publicationstate
- Veröffentlichungsversion (7)
- Erstveröffentlichung (1)
- Postprint (1)
Reviewstate
- Peer-Review (4)
- (Verlags)-Lektorat (1)
Publisher
- Institut für Deutsche Sprache (5)
- Verlag für Gesprächsforschung (4)
- ELRA (2)
- Narr (2)
- Universität (2)
- Universität Hamburg - Sonderforschungsbereich 538 (2)
- Amsterdam [u.a.] (1)
- De Gruyter (1)
- ELDA (1)
- Europ. Akad. (1)
This paper describes EXMARaLDA, an XML-based framework for the construction, dissemination and analysis of corpora of spoken language transcriptions. Departing from a prototypical example of a “partitur” (musical score) transcription, the EXMARaLDA “single timeline, multiple tiers” data model and format is presented alongside with the EXMARaLDA Partitur-Editor, a tool for inputting and visualizing such data. This is followed by a discussion of the interaction of EXMARaLDA with other frameworks and tools that work with similar data models. Finally, this paper presents an extension of the “single timeline, multiple tiers” data model and describes its application within the EXMARaLDA system.
Time-based data models and the Text Encoding Initiative’s guidelines for transcription of speech
(2005)
This paper describes EXMARaLDA, a system for computer transcription of spoken discourse developed and used by the SFB "Mehrsprachigkeit" at the university of Hamburg. EXMARaLDA consists of several DTDs for XML coding of transcription data and some input and output tools for these formats. Apart from being a transcription system in its own right, EXMARaLDA also plays the role of a mediator between older existing data formats at the SFB and between these formats and a planned database of multilingual spoken discourse.
This paper presents the Kicktionary, a multilingual (English — German - French) electronic lexical resource of the language of football. It explains how a corpus of football match reports was analysed according to the FrameNet and WordNet approaches and how the result of this analysis is presented to a dictionary user via a website
This paper presents the Kicktionary, a multilingual (English - German - French) electronic lexical resource of the language of football. In the Kicktionary, methods from corpus linguistics and two approaches to lexical semantics - the theory of frame semantics and the concept of semantic relations - are combined to construct a lexical resource in which the user can explore relationships between lexical units in various ways. This paper explains the theoretical background of the Kicktionary, sketches the data and methods which were used in its construction, and describes how the resulting resource is presented to users via a set of hyperlinked webpages.
This paper describes a new research initiative addressing the issue of sustainability of linguistic resources. This initiative is a cooperation between three linguistic collaborative research centres in Germany, which comprise more than 40 individual research projects altogether. These projects are involved in creating manifold language resources, especially corpora, tailored to their particular needs. The aim of the project described here is to ensure an effective and sustainable access of these data by third-party researchers beyond the termination of these projects. This goal involves a number of measures, such as the definition of a common data format to completely capture the heterogeneous information encoded in the individual corpora, the development of user-friendly and sustainably usable tools for processing (e.g. querying) the data, and the specification of common inventories of metadata and terminology. Moreover, the project aims at formulating general rules of best practice for creating, accessing, and archiving linguistic resources.