Developing Solutions for Long-Term Archiving of Spoken Language Data at the Institut für Deutsche Sprache
- This document presents ongoing work related to spoken language data within a project that aims to establish a common and unified infrastructure for the sustainable provision of linguistic primary research data at the Institut für Deutsche Sprache (IDS). In furtherance of its mission to “document the German language as it is currently used”, the project expects to enable the research community to access a broad empirical base of working material via a single platform. While the goal is to eventually cover all linguistically relevant digital resources of the IDS, including lexicographic information systems such as the IDS German Vocabulary Portal, OWID, written language corpora such as the IDS German Reference Corpus, DeReKo, and spoken language corpora such as the IDS German Speech Corpus for Research and Teaching, FOLK, the work presented here predominantly focuses on the latter type of data, i.e. speech corpora. Within this context, the present document pictures the project’s contributions to the development of standards and best practice guidelines concerning data storage, process documentation and legal issues for the sustainable preservation and long-term accessibility of primary linguistic research data.
Author: | Peter M. Fischer, Andreas WittORCiDGND |
---|---|
URN: | urn:nbn:de:bsz:mh39-44958 |
URL: | http://lrec-conf.org/proceedings/lrec2012/index.html |
Parent Title (English): | Proceedings of the LREC-12 Workshop on Best Practices for Speech Corpora in Linguistic Research. Istanbul, Turkey, May 2012 |
Publisher: | European Language Resources Association (ELRA) |
Place of publication: | Paris |
Editor: | Michael Haugh, Şükriye Ruhi, Thomas Schmidt, Kai Wörner |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2012 |
Date of Publication (online): | 2015/12/16 |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | (Verlags)-Lektorat |
Tag: | Best-Practice; Long-Term Archiving; Spoken Language Data |
GND Keyword: | Gesprochene Sprache; Korpus <Linguistik>; Langzeitarchivierung |
First Page: | 47 |
Last Page: | 50 |
DDC classes: | 400 Sprache / 410 Linguistik |
Open Access?: | ja |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Computerlinguistik |
Licence (German): | ![]() |