Volltext-Downloads (blau) und Frontdoor-Views (grau)

memasysco: XML schema based metadata management system for speech corpora

  • The metadata management system for speech corpora “memasysco” has been developed at the Institut für Deutsche Sprache (IDS) and is applied for the first time to document the speech corpus “German Today”. memasysco is based on a data model for the documentation of speech corpora and contains two generic XML schemas that drive data capture, XML native database storage, dynamic publishing, and information retrieval. The development of memasysco’s information architecture was mainly based on the ISLE MetaData Initiative (IMDI) guidelines for publishing metadata of linguistic resources. However, since we also have to support the corpus management process in research projects at the IDS, we need a finer atomic granularity for some documentation components as well as more restrictive categories to ensure data integrity. The XML metadata of different speech corpus projects are centrally validated and natively stored in an Oracle XML database. The extension of the system to the management of annotations of audio and video signals (e.g. orthographic and phonetic transcriptions) is planned for the near future.

Export metadata

Additional Services

Share in Twitter Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Joachim Gasch, Caren Brinckmann, Sylvia Dickgießer
URN:urn:nbn:de:bsz:mh39-68335
Parent Title (English):Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008). Marrakech, Morocco
Publisher:European Language Resources Association (ELRA)
Place of publication:Paris
Contributor(s):Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Document Type:Conference Proceeding
Language:English
Year of first Publication:2008
Date of Publication (online):2017/12/13
GND Keyword:Deutsch; Gesprochene Sprache; Korpus <Linguistik>; Metadaten
First Page:2865
Last Page:2870
Dewey Decimal Classification:400 Sprache / 430 Deutsch
BDSL-Classification:Sprache im 20. Jahrhundert. Gegenwartssprache
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Computerlinguistik
Linguistics-Classification:Korpuslinguistik
Open Access?:Ja
Licence (German):Es gilt das UrhG