Refine
Document Type
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- yes (2)
Keywords
- Gesprochene Sprache (2)
- Korpus <Linguistik> (2)
- Annotation (1)
- Clarin (1)
- Datenmanagement (1)
- FAIR data (1)
- Forschungsdaten (1)
- ISO/TEI (1)
- Sprachübersetzung (1)
- Standardisierung (1)
Publicationstate
Reviewstate
- Peer-Review (2)
Publisher
- Linköping University Electronic Press (2) (remove)
This paper describes the TEI-based ISO standard 24624:2016 ‘Transcription of spoken language’ and other formats used within CLARIN for spoken language resources. It assesses the current state of support for the standard and the interoperability between these formats and with rele- vant tools and services. The main idea behind the paper is that a digital infrastructure providing language resources and services to researchers should also allow the combined use of resources and/or services from different contexts. This requires syntactic and semantic interoperability. We propose a solution based on the ISO/TEI format and describe the necessary steps for this format to work as an exchange format with basic semantic interoperability for spoken language resources across the CLARIN infrastructure and beyond.
We present an approach to making existing CLARIN web services usable for spoken language transcriptions. Our approach is based on a new TEI-based ISO standard for such transcriptions. We show how existing tool formats can be transformed to this standard, how an encoder/decoder pair for the TCF format enables users to feed this type of data through a WebLicht tool chain, and why and how web services operating directly on the standard format would be useful.