Creating and working with spoken language corpora in EXMARaLDA

Spoken language corpora— as used in conversation analytic research, language acquisition studies and dialectology— pose a number of challenges that are rarely addressed by corpus linguistic methodology and technology. This paper starts by giving an overview of the most important methodological issues distinguishing spoken language corpus workfrom the work with written data. It then shows what technological challenges these methodological issues entail and demonstrates how they are dealt with in the architecture and tools of the EXMARaLDA system.

Metadaten
Author:	Thomas Schmidt ORCiD GND
URN:	urn:nbn:de:bsz:mh39-22548
Parent Title (German):	Proceedings of the Second Colloquium on Lesser Used Languages and Computer Linguistics (LULCL II) : "Combining efforts to foster computational support of minority languages" ; Bozen-Bolzano, 13th - 14th November 2008. (Europäische Akademie <Bozen>: EURAC book ; 54, EURAC research)
Publisher:	Europ. Akad.
Place of publication:	Bozen
Editor:	Verena Lyding
Document Type:	Conference Proceeding
Language:	English
Year of first Publication:	2009
Date of Publication (online):	2014/05/07
GND Keyword:	Computerlinguistik; Korpus <Linguistik>; geschriebene Sprache; gesprochene Sprache
First Page:	151
Last Page:	164
DDC classes:	400 Sprache / 400 Sprache, Linguistik
Open Access?:	ja
Linguistics-Classification:	Computerlinguistik
Linguistics-Classification:	Gesprächsforschung / Gesprochene Sprache
Licence (German):	Urheberrechtlich geschützt

Open Access