Volltext-Downloads (blau) und Frontdoor-Views (grau)

The KiezDeutsch Korpus (KiDKo) Release 1.0

  • This paper presents the first release of the KiezDeutsch Korpus (KiDKo), a new language resource with multiparty spoken dialogues of Kiezdeutsch, a newly emerging language variety spoken by adolescents from multi-ethnic urban areas in Germany. The first release of the corpus includes the transcriptions of the data as well as a normalisation layer and part-of-speech annotations. In the paper, we describe the main features of the new resource and then focus on automatic POS tagging of informal spoken language. Our tagger achieves an accuracy of nearly 97% on KiDKo. While we did not succeed in further improving the tagger using ensemble tagging, we present our approach to using the tagger ensembles for identifying error patterns in the automatically tagged data.

Export metadata

Additional Services

Share in Twitter Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Ines Rehbein, Sören Schalowski, Heike Wiese
URN:urn:nbn:de:bsz:mh39-55999
URL:http://www.lrec-conf.org/proceedings/lrec2014/index.html
ISBN:978-2-9517408-8-4
Parent Title (English):Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). May 26-31, 2014. Harpa Concert Hall and Conference Center. Reykjavik, Iceland
Publisher:European Language Resources Association
Place of publication:Paris
Editor:Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Document Type:Conference Proceeding
Language:English
Year of first Publication:2014
Date of Publication (online):2016/11/21
Publicationstate:Veröffentlichungsversion
Reviewstate:(Verlags)-Lektorat
Tag:Kiezdeutsch
spoken language corpora; urban youth language
GND Keyword:Gesprochene Sprache; Jugendsprache; Korpus <Linguistik>; Multikulturelle Gesellschaft; Stadtmundart
First Page:3927
Last Page:3934
Dewey Decimal Classification:400 Sprache / 400 Sprache, Linguistik
Linguistics-Classification:Korpuslinguistik
Open Access?:Ja
Licence (English):License LogoCreative Commons - Attribution-NonCommercial 4.0 International