Volltext-Downloads (blau) und Frontdoor-Views (grau)

The german reference corpus DeReKo : a primordial sample for linguistic research

  • ^This paper describes DeReKo (Deutsches Referenzkorpus), the Archive of General Reference Corpora of Contemporary Written German at the Institut für Deutsche Sprache (IDS) in Mannheim, and the rationale behind its development. We discuss its design, its legal background, how to access it, available metadata, linguistic annotation layers, underlying standards, ongoing developments, and aspects of using the archive for empirical linguistic research. The focus of the paper is on the advantages of DEREKO’s design as a primordial sample from which virtual corpora can be drawn for the specific purposes of individual studies. Both concepts, primordial sample and virtual corpus are explained and illustrated in detail. Furthermore, we describe in more detail how DEREKO deals with the fact that all its texts are subject to third parties’ intellectual property rights, and how it deals with the issue of replicability, which is particularly challenging given DEREKO’s dynamic growth and the possibility to construct from it an open number of virtual corpora.

Export metadata

Additional Services

Share in Twitter Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Marc KupietzGND, Cyril Belica, Holger Keibel, Andreas WittORCiDGND
URN:urn:nbn:de:bsz:mh39-28379
URL:http://www.lrec-conf.org/proceedings/lrec2010/pdf/414_Paper.pdf
ISBN:2-9517408-6-7
Parent Title (English):Proceedings of the 7th International Conference on Language Resources and Evaluation : Workshops & Tutorials May 17-18, May 22-23, Main Conference May 19-21, Valletta
Publisher:ELRA
Place of publication:Paris
Document Type:Conference Proceeding
Language:English
Year of first Publication:2010
Date of Publication (online):2014/07/04
Tag:Deutsches Referenzkorpus (DeReKo); Institut für Deutsche Sprache <Mannheim>
GND Keyword:Deutsch; Korpus <Linguistik>; Textkorpus
First Page:1848
Last Page:1854
Dewey Decimal Classification:400 Sprache / 430 Deutsch
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Korpuslinguistik
Open Access?:Ja
Licence (German):Es gilt das UrhG