Volltext-Downloads (blau) und Frontdoor-Views (grau)

National library as corpus: introducing DeLiKo@DNB – a large synchronous German fiction corpus

  • This paper introduces DeLiKo@DNB, a large, linguistically annotated, and large, freely accessible contemporary corpus of German fiction. The corpus currently comprises 2 billion words from over 26,000 books published between 2005 and the present, spanning pulp and genre fiction as well as literary award-winning works. We provide a detailed account of the corpus composition, metadata, and key features. Additionally, we outline our approach to ensuring lawful and productive access by deploying an instance of the open-source corpus analysis platform KorAP within the German National Library.

Export metadata

Statistics

frontdoor_oas
Metadaten
Author:Marc KupietzORCiDGND, Peter LeinenORCiDGND, Nils DiewaldORCiDGND, Philippe GenêtORCiDGND, Rebecca WilmORCiDGND, Andreas WittORCiDGND, Rameela YaddehigeORCiD
URN:urn:nbn:de:bsz:mh39-130705
DOI:https://doi.org/10.5281/zenodo.14943116
Parent Title (Multiple languages):Book of Abstracts. DHd 2025: Under Construction. 11. Jahrestagung des Verbands Digital Humanities im deutschsprachigen Raum e.V.. Universität Bielefeld und HSBI, 3.–7. März 2025, Bielefeld, Deutschland
Publisher:Zenodo
Place of publication:Genf
Editor:Nils Reiter, Thomas Haider, Daniel Kababgi, Hendrik Buschmeier
Document Type:Conference Proceeding
Language:English
Year of first Publication:2025
Date of Publication (online):2025/03/21
Publishing Institution:Leibniz-Institut für Deutsche Sprache (IDS)
Publicationstate:Veröffentlichungsversion
Reviewstate:Peer-Review
Tag:Annotieren; DeLiKo@DNB; Literatur; Sammlung; Text; Umwandlung; Virtuelle Forschungsumgebungen
IPR; contemporary; corpus; corpus analysis; fiction; library as corpus; linguistic annotation; literature; metadata
GND Keyword:Annotation; Deutsch; Deutsche Nationalbibliothek; Korpus <Linguistik>; Metadaten; Nationalbibliothek
First Page:482
Last Page:485
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Linguistics-Classification:Korpuslinguistik
Program areas:Digitale Sprachwissenschaft
Licence (English):License LogoCreative Commons - Attribution 4.0 International