National library as corpus: introducing DeLiKo@DNB – a large synchronous German fiction corpus
- This paper introduces DeLiKo@DNB, a large, linguistically annotated, and large, freely accessible contemporary corpus of German fiction. The corpus currently comprises 2 billion words from over 26,000 books published between 2005 and the present, spanning pulp and genre fiction as well as literary award-winning works. We provide a detailed account of the corpus composition, metadata, and key features. Additionally, we outline our approach to ensuring lawful and productive access by deploying an instance of the open-source corpus analysis platform KorAP within the German National Library.
Author: | Marc KupietzORCiDGND, Peter LeinenORCiDGND, Nils DiewaldORCiDGND, Philippe GenêtORCiDGND, Rebecca WilmORCiDGND, Andreas WittORCiDGND, Rameela YaddehigeORCiD |
---|---|
URN: | urn:nbn:de:bsz:mh39-130705 |
DOI: | https://doi.org/10.5281/zenodo.14943116 |
Parent Title (Multiple languages): | Book of Abstracts. DHd 2025: Under Construction. 11. Jahrestagung des Verbands Digital Humanities im deutschsprachigen Raum e.V.. Universität Bielefeld und HSBI, 3.–7. März 2025, Bielefeld, Deutschland |
Publisher: | Zenodo |
Place of publication: | Genf |
Editor: | Nils Reiter, Thomas Haider, Daniel Kababgi, Hendrik Buschmeier |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2025 |
Date of Publication (online): | 2025/03/21 |
Publishing Institution: | Leibniz-Institut für Deutsche Sprache (IDS) |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | Peer-Review |
Tag: | Annotieren; DeLiKo@DNB; Literatur; Sammlung; Text; Umwandlung; Virtuelle Forschungsumgebungen IPR; contemporary; corpus; corpus analysis; fiction; library as corpus; linguistic annotation; literature; metadata |
GND Keyword: | Annotation; Deutsch; Deutsche Nationalbibliothek; Korpus <Linguistik>; Metadaten; Nationalbibliothek |
First Page: | 482 |
Last Page: | 485 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Linguistics-Classification: | Korpuslinguistik |
Program areas: | Digitale Sprachwissenschaft |
Licence (English): | ![]() |