Making great work even better. Appraisal and digital curation of widely dispersed electronic textual resources (c. 15th-19th centuries) in CLARIN-D
- Numerous high-quality primary textual resources - in the context of this paper, this means fulltext transcriptions (and corresponding image scans) of German texts originating from the 15th to the 19th century - are scattered among the web or stored remotely on institutional or private servers. They are often filed on degrading recording media and are encoded in out-of-date or inflexible storage formats. Often, textual resources are accompanied by scarce, insufficient or inaccurate bibliographic information, which is only one further reason why valuable resources, even if available on the web, remain undiscovered. Additionally, idiosyncratic, project-specific markup conventions often hinder further usage and analysis of the data. Because of these and other problems, a great amount of the abovementioned transcriptions of historical sources can hardly be found, let alone accessed by third parties, and are of little use to the wider research community. This situation is unsatisfying from the perspective of a (corpus-)linguistic project like the one described here, but also from the perspective of any text-based research in the humanities and social sciences. The integration of as many of these ‘dispersed’ high-quality primary textual resources as possible into an encompassing repository like the sustainable, web and centres-based research infrastructure of CLARIN-D1 2 is an important step and at least a necessary prerequisite to solve this problem. This paper summarizes the work of an 18-month project funded by the German Federal Ministry of Education and Research (BMBF) which dealt with the curation and integration of historical text resources of the 15th-19th century into the CLARIN-D infrastructure.
Author: | Christian Thomas, Frank Wiegand |
---|---|
URN: | urn:nbn:de:bsz:mh39-125958 |
ISBN: | 978-3-8233-6922-6 |
Parent Title (English): | Historical corpora. Challenges and perspectives |
Series (Serial Number): | Korpuslinguistik und interdisziplinäre Perspektiven auf Sprache | Corpus Linguistics and Interdisciplinary Perspectives on Language | CLIP (5) |
Publisher: | Narr |
Place of publication: | Tübingen |
Editor: | Jost Gippert, Ralf Gehrke |
Document Type: | Part of a Book |
Language: | English |
Year of first Publication: | 2015 |
Date of Publication (online): | 2024/04/03 |
Publishing Institution: | Leibniz-Institut für Deutsche Sprache (IDS) |
Publicationstate: | Zweitveröffentlichung |
Reviewstate: | (Verlags)-Lektorat |
GND Keyword: | Deutsch; Historische Sprachwissenschaft; Korpus <Linguistik> |
First Page: | 181 |
Last Page: | 196 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
BDSL-Classification: | Grammatik |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Grammatikforschung |
Linguistics-Classification: | Korpuslinguistik |
Licence (German): | Urheberrechtlich geschützt |