Volltext-Downloads (blau) und Frontdoor-Views (grau)

Harmonizing language data. Standards for linguistic resources

  • Standards function as safeguards to ensure that data remains interpretable, uniformly queryable, and archivable over time – a critical challenge for digital humanists working with complex linguistic resources. This book provides an overview of essential standards for ensuring the sustainability of data in the Digital Humanities (DH). It addresses the selection of data encoding formats, methods of annotating primary data, and approaches to making resources findable and accessible. The focus is on various forms of linguistic data, such as texts, lexicons, or parallel arrangements (e.g., translations or transcribed recordings). The work explains the role of annotations and metadata in structuring and contextualizing data and examines the influence of diverse data formats, shaped by local academic or industrial practices. In contrast to neural language models, which often yield impressive but opaque results, DH projects aim for transparency, reproducibility, and sustainability. Achieving these goals requires interoperability – the seamless interaction between data and tools. The book demonstrates how clear guidelines and best practices help ensure the long-term usability of data. It offers digital humanists practical approaches and well-founded standards to sustainably archive and efficiently utilize their data, making it an indispensable resource for the field.

Download full text files

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
URN:urn:nbn:de:bsz:mh39-135853
DOI:https://doi.org/10.1515/9783112208212
ISBN:978-3-11-220821-2
ISSN:2751-1286
Series (Serial Number):Digital Linguistics (4)
Publisher:de Gruyter
Place of publication:Berlin/Boston
Editor:Piotr BańskiORCiDGND, Ulrich HeidORCiDGND, Laura HerzbergORCiDGND
Document Type:Book
Language:English
Year of first Publication:2025
Date of Publication (online):2025/12/05
Publishing Institution:Leibniz-Institut für Deutsche Sprache (IDS)
Publicationstate:Veröffentlichungsversion
Reviewstate:(Verlags)-Lektorat
Tag:Datennachhaltigkeit; Datennormen; Interoperabilität; Metadaten und Annotationen
linguistic resources
GND Keyword:Annotation; Computerlinguistik; Daten; Digital Humanities; Interoperabilität; Metadaten; Norm <Normung>; Sprachdaten
Page Number:VIII; 462
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Linguistics-Classification:Computerlinguistik
Program areas:Grammatik
Program areas:Digitale Sprachwissenschaft
Licence (English):License LogoCreative Commons - Attribution 4.0 International