Automated quality control for the morphological annotation of the Old High German text corpus. Checking the manually adapted data using standardized inflectional forms
- The project Referenzkorpus Altdeutsch (‘Old German Reference Corpus’) aims to es- tablish a deeply-annotated text corpus of all extant Old German texts. As the automated part-of-speech and morphological pre-annotation is amended by hand, a quality control system for the results seems a desirable objective. To this end, standardized inflectional forms, generated using the morphological information, are compared with the attested word forms. Their creation is described by way of example for the Old High German part of the corpus. As is shown, in a few cases, some features of the attested word forms are also required in order to determine as exactly as possible the shape of the inflected lemma form to be created.
Author: | Roland MittmannGND |
---|---|
URN: | urn:nbn:de:bsz:mh39-125560 |
ISBN: | 978-3-8233-6922-6 |
Parent Title (English): | Historical corpora. Challenges and perspectives |
Series (Serial Number): | Korpuslinguistik und interdisziplinäre Perspektiven auf Sprache | Corpus Linguistics and Interdisciplinary Perspectives on Language | CLIP (5) |
Publisher: | Narr |
Place of publication: | Tübingen |
Document Type: | Part of a Book |
Language: | English |
Year of first Publication: | 2015 |
Date of Publication (online): | 2024/03/06 |
Publishing Institution: | Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung] |
Publicationstate: | Zweitveröffentlichung |
Reviewstate: | (Verlags)-Lektorat |
GND Keyword: | Althochdeutsch; Historische Sprachwissenschaft; Korpus <Linguistik> |
First Page: | 65 |
Last Page: | 76 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
BDSL-Classification: | Grammatik |
Linguistics-Classification: | Grammatikforschung |
Linguistics-Classification: | Korpuslinguistik |
Licence (German): | Urheberrechtlich geschützt |