Creating synopses of ‘parallel’ historical manuscripts and early prints. Alignment guidelines, evaluation, and applications
- In this paper we introduce the task of aligning parallel historical texts, to create synopses for comparing similarities and deviations between them. We present guidelines for manually annotating corresponding words and phrases. A test annotation reveals that there is considerable high inter-annotator agreement, ranging from kappa = 0.76 to 0.98, depending on the specific text. In an application scenario we show a typical use case for which token and phrase alignments are of value.
Author: | Stefanie DipperORCiDGND, Julia KrasseltORCiDGND, Simone Schultz-BalluffGND |
---|---|
URN: | urn:nbn:de:bsz:mh39-125848 |
ISBN: | 978-3-8233-6922-6 |
Parent Title (English): | Historical corpora. Challenges and perspectives |
Series (Serial Number): | Korpuslinguistik und interdisziplinäre Perspektiven auf Sprache | Corpus Linguistics and Interdisciplinary Perspectives on Language | CLIP (5) |
Publisher: | Narr |
Place of publication: | Tübingen |
Editor: | Jost Gippert, Ralf Gehrke |
Document Type: | Part of a Book |
Language: | English |
Year of first Publication: | 2015 |
Date of Publication (online): | 2024/03/27 |
Publishing Institution: | Leibniz-Institut für Deutsche Sprache (IDS) |
Publicationstate: | Zweitveröffentlichung |
Reviewstate: | (Verlags)-Lektorat |
GND Keyword: | Historische Sprachwissenschaft; Korpus <Linguistik>; Mittelhochdeutsch |
First Page: | 137 |
Last Page: | 150 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
BDSL-Classification: | Grammatik |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Grammatikforschung |
Linguistics-Classification: | Korpuslinguistik |
Licence (German): | Urheberrechtlich geschützt |