Refine
Year of publication
- 2016 (5) (remove)
Document Type
- Article (3)
- Conference Proceeding (2)
Has Fulltext
- yes (5)
Keywords
- Rechtschreibung (5) (remove)
Publicationstate
Reviewstate
- (Verlags)-Lektorat (2)
- Peer-Review (1)
Publisher
When becoming integrated into the German vocabulary, foreign words reflect paradigmatic changes regarding orthography, grammar as well as semantics. In this context,German orthography is also highly determined by orthographic codification, which continues to influence the development of spelling to the present day. This study compares digital linguistically annotated corpora containing texts written by professional as well as non-professional writers; these corpora contain several billion foreign words (of Greek, Latin and French origin, and in the second part of the study of English/American and Italian origin), studied over a period of 20 years following the German orthographic reform of 1996. The results may potentially help the official regulations to adapt to the spelling practices observed – either by describing the rules more precisely or by proposing possible spelling variants or eliminating those which are not in common use. The study may also help to support correct lexicographic codification in dictionaries.
Aktuelle Änderungen des Rats für deutsche Rechtschreibung 2016 - Hintergründe und Begründungen
(2016)
The CELEX database is one of the standard lexical resources for German. It yields a wealth of data especially for phonological and morphological applications. The morphological part comprises deep-structure morphological analyses of German. However, as it was developed in the Nineties, both encoding and spelling are outdated. About one fifth of over 50,000 datasets contain umlauts and signs such as ß. Changes to a modern version cannot be obtained by simple substitution. In this paper, we shortly describe the original content and form of the orthographic and morphological database for German in CELEX. Then we present our work on modernizing the linguistic data. Lemmas and morphological analyses are transferred to a modern standard of encoding by first merging orthographic and morphological information of the lemmas and their entries and then performing a second substitution for the morphs within their morphological analyses. Changes to modern German spelling are performed by substitution rules according to orthographical standards. We show an example of the use of the data for the disambiguation of morphological structures. The discussion describes prospects of future work on this or similar lexicons. The Perl script is publicly available on our website.