Refine
Year of publication
- 2015 (148) (remove)
Document Type
- Part of a Book (55)
- Article (36)
- Conference Proceeding (31)
- Book (13)
- Part of Periodical (10)
- Working Paper (2)
- Review (1)
Is part of the Bibliography
- no (148) (remove)
Keywords
- Deutsch (52)
- Korpus <Linguistik> (24)
- Verb (10)
- Annotation (8)
- Englisch (8)
- Spanisch (7)
- Lernerwörterbuch (6)
- Mehrsprachigkeit (6)
- Computerlinguistik (5)
- Computerunterstützte Lexikographie (5)
Publicationstate
- Veröffentlichungsversion (82)
- Zweitveröffentlichung (17)
- Postprint (8)
- Erstveröffentlichung (1)
Reviewstate
- (Verlags)-Lektorat (62)
- Peer-Review (28)
- Peer-review (7)
- Verlags-Lektorat (4)
- Zweitveröffentlichung (2)
- Peer-Revied (1)
- Peer-reviewed (1)
- Review-Status-unbekannt (1)
Publisher
- Institut für Deutsche Sprache (23)
- de Gruyter (16)
- Narr (10)
- Lang (5)
- Springer (5)
- IDS (4)
- Narr Francke Attempto (3)
- Winter (3)
- Association for Computational Linguistics (2)
- De Gruyter (2)
Dieser Beitrag behandelt aus der Perspektive des Verarbeitens und des
Lernens von Deutsch als Fremdsprache die Frage nach dem Umgang mit Zwischenräumen, die sich zwischen einem Pol rein lexikalischen Wissens und einem Pol lexikonunabhängiger grammatischer Regeln ansiedeln lassen. Dabei wird unterschieden zwischen dem Wissen um abstrakte Konstruktionen, über das Lernende verfügen müssen, um adäquate Erwartungen in der Rezeption fremdsprachlichen Inputs aufzubauen, und dem valenz- und framebasierten Wissen, das an spezifische lexikalische Einheiten angedockt werden muss, um die lernersprachliche Produktion anzuleiten.
The project Referenzkorpus Altdeutsch (‘Old German Reference Corpus’) aims to es- tablish a deeply-annotated text corpus of all extant Old German texts. As the automated part-of-speech and morphological pre-annotation is amended by hand, a quality control system for the results seems a desirable objective. To this end, standardized inflectional forms, generated using the morphological information, are compared with the attested word forms. Their creation is described by way of example for the Old High German part of the corpus. As is shown, in a few cases, some features of the attested word forms are also required in order to determine as exactly as possible the shape of the inflected lemma form to be created.
The availability of electronic corpora of historical stages of languages has been wel- comed as possibly attenuating the inherent problem of diachronic linguistics, i.e. that we only have access to what has chanced to come down to us - the problem which was memorably named by Labov (1992) as one of “Bad Data”. However, such corpora can only give us access to an increased amount ot historical material and this can essentially still only be a partial and possibly distorted picture of the actual language at a particular period of history. Corpora can be improved by taking a more representative sample of extant texts if these are available (as they are in significant number for periods after the invention of printing). But, as examples from the recently compiled GerManC corpus of seventeenth and eighteenth century German show, the evidence from such corpora can still fail to yield definitive answers to our questions about earlier stages of a language. The data still require expert interpretation, and it is important to be realistic about what can legitimately be expected from an electronic historical corpus.
In diesem Beitrag wird sich mithilfe eines bottom up- sowie bottom down-Verfahrens mit netzwerkartigen Verknüpfungsbeziehungen der Konstruktionen und mit den für die Verlinkung und Vernetzung vorgesehenen Verfahrensmechanismen, wie etwa Fusionierung, Vererbungshierarchien, konstruktioneller Polysemie u.a. sowie mit ihrer Umsetzung in die Praxis auseinandergesetzt.3 Als Ziel strebe ich einen Vorschlag zu einer netzwerkartigen Konstruktionssystematik an, die am Beispiel deutscher Verben zum Ausdruck der Empfindung veranschaulicht wird. Gemeinsamer Nenner aller von mir analysierten Konstrukte ist das Vorhandensein eines Affizierten bzw. Experiencer.
Multi-faceted alignment. Toward automatic detection of textual similarity in Gospel-derived texts
(2015)
Ancient Germanic Bible-derived texts stand in as test material for producing computational means for automatically determining where textual contamination and linguistic interference have influenced the translation process. This paper reports on the results of research efforts that produced a text corpus; a method for decomposing the texts involved into smaller, more directly comparable thematically-related chunks; a database of relationships between these chunks; and a user-interface allowing for searches based on various referential criteria. Finally, the state of the product at the end of the project is discussed, namely as it was handed over to another researcher who has extended it to automatically find semantic and syntactic similarities within comparable chunks.
In this paper we present some preliminary considerations concerning the possibility of automatic parsing an annotated corpus for N-N compounds. This should in prin- ciple be possible at least for relational and stereotype compounds, if the lemmatization of the corpus connects the lemmata with lexical entries as described in Höhle (1982). These lexical entries then supply the necessary information about the argument structure of a relational noun or about the stereotypical purpose associated with the noun’s referent which can be used to establish a relation between the first and the head constituent of the compound.
The relative order of dative and accusative objects in older German is less free than it is today. The reason for this could be that speakers of the direct predecessor of Old High German organized the referents according to the Thematic Hierarchy. If one applies a Case Hierarchy Nom>Acc>Dat to this, the order Nom - Dat - Acc falls out. It becomes apparent that the status of the Thematic Hierarchy is not a factor governing underlying word order, but a factor inducing scrambling. Arguments from binding theory, whose validity is discussed, indicate that the underlying order is ‘accusative before dative’
Dieser Band fasst die Vorträge des 9. Hildesheimer Evaluierungs- und Retrieval-Workshops (HIER) zusammen, der am 9. und 10. Juli 2015 an der Universität Hildesheim stattfand. Die HIER Workshop-Reihe begann im Jahr 2001 mit dem Ziel, die Forschungsergebnisse der Hildesheimer Informationswissenschaft zu präsentieren und zu diskutieren. Mittlerweile nehmen immer wieder Kooperationspartner von anderen Institutionen teil, was wir sehr begrüßen. HIER schafft auch ein Forum für Systemvorstellungen und praxisorientierte Beiträge.
In a previous article (Faaß et al., 2012), a first attempt was made at documenting and encoding morphemic units of two South African Bantu languages, i.e. Northern Sotho and Zulu, with the aim of describing and storing the morphemic units of these two languages in a single relational database, structured as a hierarchical ontology. As a follow-up, the current article describes the implementation of our part-of-speech ontology. We give a detailed description of the morphemes and categories contained in the database, highlighting the need and reasons for a flexible ontology which will provide for both language specific and general linguistic information. By giving a detailed account of the methodology for the population of the database, we provide linguists from other Bantu languages with a road map for extending the database to also include their languages of specialization.