OPUS 4 | Search

Datenbank attributive Adjektive (2018)

Münzberg, Franziska ; Falke, Stefan ; Hansen-Morath, Sandra ; Waßner, Ulrich Hermann

In der Datenbank zum Datensatz attributive_Adjektive_1.csv finden sich 1.598 Belege zu artikellosen Nominalphrasen mit je zwei attributiven Adjektiven im Dativ Singular Maskulinum oder Neutrum. Die Datenbank attributive Adjektive enthält zu jedem Beleg neben dem Satzkontext eine Reihe von Annotationen. Dazu gehören Metadaten wie Register und regionale Zuordnung sowie Annotationen zur Phonologie, Morphosyntax, Semantik und Frequenz. Anhand dieser Annotationen lassen sich Hypothesen zur Adjektivflexion und -reihenfolge überprüfen. Nach einer Auswahl aus diesen Annotationen können Sie hier suchen. Alternativ können Sie unter „Download“ das gesamte Suchergebnis mit allen Annotationen und inklusive aller Belege, die bei der Untersuchung von Adjektivflexion und -reihenfolge als Fehlbelege eingestuft worden sind, herunterladen.

Dokumentationen zur Korpusgrammatik (2018)

Hansen-Morath, Sandra

Example-based querying for linguistic specialist corpora (2018)

Schneider, Roman

The paper describes preliminary studies regarding the usage of Example-Based Querying for specialist corpora. We outline an infrastructure for its application within the linguistic domain. Example-Based Querying deals with retrieval situations where users would like to explore large collections of specialist texts semantically, but are unable to explicitly name the linguistic phenomenon they look for. As a way out, the proposed framework allows them to input prototypical everyday language examples or cases of doubt, which are automatically processed by CRF and linked to appropriate linguistic texts in the corpus.

Extracting specialized terminology from linguistic corpora (2018)

Lang, Christian ; Schneider, Roman ; Suchowolec, Karolina

In this paper, we present our approach to automatically extracting German terminology in the domain of grammar using texts from the online information system grammis as our corpus. We analyze existing repositories of German grammatical terminology and develop Part-of-speech patterns for our extraction thereby showing the importance of unigrams in this domain. We contrast the results of the automatic extraction with a manually extracted standard. By comparing the performance of well-known statistical measures, we show how measures based on corpus comparison outperform alternative methods.

GeCoTagger: annotation of German verb complements with conditional random fields (2018)

Fürbacher, Monica ; Schneider, Roman

Complement phrases are essential for constructing well-formed sentences in German. Identifying verb complements and categorizing complement classes is challenging even for linguists who are specialized in the field of verb valency. Against this background, we introduce an ML-based algorithm which is able to identify and classify complement phrases of any German verb in any written sentence context. We use a large training set consisting of example sentences from a valency dictionary, enriched with POS tagging, and the ML-based technique of Conditional Random Fields (CRF) to generate the classification models.

Grammar and corpora 2016 (2018)

In recent years, the availability of large annotated and searchable corpora, together with a new interest in the empirical foundation and validation of linguistic theory and description, has sparked a surge of novel and interesting work using corpus-based methods to study the grammar of natural languages. However, a look at relevant current research on the grammar of the Germanic, Romance, and Slavic languages reveals a variety of different theoretical approaches and empirical foci, which can be traced back to different philological and linguistic traditions. Still, this current state of affairs should not be seen as an obstacle but as an ideal basis for a fruitful exchange of ideas between different research paradigms.

Grammar and Corpora – past, present, and future (2018)

Fuß, Eric ; Konopka, Marek ; Trawiński, Beata ; Waßner, Ulrich Hermann

In recent years, the availability of large annotated and searchable corpora, together with a new interest in the empirical foundation and validation of linguistic theory and description, has sparked a surge of novel and interesting work using corpus-based methods to study the grammar of natural languages. However, a look at relevant current research on the grammar of the Germanic, Romance, and Slavic languages reveals a variety of different theoretical approaches and empirical foci, which can be traced back to different philological and linguistic traditions. Still, this current state of affairs should not be seen as an obstacle but as an ideal basis for a fruitful exchange of ideas between different research paradigms.

Grammatische Terminologie am IDS – ein terminologisches Online-Wörterbuch als ein vernetztes Begriffssystem (2018)

Lang, Christian ; Schwinn, Horst ; Suchowolec, Karolina

Im Rahmen einer zur Zeit stattfindenden Umgestaltung der Inhalte und der Benutzeroberfläche des Online-Portals grammis hat sich eine Projektgruppe konstituiert, die es sich zur Aufgabe gemacht hat, das am IDS vorhandene Terminologiesystem zur Grammatik des Deutschen zu überarbeiten und zu erweitern: Dies betrifft zum einen die Überarbeitung und Erweiterung des Terminologieinventars, aber auch die zugrundeliegende methodische Grundlage und technische Infrastruktur. Zum Verständnis dieses Vorhabens sollen zunächst die vorhandenen Vorarbeiten und Grundlagen vorgestellt werden.

Keine Grammatik der politischen Sprache (2018)

Eichinger, Ludwig M.

Aus der etwas apophtegmatischen Formulierung des Titels lässt sich die Behauptung ableiten, eine Grammatik der politischen Sprache gebe es nicht. Das kann nun dreierlei heißen: Zum ersten könnte gemeint sein, es gebe keine politische Sprache - womit sich die Frage nach ihrer Grammatik a fortiori erübrigt. Weniger voraussetzungsreich und daher unmittelbar plausibler erscheint ein Verständnis, nach der es zwar eine politische Sprache gebe, diese aber keine eigene Grammatik habe. Vielleicht ist auch die dritte Lesart nur eine spezifischere Interpretation dieser zweiten Lesart: Es sei gar nicht so wichtig, was der Terminus „politische Sprache“ genau bedeute und was ihm in einer wahrscheinlichen Wirklichkeit entspreche. Auf jeden Fall sei sprachliches Interagieren im politischen Raum ein Spezialfall öffentlichen Agierens (unter spezifischen gesellschaftlichen/politischen Konstellationen) insgesamt und zeige daher entsprechende grammatische Präferenzen. Wir wollen in diesem Beitrag Argumente für diese letzte Position versammeln.

Muster, Dynamik, Komplexität – eine Einführung in den Gegenstand des Bandes (2018)

Engelberg, Stefan ; Lobin, Henning ; Steyer, Kathrin ; Wolfer, Sascha

In der Geschichte der Sprachwissenschaft hat das Lexikon in unterschiedlichem Maße Aufmerksamkeit erfahren. In jüngerer Zeit ist es vor allem durch die Verfügbarkeit sprachlicher Massendaten und die Entwicklung von Methoden zu ihrer Analyse wieder stärker ins Zentrum des Interesses gerückt. Dies hat aber nicht nur unseren Blick für lexikalische Phänomene geschärft, sondern hat gegenwärtig auch einen profunden Einfluss auf die Entstehung neuer Sprachtheorien, beginnend bei Fragen nach der Natur lexikalischen Wissens bis hin zur Auflösung der Lexikon-Grammatik-Dichotomie. Das Institut für Deutsche Sprache hat diese Entwicklungen zum Anlass genommen, sein aktuelles Jahrbuch in Anknüpfung an die Jahrestagung 2017 – „Wortschätze: Dynamik, Muster, Komplexität“ – der Theorie des Lexikons und den Methoden seiner Erforschung zu widmen.

Phonological analysis at the word level: the role of corpora (2018)

Raffelsiefen, Renate ; Geumann, Anja

Notions such as “corpus-driven” versus “theory-driven” bring into focus the specific role of corpora in linguistic research. As for phonology with its intrinsic focus on abstract categorical representation, there is a question of how a strictly corpus-driven approach can yield insight into relevant structures. Here we argue for a more theory-driven approach to phonology based on the concept of a phonological grammar in terms of interacting constraints. Empirical validation of such grammars comes from the potential convergence of the evidence from various sources including typological data, neutralization patterns, and in particular patterns observed in the creative use of language such as acronym formation, loanword adaptation, poetry, and speech errors. Further empirical validation concerns specific predictions regarding phonetic differences among opposition members, paradigm uniformity effects, and phonetic implementation in given segmental and prosodic contexts. Corpora in the narrowest sense (i.e. “raw” data consisting of spontaneous speech produced in natural settings) are useful for testing these predictions, but even here, special purpose-built corpora are often necessary.

Verbalkomplex (2018)

Konopka, Marek

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

12 search hits