Refine
Year of publication
Document Type
- Conference Proceeding (20)
- Part of a Book (17)
- Article (4)
- Doctoral Thesis (1)
- Part of Periodical (1)
- Report (1)
- Working Paper (1)
Keywords
- Deutsch (21)
- Polnisch (16)
- Korpus <Linguistik> (15)
- Head-driven phrase structure grammar (12)
- HPSG (9)
- Englisch (7)
- Kontrastive Grammatik (5)
- Kontrastive Linguistik (5)
- Präposition (5)
- Distribution <Linguistik> (4)
Publicationstate
- Veröffentlichungsversion (27)
- Zweitveröffentlichung (5)
- Postprint (1)
Reviewstate
- (Verlags)-Lektorat (22)
- Peer-Review (11)
- Peer-review (1)
Publisher
- CSLI Publications (4)
- Peter Lang (4)
- IDS-Verlag (3)
- Association for Computational Linguistics (2)
- Schneider Verlag Hohengehren (2)
- Universität Tübingen (2)
- de Gruyter (2)
- ACL (1)
- Benjamins (1)
- Buske (1)
In many European languages, propositional arguments (PAs) can be realized as different types of structures. Cross-linguistically, complex structures with PAs show a systematic correlation between the strength of the semantic bond and the syntactic union (cf. Givón 2001; Wurmbrand/Lohninger 2023). Also, different languages show similarities with respect to the (lexical) licensing of different PAs (cf. Noonan 1985; Givón 2001; Cristofaro 2003 on different predicate types). However, on a more fine-grained level, a variation across languages can be observed both with respect to the syntactic-semantic properties of PAs as well as to their licensing and usage. This presentation takes a multi-contrastive view of different types of PAs as syntactic subjects and objects by looking at five European languages: EN, DE, IT, PL and HU. Our goal is to identify the parameters of variation in the clausal domain with PAs and by this to contribute to a better understanding of the individual language systems on the one hand and the nature of the linguistic variation in the clausal domain on the other hand. Phenomena and Methodology: We investigate the following types of PAs: direct object (DO) clauses (1), prepositional object (PO) clauses (2), subject clauses (3), and nominalizations (4, 5). Additionally, we discuss clause union phenomena (6, 7). The analyzed parameters include among others finiteness, linear position of the PA, (non) presence of a correlative element, (non) presence of a complementizer, lexical-semantic class of the embedding verb. The phenomena are analyzed based on corpus data (using mono- and multilingual corpora), experimental data (acceptability judgement surveys) or introspective data.
It is well known that the distribution of lexical and grammatical patterns is size- and register-sensitive (Biber 1986, and later publications). This fact alone presents a challenge to many corpus-oriented linguistic studies focusing on a single language. When it comes to cross-linguistic studies using corpora, the challenge becomes even greater due to the lack of high-quality multilingual corpora (Kupietz et al. 2020; Kupietz/Trawiński 2022), which are comparable with respect to the size and the register. That was the motivation for the creation of the European Reference Corpus EuReCo, an initiative started in 2013 at the Leibniz Institute for the German Language (IDS) together with several European partners (Kupietz et al. 2020). EuReCo is an emerging federated corpus, with large virtual comparable corpora across various languages and with an infrastructure supporting contrastive research. The core of the infrastructure is KorAP (Diewald et al. 2016), a scalable open-source platform supporting the analysis and visualisation of properties of texts annotated by multiple and potentially conflicting information layers, and supporting several corpus query languages. Until recently, EuReCo consisted of three monolingual subparts: the German Reference Corpus DeReKo (Kupietz et al. 2018), the Reference Corpus of Contemporary Romanian Language (Barbu Mititelu/Tufiş/Irimia 2018), and the Hungarian National Corpus (Váradi 2002). The goal of the present submission is twofold. On the one hand, it reports about the new component of EuReCo: a sample of the National Corpus of Polish (Przepiórkowski et al. 2010). On the other hand, it presents the results of a new pilot study using the newly extended EuReCo. This pilot study investigates selected Polish collocations involving light verbs and their prepositional / nominal complements (Fig. 1) and extends the collocation analyses of German, Romanian and Hungarian (Fig. 2) discussed in Kupietz/Trawiński (2022).
Dieser Beitrag beschreibt die Motivation und Ziele hinter der Initiative Europäisches Referenzkorpus EuReCo. Ausgehend von den Desiderata, die sich aufgrund der Defizite verfügbarer Forschungsdaten wie monolinguale Korpora, Parallelkorpora und Vergleichskorpora für den Sprachvergleich ergeben, werden die bisherigen und die laufenden Arbeiten im Rahmen von EuReCo präsentiert und anhand vergleichender deutsch-rumänischer Kookkurrenzanalysen neue Perspektiven für kontrastive Korpuslinguistik, die die EuReCo-Initiative öffnet, skizziert.
Validating the Performativity Hypothesis to Neg-Raising using corpus data: Evidence from Polish
(2021)
Negation raising and mood. A corpus-based study of Polish sądzić ‘think’ and wierzyć ‘believe’
(2021)
The paper describes the distribution of two negation raising predicates in Polish: sądzić ‛think’ and wierzyć ‛believe’ in the National Corpus of Polish with a particular focus on their morphosyntax and the mood of their clausal complements. The aim was to examine whether there are any correlations between these two parameters, and to what extent negation raising with those verbs exhibits performative features (in terms of Prince, 1976). The results of the study support the performative approach to negation raising as per Prince (1976) only for cases with subjunctive complements. The corpus findings further imply that Polish negation raising predicates encode two different degrees of (un)certainty concerning the truth of the embedded proposition depending on the mood of their complements. Structures with indicative complements express weaker uncertainty than structures with subjunctive complements.
Mit diesem Papier wird die neue Online-Reihe IDSopen des Leibniz-Instituts für Deutsche Sprache konzeptuell aufgelegt. Die Reihe bietet Autor/-innen und Rezipient/-innen aus allen Bereichen der Linguistik eine moderne und offene Plattform für digitales Publizieren. Mit IDSopen steht eine zeitgemäße Publikationsumgebung zur Verfügung, die schwerpunktmäßig Arbeiten veröffentlicht, die auf Ressourcen des IDS beruhen und deren Verwendungsmöglichkeiten in besonderem Maße zeigen. Gleichzeitig zeichnet sich IDSopen durch eine Öffnung für unkonventionelle Publikationsformen und -formate aus. Transparente Begutachtungsprozesse gehören dabei genauso zum Profil der Reihe wie ein offener Erscheinungsturnus und das Ansprechen unterschiedlicher Zielgruppen. IDSopen verfolgt entlang der Leitlinien des IDS und der Leibniz-Gemeinschaft (vgl. LeibnizOpen) das Open-Access-Prinzip und veröffentlicht ausschließlich digital, ohne gedruckte Form (Online-only). Diese Maßnahmen haben das Ziel, kurze Veröffentlichungszeiten für Manuskripte zu ermöglichen, einen unbeschränkten und kostenlosen Zugang zu qualitäts-geprüfter wissenschaftlicher Information rund um die IDS-Ressourcen im Internet zu bieten und liquide Publikationsprozesse zu unterstützen.
Dieser Beitrag präsentiert die neue multilinguale Ressource CoMParS (Collection of Multilingual Parallel Sequences). CoMParS versteht sich als eine funktional-semantisch orientierte Datenbank von Parallelsequenzen des Deutschen und anderer europäischer Sprachen, in der alle Daten neben den sprachspezifischen und universellen (im Sinne von Universal Dependencies) morphosyntaktischen Annotationen auch nach sprachübergreifenden funktional-semantischen Informationen auf der neudefinierten Annotationsebene Functional Domains annotiert und auf mehreren Ebenen (auch ebenenübergreifend) miteinander verlinkt sind. CoMParS wird in TEI P5 XML kodiert und sowohl als monolinguale wie auch als multilinguale Sprachressource modelliert.
Der Beitrag beschreibt die Motivation und Ziele des Europäischen Referenzkorpus EuReCo, einer offenen Initiative, die darauf abzielt, dynamisch definierbare virtuelle vergleichbare Korpora auf der Grundlage bestehender nationaler, Referenz- oder anderer großer Korpora bereitzustellen und zu verwenden. Angesichts der bekannten Unzulänglichkeiten anderer Arten mehrsprachiger Korpora wie Parallel- bzw. Übersetzungskorpora oder rein webbasierte vergleichbare Korpora, stellt das EuReCo eine einzigartige linguistische Ressource dar, die neue Perspektiven für germanistische und vergleichende wie angewandte Korpuslinguistik, insbesondere im europäischen Kontext, eröffnet.
Polish żeby under negation
(2021)
The paper addresses two patterns in the distribution of complement clauses headed by the complementizer żeby in Polish related to the presence of sentential negation. It is argued that żeby-clauses with an obligatory negation in the matrix clause, licensed by epistemic verbs, can be treated in terms of negative polarity, with żeby defined as an n-word. Structures with żeby-clauses and an obligatory negation in the embedded clause, licensed by verbs of fear, are argued to be an instance of negative complementation, with żeby specified as a negative complementizer. A uniform lexicalist analysis within the framework of HPSG is provided, employing tools developed to account for Negative Concord in Polish.
Dieser Aufsatz befasst sich mit pragmatischen Aspekten von Negationsanhebung (NA), die vor allem in Horn (1978) erörtert wurden, und mit performativischen Eigenschaften von NA-Konstruktionen, die ursprünglich in Prince (1976), vor allem mit Bezug auf französische Daten diskutiert wurden. Das Ziel ist, die Kernaussagen von Horn (1978) und Prince (1976) mit Korpusdaten im übereinzelsprachlichen Kontext zu validieren. Als Gegenstand der Untersuchung werden deutsche und polnische NA-Konstruktionen herangezogen und entsprechend zwei verschiedene monolinguale Korpora als Datenquelle benutzt.
This paper reports on recent developments within the European Reference Corpus EuReCo, an open initiative that aims at providing and using virtual and dynamically definable comparable corpora based on existing national, reference or other large corpora. Given the well-known shortcomings of other types of multilingual corpora such as parallel/translation corpora (shining-through effects, over-normalization, simplification, etc.) or web-based comparable corpora (covering only web material), EuReCo provides a unique linguistic resource offering new perspectives for fine-grained contrastive research on authentic cross-linguistic data, applications in translation studies and foreign language teaching and learning.
In recent years, the availability of large annotated and searchable corpora, together with a new interest in the empirical foundation and validation of linguistic theory and description, has sparked a surge of novel and interesting work using corpus-based methods to study the grammar of natural languages. However, a look at relevant current research on the grammar of the Germanic, Romance, and Slavic languages reveals a variety of different theoretical approaches and empirical foci, which can be traced back to different philological and linguistic traditions. Still, this current state of affairs should not be seen as an obstacle but as an ideal basis for a fruitful exchange of ideas between different research paradigms.
This paper argues that there is a correlation between functional and purely grammatical patterning in language, yet the nature of this correlation has to be explored. This claim is based on the results of a corpus-driven study of the Slavic aspect, drawing on the socalled Distributional Hypothesis. According to the East-West Theory of the Slavic aspect, there is a broad east-west isogloss dividing the Slavic languages into an eastern group and a western group. There are also two transitional zones in the north and south, which share some properties with each group (Dickey 2000; Barentsen 1998, 2008). The East-West Theory uses concepts of cognitive grammar such as totality and temporal definiteness, and is based on various parameters of aspectual usage in discourse, including contexts such as habituals, general factuals, historical (narrative) present, performatives, sequenced events in the past etc. The purpose of the above-mentioned study is to challenge the semantic approach to the Slavic aspect by comparing the perfective and imperfective verbal aspect on the basis of purely grammatical co-occurrence patterns (see also Janda & Lyashevskaya 2011). The study focused on three Slavic languages: Russian, which, following the East-West Theory, belongs to the eastern group, Czech, which belongs to the western group, and Polish, which is considered as transitional in its aspectual patterning.
CoMParS is a resource under construction in the context of the long-term project German Grammar in European Comparison (GDE) at the IDS Mannheim. The principal goal of GDE is to create a novel contrastive grammar of German against the background of other European languages. Alongside German, which is the central focus, the core languages for comparison are English, French, Hungarian and Polish, representing different typological classes. Unlike traditional contrastive grammars available for German, which usually cover language pairs and are based on formal grammatical categories, the new GDE grammar is developed in the spirit of functionalist typology. This implies that, instead of formal criteria, cognitively motivated functional domains in terms of Givón (1984) are used as tertia comparationis. The purpose of CoMParS is to document the empirical basis of the theoretical assumptions of GDE-V and to illustrate the otherwise rather abstract content of grammar books by as many as possible naturally occurring and adequately presented multilingual examples, including information on their use in specific contexts and registers. These examples come from existing parallel corpora, and our presentation will focus on the legal aspects and consequences of this choice of language data.
This paper argues for using authentic data not only as an empirical basis for linguistic generalizations but also for exemplification purposes in monolingual and particularly in bi- and multilingual contrastive studies. It shows that parallel data extracted from the available parallel corpora can - after enrichment with semantic-functional information while maintaining the available contextual, register-related and linguistic information - serve as a perfect data source for multilingual exemplification. Moreover, the analysis of semantic-functionally equivalent parallel sequences allows the investigation and exemplification of similarities and differences in how different languages express similar meaning from both a semasiological and an onomasiological perspective.
Der Aufsatz knüpft an die Diskussion zur Verwendung von formalen grammatischen Kategorien im Sprachvergleich an (vgl. insbesondere Haspelmath 2007, 2010a, b und Newmeyer 2007, 2010). Es wird dabei nicht danach gefragt, ob sprachübergreifende grammatische Kategorien (oder genauer gesagt Kategorienausprägungen) existieren oder nicht bzw. ob einzelsprachliche grammatische Kategorien im Sprachvergleich sinnvoll einsetzbar sind, sondern wie ähnlich bzw. unterschiedlich einzelsprachliche Kategorien bzw. Kategorisierungen sind. Das Ziel ist damit, eine Methode zur Messung des Äquivalenzgrades von grammatischen Kategorien in verschiedenen Sprachen zu präsentieren; dies wird am Beispiel des IMPERATIVS im Deutschen, Englischen, Polnischen und Tschechischen illustriert.
The present investigation targets the phenomenon commonly called control. Many languages including German and Polish employ non-finite clauses (besides finite clauses) as propositional complements. The subject of these complement clauses is left unexpressed and must generally be interpreted co-referentially with the subject or object of the matrix clause (subject or object control). However. there are also infinitive-selecting verbs that do not allow for a co- referential interpretation of the embedded subject - semantically, the embedded infinitives of these anti-control verbs are thus less dependent on or less unifiable with the matrix proposition. In Polish anti-control constructions, non-finite complements are overtly marked with the complementizer zeby, suggesting that they are structurally more complex (namely. containing a C-projection) than the non-finite complements in control constructions lacking zeby (modulo special contexts. viz. 'control switch'). In a comparative perspective, the paper brings corpuslinguistic and experimental evidence to bear on the question whether surface appearances notwithstanding, the infinitival complements of anti-control verbs in German should similarly be analyzed as truly sentential, i.e., C-headed structures.