Refine
Year of publication
Document Type
- Article (919)
- Conference Proceeding (328)
- Part of a Book (235)
- Review (46)
- Book (27)
- Part of Periodical (18)
- Report (9)
- Other (6)
- Working Paper (5)
- Image (1)
Language
- English (828)
- German (718)
- French (21)
- Portuguese (6)
- Multiple languages (5)
- Russian (5)
- Ukrainian (4)
- Polish (3)
- Latvian (2)
- Croatian (1)
Keywords
- Deutsch (532)
- Korpus <Linguistik> (304)
- Konversationsanalyse (132)
- Interaktion (119)
- Computerlinguistik (110)
- Gesprochene Sprache (95)
- Rezension (85)
- Wörterbuch (76)
- Kommunikation (61)
- Annotation (58)
Publicationstate
- Veröffentlichungsversion (1008)
- Zweitveröffentlichung (410)
- Postprint (166)
- Ahead of Print (6)
- Hybrides Open Access (2)
- (Verlags)-Lektorat (1)
- Preprint (1)
Reviewstate
- Peer-Review (1595) (remove)
Publisher
- de Gruyter (97)
- IDS-Verlag (91)
- Erich Schmidt (75)
- Association for Computational Linguistics (35)
- Schmidt (35)
- European Language Resources Association (34)
- Verlag für Gesprächsforschung (34)
- Erich Schmidt Verlag (33)
- Institut für Deutsche Sprache (28)
- Springer (28)
This paper consists of a short analysis of the sources and the treatment of the legal lexicon in the first dictionary published by the Spanish Royal Academy (1726–1739), followed by a longer commentary on the representation and the treatment of the concept of judge, in which the reflection of the extralinguistic factors in the definitions stands in focus. The results highlight the relevance of the legal context of that era for the treatment of the lexicon related to the legal domain, but they also demonstrate the pattern in which the lexicographic data displays peculiarities of legal matters.
Pogled u e-leksikografiju
(2015)
U radu se daje pregled temeljnih pojmova i klasifikacija u području e-leksikografije. Donosi se klasifikacija e-rječnika, prikazuje se leksikografski proces izrade e-rječnika te pregled najraširenijih sustava za izradu rječnika (DWS) i sustava za pretragu korpusa (CQS). Kao primjer dobre prakse detaljnije se opisuje mrežni rječnik elexiko (Institut za njemački jezik u Mannheimu): prikazuju se njegovi ciljevi i namjena, teorijske i metodološke postavke, moduli te mogućnosti uporabe. Kao moguća osnova za izradu korpusno utemeljenoga e-rječnika hrvatskoga jezika koji bi bio u skladu s najrecentnijim leksikografskim (i uopće lingvističkim) teorijama i praksama prikazuje se rad na mrežnome leksičko-semantičkome repozitoriju hrvatskoga jezika (baza semantičkih okvira, predodžbenih shema, kognitivnih primitiva i leksičkih jedinica) u okviru projekta Repozitorij metafora hrvatskoga jezika.
This paper reports on the efforts of twelve national teams in building the International Comparable Corpus (ICC; https://korpus.cz/icc) that will contain highly comparable datasets of spoken, written and electronic registers. The languages currently covered are Czech, Finnish, French, German, Irish, Italian, Norwegian, Polish, Slovak, Swedish and, more recently, Chinese, as well as English, which is considered to be the pivot language. The goal of the project is to provide much-needed data for contrastive corpus-based linguistics. The ICC corpus is committed to the idea of re-using existing multilingual resources as much as possible and the design is modelled, with various adjustments, on the International Corpus of English (ICE). As such, ICC will contain approximately the same balance of forty percent of written language and 60 percent of spoken language distributed across 27 different text types and contexts. A number of issues encountered by the project teams are discussed, ranging from copyright and data sustainability to technical advances in data distribution.
The Component MetaData Infrastructure (CMDI) is a framework for the creation and usage of metadata formats to describe all kinds of resources in the CLARIN world. To better connect to the library world, and to allow librarians to enter metadata for linguistic resources into their catalogues, a crosswalk from CMDI-based formats to bibliographic standards is required. The general and rather fluid nature of CMDI, however, makes it hard to map arbitrary CMDI schemas to metadata standards such as Dublin Core (DC) or MARC 21, which have a mature, well-defined and fixed set of field descriptors. In this paper, we address the issue and propose crosswalks between CMDI-based profiles originating from the NaLiDa project and DC and MARC 21, respectively.
The ISOcat registry reloaded
(2012)
The linguistics community is building a metadata-based infrastructure for the description of its research data and tools. At its core is the ISOcat registry, a collaborative platform to hold a (to be standardized) set of data categories (i.e., field descriptors). Descriptors have definitions in natural language and little explicit interrelations. With the registry growing to many hundred entries, authored by many, it is becoming increasingly apparent that the rather informal definitions and their glossary-like design make it hard for users to grasp, exploit and manage the registry’s content. In this paper, we take a large subset of the ISOcat term set and reconstruct from it a tree structure following the footsteps of schema.org. Our ontological re-engineering yields a representation that gives users a hierarchical view of linguistic, metadata-related terminology. The new representation adds to the precision of all definitions by making explicit information which is only implicitly given in the ISOcat registry. It also helps uncovering and addressing potential inconsistencies in term definitions as well as gaps and redundancies in the overall ISOcat term set. The new representation can serve as a complement to the existing ISOcat model, providing additional support for authors and users in browsing, (re-)using, maintaining, and further extending the community’s terminological metadata repertoire.
This study investigated whether an analysis of narrative style (word use and cross-clausal syntax) of patients with symptoms of generalised anxiety and depression disorders can help predict the likelihood of successful participation in guided self-help. Texts by 97 people who had made contact with a primary care mental health service were analysed. Outcome measures were completion of the guided self-help programme, and change in symptoms assessed by a standardised scale (CORE-OM). Regression analyses indicated that some aspects of participants' syntax helped to predict completion of the programme, and that aspects of syntax and word use helped to predict improvement of symptoms. Participants using non-finite complement clauses with above-average frequency were four times more likely to complete the programme (95% confidence interval 1.4 to 11.7) than other participants. Among those who completed, the use of causation words and complex syntax (adverbial clauses) predicted improvement, accounting for 50% of the variation in well-being benefit. These results suggest that the analysis of narrative style can provide useful information for assessing the likelihood of success of individuals participating in a mental health guided self-help programme.
In their analysis of methods that participants use to manage the realization of practical courses of action, Kendrick and Drew (2016/this issue) focus on cases of assistance, where the need to be addressed is Self’s, and Other lends a helping hand. In our commentary, we point to other forms of cooperative engagement that are ubiquitously recruited in interaction. Imperative requests characteristically expect compliance on the grounds of Other’s already established commitment to a wider and shared course of actions. Established commitments can also provide the engine behind recruitment sequences that proceed nonverbally. And forms of cooperative engagement that are well glossed as assistance can nevertheless be demonstrably oriented to established commitments. In sum, we find commitment to shared courses of action to be an important element in the design and progression of certain recruitment sequences, where the involvement of Other is best defined as contribution. The commentary highlights the importance of interdependent orientations in the organization of cooperation. Data are in German, Italian, and Polish.
Drawing on research from conversation analysis and developmental psychology, we point to the existence of “supporters” of morally responsible agency in everyday interaction: causes of our behavior that we are often unaware of, but that would make goodenough reasons for our actions, were we made aware of them.
How to propose an action as an objective necessity. The case of Polish trzeba x (‘one needs to x’)
(2011)
The present study demonstrates that language-specific grammatical resources can afford speakers language-specific ways of organizing cooperative practical action. On the basis of video recordings of Polish families in their homes, we describe action affordances of the Polish impersonal modal declarative construction trzeba x (“one needs to x”) in the accomplishment of everyday domestic activities, such as cutting bread, bringing recalcitrant children back to the dinner table, or making phone calls. Trzeba-x turns in first position are regularly chosen by speakers to point to a possible action as an evident necessity for the furthering of some broader ongoing activity. Such turns in first position provide an environment in which recipients can enact shared responsibility by actively involving themselves in the relevant action. Treating the necessity as not restricted to any particular subject, aligning responsive actions are oriented to when the relevant action will be done, not whether it will be done. We show that such sequences are absent from English interactions by analyzing (a) grammatically similar turn formats in English interaction (“we need to x,” “the x needs to y”), and (b) similar interactive environments in English interactions. We discuss the potential of this research to point to a new avenue for researchers interested in the relationship between language diversity and diversity in human action and cognition.
The authors compare the use of two formats for requesting an object in informal everyday interaction: imperatives, common in our Polish data, and second-person polar questions, common in our English data. Imperatives and polar questions are selected in the same interactional “home environments” across the languages, in which they enact two social actions: drawing on shared responsibility and enlisting assistance, respectively. Speakers across the languages differ in their choice of request format in “mixed” interactional environments that support either. The finding shed light on the orderly ways in which cultural diversity is grounded in invariants of action formation.
Sometimes in interaction, a speaker articulates an overt interpretation of prior talk. Such moments have been studied as involving the repair of a problem with the other’s talk or as formulating an understanding of the matter at hand. Stepping back from the established notions of formulations and repair, we examine the variety of actions speakers do with the practice of offering an interpretation, and the order within this domain. Results show half a dozen usage types of interpretations in mundane interaction. These form a largely continuous territory of action, with recognizably distinct usage types as well as cases falling between these (proto)typical uses. We locate order in the domain of interpretations using the method of semantic maps and show that, contrary to earlier assumptions in the literature, interpretations that formulate an understanding of the matter at hand are actually quite pervasive in ordinary talk. These findings contribute to research on action formation and advance our understanding of understanding in interaction. Data are video- and audio-recordings of mundane social interaction in the German language from a variety of settings.
The present paper explores how rules are enforced and talked about in everyday life. Drawing on a corpus of board game recordings across European languages, we identify a sequential and praxeological context for rule talk. After a game rule is breached, a participant enforces proper play and then formulates a rule with an impersonal deontic statement (e.g. “It’s not allowed to do this”). Impersonal deontic statements express what may or may not be done without tying the obligation to a particular individual. Our analysis shows that such statements are used as part of multi-unit and multi-modal turns where rule talk is accomplished through both grammatical and embodied means. Impersonal deontic statements serve multiple interactional goals: they account for having changed another’s behavior in the moment and at the same time impart knowledge for the future. We refer to this complex action as an “instruction.” The results of this study advance our understanding of rules and rule-following in everyday life, and of how resources of language and the body are combined to enforce and formulate rules.
We examine moments in social interaction in which a person formulates what another thinks or believes. Such formulations of belief constitute a practice with specifiable contexts and consequences. Belief formulations treat aspects of the other person's prior conduct as accountable on the basis that it provided a new angle on a topic, or otherwise made a surprising contribution within an ongoing course of actions. The practice of belief formulations subjectivizes the content that the other articulated and thereby topicalizes it, mobilizing commitment to that position, an account, or further elaboration. We describe how the practice can be put to work in different activity contexts: sometimes it is designed to undermine the other's position as a subjective 'mere belief', at other times it serves to mobilize further topic talk. Throughout, belief formulations show themselves to be a method by which we get to know ourselves and each other as mental agents.
Linguistic relativists have traditionally asked 'how language influences thought', but conversation analysts and anthropological linguists have moved the focus from thought to social action. We argue that 'social action' should in this context not become simply a new dependent variable, because the formulation 'does language influence action' suggests that social action would already be meaningfully constituted prior to its local (verbal and multi-modal) accomplishment. We draw on work by the gestalt psychologist Karl Duncker to show that close attention to action-in-a-situation helps us ground empirical work on cross-cultural diversity in an appreciation of the invariances that make culture-specific elements of practice meaningful.
This article makes an empirical and a methodological contribution to the comparative study of action. The empirical contribution is a comparative study of three distinct types of action regularly accomplished with the turn format du meinst x (“you mean/think x”) in German: candidate understandings, formulations of the other’s mind, and requests for a judgment. These empirical materials are the basis for a methodological exploration of different levels of researcher abstraction in the comparative study of action. Two levels are examined: the (coarser) level of conditionally relevant responses (what a response speaker must do to align with the action of the prior turn) and the (finer) level of “full alignment” (what a response speaker can do to align with the action of a prior turn). Both levels of abstraction provide empirically viable and analytically interesting descriptive concepts for the comparative study of action. Data are in German.
This article makes an empirical and a methodological contribution to the comparative study of action. The empirical contribution is a comparative study of three distinct types of action regularly accomplished with the turn format du meinst x (“you mean/think x”) in German: candidate understandings, formulations of the other’s mind, and requests for a judgment. These empirical materials are the basis for a methodological exploration of different levels of researcher abstraction in the comparative study of action. Two levels are examined: the (coarser) level of conditionally relevant responses (what a response speaker must do to align with the action of the prior turn) and the (finer) level of “full alignment” (what a response speaker can do to align with the action of a prior turn). Both levels of abstraction provide empirically viable and analytically interesting descriptive concepts for the comparative study of action. Data are in German.
When formulating a request for an object, speakers can choose among different grammatical resources that would all serve the overall purpose. This paper examines the social contexts indexed and created by the choice of the turn format can I have x to request a shared good (the pepper grinder, a tissue from a box on the table, etc.) in British English informal interaction. The analysis is based on a video corpus of approximately 25 h of everyday interaction among family and friends. In its home environment, a request in the format can I have x treats the other as being in control over the relevant material object, a control that is the contingent outcome of ongoing courses of action. This contingent control over a shared good produces an obligation to make it available. This analysis is supported by an examination of similarly formatted request turns in other languages, of can I have x in another interactional environment (after a relevant offer has been made) in British English, and of deviant cases. The results highlight the intimate connection of request format selection to the present engagements of (prospective) request recipients.
This paper introduces a method for computer-based analyses of metaphor in discourse, combining quantitative and qualitative elements. This method is illustrated with data from research on German newspaper discourse concerning the ongoing system transformations of the late 1980s and early 1990s. Methodological aspects of the research procedure are discussed and it is argued that quantitative elements can enhance comparability in cross-cultural and cross-lingual research. Some basic findings of the research are presented. The peculiarities of the German Wende discourse - especially the salience of a passive perspective on the ongoing political and social changes - are outlined.
W artykule przedstawiono analizç struktury metaforycznej polskich dyskursów na temat konca komunizmu panst wowego. Analizç przeprowadzono w oparciu o bazç danych, zawierajqcq 1008 metafor pochodzqcych z tekstów prasowych z 1999 roku, upamiçtniajqcych wazne wydarzenia z 1989 roku. Jak siç okazuje, struktury metaforyczne róznych dyskursów wyrazajq i utrwalajq ideologjcznie uksztaltowane interpretacje historii. Szczegolowiej badano interpretacje metaforyczne dwóch zjawisk: zachowania siç przedstawicieli wladzy i opozycji przy Okrqglym Stole oraz pytania o ciqglosc historii. Te dwa zjawiska — których konceptualizacja gra waznq rolç w okreáleniu autostereotypu Polaka w III RP — sq interpretowane za pomocq róznego rodzaju metafor. Metaforyczne rozumienie ciqglosci historii da siç analizowac za pomocq tak zwanej „konceptualnej teorii metafory" LakofFa i Johnsona. Natomiast zachowania komunistów i opozycjonistów sq. interpretowane za pomoc^ metafor intertekstualnych. Sq one skonstruowane nie na podstawie doswiadczenia cielesnego, lecz doswiadczenia specyficznego dia danej kultury. Wydaje siç zatem, ze ksztaltowanie róznego rodzaju pojçc w dyskursie aktywizuje rózne strefy bazy doswiadczeniowej.
Badania etnolingwistyczne zdobyly w ciqgu ostatnich dwu dekad znaozna populamosc. Najwazniejsz^ formuh\ nietaforycznn okreslajqcii glowny przedmiot tych badaií jest .jçzykowy obraz swiata”. W zwiqzku z tym. iz powstaj^ obecnie projekty studiów komparatyslycznych na duzíi skalç, warto byt moze rozwazyc, czego takie ujçcie etnolingwistyki nie uwzglçdnia. Wizualna metafora obrazów implikuje, ze mówincy si\ w slanie wyjsc ix>za swiat i patrzec nan (oraz nazywac go) z zewmprz. Artykul oinawia dwie kcinsekwencje tej inetafory, które mog^ przysporzyc problemów. Po pierwsze, wyizolowanie jçzyka ze swiata ludzkich dzialan, którego jyzyk wszak jest czçsci^. prowadzi do przyjçcia kognitywistycznego modeln znaczenia jako oddzielnego stmmienia komunikaeji. Taki model nie pasuje do eodziennego doswiadezenia przezroczystosci jyzyka. Po drugie, wyizolowanie jçzyka z zycia sprzyja stosowaniu metod „bezczasowych” oraz studiom nad stowami wyalKtrahowanymi z sytuaeji, w której zostaly one uzyte (jesli nie wyjçtymi z kontekstu). Przyjmuj^c takie metafory i inetody, inozetny stracic z oczu znaczn^ czçsc tego, co jest istotne dla jyzyka poUx;znego — przedmiotu badan etnonauki.
W artykule tym przyglfjdam si. zasadniczej dia j.zykowego obrazu swiata opozycji mi.dzy swotm i obcym w przykladowych tckstach przynaleznych do polskiego i niemieckiego dyskursu Ideologieznego (politycznego). Za van Dijkicin przyjmuj., ze charakterystyczne dla dyskursu ideologicznego jest ustalenie i reprodukeja rozr.zmenia mi.dzy grupa wlasn^ a innymi grupami. Funkcjq dyskursu ideologicznego jest legitymizaeja dzialan i przekonan grupy wlasnej oraz delegitymizacja dzialan i przekonan innych grup. W populamych czasopismach polskich i niemieckich, traktuj^cych o tematach politycznych ( Wprost i Spiegel) takie pojmowanie swojego i obeego wydaje si. byc akeeptowane. Konkretyzacja absttakcyjnych poj.c. sw.j i obey przy tym nie jest stala, a raczej funkcjonalnie zmienna, zaleznie od tego, kto ma byc postrzegany jako rialeziycy do grupy wlasnej, a kto ma byc z niej wylijczony.
‘Can’ and ‘must’-type modal verbs in the direct sanctioning of misconduct across European languages
(2023)
Deontic meanings of obligation and permissibility have mostly been studied in relation to modal verbs, even though researchers are aware that such meanings can be conveyed in other ways (consider, for example, the contributions to Nuyts/van der Auwera (eds.) 2016). This presentation reports on an ongoing project that examines deontic meaning but takes as its starting point not a type of linguistic structure but a particular kind of social moment that presumably attracts deontic talk: The management of potentially ‚unacceptable‘ or untoward actions (taking the last bread roll at breakfast, making a disallowed move during a board game, etc.). Data come from a multi-language parallel video corpus of everyday social interaction in English, German, Italian, and Polish. Here, we focus on moments in which one person sanctions another’s behavior as unacceptable. Using interactional-linguistic methods (Couper-Kuhlen/Selting 2018), we examine similarities and differences across these four languages in the use of modal verbs as part of such sanctioning attempts. First results suggest that modal verbs are not as common in the sanctioning of misconduct as one might expect. Across the four languages, only between 10%–20% of relevant sequences involve a modal verb. Most of the time, in this context, speakers achieve deontic meaning in other ways (e.g., infinitives such as German nicht so schmatzen, ‚no smacking‘). This raises the question what exactly modal verbs, on those relatively rare occasions when they are used, contribute to the accomplishment of deontic meaning. The reported study pursues this question in two ways: 1) By considering similarities across languages in the ways that modal verbs interact with other (verbal) means in the sanctioning of misconduct.; 2) By considering differences across languages in the use of modal verbs. Here, we find that the relevant modal verbs are used similarly in some activity contexts (enforcing rules during board games), but less so in other activity contexts (mundane situations with no codified rules). In sum, the presented study adds to cross-linguistically grounded knowledge about deontic meaning and its relationships to linguistics structures.
Dieser Beitrag widmet sich der Beschreibung des Korpus Deutsch in Namibia (DNam), das über die Datenbank für Gesprochenes Deutsch (DGD) frei zugänglich ist. Bei diesem Korpus handelt es sich um eine neue digitale Ressource, die den Sprachgebrauch der deutschsprachigen Minderheit in Namibia sowie die zugehörigen Spracheinstellungen umfassend und systematisch dokumentiert. Wir beschreiben die Datenerhebung und die dabei angewandten Methoden (freie Gespräche, „Sprachsituationen“, semi-strukturierte Interviews), die Datenaufbereitung inklusive Transkription, Normalisierung und Tagging sowie die Eigenschaften des verfügbaren Korpus (Umfang, verfügbare Metadaten usw.) und einige grundlegende Funktionalitäten im Rahmen der DGD. Erste Forschungsergebnisse, die mithilfe der neuen Ressource erzielt wurden, veranschaulichen die vielseitige Nutzbarkeit des Korpus für Fragestellungen aus den Bereichen Kontakt-, Variations-
und Soziolinguistik.
In Beispielen wie
(1) Du hast scheints / Weiß Gott nichts begriffen.
(2) It cost £200, give or take.
(3) Qu’est ce qu’il a dit?
werden verbale Konstruktionen (kurz: VK, hier jeweils die fett gesetzten Teile) in einer Weise gebraucht, die der Grammatik verbaler Konstruktionen zuwiderläuft. In (1) und (2) wird die verbale Konstruktion wie ein Adverb/eine Partikel gebraucht bzw. wie ein Ausdruck in der Funktion eines (adverbialen) Adjunkts/ Supplements. In (3) ist die verbale Konstruktion zum Bestandteil einer periphrastischen interrogativen Konstruktion geworden. Wie sind solche ‘Umfunktionalisierungen’ – wie ich das Phänomen zunächst vortheoretisch bezeichnen möchte – einzuordnen? Handelt es sich um Lexikalisierung oder um Grammatikalisierung? Oder um ein Phänomen der dritten Art? Die Umfunktionalisierung verbaler Syntagmen bzw. Konstruktionen – ich gebrauche die Abkürzung UVK für ‘umfunktionalisierte verbale Konstruktion(en)’ – ist ein bisher weniger gut untersuchtes Phänomen, etwa gegenüber der Umfunktionalisierung von Präpositionalphrasen, die sprachübergreifend zu komplexen, „sekundären“ Präpositionen werden können (man vergleiche DEU auf Grund + Genitiv / von, ENG on top of, FRA à cause de).
Anhand eines grammatischen Details werden deskriptive und generative Beschreibungsansätze miteinander verglichen. Drei verschiedene Typen des nicht-phorischen eswerden im Hinblick auf die grammatischen Dimensionen 'Stellung', 'Festigkeit' und 'Komplement-Assoziation' beschrieben; das jeweilige Profil des Typs wird festgelegt. In generativen Lösungen geht es primär um den Subjektstatus der es-Typen und damit allgemeiner um die umstrittene Annahme einer strukturellen Subjektposition im deutschen Satz. Es wird gezeigt, daß nicht-phorisches es im allgemeinen nicht als Besetzung einer strukturellen Subjektposition in Frage kommt. Entsprechende generative Lösungen stehen im Widerspruch zum deskriptiv ermittelten grammatischen Profil von es.
Die Grammatik von a/s-Nominalen ist noch nicht hinreichend erforscht. Der Konstituentenstatus wird unterschiedlich beurteilt; als syntaktische Funktionen werden nur die adnominale und die Funktion als Verbergänzung identifiziert. Es wird gezeigt, daß dieser reduktionistische Ansatz den a/s-Nominalen unter satzsemantischem Aspekt nicht gerecht wird: Dislozierung aus der NP ist mit satzsemantischen Veränderungen verbunden, die als Interpretationen jeweils veränderter syntaktischer Funktion zu verstehen sind. Der Aufsatz argumentiert für insgesamt vier mögliche syntaktische Funktionen; zu den beiden bereits genannten kommen die verbbezogen und die satzbezogen adverbiale hinzu.
Am Beispiel von zwei Fallstudien wird die Frage der Generalisierbarkeit von an einer Einzelsprache gewonnenen Erkenntnissen über Verknüpfungselemente (Konnektoren) und konnektorale Strukturen aufgeworfen. Empirisch geht es zum einen um die Topologie von Adverbkonnektoren, zum anderen um das Verhältnis zwischen Adverbkonnektoren, Subjunktoren (bzw. Untersatzeinleitern) und den ihnen zugrundeliegenden Präpositionen. Methodischer Ausgangspunkt sind jeweils die Analysen und Klassifikationen des HDK, also ein dezidiert auf das Deutsche bezogener Ansatz. Es soll gezeigt werden, dass die feinkörnige einzelsprachliche Analyse, wie sie das HDK bietet, mit Gewinn auch auf andere europäische Sprachen, hier Englisch, Französisch und am Rande auch Polnisch, adaptiert werden kann, wenn die Rahmenbedingungen stimmen, also zugrundeliegende funktionale komparative Konzepte und sprachspezifische Strukturprinzipien beachtet werden. Dann ist auch ein Zugewinn für die Beschreibung des Deutschen zu erwarten.
Im vorliegenden Beitrag wird ein Vorschlag für die Wortartenunterscheidung bei den nominalen Funktionswörtern entwickelt, der auf dem Prinzip der ‘Unterspezifikation’ beruht. Das Merkmal, in dem nominale Funktionswörter unterspezifiziert sein können, ist ‘Selbstständigkeit’. So werden ‘nur-selbstständige nominale Funktionswörter’ (genuine Pronomina), von ‘nur-adnominalen’ (genuine Determinative) und ‘non-selbstständigen’ unterschieden. Den Non-Selbstständigen wie dt. dieser, die im Hinblick auf Selbstständigkeit unterspezifiziert sind, gilt das besondere Augenmerk. Im Anschluss an die englische Grammatikografie wird eine Verwendungstypik für diese Gruppe vorgestellt. Ihre Konkurrenz mit den Nur Selbstständigen wird sprachvergleichend, vor allem im Kontrast zwischen Englisch und Deutsch, heraus gearbeitet. Aus den Beobachtungen werden allgemeinere Folgerungen für das Phänomen der Indeterminiertheit oder Adaptivität von sprachlichen Ausdrücken, seine Beschreibung mithilfe von Unterspezifikation und seine unterschiedlichen Erscheinungsformen in der Flexionsmorphologie und im Lexikon von Funktions- und Inhaltswörtern gezogen. Hintergrund des Beitrags ist das IDS-Projekt „Grammatik des Deutschen im europäischen Vergleich“ (GDE).
In diesem Beitrag wird eine neue, funktional motivierte Systematik für den adnominalen Genitiv und entsprechende von-Phrasen, die zusammenfassend als ‘possessive Attribute’ bezeichnet werden, entwickelt. Sie beruht auf Erkenntnissen aus der sprachtypologischen Forschung und dem Vergleich mit anderen, vor allem germanischen Sprachen. Der Beschreibungsrahmen für die NP mit der übergreifenden ‘funktionalen Domäne’ der Referenz und den zugehörigen Subdomänen wird vorgestellt. Possessive Attribute können als eine Ausdrucksform der Subdomäne Modifikation bestimmt werden. Es wird gezeigt, dass possessive Attribute verschiedene funktionale Typen der Modifikation realisieren können: referentiell-verankernde (der Hut meiner Schwester), qualitative (ein Autor deutscher Herkunft) und klassifikatorische (ein Mann der Tat). Auch randständige possessive Attribute wie der ‘Teilungsgenitiv’ (eine Tasse heißen Tees) und der Identitätsgenitiv (das Laster der Unbescheidenheit) werden berücksichtigt. Die neue Ordnung possessiver Attribute nach funktionalen Subdomänen ist der traditionellen Einteilung vorzuziehen, insofern als sie lediglich Grundunterscheidungen gemäß dem referenzsemantischen Status des Modifikators (begrifflich versus referentiell) und nach dem Beitrag des Modifikators zur Bedeutungskomposition der NP (verankernd versus qualitativ bzw. klassifikatorisch) berücksichtigt. Zudem ist sie durch Testverfahren wie den Pronominalisierungstest abgesichert.
Der Beitrag verfolgt zwei Zielsetzungen: eine deskriptive und eine methodologische. Auf der Ebene grammatischer Beschreibung erfolgt eine Analyse der deutschen Relativsatzkonstruktion aus der Gegenüberstellung mit entsprechenden Konstruktionen anderer europäischer Sprachen heraus, insbesondere mit Konstruktionen des Englischen, Französischen, Polnischen und Ungarischen, den Kernkontrastsprachen des Projekts „Grammatik des Deutschen im europäischen Vergleich“. Dabei wird auf die zentralen Projektkonzepte ‘funktionale Domäne’ und ‘Varianzparameter’ rekurriert. Die funktionale Domäne des Relativsatzes wird als Beitrag zu der übergreifenden Funktion nominaler Konstruktionen, nämlich der Referenz, bestimmt und zwar als referentielle Modifikation des begrifflichen Kerns durch einen verankernden Sachverhalt. Von den die Sprachen differenzierenden Parametrisierungen werden drei herausgegriffen und in ihrer Korrelation diskutiert. In methodologischer Hinsicht soll am Beispiel des Relativsatzes gezeigt werden, in welcher Weise typologische Generalisierungen, Kontraste zwischen – in diesem Fall überwiegend nah verwandten bzw. über Sprachkontakte miteinander verbundenen – Sprachen und einzelsprachenspezifische Eigenschaften aufeinander zu beziehen sind, immer im Dienst einer besseren Einsicht in das Funktionieren des Deutschen.
Die traditionelle Einordnung von man als Indefinitpronomen wird in Zweifel gezogen, andere Zuordnungsmöglichkeiten werden geprüft. Zu diesem Zweck werden die Morphosyntax und die Semantik von man herausgearbeitet. Dabei steht insbesondere die Dichotomie 'generische' versus 'partikuläre' Verwendung zur Debatte. Abschließend wird ein kurzer Blick auf man aus der Lernerperspektive und im Sprachvergleich geworfen.
Die in der gesprochenen Umgangssprache und in Dialekten weit verbreitete nominale Possessorkonstruktion des Typs dem Vater sein Hut tanzt in morphologischer, syntaktischer und semantischer Hinsicht außer der Reihe. Dessen ungeachtet hält sie sich hartnäckig in den genannten Varietäten und erscheint somit als funktional angemessen.
Der Beitrag gibt einen Überblick über die Datenlage im Deutschen und stellt die Analysevorschläge im Hinblick auf Morphologie, syntaktische und semantische Struktur vor. Der Blick auf andere Sprachen und die Beschreibungsansätze in der allgemeinen Sprachtypologie erlauben eine neue Perspektive, die diese Konstruktion in den Kontext grundsätzlicher Alternativen für die Markierung syntaktischer Relationen („head-marking“ versus „dependent-marking“) einordnet. Auch dem viel diskutierten Thema der Entstehung der Konstruktion auf dem Wege von Reanalyse oder Grammatikalisierung sind unter dieser übergreifenden Perspektive neue Aspekte abzugewinnen. Abschließend wird der Frage nachgegangen, welche Eigenschaften diese Konstruktion trotz grammatischer Sonderwege und Sanktionierung durch die normative Grammatik für die Sprecher attraktiv machen.
Relationale Adjektive, also Adjektive, die aus Substantiven abgeleitet werden und die in attributiver Konstruktion mit einem Kopfsubstantiv eine unspezifische Relation zwischen dem Begriff des Kopfs und dem Begriff der Basis ausdrücken, spielen in den klassischen Sprachen eine bedeutende Rolle. Ausgehend von der silvestris musa, der Waldmuse des Vergil, wird in dem vorliegenden Beitrag den Nachwirkungen dieses Musters in europäischen Sprachen, dem Französischen, Englischen, vor allem aber im Deutschen nachgegangen. Die semantische Funktion solcher Adjektive wird der funktionalen Domäne ‚klassifikatorische Modifikation‘ zugeordnet. Sprachübergreifende Gemeinsamkeiten und Unterschiede werden herausgearbeitet. In knapper Form werden auch relationale Adjektive im Polnischen und Ungarischen, den weiteren Vergleichssprachen des Projekts „Grammatik des Deutschen im europäischen Vergleich“, einbezogen. Die Frage nach dem Verhältnis von universalen, sprachfamiliären, arealen und sprachspezifischen Eigenschaften des Konstruktionsmusters sowie nach dem Grad des lateinischen Einflusses wird auf diesem Hintergrund präziser formulierbar.
What is the subject of German linguistics? This seemingly simple question has no obvious answer. In the ZGL’s first issue, the editors required contributions to cover the whole of the German language and to be theoretically sound but application-orientated, whereas the current ZGL-homepage defines the German language of present and history in all its differentiations as its subject matter.
Looking through the fifty volumes of ZGL, three relationships can be identified as presumably enlightening the role of language, in particular the German language: language and mind; language and language use; language and culture. Though of a different systematic type, language and data should be added as an increasingly important pairing for conceptualizing language. On this basis, I also discuss the position of linguistic studies of the German language, mirrored in the ZGL-volumes, between social, cultural and natural sciences, as well as the corresponding epistemic approaches – like explaining vs. understanding.
Der Beitrag diskutiert - aus der Perspektive sozialer Welten - die Frage des Zusammenhangs zwischen den Deutungsmustern und Wissensbestanden, deren sich Migranten bedienen, und den Formen ihrer sozialen Teilhabe. Die empirische Analyse stützt sich auf „intra-ethnische“ Interaktionsprozesse in der sozialen Welt eines „türkischen“ Fußballvereins in Mannheim. Es wird gezeigt, dass sich im untersuchten Fall ethnische Selbstorganisation und Integration auf spezifische Weise paaren. Zu den Strukturmerkmalen dieser lokalen sozialen Welt zählen insbesondere ihre Einbettung in eine Vielzahl unterschiedlicher Kontexte und ihre interne Differenzierung. Des Weiteren ist die alltagspragmatische Verwendung "türkischer" Kulturmuster und der universalistische Charakter der symbolischen Legitimationen in der Alltagsphilosophie der Vereinsangehörigen zu nennen. Schließlich ist die Dominanz von Handlungsanforderungen und Deutungen aus der Fußballwelt gegenüber solchen aus dem „ethnischen“ Milieu sowie die Infragestellung der Kategorien „deutsch“ und „türkisch“ kennzeichnend für die untersuchte Sozialwelt.
A constructicon, i.e., a structured inventory of constructions, essentially aims at documenting functions of lexical and grammatical constructions. Among other parameters, so-called constructional collo-profiles, as introduced by Herbst (2018, 2020), are conclusive for determining constructional meanings. They provide information on how relevant individual words are for construction slots, they hint at usage preferences of constructions and serve as a helpful indicator for semantic peculiarities of constructions. However, even though collo-profiles constitute an indispensable component of constructicon entries, they pose major challengers for constructicographers: For a constructicographic enterprise it is not feasible to conduct collostructional analyses for hundreds or even thousands of constructions. In this article, we introduce a procedure based on the large language model BERT that allows to predict collo-profiles without having to extensively annotate instances of constructions in a given corpus. Specifically, by discussing the constructions X macht Y ADJP (‘x makes Y ADJ’, e.g. he drives him crazy) and N1 PREP N1 (e.g., bumper to bumper, constructions over constructions), we show how the developed automated system generates collo-profiles based on a limited number of annotated instances. Finally, we place collo-profiles alongside other dimensions of constructional meanings included in the German Constructicon.
Vorwort
(2021)
This chapter starts out by giving a brief overview of the main priorities of international and German studies in the area of linguistic landscape research. The contributions to this volume are then embedded in current debates and developments in the field. Finally, we outline important desiderata of linguistic landscape research that focus on German and address challenges of knowledge transfer and application as well as possible contributions to international lines of research.
Novel formats of construction-based description hold great potential for phenomena that fall through the cracks in traditional kinds of linguistic reference works. On the example of German verb argument structure constructions with a prepositional object, we demonstrate that a construction-based description of such phenomena is superior to existing lexicographic and grammaticographic treatments, but that it also poses a number of new problems. The most fundamental of these relates to the fact that construction-based analyses can be proposed on different levels of abstraction. We illustrate pertinent problems relating to the precise identification of constructional form and meaning and suggest a multi-layered descriptive format for web-based electronic reference constructica that can accommodate these challenges. Semantically, the proposed solution integrates both lumping and splitting perspectives on constructional grain size and permits users to flexibly zoom in and out on individual elements in the resource. Formally, it can capture variation in the number and marking of realised arguments as found in e.g. passives and transitivity alternations. Aspects of the theoretical controversy between Construction Grammar and Valency Theory are addressed where relevant, but our focus is on questions of description and the practical implementation of construction-based analyses in a suitable type of linguistic reference work.
Co-development of action, conceptualization and social interaction mutually scaffold and support each other within a virtuous feedback cycle in the development of human language in children. Within this framework, the purpose of this article is to bring together diverse but complementary accounts of research methods that jointly contribute to our understanding of cognitive development and in particular, language acquisition in robots. Thus, we include research pertaining to developmental robotics, cognitive science, psychology, linguistics and neuroscience, as well as practical computer science and engineering. The different studies are not at this stage all connected into a cohesive whole; rather, they are presented to illuminate the need for multiple different approaches that complement each other in the pursuit of understanding cognitive development in robots. Extensive experiments involving the humanoid robot iCub are reported, while human learning relevant to developmental robotics has also contributed useful results.
Disparate approaches are brought together via common underlying design principles. Without claiming to model human language acquisition directly, we are nonetheless inspired by analogous development in humans and consequently, our investigations include the parallel co-development of action, conceptualization and social interaction. Though these different approaches need to ultimately be integrated into a coherent, unified body of knowledge, progress is currently also being made by pursuing individual methods.
Within cognitive linguistics, there is an increasing awareness that the study of linguistic phenomena needs to be grounded in usage. Ideally, research in cognitive linguistics should be based on authentic language use, its results should be replicable, and its claims falsifiable. Consequently, more and more studies now turn to corpora as a source of data. While corpus-based methodologies have increased in sophistication, the use of corpus data is also associated with a number of unresolved problems. The study of cognition through off-line linguistic data is, arguably, indirect, even if such data fulfils desirable qualities such as being natural, representative and plentiful. Several topics in this context stand out as particularly pressing issues. This discussion note addresses (1) converging evidence from corpora and experimentation, (2) whether corpora mirror psychological reality, (3) the theoretical value of corpus linguistic studies of ‘alternations’, (4) the relation of corpus linguistics and grammaticality judgments, and, lastly, (5) the nature of explanations in cognitive corpus linguistics. We do not claim to resolve these issues nor to cover all possible angles; instead, we strongly encourage reactions and further discussion.
An experiment on the English caused motion construction in adult- and child-directed speech was conducted to assess in how far (i) verbal frequency biases and (ii) a register-specific preference for explicit and redundant coding influence speakers' selection of argument structure constructions during speaking. Subjects retold the contents of short cartoon video clips to adult and child interaction partners. The stimuli showed events of caused motion which suggested designations with verbs for which caused motion-complementation was either (i) uncommon/unattested, (ii) conventional or (iii) the dominant usage in a sample extracted from the BNC. The results show a significant tendency to avoid more compacted coding (using the caused motion construction instead of a possible two-clause paraphrase) in child-directed speech. At the same time, they also point to an interaction between the register-specific preference for explicitness and verbs' relative conventionality in the construction that neutralizes the effect for verbs that are highly frequent in the target environment.
Research on syntactic ambiguity resolution in language comprehension has shown that subjects' processing decisions are influenced by a variety of heterogeneous factors such as e.g., syntactic complexity, semantic fit and the discourse frequency of the competing structures. The present paper investigates a further potentially relevant factor in such processes: effects of syntagmatic lexical chunking (or matching to a complex memorized prefab) whose occurrence would be predicted from usage-based assumptions about linguistic categorisation. Focusing on the widely studied so-called DO/SC-ambiguity in which a post-verbal NP is syntactically ambiguous between a direct object and the subject of an embedded clause, potentially biasing collocational chunks of the relevant type are identified in a number of corpus-linguistic pretests and then investigated in a self-paced reading experiment. The results show a significant increase in processing difficulty from a collocationally neutral over a lexically biasing to a strongly biasing condition. This suggests that syntagmatically complex and partially schematic templates of the kind envisioned in usage-based Construction Grammar may impinge on speakers' online processing decisions during sentence comprehension.
Introduction
(2008)
Smooth turn-taking in conversation depends in part on speakers being able to communicate their intention to hold or cede the floor. Both prosodic and gestural cues have been shown to be used in this context. We investigate the interplay of pitch movements and hand gestures at locations at which speaker change becomes relevant, comparing their use in German and Swedish. We find that there are some shared functions of prosody and gesture with regard to turn-taking in the two languages, but that these shared functions appear to be mediated by the different phonological demands on pitch in the two languages.
Looking at gestures as a means for communication, they can serve conversational participants at several levels. As co-speech gestures, they can add information to the verbally expressed content and they can serve to manage turn-taking. In order to look closer at the interplay between these resources in face-to face conversation, we annotated hand gestures, syntactic completion points and the related turn-organisation, and measured the timing of gesture strokes and their lexical/phrasal referent. In a case study on German, we observe the trend that speakers vary less in gesturelexis on- and offsets when keeping the turn after syntactic completions than at speaker changes, backchannel or other locations of a conversation. This indicates that timing properties of non-verbal cues interact with verbal cues to manage turn-taking.
Social media, as the fifth estate, increasingly influence public discourses and play a major role in shaping public opinion. Undoubtedly, they have the potential to promote participation and democracy. On the other side, they also constitute a risk for democratic societies, as the spread of hate speech and fake news has shown. As a response, forms of counterspeech organised by civil society have emerged in social media to counter the normalisation of hate speech and democracy-threatening discourses. In order to influence discourse in social media in terms of the fifth estate, counterspeech campaigns must be visible also quantitatively. In this ethnographic contrastive study, I analysed the activities of the German and Finnish Facebook groups of the network #iamhere international. The intensity and continuity of their activities is obviously influenced by their strategic organisation: conventionalised rules support them whereas lacking or inconsequent rules seemed to be counterproductive.
Das Ziel des Beitrages ist es, das Schweigen und seine sprachliche Gestaltung in Bezug auf die Makro- und Mikrostruktur des literarischen Textes zu erforschen. Den theoretischen Hintergrund bilden linguistische und literaturwissenschaftliche Arbeiten, die kommunikative, pragmatische, semantische, kulturelle sowie literaturhistorische Aspekte des Schweigens behandeln und seine Abgrenzung von der Stille hervorheben, die als Naturphänomen zu verstehen ist. Hingewiesen wird ausgehend vom Modell der literarischen Kommunikation auf die Rolle des Schweigens in der Triade Autor-Text-Leser sowie auf seine Realisierungsmöglichkeiten in der Struktur und Sprache des Erzähltextes. Dabei richtet sich die Aufmerksamkeit nicht nur auf das Schweigen als Nicht-Sprechen, sondern auch auf die nichtssagende Rede, die im Rahmen der Kommunikationssituation die Semantik des Schweigens aktualisiert. Die zwei gegensätzlichen Schweigeformen kommen in den Berliner Romanen von Robert Walser (1878-1956) zum Vorschein und unterliegen der genauen Analyse aus der Perspektive der Makro- und Mikrostilistik. Untersucht werden das Erzählprinzip der Geschwätzigkeit in Geschwister Tanner (1907), die Ironie in Der Gehülfe (1908) und die fragmentarische Erzählweise in Jakob von Gunten (1909), durch die das Schweigen sowohl auf der thematischen Ebene als auch in der Struktur und Sprache des Textes realisiert wird. Als narrative Strategie beeinflusst Schweigen die Form und den Inhalt Walsers Berliner Romane und erzielt somit die vom Autor gewünschte Wirkung auf den Leser.
Accentuation, Uncertainty and Exhaustivity - Towards a Model of Pragmatic Focus Interpretation
(2010)
This paper presents a model of pragmatic focus interpretation that is assumed to be part of a complete language comprehension model and that is inspired by Levelt's language processing model. The model is derived from our empirical data on the role of accentuation, prosodic indicators of uncertainty and context for pragmatic focus interpretation. In its present state, the model is restricted to these data, but nevertheless generates predictions.
Many studies on dictionary use presuppose that users do indeed consult lexicographic resources. However, little is known about what users actually do when they try to solve language problems on their own. We present an observation study where learners of German were allowed to browse the web freely while correcting erroneous German sentences. In this paper, we are focusing on the multi-methodological approach of the study, especially the interplay between quantitative and qualitative approaches. In one example study, we will show how the analysis of verbal protocols, the correction task and the screen recordings can reveal the effects of intuition, language (learning) awareness, and determination on the accuracy of the corrections. In another example study, we will show how preconceived hypotheses about the problem at hand might hinder participants from arriving at the correct solution.
Im vorliegenden Beitrag gehen wir von der Prämisse aus, dass die Angemessenheit sprachlicher Formen nicht pauschal, sondern anhand des jeweiligen Kontexts zu beurteilen ist. Anhand einer Online-Fragebogenstudie mit durch weil eingeleiteten Nebensätzen untersuchen wir die Hypothese, dass Varianten, die nicht dem Schriftstandard entsprechen, in Kommunikationsformen, die sich weniger an standard- und schriftsprachlichen Normen orientieren, als (mindestens) ebenso angemessen oder zumindest unterschiedlich wahrgenommen werden wie eine schriftstandardsprachliche Variante. Wir untersuchen dies anhand von drei Aufgaben: Rezeption, Produktion und Assoziation zu bestimmten Medien und Textsorten. Wir können zeigen, dass die schriftnormgerechte Variante durchweg als am akzeptabelsten eingeschätzt wird. In allen drei Aufgaben finden sich aber auch eindeutige und übereinstimmende Effekte, die nahelegen, dass die verschiedenen Varianten in Abhängigkeit der Textsorte doch unterschiedlich eingeschätzt, produziert und assoziiert werden.
Wiktionary is increasingly gaining influence in a wide variety of linguistic fields such as NLP and lexicography, and has great potential to become a serious competitor for publisher-based and academic dictionaries. However, little is known about the "crowd" that is responsible for the content of Wiktionary. In this article, we want to shed some light on selected questions concerning large-scale cooperative work in online dictionaries. To this end, we use quantitative analyses of the complete edit history files of the English and German Wiktionary language editions. Concerning the distribution of revisions over users, we show that — compared to the overall user base — only very few authors are responsible for the vast majority of revisions in the two Wiktionary editions. In the next step, we compare this distribution to the distribution of revisions over all the articles. The articles are subsequently analysed in terms of rigour and diversity, typical revision patterns through time, and novelty (the time since the last revision). We close with an examination of the relationship between corpus frequencies of headwords in articles, the number of article visits, and the number of revisions made to articles.
Dictionary usage research views dictionaries primarily as tools for solving linguistic problems. A large proportion of dictionary use now takes place online and can thus be easily monitored using tracking technologies. Using the data gathered through tracking usage data, we hope to optimize user experiences of dictionaries and other linguistic resources. Usage statistics are also used for external evaluation of linguistic resources. In this paper, we pursue the following three questions from a quantitative perspective: (1) What new insights can we gain from collecting and analysing usage data? (2) What limitations of the data and/or the collection process do we need to be aware of? (3) How can these insights and limitations inform the development and evaluation of linguistic resources?
Dictionaries have been part and parcel of literate societies for many centuries. They assist in communication, particularly across different languages, to aid in understanding, creating, and translating texts. Communication problems arise whenever a native speaker of one language comes into contact with a speaker of another language. At the same time, English has established itself as a lingua franca of international communication. This marked tendency gives lexicography of English a particular significance, as English dictionaries are used intensively and extensively by huge numbers of people worldwide.
We present ESDexplorer (https://owid.shinyapps.io/ESDexplorer), a browser application which allows the user to explore the data from a large European survey on dictionary use and culture. We built ESDexplorer with several target groups in mind: our cooperation partners, other researchers, and a more general public interested in the results. Also, we present in detail the architecture and technological realisation of the application and discuss some legal aspects of data protection that motivated some architectural choices.
The coronavirus pandemic may be the largest crisis the world has had to face since World War II. It does not come as a surprise that it is also having an impact on language as our primary communication tool. In this short paper, we present three inter-connected resources that are designed to capture and illustrate these effects on a subset of the German language: An RSS corpus of German-language newsfeeds (with freely available untruncated frequency lists), a continuously updated HTML page tracking the diversity of the vocabulary in the RSS corpus and a Shiny web application that enables other researchers and the broader public to explore the corpus in terms of basic frequencies.
We introduce DeReKoGram, a novel frequency dataset containing lemma and part-of-speech (POS) information for 1-, 2-, and 3-grams from the German Reference Corpus. The dataset contains information based on a corpus of 43.2 billion tokens and is divided into 16 parts based on 16 corpus folds. We describe how the dataset was created and structured. By evaluating the distribution over the 16 folds, we show that it is possible to work with a subset of the folds in many use cases (e.g., to save computational resources). In a case study, we investigate the growth of vocabulary (as well as the number of hapax legomena) as an increasing number of folds are included in the analysis. We cross-combine this with the various cleaning stages of the dataset. We also give some guidance in the form of Python, R, and Stata markdown scripts on how to work with the resource.
Neologisms, i.e., new words or meanings, are finding their way into everyday language use all the time. In the process, already existing elements of a language are recombined or linguistic material from other languages is borrowed. But are borrowed neologisms accepted similarly well by the speech community as neologisms that were formed from “native” material? We investigate this question based on neologisms in German. Building on the corresponding results of a corpus study, we test the hypothesis of whether “native” neologisms are more readily accepted than those borrowed from English. To do so, we use a psycholinguistic experimental paradigm that allows us to estimate the degree of uncertainty of the participants based on the mouse trajectories of their responses. Unexpectedly, our results suggest that the neologisms borrowed from English are accepted more frequently, more quickly, and more easily than the “native” ones. These effects, however, are restricted to people born after 1980, the so-called millenials. We propose potential explanations for this mismatch between corpus results and experimental data and argue, among other things, for a reinterpretation of previous corpus studies.
Based on the privative derivational suffix -los, we test statements found in the literature on word formation using a – at least in this field – novel empirical basis: a list of affective-emotional ratings of base nouns and associated -los derivations. In addition to a frequency analysis based on the German Reference Corpus, we show that, in general, emotional polarity (so-called valence, positive vs. negative emotions) is reversed by suffixation with -los. This change is stronger for more polarized base nouns. The perceived intensity of emotion (so-called arousal) is generally lower for -los derivations than for base nouns. Finally, to capture the results theoretically, we propose a prototypical -los construction in the framework of Construction Morphology.
Die öffentliche Akzeptanz und Wirkung natur- und technikwissenschaftlicher Forschung hängt grundlegend davon ab, ob sich die Ziele und Forschungsergebnisse an die Öffentlichkeit vermitteln lassen. Doch die Inhalte aktueller Forschungsvorhaben sind für ein Laienpublikum oft nur schwer zugänglich und verständlich. Vor dem Hintergrund, die gesellschaftliche Diskussion natur- und technikwissenschaftlicher Forschung zu verbessern, untersuchen und bewerten wir im Projekt PopSci – Understanding Science einen wichtigen Sektor des populärwissenschaftlichen Diskurses in Deutschland empirisch. Hierfür identifizieren wir die linguistischen Merkmale deutscher populärwissenschaftlicher Texte durch korpusbasierte Methoden und untersuchen deren Effekt auf die kognitive Verarbeitung der Texte durch Laien. Dazu setzen wir Vor- und Nachwissenstests ein. Außerdem messen wir die Blickbewegungen der Leserinnen und Leser, während sie populärwissenschaftliche Texte lesen. Aus dieser Kombination von unterschiedlichen Methoden versuchen wir, erste Empfehlungen zur Verbesserung des linguistischen Stils und der Wissensrepräsentation populärwissenschaftlicher Texte abzuleiten.
We present an empirical study addressing the question whether, and to which extent, lexicographic writing aids improve text revision results. German university students were asked to optimise two German texts using (1) no aids at all, (2) highlighted problems, or (3) highlighted problems accompanied by lexicographic resources that could be used to solve the specific problems. We found that participants from the third group corrected the largest number of problems and introduced the fewest semantic distortions during revision. Also, they reached the highest overall score and were most efficient (as measured in points per time). The second group with highlighted problems lies between the two other groups in almost every measure we analysed. We discuss these findings in the scope of intelligent writing environments, the effectiveness of writing aids in practical usage situations and teaching dictionary skills.
We present an empirical study addressing the question whether, and to which extent, lexicographic writing aids improve text revision results. German university students were asked to optimise two German texts using (1) no aids at all, (2) highlighted problems, or (3) highlighted problems accompanied by lexicographic resources that could be used to solve the specific problems. We found that participants from the third group corrected the largest number of problems and introduced the fewest semantic distortions during revision. Also, they reached the highest overall score and were most efficient (as measured in points per time). The second group with highlighted problems lies between the two other groups in almost every measure we analysed. We discuss these findings in the scope of intelligent writing environments, the effectiveness of writing aids in practical usage situations and teaching dictionary skills.
This replication study aims to investigate a potential bias toward addition in the German language, building upon previous findings of Winter and colleagues who identified a similar bias in English. Our results confirm a bias in word frequencies and binomial expressions, aligning with these previous findings. However, the analysis of distributional semantics based on word vectors did not yield consistent results for German. Furthermore, our study emphasizes the crucial role of selecting appropriate translational equivalents, highlighting the significance of considering language-specific factors when testing for such biases for languages other than English.
Poster des Text+ Partners Leibniz-Institut für Deutsche Sprache Mannheim präsentiert beim Workshop "Wohin damit? Storing and reusing my language data" am 22. Juni 2023 in Mannheim. Das Poster wurde im Kontext der Arbeit des Vereins Nationale Forschungsdateninfrastruktur (NFDI) e.V. verfasst. NFDI wird von der Bundesrepublik Deutschland und den 16 Bundesländern finanziert, und das Konsortium Text+ wird gefördert durch die Deutsche Forschungsgemeinschaft (DFG) – Projektnummer 460033370. Die Autor:innen bedanken sich für die Förderung sowie Unterstützung. Ein Dank geht außerdem an alle Einrichtungen und Akteur:innen, die sich für den Verein und dessen Ziele engagieren.
The Leibniz-Institute for the German Language (IDS) was established in Mannheim in 1964. Since then, it has been at the forefront of innovation in German linguistics as a hub for digital language data. This chapter presents various lessons learnt from over five decades of work by the IDS, ranging from the importance of sustainability, through its strong technical base and FAIR principles, to the IDS’ role in national and international cooperation projects and its expertise on legal and ethical issues related to language resources and language technology.
Das vorliegende Papier fasst den bisherigen Diskussionsstand zur Konzeption eines Organisationsmodells für die institutionelle Verstetigung des Verbundforschungsprojektes TextGrid zusammen und bündelt die bisherigen Arbeitsergebnisse im Arbeitspaket 3 – Strukturelle und organisatorische Nachhaltigkeit. Das hier skizzierte Organisationsmodell basiert auf den in D-Grid und WissGrid erarbeiteten Nachhaltigkeitskonzepten und adaptiert das Konzept der Virtuellen Organisation (VO) für TextGrid. Insgesamt strebt TextGrid eine institutionelle Verstetigung seiner Aktivitäten nach Ende der Projektlaufzeit an und beabsichtigt gemeinsam mit Virtuellen Forschungsumgebungen aus anderen Wissenschaftsdisziplinen Wege und Prozesse etablieren zu können. Am 24./25. Februar 2011 hat TextGrid einen Strategie-Workshop in Berlin ausgerichtet, zu dem sich eine Expertenrunde zur „Nachhaltigkeit von Virtuellen Forschungsumgebungen“ eingefunden hat. Diskutiert werden wird, wie Virtuelle Forschungsumgebungen basierend auf heutigen finanziellen und organisatorischen Strukturen nachhaltig sein können und welche Empfehlungen sich daraus für TextGrid ableiten. Die Diskussionsergebnisse der Expertenrunde werden zusammen mit den Überlegungen in diesem Papier in die Konzeption eines umfassenderen Organisationsmodells einfließen, das die Grundlage für eine Verstetigung von TextGrid bilden wird.
The actual or anticipated impact of research projects can be documented in scientific publications and project reports. While project reports are available at varying level of accessibility, they might be rarely used or shared outside of academia. Moreover, a connection between outcomes of actual research project and potential secondary use might not be explicated in a project report. This paper outlines two methods for classifying and extracting the impact of publicly funded research projects. The first method is concerned with identifying impact categories and assigning these categories to research projects and their reports by extension by using subject matter experts; not considering the content of research reports. This process resulted in a classification schema that we describe in this paper. With the second method which is still work in progress, impact categories are extracted from the actual text data.
In this paper, we present the Multiple Annotation approach, which solves two problems: the problem of annotating overlapping structures, and the problem that occurs when documents should be annotated according to different, possibly heterogeneous tag sets. This approach has many advantages: it is based on XML, the modeling of alternative annotations is possible, each level can be viewed separately, and new levels can be added at any time. The files can be regarded as an interrelated unit, with the text serving as the implicit link. Two representations of the information contained in the multiple files (one in Prolog and one in XML) are described. These representations serve as a base for several applications.
Gerade weil das Thema der diesjährigen Arbeitstagung bereits seit einigen Jahrzehnten immer wieder Gegenstand verschiedener Forschungsrichtungen gewesen ist und heute gleichermaßen polymorph erforscht wird, sollten im Rahmen dieser Tagung aktuelle Projekte aus unterschiedlichen Disziplinen vorgestellt und interdisziplinär verhandelt werden. Das Ziel der Tagung war es, MedizinerInnen, PsychologInnen und GesprächsanalytikerInnen eine Plattform zu bieten, miteinander in Kontakt zu treten, die vorgestellten Ansätze, Erkenntnisinteressen und Methoden gemeinschaftlich zu diskutieren und dabei herauszustellen, in welchen Punkten sich diese von den eigenen unterscheiden.
Die Artefaktbezeichnungen im Deutschen weisen, wie viele andere sprachliche Ausdrücke auch, eine vom Kontext abhängige Bedeutungsvariation auf, die sich nach systematisch wiederkehrenden Mustern gestaltet. Ein Ziel dieser Untersuchung ist es, herauszufinden, wie diese Bedeutungsvariation zustande kommt und welche semantischen Relationen oder Merkmale das Bindeglied zwischen den einzelnen Varianten der Wortbedeutung bilden. So lässt sich auch der Grad an Systematizität oder Regelhaftigkeit der Polysemie genauer bestimmen. Die Bedeutungsvariationen bei Artefaktbezeichnungen werden hier im wesentlichen als Fälle von metonymischer Bedeutungsverschiebung behandelt. Den Ausgangspunkt der Analyse bildet dabei eine unterspezifizierte semantische Form der sprachlichen Ausdrücke, die mit Hilfe verschiedener inferenzieller Verfahren und unter Einbeziehung von Kontext und Weltwissen schrittweise angereichert und modelliert wird.
This paper describes work directed towards the development of a syllable prominence-based prosody generation functionality for a German unit selection speech synthesis system. A general concept for syllable prominence-based prosody generation in unit selection synthesis is proposed. As a first step towards its implementation, an automated syllable prominence annotation procedure based on acoustic analyses has been performed on the BOSS speech corpus. The prominence labeling has been evaluated against an existing annotation of lexical stress levels and manual prominence labeling on a subset of the corpus. We discuss methods and results and give an outlook on further implementation steps.
Anhand eines Fallbeispiels wird gezeigt, dass in der praktischen Arbeit des EuGH Rechtsarbeit und Spracharbeit eng miteinander verflochten sind. Wenn es in einem strittigen Fall um die konkrete Ausarbeitung einer haltbaren Sachverhaltsbeschreibung geht, zeigt sich, dass die Rechtsarbeit und die Spracharbeit des Gerichts eigentlich identisch sind. In einem solchen Fall ist es für das Gericht nützlich und günstig, wenn es auf so viele sprachliche Formulierungen (auch in verschiedenen Sprachen) zurückgreifen kann wie möglich. Das Ziel ist, möglichst viele Interpretationen in Betracht zu ziehen, um das Urteil bestandssicher zu machen. In dieser Situation sind Vorschläge, das Sprachenspektrum, in dem der EuGH arbeitet, im Vorhinein und generell einzuschränken, kontraproduktiv.
Zur Theorie der Eigennamen
(1972)
Linguistik
(1975)
The present study uses electromagnetic articulography, by which the position of tongue and lips during speech is measured, for the study of dialect variation. By using generalized additive modeling to analyze the articulatory trajectories, we are able to reliably detect aggregate group differences, while simultaneously taking into account the individual variation of dozens of speakers. Our results show that two Dutch dialects show clear differences in their articulatory settings, with generally a more anterior tongue position in the dialect from Ubbergen in the southern half of the Netherlands than in the dialect of Ter Apel in the northern half of the Netherlands. A comparison with formant-based acoustic measurements further reveals that articulography is able to reveal interesting structural articulatory differences between dialects which are not visible when only focusing on the acoustic signal.
The present study introduces articulography, the measurement of the position of tongue and lips during speech, as a promising method to the study of dialect variation. By using generalized additive modeling to analyze articulatory trajectories, we are able to reliably detect aggregate group differences, while simultaneously taking into account the individual variation across dozens of speakers. Our results on the basis of Dutch dialect data show clear differences between the southern and the northern dialect with respect to tongue position, with a more frontal tongue position in the dialect from Ubbergen (in the southern half of the Netherlands) than in the dialect of Ter Apel (in the northern half of the Netherlands). Thus articulography appears to be a suitable tool to investigate structural differences in pronunciation at the dialect level.
We examine the new task of detecting derogatory compounds (e.g. curry muncher). Derogatory compounds are much more difficult to detect than derogatory unigrams (e.g. idiot) since they are more sparsely represented in lexical resources previously found effective for this task (e.g. Wiktionary). We propose an unsupervised classification approach that incorporates linguistic properties of compounds. It mostly depends on a simple distributional representation. We compare our approach against previously established methods proposed for extracting derogatory unigrams.
We present an approach for modeling German negation in open-domain fine grained sentiment analysis. Unlike most previous work in sentiment analysis, we assume that negation can be conveyed by many lexical units (and not only common negation words) and that different negation words have different scopes. Our approach is examined on a new dataset comprising sentences with mentions of polar expressions and various negation words. We identify different types of negation words that have the same scopes. We show that already negation modeling based on these types largely outperforms traditional negation models which assume the same scope for all negation words and which employ a window-based scope detection rather than a scope detection based on syntactic information.
We present the pilot edition of the GermEval Shared Task on the Identification of Offensive Language. This shared task deals with the classification of German tweets from Twitter. It comprises two tasks, a coarse-grained binary classification task and a fine-grained multi-class classification task. The shared task had 20 participants submitting 51 runs for the coarse-grained task and 25 runs for the fine-grained task. Since this is a pilot task, we describe the process of extracting the raw-data for the data collection and the annotation schema. We evaluate the results of the systems submitted to the shared task. The shared task homepage can be found at https://projects.cai. fbi.h-da.de/iggsa/
We address the detection of abusive words. The task is to identify such words among a set of negative polar expressions. We propose novel features employing information from both corpora and lexical resources. These features are calibrated on a small manually annotated base lexicon which we use to produce a large lexicon. We show that the word-level information we learn cannot be equally derived from a large dataset of annotated microposts. We demonstrate the effectiveness of our (domain-independent) lexicon in the crossdomain detection of abusive microposts.
We discuss the impact of data bias on abusive language detection. We show that classification scores on popular datasets reported in previous work are much lower under realistic settings in which this bias is reduced. Such biases are most notably observed on datasets that are created by focused sampling instead of random sampling. Datasets with a higher proportion of implicit abuse are more affected than datasets with a lower proportion.
We examine predicative adjectives as an unsupervised criterion to extract subjective adjectives. We do not only compare this criterion with a weakly supervised extraction method but also with gradable adjectives, i.e. another highly subjective subset of adjectives that can be extracted in an unsupervised fashion. In order to prove the robustness of this extraction method, we will evaluate the extraction with the help of two different state-of-the-art sentiment lexicons (as a gold standard).
Implicitly abusive language – What does it actually look like and why are we not getting there?
(2021)
Abusive language detection is an emerging field in natural language processing which has received a large amount of attention recently. Still the success of automatic detection is limited. Particularly, the detection of implicitly abusive language, i.e. abusive language that is not conveyed by abusive words (e.g. dumbass or scum), is not working well. In this position paper, we explain why existing datasets make learning implicit abuse difficult and what needs to be changed in the design of such datasets. Arguing for a divide-and-conquer strategy, we present a list of subtypes of implicitly abusive language and formulate research tasks and questions for future research.