Refine
Year of publication
- 2021 (107) (remove)
Document Type
- Article (47)
- Conference Proceeding (26)
- Part of a Book (20)
- Book (4)
- Other (3)
- Part of Periodical (3)
- Report (2)
- Course Material (1)
- Working Paper (1)
Language
- English (107) (remove)
Keywords
- Interaktion (32)
- Konversationsanalyse (32)
- Korpus <Linguistik> (24)
- Deutsch (23)
- Kommunikation (12)
- Forschungsdaten (11)
- conversation analysis (11)
- Computerlinguistik (9)
- Semantik (8)
- Grammatik (7)
Publicationstate
- Veröffentlichungsversion (76)
- Zweitveröffentlichung (19)
- Postprint (11)
- Erstveröffentlichung (1)
Reviewstate
- Peer-Review (87)
- (Verlags)-Lektorat (11)
Publisher
- Taylor & Francis (15)
- Benjamins (7)
- Leibniz-Institut für Deutsche Sprache (IDS) (7)
- Association for Computational Linguistics (6)
- Linköping University Electronic Press (6)
- Leibniz-Institut für Deutsche Sprache (4)
- Verlag für Gesprächsforschung (4)
- CLARIN (3)
- Cambridge University Press (3)
- Deutsche Gesellschaft für Sprachwissenschaft (3)
Our paper discusses family language policies among multilingual families in Latvia with Russian as home language. The presentation is based on three case studies, i.e. interviews conducted with Russophones who have chosen to send their children to Latvian-medium pre-schools and schools. The main aim is to understand practices and regards among such families “from below,” i.e. which family-internal and family-external factors influenced the choice of Latvian-medium education and what impact this choice has on linguistic practices.
The paper shows that there have been critical events which both encouraged and discouraged the choice of Latvian-medium education. The wish to integrate into mainstream society has been met by obstacles both from ethnic Russians and Latvians. Yet, the three families consider their choices to be the right ones for the future development of their children in a multiethnic Latvia in which Latvian serves as the unifying language of society.
Communicative deviations of respondents in political video interviews in Ukrainian and German
(2021)
The research has the objective to establish the peculiarities of communicative deviations as a cognitive and at the same time discursive phenomenon in Ukrainian- and German-language video interviews from the viewpoint of respondents. The procedure of the research involves the integrated application of methods and techniques of pragmatics, deviatology and communicative linguistics. A new methodological basis has been developed for the reconstruction of communicative deviations using discourse analysis, namely for the reconstruction of a single event in two discursive environments, determining the communicative context and communication of interview in compared languages. The results of the research allow us to identify the features of communicative deviations in political interviews at the external, internal structural levels and at the situational level. The conclusions of the research indicate that the types of communicative deviations in political video interviews are universal in Ukrainian and German, but reflect national and cultural specifics given the peculiarities of both languages and each linguoculture, as well as existing realias, norms, conventions, maxims and rules of communication.
Polish żeby under negation
(2021)
The paper addresses two patterns in the distribution of complement clauses headed by the complementizer żeby in Polish related to the presence of sentential negation. It is argued that żeby-clauses with an obligatory negation in the matrix clause, licensed by epistemic verbs, can be treated in terms of negative polarity, with żeby defined as an n-word. Structures with żeby-clauses and an obligatory negation in the embedded clause, licensed by verbs of fear, are argued to be an instance of negative complementation, with żeby specified as a negative complementizer. A uniform lexicalist analysis within the framework of HPSG is provided, employing tools developed to account for Negative Concord in Polish.
This article sketches the development of paronym dictionaries in German. These dictionaries document and describe commonly confused words which cause uncertainties because they are similar in sound, spelling and/or meaning (e.g. effektiv/effizient, sportlich/sportiv). First, an overview of existing reference guides is provided, covering different traditions. Numerous lemma lists have been collected for pedagogical purposes and there has always been an interest in the lexicological treatment of paronyms. However, only a handful of dictionaries covering commonly confused pairs and a small number of genuine paronym dictionaries have ever been compiled. I will focus on lexicographic endeavours, including Wustmann (1891), Müller (1973) and Pollmann and Wolk (2001). Secondly, I will shed light on the differences in descriptions in these dictionaries. This includes how prescriptive approaches have been replaced over time by empirical descriptive accounts and how dictionaries have moved away from restricted, static hardback editions towards dynamic e-dictionaries. Finally, an e-dictionary, “Paronyme — Dynamisch im Kontrast”, is presented with contrastive and flexible two-level consultation views. Its three key elements are its corpus-based foundation, the implementation of meta-lexicographic requirements and a consideration of users’ interests. This dictionary has implemented a user-friendly and dynamic interface and it records conventionalized patterns and preferences in authentic communication.
This paper deals with a specific type of lexeme, namely binary preposition-noun combinations containing temporal references like am Ende [at (the) end] or für Sekunden [for seconds]. The main characteristic of these combinations is the recurrent internal zero gap. Despite the fact that the omission of the determiner can often be explained by grammatical rules, the zero gaps indicate a higher degree of lexicalization. Therefore, we interpret these expressions as minimal phraseological units with holistic meanings and functions. The corpusdriven exploration of typical context patterns (e.g. using collocation profiles and the lexpan slot filler analysis) shows that a) even such minimal expressions are based on semi-abstract schemes and b) temporal expressions can also fulfill modal or discursive functions, usually with fuzzy borders and overlapping structures. In the case of modalization or pragmatization one can regard such PNs as distinct lexicon entries.
This paper reports on an ongoing international project of compiling a freely accessible online Dictionary of German Loans in Polish Dialects. The dictionary will be the first comprehensive lexicographic compendium of its kind, serving as a complement to existing resources on German lexical loans in the literary or standard language. The empirical results obtained in the project will shed new light on the distribution of German loanwords among different dialects, also in comparison to the well-documented situation in written Polish. The dictionary will have a strong focus on the dialectal distribution of Polish dialectal variants for a given German etymon, accessible through interactive cartographic representations and corresponding search options. The editorial process is realized with dedicated collaborative web tools. The new resource will be published as an integrated part of an online information system for German lexical borrowings in other languages, the Lehnwortportal Deutsch, and is therefore highly cross-linked with other loanword dictionaries on Polish as well as Slavic and further European languages.
This poster summarizes the results of the CLARIAH-DE Work Package 5 - Community Engagement: Outreach/Dissemination and Liaison.
Work package 5 engages with the community through dissemination activities, outreach and liaison. The work package set itself the following sub goals:
- Combining the existing dissemination and outreach activities of CLARIN-D and DARIAH-DE in a meaningful way and elaborating on them. In some cases this meant continuity, in other cases a new appearance for resources.
- Providing a web portal as a gateway to the CLARIAH-DE project.
- Creating a common identity and corporate identity and maintaining the established level of trust users already put into CLARIN-D and DARIAH-DE.
- Providing a social media presence as well as a physical presence at workshops, conferences and other meetings in the Digital Humanities.
CLARIAH-DE cross-service search - prospects and benefits of merging subject-specific services
(2021)
CLARIAH-DE combines services and offerings of CLARIN-D and DARIAH-DE. This includes various search applications which are made directly available to researchers. These search applications are presented in this working paper based on their main characteristics and compared with a focus on possible harmonizations. Opportunities and risks of different forms of technical integration are highlighted. Identified challenges can be explained in particular considering the background of different organizational and technical frameworks as well as highly specific and discipline-dependent requirements. The integration work that has already been carried out and the experiences gained with regard to future work and possible integration of further applications are also discussed. The experiences made in CLARIAH-DE can especially be of interest for other projects in the field of digital research infrastructures.
In order to differentiate between figurative and literal usage of verb-noun combinations for the shared task on the disambiguation of German Verbal Idioms issued for KONVENS 2021, we apply and extend an approach originally developed for detecting idioms in a dataset consisting of random ngram samples. The classification is done by implementing a rather shallow, statistics-based pipeline without intensive preprocessing and examinations on the morphosyntactic and semantic level. We describe the overall approach, the differences between the original dataset and the dataset of the KONVENS task, provide experimental classification results, and analyse the individual contributions of our feature sets.
This poster summarizes the results of the CLARIAH-DE Work Package 3: Skills Training and Promotion of Junior Researchers.
For a research field that is characterised by rapid technical development, CLARIAH-DE has to include the promotion of data literacy necessary for the efficient use of this digital research infrastructure as part of its objective. To develop, consolidate and refine a common programme in this area, work package 3 set itself the following sub goals:
- Consolidation of the activities from the previous projects into a joint service
- Cataloguing and reflecting on the methods and tools used in the research field, with the aim of identifying remaining gaps
- Skills training of, individual support for and the promotion of junior researchers
We discuss the modal uses of the Hausa exclusive particle sai (≈ only). We argue that the distribution of sai in modal environments provides evidence for the following claims on the composition of modal meaning that have been independently made in the literature: i) Future-oriented modality involves a prospective aspect operator that can be realized covertly in some languages (e.g. English, Kratzer 2012b) and overtly in others (e.g. Gitksan, Matthewson 2012, 2013). ii) Necessity interpretations arise from exhaustifying possibilities, i.e. an exhaustivity operator applying to existential modality (e.g. Kaufmann 2012 for the case of imperatives and Leffel 2012 for a relevant analysis of necessity meaning in Masalit). We show that future-oriented necessity in Hausa decomposes into EXH((PROSP)), with sai contributing exhaustivity.
The prohibitive is typically defined as the negative imperative, i.e. it “implies making someone not do something, having the effect of forbidding, preventing, or restricting” (Aikhenvald, 2017: 3). This chapter focuses on the formation of the prohibitive in the languages of Daghestan and neighboring regions, analyzing two different aspects of the morphological coding: first, the verb form (especially whether it is an imperative form or not), and second, the type of negation marker/affix used. Based on this, the general encoding types are deduced. Additionally, the phonological form of the markers is shortly analyzed.
Verbs may be attributed to higher agency than other grammatical categories. In Study 1, we confirmed this hypothesis with archival datasets comprising verbs (N = 950) and adjectives (N = 2115). We then investigated whether verbs (vs. adjectives) increase message effectiveness. In three experiments presenting potential NGOs (Studies 2 and 3) or corporate campaigns (Study 4) in verb or adjective form, we demonstrate the hypothesized relationship. Across studies, (overall N = 721) grammatical agency consistently increased message effectiveness. Semantic agency varied across contexts by either increasing (Study 2), not affecting (Study 3), or decreasing (Study 4) the effectiveness of the message. Overall, experiments provide insights in to the meta-semantic effects of verbs – demonstrating how grammar may influence communication outcomes.
Repeating the movements associated with activities such as drawing or sports typically leads to improvements in kinematic behavior: these movements become faster, smoother, and exhibit less variation. Likewise, practice has also been shown to lead to faster and smoother movement trajectories in speech articulation. However, little is known about its effect on articulatory variability. To address this, we investigate the extent to which repetition and predictability influence the articulation of the frequent German word “sie” [zi] (they). We find that articulatory variability is proportional to speaking rate and the duration of [zi], and that overall variability decreases as [zi] is repeated during the experiment. Lower variability is also observed as the conditional probability of [zi] increases, and the greatest reduction in variability occurs during the execution of the vocalic target of [i]. These results indicate that practice can produce observable differences in the articulation of even the most common gestures used in speech.
This paper investigates synchronic variation in the lexical and grammatical environments of the German lexical verb verdienen ‘earn’, ‘deserve’. In its lexical uses, verdienen co-occurs with an object noun phrase whose head is either concrete (e.g. Geld ‘money’) or, more commonly, abstract (e.g. Beachtung ‘attention’). When it is used more grammatically with deontic modal meaning, verdienen is followed by a passive or active infinitive. This paper uses collostructional analyses to contrast lexical and grammatical uses in terms of the most strongly attracted lexical items, which are grouped into semantic classes. The results reflect different degrees of host-class expansion (cf. Himmelmann 2004), whereby the collexemes of verdienen expand from concrete to abstract and their morpho-syntactic contexts from nominal to infinitival complement and subsequently from passive to active. Synchronic distribution can thus serve as a window on diachronic development (Kuteva 2001), in this case the rise of a deontic modality marker.
Negation raising and mood. A corpus-based study of Polish sądzić ‘think’ and wierzyć ‘believe’
(2021)
The paper describes the distribution of two negation raising predicates in Polish: sądzić ‛think’ and wierzyć ‛believe’ in the National Corpus of Polish with a particular focus on their morphosyntax and the mood of their clausal complements. The aim was to examine whether there are any correlations between these two parameters, and to what extent negation raising with those verbs exhibits performative features (in terms of Prince, 1976). The results of the study support the performative approach to negation raising as per Prince (1976) only for cases with subjunctive complements. The corpus findings further imply that Polish negation raising predicates encode two different degrees of (un)certainty concerning the truth of the embedded proposition depending on the mood of their complements. Structures with indicative complements express weaker uncertainty than structures with subjunctive complements.
This article examines how the most frequent imperative forms of the verb to show in German (zeig mal) and Czech (ukaž) are deployed in object-centred sequences. Specifically, it focuses on smartphone-based showing activities as these were the main sequential environments of show imperatives in the datasets investigated. In both languages, the imperative form does not merely aim to elicit a responsive action from the smartphone holder (such as making the device available) but projects an individual course of action from the requester’s side in the form of an immediate visual inspection of the digital content. This inspection is carried out as part of a joint course of action, allowing the recipient to provide a more detailed response to a prior action. Therefore, this specific imperative form is proven to be cross-linguistically suited to technology-mediated inspection sequences.
We describe a simple procedure for the automatic creation of word-level alignments between printed documents and their respective full-text versions. The procedure is unsupervised, uses standard, off-the-shelf components only, and reaches an F-score of 85.01 in the basic setup and up to 86.63 when using pre- and post-processing. Potential areas of application are manual database curation (incl. document triage) and biomedical expression OCR.
We are witnessing an emerging digital revolution. For the past 25–30 years, at an increasing pace, digital technologies—especially the internet, mobile phones and smartphones—have transformed the everyday lives of human beings. The pace of change will increase, and new digital technologies will become even more tightly entangled in human everyday lives. Artificial intelligence (AI), the Internet of Things (IoT), 6G wireless solutions, virtual reality (VR), augmented reality (AR), mixed reality (XR), robots and various platforms for remote and hybrid communication will become embedded in our lives at home, work and school.
Digitalisation has been identified as a megatrend, for example, by the OECD (2016; 2019). While digitalisation processes permeate all aspects of life, special attention has been paid to its impact on the ageing population, everyday communication practices, education and learning and working life. For example, it has been argued that digital solutions and technologies have the potential to improve quality of life, speed up processes and increase efficiency. At the same time, digitalisation is likely to bring with it unexpected trends and challenges. For example, AI and robots will doubtlessly speed up or take over many routine-based work tasks from humans, leading to the disappearance of certain occupations and the need for re-education. This, in turn, will lead to an increased demand for skills that are unique to humans and that technologies are not able to master. Thus, developing human competences in the emerging digital era will require not only the mastering of new technical skills, but also the advancement of interpersonal, emotional, literacy and problem-solving skills.
It is important to identify and describe the digitalisation phenomena—pertaining to individuals and societies—and seek human-centric answers and solutions that advance the benefits of and mitigate the possible adverse effects of digitalisation (e.g. inequality, divisions, vulnerability and unemployment). This requires directing the focus on strengthening the human skills and competences that will be needed for a sustainable digital future. Digital technologies should be seen as possibilities, not as necessities.
There is a need to call attention to the co-evolutionary processes between humans and emerging digital technologies—that is, the ways in which humans grow up with and live their lives alongside digital technologies. It is imperative to gain in-depth knowledge about the natural ways in which digital technologies are embedded in human everyday lives—for example, how people learn, interact and communicate in remote and hybrid settings or with artificial intelligence; how new digital technologies could be used to support continuous learning and understand learning processes better and how health and well-being can be promoted with the help of new digital solutions.
Another significant consideration revolves around the co-creation of our digital futures. Important questions to be asked are as follows: Who are the ones to co-create digital solutions for the future? How can humans and human sciences better contribute to digitalisation and define how emerging technologies shape society and the future? Although academic and business actors have recently fostered inclusion and diversity in their co-creation processes, more must be done. The empowerment of ordinary people to start acting as active makers and shapers of our digital futures is required, as is giving voice to those who have traditionally been silenced or marginalised in the development of digital technology. In the emerging co-creation processes, emphasis should be placed on social sustainability and contextual sensitivity. Such processes are always value-laden and political and intimately intertwined with ethical issues.
Constant and accelerating change characterises contemporary human systems, our everyday lives and the environment. Resilience thinking has become one of the major conceptual tools for understanding and dealing with change. It is a multi-scalar idea referring to the capacity of individuals and human systems to absorb disturbances and reorganise their functionality while undergoing a change. Based on the evolving new digital technologies, there is a pressing need to understand how these technologies could be utilised for human well-being, sustainable lifestyles and a better environment. This calls for analysing different scales and types of resilience in order to develop better technology-based solutions for human-centred development in the new digital era.
This white paper is a collaborative effort by researchers from six faculties and groups working on questions related to digitalisation at the University of Oulu, Finland. We have identified questions and challenges related to the emerging digital era and suggest directions that will make possible a human-centric digital future and strengthen the competences of humans and humanity in this era.
In conversation, speakers need to plan and comprehend language in parallel in order to meet the tight timing constraints of turn taking. Given that language comprehension and speech production planning both require cognitive resources and engage overlapping neural circuits, these two tasks may interfere with one another in dialogue situations. Interference effects have been reported on a number of linguistic processing levels, including lexicosemantics. This paper reports a study on semantic processing efficiency during language comprehension in overlap with speech planning, where participants responded verbally to questions containing semantic illusions. Participants rejected a smaller proportion of the illusions when planning their response in overlap with the illusory word than when planning their response after the end of the question. The obtained results indicate that speech planning interferes with language comprehension in dialogue situations, leading to reduced semantic processing of the incoming turn. Potential explanatory processing accounts are discussed.
In this paper, the meaning and processing of the German conditional connectives (CCs) such as wenn ‘if’ and nur wenn ‘only if’ are investigated. In Experiment 1, participants read short scenarios containing a conditional sentence (i.e., If P, Q.) with wenn/nur wenn ‘if/only if’ and a confirmed or negated antecedent (i.e., P/not-P), and subsequently completed the final sentence about Q (with or without negation). In Experiment 2, participants rated the truth or falsity of the consequent Q after reading a conditional sentence with wenn or nur wenn and a confirmed or negated antecedent (i.e., If P, Q. P/not-P. // Therefore, Q?). Both experiments showed that neither wenn nor nur wenn were interpreted as biconditional CCs. Modus Ponens (If P, Q. P. // Therefore, Q) was validated for wenn, whereas it was not validated in the case of nur wenn. While Denial of the Antecedent (If P, Q. not-P. // Therefore, not-Q.) was validated in the case of nur wenn, it was not validated for wenn. The same method was used to test wenn vs. unter der Bedingung, dass ‘on condition that’ in Experiment 3, and wenn vs. vorausgesetzt, dass ‘provided that’ in Experiment 4. Experiment 5, using Affirmation of the Consequent (If P, Q. Q. // Therefore, P.) to test wenn vs. nur wenn replicated the results of Experiment 2. Taken together, the results show that in German, unter der Bedingung, dass is the most likely candidate of biconditional CCs whereas all others are not biconditional. The findings, in particular of nur wenn not being semantically biconditional, are discussed based on available formal analyses of conditionals.
This paper will address the challenge of creating a knowledge graph from a corpus of historical encyclopedias with a special focus on word sense alignment (WSA) and disambiguation (WSD). More precisely, we examine WSA and WSD approaches based on article similarity to link messy historical data, utilizing Wikipedia as aground-truth component – as the lack of a critical overlap in content paired with the amount of variation between and within the encyclopedias does not allow for choosing a ”baseline” encyclopedia to align the others to. Additionally, we are comparing the disambiguation performance of conservative methods like the Lesk algorithm to more recent approaches, i.e. using language models to disambiguate senses.
Obwohl Smartphones und andere mobile Endgeräte mittlerweile ein fester Bestandteil unseres Alltags sind, betonen öffentliche und wissenschaftliche Diskurse immer noch bevorzugt mögliche negative Auswirkungen ihres Gebrauchs auf Gesundheit und Kommunikationsverhalten. Dieser Beitrag skizziert einen anderen Ansatz zur Analyse alltäglichen Technologiegebrauchs, indem er zunächst auf Studien aus der angewandten Linguistik und insbesondere der interaktionalen Forschung eingeht, die sich auf dessen öffentliche Beobachtbarkeit, Mobilität und Ubiquität konzentrieren. Anhand zweier Auszüge aus videoaufgezeichneten Interaktionen wird dann aufgezeigt, wie eine multimodale und sequentielle Analyse dazu beitragen kann, Technologiegebrauch als eine routinemäßige und geordnete soziale Praktik zu verstehen, die nicht mit sozialem, kooperativem Handeln in Widerspruch steht oder dieses gefährdet. Ein detaillierter Blick auf situierten Smartphonegebrauch in informellen und institutionellen Face-to-Face-Settings lenkt die analytische Aufmerksamkeit weg von einer generisch positiven oder negativen Bewertung der Technologie hin zu verschiedenen interaktionalen Phänomenen, die mit ihrer Handhabung und Erkundung in Zusammenhang stehen. Es wird abschließend argumentiert, dass diese Art von mikroanalytischem Ansatz zu einer facettenreichen und objektiveren Perspektive auf die situierte Nutzung mobiler Geräte beitragen kann.
Mock fiction is a genre of humorous, fictional narratives. It is pervasive in adolescents’ peer-group interaction. Building on a corpus of informal peer-group interaction among 14 to 17 year-old German adolescents, it is shown how mock fiction is used to sanction identity-claims of peer-group co-members that are taken to be inadequate by the teller of a mock fiction. Mock fiction exposes and ridicules those claims by fictional exaggeration. Mock fiction is an indirect, yet sometimes even highly abusive means for criticizing and negotiating identities and statuses of peer-group members. The analysis shows how mock fiction is collaboratively produced, how it is used to convey criticism and to negotiate social norms indirectly, and how, in addition, it allows for performative self-positioning of the tellers as skilled, entertaining tellers and socio-psychological diagnosticians.
The term “pivot” usually refers to two overlapping syntactic units such that the completion of the first unit simultaneously launches the second. In addition, pivots are generally said to be characterized by the smooth prosodic integration of their syntactic parts. This prosodic integration is typically achieved by prosodic-phonetic matching of the pivot components. As research on such turns in a range of languages has illustrated, speakers routinely deploy pivots so as to be able to continue past a point of possible turn completion, in the service of implementing some additional or revised action. This article seeks to build on, and complement, earlier research by exploring two issues in more detail as follows: (1) what exactly do pivotal turn extensions accomplish on the action dimension, and (2) what role does prosodic-phonetic packaging play in this? We will show that pivot constructions not only exhibit various degrees of prosodic-phonetic (non-)integration, i.e., differently strong cesuras, but that they can be ordered on a continuum, and that this cline maps onto the relationship of the actions accomplished by the components of the pivot construction. While tighter prosodic-phonetic integration, i.e., weak(er) cesuring, co-occurs with post-pivot actions whose relationship to that of the pre-pivot tends to be rather retrospective in character, looser prosodic-phonetic integration, i.e., strong(er) cesuring, is associated with a more prospective orientation of the post-pivot’s action. These observations also raise more general questions with regard to the analysis of action.
In this paper, we present an overview of freely available web applications providing online access to spoken language corpora. We explore and discuss various solutions with which the corpus providers and corpus platform developers address the needs of researchers who are working with spoken language. The paper aims to contribute to the long-overdue exchange and discussion of methods and best practices in the design of online access to spoken language corpora.
Between January 2020 and summer 2021, many new words and phrases contributed to the expansion of the German vocabulary in order to enable communication under the new conditions during the corona pandemic. This rapid expansion of vocabulary has most notably affected lexicography as a discipline of applied linguistics. General language dictionaries or terminological dictionaries have quickly reflected on how the German lexicon evolved during the corona pandemic: new entries were added, others were revised. This paper, however, focuses on the ways in which a German (specialized) neologism dictionary project, the "Neologismenwörterbuch" at the "Leibniz Institute for the German Language, Mannheim" published (online, see https://www.owid.de/docs/neo/start.jsp) has chosen to capture and document lexicographic information in a timely manner. Neologisms are (following the definition applied here) lexical units or senses/meanings which emerge in a language community over a specific period of time of language development, which diffuse, are generally accepted as language norms, and which the majority of speakers perceive as new for some time. Thus, the "Neologismenwörterbuch" used to record neologisms only retrospectively, that is after their lexicalization. As a consequence, users of the dictionary were often not able to obtain details on words that were particularly conspicuous at a particular time in a specific discourse, thus raising questions concerning their meaning, correct spelling, etc. This, however, did not imply that the lexicographers of the project had not already collected these words with some preliminary information in a list of candidates for inclusion in an internal database. Therefore, the project started to publish online an index of monitored words including lexical units that had emerged since 2011, for which only time will tell whether they will diffuse and manifest as language norms. This list format was used since April 2020 to also issue a compilation of corona-related neologisms as part of the "Neologismenwörterbuch". In October 2021, this inventory included more than 1.800 Corona-related neologisms, and still, more than 700 candidates in an internal database awaited lexicographic description and inclusion into the online index (see https://www.owid.de/docs/neo/listen/corona.jsp). In this paper many examples are presented to illustrate how new words, new senses and new uses in the context of the Covid-19 pandemic are reflected in the dictionary.
This paper investigates situations in French videogame interactions where non-players who share the same physical space as players, participate in the gaming activities as spectators. Through a detailed multimodal and sequential analysis, we show that being a spectator is a local achievement of all co-present participants - players and non-players.
This study analyzes how participants playing VR games construct co-presence and shared gameplay. The analysis focuses on instances of play where one person is wearing the VR equipment, and other participants are located nearby without the ability to directly interact with the game. We first show how the active player using the VR equipment draws on talk and embodied activity to signal their presence in the shared physical environment, while simultaneously conducting actions in the virtual space, and thus creates spaces for the other participants to take part in gameplay. Second, we describe how other participants draw on the contextual configurations of the moment in displaying co-presence and position themselves as active and consequential co-players. The analysis demonstrates how gameplay can be communicatively constructed even in situations where the participants have differential rights and possibilities to act and influence the game.
Schegloff (1996) has argued that grammars are “positionally-sensitive”, implying that the situated use and understanding of linguistic formats depends on their sequential position. Analyzing the German format Kannst du X? (corresponding to English Can you X?) based on 82 instances from a large corpus of talk-in-interaction (FOLK), this paper shows how different action-ascriptions to turns using the same format depend on various orders of context. We show that not only sequential position, but also epistemic status, interactional histories, multimodal conduct, and linguistic devices co-occurring in the same turn are decisive for the action implemented by the format. The range of actions performed with Kannst du X? and their close interpretive interrelationship suggest that they should not be viewed as a fixed inventory of context-dependent interpretations of the format. Rather, the format provides for a root-interpretation that can be adapted to local contextual contingencies, yielding situated action-ascriptions that depend on constraints created by contexts of use.
In so-called Let’s Plays, video gaming is presented and verbally commented by Let’s Players on the internet for an audience. When only watched but not played, the most attractive features of video games, immersion and interactivity, get lost – at least for the internet audience. We assume that the accompanying reactions (transmitted via a so-called facecam) and verbal comments of Let’s Players on their game for an audience contribute to an embodiment of their avatars which makes watching a video game more attractive. Following an ethnomethodological conversation analytical (EMCA) approach, our paper focusses on two practices of embodying avatars. A first practice is that Let’s Players verbally formulate their actions in the game. By that, they make their experiences and the 'actions' of avatars more transparent. Secondly, they produce response cries (Goffman) in reaction to game events. By that, they enhance the liveliness of their avatars. Both practices contribute to a co-construction of a specific kind of (tele-)presence.
Playing videogames is a popular social activity; people play videogames in different places, on different media, in different situations, alone or with partners, online or offline. Unsurprisingly, they thereby share space (physically or virtually) with other playing or non-playing people. The special issue investigates through different contexts and settings how non-players become participants of the gaming interaction and how players and non-players co-construct presence. The introduction provides a problem-related context for the individual contributions and then briefly presents them.
This study explores how ‘gatherings’ turn into ‘encounters’ in a virtual world (VW) context. Most communication technologies enable only focused encounters between distributed participants, but in VWs both gatherings and encounters can occur. We present close sequential analysis of moments when after a silent gathering, interaction among participants in a VW is gradually resumed, and also investigate the social actions in the verbal (re-)opening turns. Our findings show that like in face-to-face situations, also in VWs participants often use different types of embodied resources to achieve the transition, rather than rely on verbal means only. However, the transition process in VWs has distinctive characteristics compared to the one in face-to-face situations. We discuss how participants in a VW use virtually embodied pre-beginnings to display what we call encounter-readiness, instead of displaying lack of presence by avatar stillness. The data comprise 40 episodes of video-recorded team interactions in a VW.
Over the past decade, conducting empirical research in linguistics has become increasingly popular. The first of its kind, this book provides an engaging and practical introduction to this exciting versatile field, providing a comprehensive overview of research aspects in general, and covering a broad range of subdiscipline-specific methodological approaches. Subfields covered include language documentation and descriptive linguistics, language typology, corpus linguistics, sociolinguistics and anthropological linguistics, cognitive linguistics and psycholinguistics, and neurolinguistics. The book reflects on the strengths and weaknesses of each single approach and on how they interact with one-another across the study of language in its many diverse facets. It also includes exercises, example student projects and recommendations for further reading, along with additional online teaching materials. Providing hands-on experience, and written in an engaging and accessible style, this unique and comprehensive guide will give students the inspiration they need to develop their own research projects in empirical linguistics.
The teaching slides accompany the following textbook:
Svenja Völkel & Franziska Kretzschmar (2021): Introducing linguistic research. Cambridge: Cambridge University Press.
The slides follow the structure of the book chapters and can be used for teaching in class. They include the basic information per chapter and exercises to work on in class or as homework. More detailed information, additional exercises, suggestions for research projects and recommendations for further reading can be found in the textbook.
We present zu-excessive structures like Otto ist zu schwer ‘Otto is too heavy’ as instantiations of comparatives that have been reflexivized. Comparatives express asymmetric relations between distinguished referents, but reflexivization identifies argument places (or reduces two argument places to one), leading to a Symmetrie relation. Reflexivization is thus in conflict with the asymmetry property of comparatives and leads to an intermediate semantic representation that is con- tradictory. Two experiments substantiate that zu-excessives share this property with privative adjective and animal-for-statue constructions that similarly give rise to contradictory semantics. The processing of any of the constructions mentioned yields a positivity in the event-related-potential signature characteristic of concep- tual reorganization; however, the observed positivity occurs earlier in the case of zu-excessives than in the other cases. We propose this difference is due to zu signalling the mandatory preparation for an ensuing repair rather than reflecting the repair Operation itself that involves manipulating the Standard of comparison, coded elsewhere in the String (if at all).
This paper describes the TEI-based ISO standard 2462:2016 “Transcription of spoken language” and other formats used within CLARIN for spoken language resources. It assesses the current state of support for the standard and the interoperability between these formats and with relevant tools and services. The main idea behind the paper is that a digital infrastructure providing language resources and services to researchers should also allow the combined use of resources and/or services from different contexts. This requires syntactic and semantic interoperability. We propose a solution based on the ISO/TEI format and describe the necessary steps for this format to work as an exchange format with basic semantic interoperability for spoken language resources across the CLARIN infrastructure and beyond.
Sometimes legal scholars get relevant but baffling questions from laypersons like: “The reference to a work is personal data, so does the GDPR actually require me to anonymise it? Or, as my voice data is personal data, does the GDPR automatically give me access to a speech recognizer using my voice sample? Or, can I say anything about myself without the GDPR requiring the web host to anonymise or remove the post? What can I say about others like politicians? And, what can researchers say about patients in a research report?” Based on these questions, the authors address the interaction of intellectual property and data protection law in the context of data minimisation and attribution rights, access rights, trade secret protection, and freedom of expression.
Twitter data is used in a wide variety of research disciplines in Social Sciences and Humanities. Although most Twitter data is publicly available, its re-use and sharing raise many legal questions related to intellectual property and personal data protection. Moreover, the use of Twitter and its content is subject to the Terms of Service, which also regulate re-use and sharing. This extended abstract provides a brief analysis of these issues and introduces the new Academic Research product track, which enables authorized researchers to access Twitter API on a preferential basis.
In this paper we examine the composition and interactional deployment of suspended assessments in ordinary German conversation. We define suspended assessments as lexicosyntactically incomplete assessing TCUs that share a distinct cluster of prosodic-phonetic features which auditorily makes them come off as 'left hanging' rather than cut-off (e.g., Schegloff/Jefferson/Sacks 1977; Jasperson 2002) or trailing-off (e.g., Local/Kelly 1986; Walker 2012). Using CA/IL methodology (Couper-Kuhlen/Selting 2018) and drawing on a large body of video-recorded face-to-face conversations, we highlight the verbal, vocal and bodily-visual resources participants use to render such unfinished assessing TCUs recognizably incomplete and identify six recurrent usage types. Overall, the suspension of assessing TCUs appears to either serve as a practice for circumventing the production of assessments that are interactionally inapposite, or as a practice for coping with local contingencies that render the very doing of an assessment problematic for the speaker. Data are in German with English translations.
While there is a large amount of research in the field of Lexical Semantic Change Detection, only few approaches go beyond a standard benchmark evaluation of existing models. In this paper, we propose a shift of focus from change detection to change discovery, i.e., discovering novel word senses over time from the full corpus vocabulary. By heavily fine-tuning a type-based and a token-based approach on recently published German data, we demonstrate that both models can successfully be applied to discover new words undergoing meaning change. Furthermore, we provide an almost fully automated framework for both evaluation and discovery.
Privacy in its many aspects is protected by various legal texts (e.g. the Basic Law, Civil Code, Criminal Code, or even the Law on Copyright in artistic and photographic works (KunstUrhG), which protects image rights). Data protection law, which governs the processing of information about individuals (personal data), also serves to protect their privacy. However, some information referring to the public sphere of an individual’s life (e.g. the fact that X is a mayor of Smallville) may still be considered personal data (see below), and as such fall within the scope of data protection rules. In this sense, data protection laws concern information that is not private.
Therefore, privacy and data protection, although closely related, are distinct notions: one can violate someone else’s privacy without processing his or her personal data (e.g. simply by knocking at one’s door at night, uninvited), and vice versa: one can violate data protection rules without violating privacy.
The following handouts focus exclusively on data protection rules, and specifically on the General Data Protection Regulation (GDPR). However, please keep in mind that compliance with the GDPR is not the only aspect of protecting privacy of individuals in research projects. Other rules, such as academic ethics and community standards (such as CARE) also need to be observed.
The automatic recognition of idioms poses a challenging problem for NLP applications. Whereas native speakers can intuitively handle multiword expressions whose compositional meanings are hard to trace back to individual word semantics, there is still ample scope for improvement regarding computational approaches. We assume that idiomatic constructions can be characterized by gradual intensities of semantic non-compositionality, formal fixedness, and unusual usage context, and introduce a number of measures for these characteristics, comprising count-based and predictive collocation measures together with measures of context (un)similarity. We evaluate our approach on a manually labelled gold standard, derived from a corpus of German pop lyrics. To this end, we apply a Random Forest classifier to analyze the individual contribution of features for automatically detecting idioms, and study the trade-off between recall and precision. Finally, we evaluate the classifier on an independent dataset of idioms extracted from a list of Wikipedia idioms, achieving state-of-the art accuracy.
The German e-dictionary documenting confusables Paronyme – Dynamisch im Kontrast contains lexemes which are similar in sound, spelling and/or meaning, e.g. autoritär/autoritativ, innovativ/innovatorisch. These can cause uncertainty as to their appropriate use. The monolingual guide could be easily expanded to become a multilingual platform for commonly confused items by incorporating language modules. The value of this visionary resource is manifold. Firstly, e-dictionaries of confusables have not yet been compiled for most European languages; consequently, the German resource could serve as a model of practice. Secondly, it would be able to explain the usage of false friends. Thirdly, cognates and loan word equivalents would be offered for simultaneous consultation. Fourthly, users could find out whether, for example, a German pair is semantically equivalent to a pair in another language. Finally, it would inform users about cases where a pair of semantically similar words in one language has only one lexical counterpart in another language. This paper is an appeal for visionary projects and collaborative enterprises. I will outline the dictionary’s layout and contents as shown by its contrastive entries. I will demonstrate potential additions, which would make it possible to build up a large platform for easily misused words in different languages.
In this paper we present an experimental semantic search function, based on word embeddings, for an integrated online information system on German lexical borrowings into other languages, the Lehnwortportal Deutsch (LWPD). The LWPD synthesizes an increasing number of lexicographical resources and provides basic cross-resource search options. Onomasiological access to the lexical units of the portal is a highly desirable feature for many research questions, such as the likelihood of borrowing lexical units with a given meaning (Haspelmath & Tadmor, 2009; Zeller, 2015). The search technology is based on multilingual pre-trained word embeddings, and individual word senses in the portal are associated with word vectors. Users may select one or more among a very large number of search terms, and the database returns lexical items with word sense vectors similar to these terms. We give a preliminary assessment of the feasibility, usability and efficacy of our approach, in particular in comparison to search options based on semantic domains or fields.
Who is we? Disambiguating the referents of first person plural pronouns in parliamentary debates
(2021)
This paper investigates the use of first person plural pronouns as a rhetorical device in political speeches. We present an annotation schema for disambiguating pronoun references and use our schema to create an annotated corpus of debates from the German Bundestag. We then use our corpus to learn to automatically resolve pronoun referents in parliamentary debates. We explore the use of data augmentation with weak supervision to further expand our corpus and report preliminary results.
This introduction summarizes general issues combining lexicography and neology in the context of the Globalex Workshop on Lexicography and Neology series. We present each of the six papers composing this Special Issue, featuring two Slavic languages (Czech and Slovak) and two Romance ones (Brazilian Portuguese and Spanish in its European and Latin American varieties) and their diverse lexicographic research and representation, in specialized dictionaries of neologisms or general language ones, in monolingual, bilingual and multilingual lexical resources, and in print and digital dictionaries.
Control, typically defined as a specific referential dependency between the null-subject of a non-finite embedded clause and a co-dependent of the matrix predicate, has been subject to extensive research in the last 50 years. While there is a broad consensus that a distinction between Obligatory Control (OC), Non-Obligatory Control (NOC) and No Control (NC) is useful and necessary to cover the range of relevant empirical phenomena, there is still less agreement regarding their proper analyses. In light of this ongoing discussion, the articles collected in this volume provide a cross-linguistic perspective on central questions in the study of control, with a focus on non-canonical control phenomena. This includes cases which show NOC or NC in complement clauses or OC in adjunct clauses, cases in which the controlled subject is not in an infinitival clause, or in which there is no unique controller in OC (i.e. partial control, split control, or other types of controllers). Based on empirical generalizations from a wide range of languages, this volume provides insights into cross-linguistic variation in the interplay of different components of control such as the properties of the constituent hosting the controlled subject, the syntactic and lexical properties of the matrix predicate as well as restrictions on the controller, thereby furthering our empirical and theoretical understanding of control in grammar.