Refine
Year of publication
- 2021 (107) (remove)
Document Type
- Article (47)
- Conference Proceeding (26)
- Part of a Book (20)
- Book (4)
- Other (3)
- Part of Periodical (3)
- Report (2)
- Course Material (1)
- Working Paper (1)
Language
- English (107) (remove)
Keywords
- Interaktion (32)
- Konversationsanalyse (32)
- Korpus <Linguistik> (24)
- Deutsch (23)
- Kommunikation (12)
- Forschungsdaten (11)
- conversation analysis (11)
- Computerlinguistik (9)
- Semantik (8)
- Grammatik (7)
Publicationstate
- Veröffentlichungsversion (76)
- Zweitveröffentlichung (19)
- Postprint (11)
- Erstveröffentlichung (1)
Reviewstate
- Peer-Review (87)
- (Verlags)-Lektorat (11)
Publisher
- Taylor & Francis (15)
- Benjamins (7)
- Leibniz-Institut für Deutsche Sprache (IDS) (7)
- Association for Computational Linguistics (6)
- Linköping University Electronic Press (6)
- Leibniz-Institut für Deutsche Sprache (4)
- Verlag für Gesprächsforschung (4)
- CLARIN (3)
- Cambridge University Press (3)
- Deutsche Gesellschaft für Sprachwissenschaft (3)
Validating the Performativity Hypothesis to Neg-Raising using corpus data: Evidence from Polish
(2021)
This report presents a corpus of articulations recorded with Schlieren photography, a recording technique to visualize aeroflow dynamics for two purposes. First, as a means to investigate aerodynamic processes during speech production without any obstruction of the lips and the nose. Second, to provide material for lecturers of phonetics to illustrates these aerodynamic processes. Speech production was recorded with 10 kHz frame rate for statistical video analyses. Downsampled videos (500 Hz) were uplodad to a youtube channel for illustrative purposes. Preliminary analyses demonstrate potential in applying Schlieren photography in research.
This paper investigates synchronic variation in the lexical and grammatical environments of the German lexical verb verdienen ‘earn’, ‘deserve’. In its lexical uses, verdienen co-occurs with an object noun phrase whose head is either concrete (e.g. Geld ‘money’) or, more commonly, abstract (e.g. Beachtung ‘attention’). When it is used more grammatically with deontic modal meaning, verdienen is followed by a passive or active infinitive. This paper uses collostructional analyses to contrast lexical and grammatical uses in terms of the most strongly attracted lexical items, which are grouped into semantic classes. The results reflect different degrees of host-class expansion (cf. Himmelmann 2004), whereby the collexemes of verdienen expand from concrete to abstract and their morpho-syntactic contexts from nominal to infinitival complement and subsequently from passive to active. Synchronic distribution can thus serve as a window on diachronic development (Kuteva 2001), in this case the rise of a deontic modality marker.
Verbs may be attributed to higher agency than other grammatical categories. In Study 1, we confirmed this hypothesis with archival datasets comprising verbs (N = 950) and adjectives (N = 2115). We then investigated whether verbs (vs. adjectives) increase message effectiveness. In three experiments presenting potential NGOs (Studies 2 and 3) or corporate campaigns (Study 4) in verb or adjective form, we demonstrate the hypothesized relationship. Across studies, (overall N = 721) grammatical agency consistently increased message effectiveness. Semantic agency varied across contexts by either increasing (Study 2), not affecting (Study 3), or decreasing (Study 4) the effectiveness of the message. Overall, experiments provide insights in to the meta-semantic effects of verbs – demonstrating how grammar may influence communication outcomes.
The prohibitive is typically defined as the negative imperative, i.e. it “implies making someone not do something, having the effect of forbidding, preventing, or restricting” (Aikhenvald, 2017: 3). This chapter focuses on the formation of the prohibitive in the languages of Daghestan and neighboring regions, analyzing two different aspects of the morphological coding: first, the verb form (especially whether it is an imperative form or not), and second, the type of negation marker/affix used. Based on this, the general encoding types are deduced. Additionally, the phonological form of the markers is shortly analyzed.
This article examines how the most frequent imperative forms of the verb to show in German (zeig mal) and Czech (ukaž) are deployed in object-centred sequences. Specifically, it focuses on smartphone-based showing activities as these were the main sequential environments of show imperatives in the datasets investigated. In both languages, the imperative form does not merely aim to elicit a responsive action from the smartphone holder (such as making the device available) but projects an individual course of action from the requester’s side in the form of an immediate visual inspection of the digital content. This inspection is carried out as part of a joint course of action, allowing the recipient to provide a more detailed response to a prior action. Therefore, this specific imperative form is proven to be cross-linguistically suited to technology-mediated inspection sequences.
Repeating the movements associated with activities such as drawing or sports typically leads to improvements in kinematic behavior: these movements become faster, smoother, and exhibit less variation. Likewise, practice has also been shown to lead to faster and smoother movement trajectories in speech articulation. However, little is known about its effect on articulatory variability. To address this, we investigate the extent to which repetition and predictability influence the articulation of the frequent German word “sie” [zi] (they). We find that articulatory variability is proportional to speaking rate and the duration of [zi], and that overall variability decreases as [zi] is repeated during the experiment. Lower variability is also observed as the conditional probability of [zi] increases, and the greatest reduction in variability occurs during the execution of the vocalic target of [i]. These results indicate that practice can produce observable differences in the articulation of even the most common gestures used in speech.
Mock fiction is a genre of humorous, fictional narratives. It is pervasive in adolescents’ peer-group interaction. Building on a corpus of informal peer-group interaction among 14 to 17 year-old German adolescents, it is shown how mock fiction is used to sanction identity-claims of peer-group co-members that are taken to be inadequate by the teller of a mock fiction. Mock fiction exposes and ridicules those claims by fictional exaggeration. Mock fiction is an indirect, yet sometimes even highly abusive means for criticizing and negotiating identities and statuses of peer-group members. The analysis shows how mock fiction is collaboratively produced, how it is used to convey criticism and to negotiate social norms indirectly, and how, in addition, it allows for performative self-positioning of the tellers as skilled, entertaining tellers and socio-psychological diagnosticians.
This study investigates how driving school instructors adapt their instructions to constraints and affordances of different activity types. Adopting a Conversation Analytic approach and building on a comparative corpus of theoretical and practical driving lessons in German, it compares sequences of instructions of the execution of the “shoulder check” (i.e., checking the blind spot) in stationary theoretical versus mobile practical driving lessons. In theoretical lessons, the instructor uses vivid and humorous embodied instructions. In practical driving lessons, the instructor orients to the complex multi‐activity and delivers instructions in a succinct manner, considering the students’ previous knowledge and the embeddedness into the global tasks. The paper shows how instructional practices are sensitive to contextual contingencies which they reflect and treat by their situated design.
Control, typically defined as a specific referential dependency between the null-subject of a non-finite embedded clause and a co-dependent of the matrix predicate, has been subject to extensive research in the last 50 years. While there is a broad consensus that a distinction between Obligatory Control (OC), Non-Obligatory Control (NOC) and No Control (NC) is useful and necessary to cover the range of relevant empirical phenomena, there is still less agreement regarding their proper analyses. In light of this ongoing discussion, the articles collected in this volume provide a cross-linguistic perspective on central questions in the study of control, with a focus on non-canonical control phenomena. This includes cases which show NOC or NC in complement clauses or OC in adjunct clauses, cases in which the controlled subject is not in an infinitival clause, or in which there is no unique controller in OC (i.e. partial control, split control, or other types of controllers). Based on empirical generalizations from a wide range of languages, this volume provides insights into cross-linguistic variation in the interplay of different components of control such as the properties of the constituent hosting the controlled subject, the syntactic and lexical properties of the matrix predicate as well as restrictions on the controller, thereby furthering our empirical and theoretical understanding of control in grammar.
This study examines head nods produced as embodied and silent answers to polar questions before a transition relevance place has been reached. It discusses the notion of “response” and the ways in which the literature conceptualizes head nods. The analysis of video recordings of ordinary and institutional multiparty interactions shows that answer-nods rely on mutual gaze and that affirmative head nods may co-occur with other facial expressions (e.g., eye blinks). By replying with a silent head nod, respondents may complete an unfolding adjacency pair without claiming speakership, thereby enabling the questioner to extend their turn-in-progress. Alternatively, respondents may expand their answer-nod with talk, in which case silent nodding may contribute to organizing the smooth transition of turns-at-talk. Head nods produced while a question is unfolding are described as a microsequential phenomenon that may affect the questioner’s turn-in-progress. Data are in French and Italian.
This article explores the relation between word order and response latency, focusing on responses to question-word questions. Qualitative (multimodal) and quantitative analyses of naturally occurring conversations in French—where question-words can occur in initial, medial, or final position within the question—show that variation in word order affects the timing of responses. It is argued that this is so because word order provides a differential basis for action ascription, creating different temporal opportunities for projecting the recipient’s next relevant action. The frequent occurrence of early responses to questions with an initial question-word, in particular, stresses the importance of the recognition point of an action under way for response timing and shows respondents’ pervasive orientation to sequential progressivity. Findings highlight how lexico-syntactic trajectories of emergent turns, prior talk and actions, material and bodily features of interaction, and participants’ shared expectations conspire in shaping the time-courses of action ascription and action projection.
In this article, we provide longitudinal evidence for the progressive routinization of a grammatical construction used for social coordination purposes in a highly specialized activity context: task-oriented video-mediated interactions. We focus on the methodic ways in which, over the course of 4 years, a second language speaker and initially novice to such interactions coordinates the transition between interacting with her coparticipants and consulting her own screen, which suspends talk, without creating trouble due to halts in progressivity. Initially drawing on diverse resources, she increasingly resorts to the use of a prospective alert constructed around the verb to check (e.g., “I will check”), which eventually routinizes in the lexically specific form “let me check” as a highly context- and activity-bound social action format. We discuss how such change over the participant’s video-mediated interactional history contributes to our understanding of social coordination in video-mediated interaction and of participants’ recalibrating their grammar-for-interaction while adapting to new situations, languages, or media. Data are in English.
This study documents change over time and across proficiency levels in French second-language (L2) speakers’ practices for initiating complaints. Prior research has shown that speakers typically initiate complaints in a stepwise manner that indexes the contingent, moral, and delicate nature of the activity. Although elementary speakers in my data often launch complaint sequences in a straightforward way, they sometimes embodiedly foreshadow verbal expressions of negative stance or delay negative talk through brief positively valenced prefaces. More advanced speakers in part rely on the same initiation practices as elementary speakers. In addition, they recurrently use extensive prefatory work that accounts for and legitimizes the upcoming complaint, and they regularly initiate complaints jointly with coparticipants through a progressive escalation of negative stance expressions. I document interactional resources involved in this change and discuss the findings in terms of speakers’ development of L2 interactional competence. Data are in French with English translations.
The human ability to anticipate upcoming behavior not only enables smooth turn transitions but also makes early responses possible, as respondents use a variety of cues that provide for early projection of the type of action that is being performed. This article examines resources for projection in interaction in three unrelated languages—Finnish, Japanese, and Mandarin—in sequences where speakers make evaluative assertions on a topic. The focus is on independently agreeing responses initiated in early overlap. Our cross-linguistic analysis reveals that while projection based on the ongoing turn-constructional unit relies on language-specific grammatical constructions, projection based on the larger context seems to be less language-dependent. A crucial finding is that in the target sequences, stances taken toward the topic already during earlier talk, as well as other structural patterns, are among the resources that recipients use for projecting how and when the ongoing turn will end.
Focusing on request sequences, this article explores dynamics of projection and anticipation, enabling participants to produce early responses to requests. In particular, the analyses highlight the importance of multimodal formatting and the specific temporalities of multiple multimodal resources for the emergence of projections and the possibility to anticipate an ongoing action. Moreover, the analyses pinpoint the relevance of the local ecology and the praxeological context for the participants, enabling them to anticipate the next relevant action. These features characterizing the temporality of multimodal Gestalts, the relevance of the local ecology, and the details of the praxeological context make it possible for participants to produce very early responses and also to accomplish an action even before it has been actually requested.
There has been a long-standing interest in projection and the resources on which participants rely to produce and recognize the import and organization of turns at talk. Less attention has been paid to the character of the activity in which utterances form part and the ways in which embodied action enables the intelligibility, coordination, and in some cases, coproduction, of particular actions. In this article, we focus on specialized forms of embodied, institutional activity and focus in particular on simultaneity and the ways in which bodily action enables the progressive formation and reformation of an activity in the light of the (co)participants’ emerging contributions. We address how the routine structure of particular tasks enables participants to anticipate, prepare for, and even initiate actions in advance of the relevant activity and in turn, how participants may seek to ameliorate the interactional import of potentially premature action. The articles explores the interplay of technical practice and interactional organization and points to the distinctive character of embodied action in understanding anticipation and coordination in complex forms of institutional interaction.
This paper reports on the efforts of twelve national teams in building the International Comparable Corpus (ICC; https://korpus.cz/icc) that will contain highly comparable datasets of spoken, written and electronic registers. The languages currently covered are Czech, Finnish, French, German, Irish, Italian, Norwegian, Polish, Slovak, Swedish and, more recently, Chinese, as well as English, which is considered to be the pivot language. The goal of the project is to provide much-needed data for contrastive corpus-based linguistics. The ICC corpus is committed to the idea of re-using existing multilingual resources as much as possible and the design is modelled, with various adjustments, on the International Corpus of English (ICE). As such, ICC will contain approximately the same balance of forty percent of written language and 60 percent of spoken language distributed across 27 different text types and contexts. A number of issues encountered by the project teams are discussed, ranging from copyright and data sustainability to technical advances in data distribution.
This paper presents the QUEST project and describes concepts and tools that are being developed within its framework. The goal of the project is to establish quality criteria and curation criteria for annotated audiovisual language data. Building on existing resources developed by the participating institutions earlier, QUEST also develops tools that could be used to facilitate and verify adherence to these criteria. An important focus of the project is making these tools accessible for researchers without substantial technical background and helping them produce high-quality data. The main tools we intend to provide are a questionnaire and automatic quality assurance for depositors of language resources, both developed as web applications. They are accompanied by a knowledge base, which will contain recommendations and descriptions of best practices established in the course of the project. Conceptually, we consider three main data maturity levels in order to decide on a suitable level of strictness of the quality assurance. This division has been introduced to avoid that a set of ideal quality criteria prevent researchers from depositing or even assessing their (legacy) data. The tools described in the paper are work in progress and are expected to be released by the end of the QUEST project in 2022.
The article focuses on determining responsible parties and the division of potential liability arising from sharing language data (LD) containing personal data (PD). A key issue here is to identify who has to make sure and guarantee the GDPR compliance. The authors aim to answer 1) whether an individual researcher is a controller and 2) whether sharing LD results in joint controllership or separate controllership (whether the data's transferee becomes the controller, the joint controller or the processor). The article also analyses the legal relations of parties involved in data sharing and potential liability. The final section outlines data sharing in the CLARIN context. The analysis serves as a preliminary analytical background for redesigning the CLARIN contractual framework for sharing data.
Towards comprehensive definitions of data quality for audiovisual annotated language resources
(2021)
Though digital infrastructures such as CLARIN have been successfully established and now provide large collections of digital resources, the lack of widely accepted standards for data quality and documentation still makes re-use of research data a difficult endeavour, especially for more complex resource types. The article gives a detailed overview over relevant characteristics of audiovisual annotated language resources and reviews possible approaches to data quality in terms of their suitability for the current context. Conclusively, various strategies are suggested in order to arrive at comprehensive and adequate definitions of data quality for this specific resource type and possibly for digital language resources in general.
N-grams are of utmost importance for modern linguistics and language technology. The legal status of n-grams, however, raises many practical questions. Traditionally, text snippets are considered copyrightable if they meet the originality criterion, but no clear indicators as to the minimum length of original snippets exist; moreover, the solutions adopted in some EU Member States (the paper cites German and French law as examples) are considerably different. Furthermore, recent developments in EU law (the CJEU's Pelham decision and the new right of press publishers) also provide interesting arguments in this debate. The paper presents the existing approaches to the legal protection of n-grams and tries to formulate some clear guidelines as to the length of n-grams that can be freely used and shared.
Signposts for CLARIN
(2021)
An implementation of CMDI-based signposts and its use is presented in this paper. Arnold, Fisseni et al. (2020) present signposts as a solution to challenges in long-term preservation of corpora. Though applicable to digital resources in general, we focus on corpora, especially those that are continuously extended or subject to modification, e.g., due to legal injunctions, but also may overlap with respect to constituents, and may be subject to migrations to new data formats. We describe the contribution signposts can make to the CLARIN infrastructure, notably virtual collections, and document the design for the CMDI profile.
CMDI Explorer
(2021)
We present CMDI Explorer, a tool that empowers users to easily explore the contents of complex CMDI records and to process selected parts of them with little effort. The tool allows users, for instance, to analyse virtual collections represented by CMDI records, and to send collection items to other CLARIN services such as the Switchboard for subsequent processing. CMDI Explorer hence adds functionality that many users felt was lacking from the CLARIN tool space.
How do people’s interactional practices change over time? Can conversation analysis identify those changes, and if so, how? In this introductory article, we scrutinize the novel insights that can be gained from examining interactional practices over time and discuss the related methodological challenges for longitudinal CA. We first retrace CA’s interest in the temporality of social interaction and then review three lines of current CA work on change over time: developmental studies, studies of sociohistorical change, and studies of joint interactional histories. Existing work shows how the execution of locally coordinated actions and their meanings change over time; how prior actions inform future actions; and how resources, practices, and structures of joint action emerge over people’s repeated interactional encounters. We conclude by arguing that the empirical analysis of the microlevel organization of social interaction, which is the hallmark of CA, can elucidate the fine-grained situated interactional infrastructure that provides for the larger-scale social dynamics that have been of interest to other lines of research.
Taking the use of the esthetic term wabi sabi (Japanese compound noun) in a series of German- and English-language theater rehearsals as an example, this article studies the emergence of shared meanings and uses of an expression over an interactional history. We track how shared understandings and uses of wabi sabi develop over the course of a series of theater rehearsals. We focus on the practices by which understandings of wabi sabi are displayed, adopted, and negotiated. We discuss complexities and intransparencies of the manifestation of common ground in multiparty interactions and its relationship to the emergence of routine uses of the expression. Data are in English and German with English translation.
Information theory can be used to assess how efficiently a message is transmitted on the basis of different symbolic systems. In this paper, I estimate the information-theoretic efficiency of written language for parallel text data in more than 1000 different languages, both on the level of characters and on the level of words as information encoding units. The main results show that (i) the median efficiency is ∼29% on the character level and ∼45% on the word level, (ii) efficiency on both levels is strongly correlated with each other and (iii) efficiency tends to be higher for languages with more speakers.
In this paper we present an experimental semantic search function, based on word embeddings, for an integrated online information system on German lexical borrowings into other languages, the Lehnwortportal Deutsch (LWPD). The LWPD synthesizes an increasing number of lexicographical resources and provides basic cross-resource search options. Onomasiological access to the lexical units of the portal is a highly desirable feature for many research questions, such as the likelihood of borrowing lexical units with a given meaning (Haspelmath & Tadmor, 2009; Zeller, 2015). The search technology is based on multilingual pre-trained word embeddings, and individual word senses in the portal are associated with word vectors. Users may select one or more among a very large number of search terms, and the database returns lexical items with word sense vectors similar to these terms. We give a preliminary assessment of the feasibility, usability and efficacy of our approach, in particular in comparison to search options based on semantic domains or fields.
Digital humanities research under United States and European copyright laws. Evolving frameworks
(2021)
This chapter summarizes the current state of copyright laws in the United States and European Union that most affect Digital Humanities research, namely the fair use doctrine in the US and research exceptions in Europe, including the Directive on Copyright in the Digital Single Market, which has been finally adopted in 2019. This summary begins with a description of recent copyright advances most relevant to DH research, and finishes with an analysis of a significant remaining legal hurdle which DH researchers face: how do fair use and research exceptions deal with the critical issue of circumventing technological protection measures (TPM, a.k.a. DRM). Our discussion of the lawful means of obtaining TPM-protected material may contribute to both current DH research and planning decisions and inform future stakeholders and lawmakers of the need to allow TPM circumvention for academic research.
The General Data Protection Regulation (GDPR) on personal data protection in the European Union entered into application on 25 May 2018. With its 173 recitals and 99 articles, it may be one of the most ambitious pieces of EU legislation to date. Rather than a guide to GDPR compliance for Digital Humanities researchers, this chapter looks at the use of personal data in DH projects from the data subject’s perspective, and examines to what extent the GDPR kept its promise of enabling the data subject to “take control of his data”. The chapter provides an overview of the right to privacy and the right to data protection, a discussion of the relation between the concept of data control and privacy and data protection law, an introduction to the GDPR, and an explanation of its relevance for scientific research in general and DH in particular. The main section of the chapter analyses two types of data control mechanisms (consent and data subject rights) and their impact on DH research.
This paper will address the challenge of creating a knowledge graph from a corpus of historical encyclopedias with a special focus on word sense alignment (WSA) and disambiguation (WSD). More precisely, we examine WSA and WSD approaches based on article similarity to link messy historical data, utilizing Wikipedia as aground-truth component – as the lack of a critical overlap in content paired with the amount of variation between and within the encyclopedias does not allow for choosing a ”baseline” encyclopedia to align the others to. Additionally, we are comparing the disambiguation performance of conservative methods like the Lesk algorithm to more recent approaches, i.e. using language models to disambiguate senses.
Negation raising and mood. A corpus-based study of Polish sądzić ‘think’ and wierzyć ‘believe’
(2021)
The paper describes the distribution of two negation raising predicates in Polish: sądzić ‛think’ and wierzyć ‛believe’ in the National Corpus of Polish with a particular focus on their morphosyntax and the mood of their clausal complements. The aim was to examine whether there are any correlations between these two parameters, and to what extent negation raising with those verbs exhibits performative features (in terms of Prince, 1976). The results of the study support the performative approach to negation raising as per Prince (1976) only for cases with subjunctive complements. The corpus findings further imply that Polish negation raising predicates encode two different degrees of (un)certainty concerning the truth of the embedded proposition depending on the mood of their complements. Structures with indicative complements express weaker uncertainty than structures with subjunctive complements.
We describe a simple procedure for the automatic creation of word-level alignments between printed documents and their respective full-text versions. The procedure is unsupervised, uses standard, off-the-shelf components only, and reaches an F-score of 85.01 in the basic setup and up to 86.63 when using pre- and post-processing. Potential areas of application are manual database curation (incl. document triage) and biomedical expression OCR.
We are witnessing an emerging digital revolution. For the past 25–30 years, at an increasing pace, digital technologies—especially the internet, mobile phones and smartphones—have transformed the everyday lives of human beings. The pace of change will increase, and new digital technologies will become even more tightly entangled in human everyday lives. Artificial intelligence (AI), the Internet of Things (IoT), 6G wireless solutions, virtual reality (VR), augmented reality (AR), mixed reality (XR), robots and various platforms for remote and hybrid communication will become embedded in our lives at home, work and school.
Digitalisation has been identified as a megatrend, for example, by the OECD (2016; 2019). While digitalisation processes permeate all aspects of life, special attention has been paid to its impact on the ageing population, everyday communication practices, education and learning and working life. For example, it has been argued that digital solutions and technologies have the potential to improve quality of life, speed up processes and increase efficiency. At the same time, digitalisation is likely to bring with it unexpected trends and challenges. For example, AI and robots will doubtlessly speed up or take over many routine-based work tasks from humans, leading to the disappearance of certain occupations and the need for re-education. This, in turn, will lead to an increased demand for skills that are unique to humans and that technologies are not able to master. Thus, developing human competences in the emerging digital era will require not only the mastering of new technical skills, but also the advancement of interpersonal, emotional, literacy and problem-solving skills.
It is important to identify and describe the digitalisation phenomena—pertaining to individuals and societies—and seek human-centric answers and solutions that advance the benefits of and mitigate the possible adverse effects of digitalisation (e.g. inequality, divisions, vulnerability and unemployment). This requires directing the focus on strengthening the human skills and competences that will be needed for a sustainable digital future. Digital technologies should be seen as possibilities, not as necessities.
There is a need to call attention to the co-evolutionary processes between humans and emerging digital technologies—that is, the ways in which humans grow up with and live their lives alongside digital technologies. It is imperative to gain in-depth knowledge about the natural ways in which digital technologies are embedded in human everyday lives—for example, how people learn, interact and communicate in remote and hybrid settings or with artificial intelligence; how new digital technologies could be used to support continuous learning and understand learning processes better and how health and well-being can be promoted with the help of new digital solutions.
Another significant consideration revolves around the co-creation of our digital futures. Important questions to be asked are as follows: Who are the ones to co-create digital solutions for the future? How can humans and human sciences better contribute to digitalisation and define how emerging technologies shape society and the future? Although academic and business actors have recently fostered inclusion and diversity in their co-creation processes, more must be done. The empowerment of ordinary people to start acting as active makers and shapers of our digital futures is required, as is giving voice to those who have traditionally been silenced or marginalised in the development of digital technology. In the emerging co-creation processes, emphasis should be placed on social sustainability and contextual sensitivity. Such processes are always value-laden and political and intimately intertwined with ethical issues.
Constant and accelerating change characterises contemporary human systems, our everyday lives and the environment. Resilience thinking has become one of the major conceptual tools for understanding and dealing with change. It is a multi-scalar idea referring to the capacity of individuals and human systems to absorb disturbances and reorganise their functionality while undergoing a change. Based on the evolving new digital technologies, there is a pressing need to understand how these technologies could be utilised for human well-being, sustainable lifestyles and a better environment. This calls for analysing different scales and types of resilience in order to develop better technology-based solutions for human-centred development in the new digital era.
This white paper is a collaborative effort by researchers from six faculties and groups working on questions related to digitalisation at the University of Oulu, Finland. We have identified questions and challenges related to the emerging digital era and suggest directions that will make possible a human-centric digital future and strengthen the competences of humans and humanity in this era.
In this paper, the meaning and processing of the German conditional connectives (CCs) such as wenn ‘if’ and nur wenn ‘only if’ are investigated. In Experiment 1, participants read short scenarios containing a conditional sentence (i.e., If P, Q.) with wenn/nur wenn ‘if/only if’ and a confirmed or negated antecedent (i.e., P/not-P), and subsequently completed the final sentence about Q (with or without negation). In Experiment 2, participants rated the truth or falsity of the consequent Q after reading a conditional sentence with wenn or nur wenn and a confirmed or negated antecedent (i.e., If P, Q. P/not-P. // Therefore, Q?). Both experiments showed that neither wenn nor nur wenn were interpreted as biconditional CCs. Modus Ponens (If P, Q. P. // Therefore, Q) was validated for wenn, whereas it was not validated in the case of nur wenn. While Denial of the Antecedent (If P, Q. not-P. // Therefore, not-Q.) was validated in the case of nur wenn, it was not validated for wenn. The same method was used to test wenn vs. unter der Bedingung, dass ‘on condition that’ in Experiment 3, and wenn vs. vorausgesetzt, dass ‘provided that’ in Experiment 4. Experiment 5, using Affirmation of the Consequent (If P, Q. Q. // Therefore, P.) to test wenn vs. nur wenn replicated the results of Experiment 2. Taken together, the results show that in German, unter der Bedingung, dass is the most likely candidate of biconditional CCs whereas all others are not biconditional. The findings, in particular of nur wenn not being semantically biconditional, are discussed based on available formal analyses of conditionals.
In conversation, speakers need to plan and comprehend language in parallel in order to meet the tight timing constraints of turn taking. Given that language comprehension and speech production planning both require cognitive resources and engage overlapping neural circuits, these two tasks may interfere with one another in dialogue situations. Interference effects have been reported on a number of linguistic processing levels, including lexicosemantics. This paper reports a study on semantic processing efficiency during language comprehension in overlap with speech planning, where participants responded verbally to questions containing semantic illusions. Participants rejected a smaller proportion of the illusions when planning their response in overlap with the illusory word than when planning their response after the end of the question. The obtained results indicate that speech planning interferes with language comprehension in dialogue situations, leading to reduced semantic processing of the incoming turn. Potential explanatory processing accounts are discussed.
Playing videogames is a popular social activity; people play videogames in different places, on different media, in different situations, alone or with partners, online or offline. Unsurprisingly, they thereby share space (physically or virtually) with other playing or non-playing people. The special issue investigates through different contexts and settings how non-players become participants of the gaming interaction and how players and non-players co-construct presence. The introduction provides a problem-related context for the individual contributions and then briefly presents them.
The term “pivot” usually refers to two overlapping syntactic units such that the completion of the first unit simultaneously launches the second. In addition, pivots are generally said to be characterized by the smooth prosodic integration of their syntactic parts. This prosodic integration is typically achieved by prosodic-phonetic matching of the pivot components. As research on such turns in a range of languages has illustrated, speakers routinely deploy pivots so as to be able to continue past a point of possible turn completion, in the service of implementing some additional or revised action. This article seeks to build on, and complement, earlier research by exploring two issues in more detail as follows: (1) what exactly do pivotal turn extensions accomplish on the action dimension, and (2) what role does prosodic-phonetic packaging play in this? We will show that pivot constructions not only exhibit various degrees of prosodic-phonetic (non-)integration, i.e., differently strong cesuras, but that they can be ordered on a continuum, and that this cline maps onto the relationship of the actions accomplished by the components of the pivot construction. While tighter prosodic-phonetic integration, i.e., weak(er) cesuring, co-occurs with post-pivot actions whose relationship to that of the pre-pivot tends to be rather retrospective in character, looser prosodic-phonetic integration, i.e., strong(er) cesuring, is associated with a more prospective orientation of the post-pivot’s action. These observations also raise more general questions with regard to the analysis of action.
OKAY has been termed ‘a spectacular expression’ and ‘America’s greatest invention.’ This volume offers an in-depth empirical study of the uses that have resulted from its global spread. Focusing on actions and interactional practices, it investigates OKAY in a variety of settings in 13 languages. The collected work showcases the importance of a holistic analysis: prosodic realization and the placement of OKAY in its larger sequential and multimodal context emerge as constitutive for distinct uses in individual languages. An inductive approach makes it possible to identify practices not previously documented, for example OKAY used for ‘qualified acceptance’ or as a ‘continuer’, and to document a core of recurrent, similar uses across languages. This work also outlines new research directions for comparative analysis by offering first insights into the diachronic development of OKAY’s uses and the relationship of OKAY to other particles in specific languages.
We propose to use abusive emojis, such as the “middle finger” or “face vomiting”, as a proxy for learning a lexicon of abusive words. Since it represents extralinguistic information, a single emoji can co-occur with different forms of explicitly abusive utterances. We show that our approach generates a lexicon that offers the same performance in cross-domain classification of abusive microposts as the most advanced lexicon induction method. Such an approach, in contrast, is dependent on manually annotated seed words and expensive lexical resources for bootstrapping (e.g. WordNet). We demonstrate that the same emojis can also be effectively used in languages other than English. Finally, we also show that emojis can be exploited for classifying mentions of ambiguous words, such as “fuck” and “bitch”, into generally abusive and just profane usages.
Digital research infrastructures can be divided into four categories: large equipment, IT infrastructure, social infrastructure, and information infrastructure. Modern research institutions often employ both IT infrastructure and information infrastructure, such as databases or large-scale research data. In addition, information infrastructure depends to some extent on IT infrastructure. In this paper, we discuss the IT, information, and legal infrastructure issues that research institutions face.
Research on multimodal interaction has shown that simultaneity of embodied behavior and talk is constitutive for social action. In this study, we demonstrate different temporal relationships between verbal and embodied actions. We focus on uses of German darf/kann ich? (“may/can I?”) in which speakers initiate, or even complete the embodied action that is addressed by the turn before the recipient’s response. We argue that through such embodied conduct, the speaker bodily enacts high agency, which is at odds with the low deontic stance they express through their darf/kann ich?-TCUs. In doing so, speakers presuppose that the intersubjective permissibility of the action is highly probable or even certain. Moreover, we demonstrate how the speaker’s embodied action, joint perceptual salience of referents, and the projectability of the action addressed with darf/kann ich? allow for a lean syntactic design of darf/kann ich?-TCUs (i.e., pronominalization, object omission, and main verb omission). Our findings underscore the reflexive relationship between lean syntax, sequential organization and multimodal conduct.
Social actions
(2021)
Social actions are recipient-designed actions that occur in the context of interaction sequences. This chapter focuses on sources and practices for the formation and ascription of social actions. While linguists stress the relevance of linguistic social action formats, conversation analysts highlight the relevance of the sequential position of an action, and sociolinguists point to the influence of social identities for action-formation and -ascription. The combination of these three approaches helps us to solve the analytic problem of indirectness, which, however, only rarely becomes a problem for the participants in an interaction themselves. Social properties which recurrently apply when using verbal and bodily resources of action-formation, i.e. the social actions themselves, inferred meanings, projected next actions, the participation framework, the activity type, speaker’s stance, participants’ identities, etc. lead to stable pragmatic connotations of those forms, i.e. action-meanings, which become idiomatic and part of our common-sense competence. Still, social actions are multi-layered and can be ambiguous at times. Therefore, their meaning can be open for negotiation. Intersubjectivity of action ascription is ultimately secured neither by conventions nor by speaker’s intentions, but is accomplished by their treatment in subsequent discourse.
Implicitly abusive language – What does it actually look like and why are we not getting there?
(2021)
Abusive language detection is an emerging field in natural language processing which has received a large amount of attention recently. Still the success of automatic detection is limited. Particularly, the detection of implicitly abusive language, i.e. abusive language that is not conveyed by abusive words (e.g. dumbass or scum), is not working well. In this position paper, we explain why existing datasets make learning implicit abuse difficult and what needs to be changed in the design of such datasets. Arguing for a divide-and-conquer strategy, we present a list of subtypes of implicitly abusive language and formulate research tasks and questions for future research.
This paper explores how attitudes affect the seemingly objective process of counting speakers of varieties using the example of Low German, Germany’s sole regional language. The initial focus is on the basic taxonomy of classifying a variety as a language or a dialect. Three representative surveys then provide data for the analysis: the Germany Survey 2008, the Northern Germany Survey 2016, and the Germany Survey 2017. The results of these surveys indicate that there is no consensus concerning the evaluation of Low German’s status and that attitudes towards Low German are related to, for example, proficiency in the language. These attitudes are shown to matter when counting speakers of Low German and investigating the status it has been accorded.
The European language world is characterized by an ideology of monolingualism and national languages. This language-related world view interacts with social debates and definitions about linguistic autonomy, diversity, and variation. For the description of border minorities and their sociolinguistic situation, however, this view reaches its limits. In this article, the conceptual difficulties with a language area that crosses national borders are examined. It deals with the minority in East Lorraine (France) in particular. On the language-historical level, this minority is closely related to the language of its (big) neighbor Germany. At the same time, it looks back on a conflictive history with this country, has never filled a (subordinated) political–administrative unit, and has experienced very little public support. We want to address the questions of how speakers themselves reflect on their linguistic situation and what concepts and argumentative figures they bring up in relation to what (Germanic) variety. To this end, we look at statements from guideline-based interviews. In the paper, we present first observations gained through qualitative content analysis.
The paper explores factors that influence the distribution of constituent words of compounds over the head and modifier position. The empirical basis for the study is a large database of German compounds, annotated with respect to the morphological structure of the compound and the semantic category of the constituents. The study shows that the polysemy of the constituent word, its constituent family size, and its semantic category account for tendencies of the constituent word to occur in either modifier or head position. Furthermore, the paper explores the degree to which the semantic category combination of head and modifier word, e.g., x=substance and y=artifact, indicates the semantic relation between the constituents, e.g., y_consists_of_x.
We present empirical evidence of the communicative utility of conventionalization, i.e., convergence in linguistic usage over time, and diversification, i.e., linguistic items acquiring different, more specific usages/meanings. From a diachronic perspective, conventionalization plays a crucial role in language change as a condition for innovation and grammaticalization (Bybee, 2010; Schmid, 2015) and diversification is a cornerstone in the formation of sublanguages/registers, i.e., functional linguistic varieties (Halliday, 1988; Harris, 1991). While it is widely acknowledged that change in language use is primarily socio-culturally determined pushing towards greater linguistic expressivity, we here highlight the limiting function of communicative factors on diachronic linguistic variation showing that conventionalization and diversification are associated with a reduction of linguistic variability. To be able to observe effects of linguistic variability reduction, we first need a well-defined notion of choice in context. Linguistically, this implies the paradigmatic axis of linguistic organization, i.e., the sets of linguistic options available in a given or similar syntagmatic contexts. Here, we draw on word embeddings, weakly neural distributional language models that have recently been employed to model lexical-semantic change and allow us to approximate the notion of paradigm by neighbourhood in vector space. Second, we need to capture changes in paradigmatic variability, i.e. reduction/expansion of linguistic options in a given context. As a formal index of paradigmatic variability we use entropy, which measures the contribution of linguistic units (e.g., words) in predicting linguistic choice in bits of information. Using entropy provides us with a link to a communicative interpretation, as it is a well-established measure of communicative efficiency with implications for cognitive processing (Linzen and Jaeger, 2016; Venhuizen et al., 2019); also, entropy is negatively correlated with distance in (word embedding) spaces which in turn shows cognitive reflexes in certain language processing tasks (Mitchel et al., 2008; Auguste et al., 2017). In terms of domain we focus on science, looking at the diachronic development of scientific English from the 17th century to modern time. This provides us with a fairly constrained yet dynamic domain of discourse that has witnessed a powerful systematization throughout the centuries and developed specific linguistic conventions geared towards efficient communication. Overall, our study confirms the assumed trends of conventionalization and diversification shown by diachronically decreasing entropy, interspersed with local, temporary entropy highs pointing to phases of linguistic expansion pertaining primarily to introduction of new technical terminology.