Refine
Year of publication
Document Type
- Part of a Book (72)
- Article (7)
- Conference Proceeding (2)
- Preprint (1)
- Review (1)
Language
- English (83) (remove)
Has Fulltext
- yes (83) (remove)
Keywords
- Deutsch (21)
- Konversationsanalyse (13)
- Interaktion (10)
- Korpus <Linguistik> (10)
- Sprachpolitik (9)
- Annotation (7)
- Computerlinguistik (7)
- Semantik (7)
- Mehrsprachigkeit (6)
- Gesprochene Sprache (5)
Publicationstate
- Postprint (83) (remove)
Reviewstate
- (Verlags)-Lektorat (83) (remove)
Publisher
- Benjamins (29)
- Springer (14)
- Oxford University Press (8)
- Equinox (4)
- Palgrave Macmillan (4)
- Buske (2)
- Multilingual Matters (2)
- Routledge (2)
- Routledge, Taylor & Francis Group (2)
- Asgard (1)
Latvia
(2019)
This chapter deals with current issues in bilingual education in the framework of language and educational policies in Latvia, and also outlines similarities or common tendencies in the two other Baltic states, Estonia and Lithuania. As commonly understood in the 21st century, the term ‘bilingual education’ includes ‘multilingual education, as the umbrella term to cover a wide spectrum of practice and policy’ (García, 2009: 9).
The idea of this article is to take the immaterial and somehow ethereal nature of aesthetic concepts seriously by asking how aesthetic concepts are negotiated and thus formed in communication. My examples come from theatrical production where aesthetic decisions naturally play a major role. In the given case, an aesthetic concept is introduced with which only the director, but none of the actors is familiar in the beginning of the rehearsals. The concept, Wabi Sabi, comes from Japanese culture. As the whole rehearsal process was video recorded, it is possible to track the process of how the concept is negotiated and acquired over time. So, instead of defining criteria what Wabi Sabi as an aesthetic concept “consists of,” this article seeks to show how the concept is introduced, explained and “used” within a practical context, in this case a theater rehearsal. In contrast to conventional models of aesthetic experience, I am interested in the ways in which an aesthetic concept is configured in and through socially organized interaction, and — vice versa — how that interaction contributes to the situational accomplishment of the same concept. In short: I am interested in the “doing” of aesthetic concepts, especially in “doing Wabi Sabi.”
The ubiquity of smartphones has been recognised within conversation analysis as having an impact on conversational structures and on the participants’ interactional involvement. However, most of the previous studies have relied exclusively on video recordings of overall encounters and have not systematically considered what is taking place on the device. Due to the personal nature of smartphones and their small displays, onscreen activities are of limited visibility and are thus potentially opaque for both the co-present participants (“participant opacity”) and the researchers (“analytical opacity”). While opacity can be an inherent feature of smartphones in general, analytical opacity might not be desirable for research purposes. This chapter discusses how a recording set-up consisting of static cameras, wearable cameras and dynamic screen captures allowed us to address the analytical opacity of mobile devices. Excerpts from multi-source video data of everyday encounters will illustrate how the combination of multiple perspectives can increase the visibility of interactional phenomena, reveal new analytical objects and improve analytical granularity. More specifically, these examples will emphasise the analytical advantages and challenges of a combined recording set-up with regard to smartphone use as multiactivity, the role of the affordances of the mobile device, and the prototypicality and “naturalness” of the recorded practices.
Introduction
(2023)
We argue that properties with a nominal origin get transferred regularly in certain Gentian particle verb constructions to properties that are propositional insofar as they characterize the temporal structure of eventualities, understood to be described by propositional (= truth-assessable) representations of state changes. Accordingly, the oft-noted perfectivizing function of certain verbal particles like ein- in einfahren ('pull in', cf. Kühnhold 1972) is the effect of redressing a conflict at the syntax-semantics interface: On the one hand, constructions like in [die Grube]acc einfahren ('pull into the mine’) exhibit transitive syntax (Gehrke 2008), requiring that the syntactic arguments be mapped onto well-distinguished or DIFFERENT referents in the semantics (Kemmer 1993). On the other hand, in/ein codes a spatio-temporal inclusion relation between its relata, contradicting the requirement imposed by the transitive syntax. Following Brandt (2019), we submit that the interface executes a manoeuvre that delays the interpretation of part of the contradiction-inducing DIFFERENCE feature. It is not locally interpreted (semantically represented) in toto but in part passed on to the next syntactic-semantic computational cycle. Here, the passed-on meaning is interpreted in the locally customary terms, in the case at hand, as a temporal index where the post-state of the depicted eventuality does not hold.
Speech islands are historically and developmentally unique and will inevitably disappear within the next decades. We urgently need to preserve their remains and exploit what is left in order to make research on language-in-contact and historical as well as current comparative language research possible.
The Archive for Spoken German (AGD) at the Institute for German Language collects, fosters and archives data from completed research projects and makes them available to the wider research community.
Besides large variation corpora and corpora of conversational speech, the archive already contains a range of collections of data on German speech minorities. The latter will be outlined in this chapter. Some speech island data is already made available through the personal service of the AGD, or the database of spoken German (DGD), e.g. data on Australian German, Unserdeutsch, or German in North America. Some corpora are still being prepared for publication, but still important to document for potentially interested research projects. We therefore also explain the current problems and efforts related to the curation of speech island data, from the digitization of recordings and the collection of metadata, to the integration of transcriptions, annotations and other ways of accessing and sharing data.
Morphophonological asymmetries in affixation concern systematic correlations between morphological properties of affixes (e.g. combination with bound versus free stems, position relative to stem (suffixes versus prefixes)) and their phonological properties (e.g. stress behaviour). The arguably most insightful approach to capturing relevant asymmetries invokes a notion of affix coherence, first introduced by Dixon in connection with his work on Yidiɲ, a nearly extinct language spoken in Northern Australia. This notion is based on a categorical division of affixes into ones that integrate into the phonological word of the stem and ones that do not. The integration of affixes is envisioned as being fully determined by phonological and morphological structure in a given language and verifiable by diagnostics relevant to phonological word domains (primarily the syllable and the foot structure). The assumption of two types of prosodic domains characterized by integrated versus non-integrated affixes is manifest in consistent asymmetries that pertain to morphophonological, phonological, and phonetic rules. This consistency constitutes compelling evidence for the structure-based analysis of the impact of various affixes on derived words, as opposed to alternative approaches to capturing these effects by associating affixes with diacritics (morpheme versus word boundary, class 1 versus class 2, stratum 1 versus stratum 2). The present entry aims to demonstrate, mostly on the basis of data from Germanic languages, the breadth of the empirical evidence in support of a fundamental role of affix coherence. Moreover, it aims to draw attention to the various implications of affix coherence for modeling relevant generalizations, in particular the necessary reference to a level of phonological representation characterized by a specific degree of abstractness (‘phonemic’).
In this chapter, we will investigate smartphone-based showing sequences in everyday social encounters, that is, moments in which a personal mobile device is used for presenting (audio-)visual content to co-present participants. Despite a growing interest in object-centred sequences and mundane technology use, detailed accounts of the sequential, multimodal, and material dimensions of showing sequences are lacking. Based on video data of social interactions in different languages and on the framework of multimodal interaction analysis, this chapter will explore the link between mobile device use and social practices. We will analyse how smartphone showers and their recipients coordinate the manipulation of a technological object with multiple courses of action, and reflect upon the fundamental complexity of this by-now routine joint activity.
This chapter will present results of a linguistic landscape (LL) project in the regional centre of Rēzekne in the region of Latgale in Eastern Latvia. Latvia was de facto a part of the Soviet Union until 1991, and this has given it a highly multilingual society. In the essentially post-colonial situation since 1991, strict language policies have been in place, which aim to reverse the language shift from Russian, the dominant language of Soviet times, back to Latvian. Thus, the main interests of the research were how the complex pattern of multilingualism in Latvia is reflected in the LL; how people relate to current language legislation; and what motivations, attitudes and emotions inform their behaviour.
The establishment of Scottish Parliament: What difference does it make for the Gaelic language?
(2004)
After the Labour government takeover in Westminster in 1997, followed by the referendum on establishing a Scottish Parliament, hopes for more support for the Gaelic language in Scotland were nourished. In the election campaign to the Scottish Parliament in 1999, all parties which were elected to Parliament had mentioned Gaelic, and all parties except the Conservatives had promised an increase in support for Gaelic (cf. Scottish parties’ election manifestoes, obtainable from the parties or via their web sites). Now that the new Scottish Executive, formed by Labour and the Liberal Democrats, has been in power for some time, it is interesting to see if these hopes have been fulfilled.
The two core questions of this paper will thus be:
1. What is the status of Scottish Gaelic after the devolution process?
2. What difference does the existence of the Scottish Parliament make for the status of Gaelic?
It is important to note that this paper refers to language status and Gaelic’s position from a mere language policy perspective. The results are mostly based on an analysis of Parliament documents, the method of investigation being strictly philological. Empirical research has not yet been undertaken. The reference time of my paper will be the first year of Scottish Parliament and the new executive. Even though this is an arbitrary time break, the first year is a symbolic point of time. As the first legislation period as a possibly more natural reference point is not over yet, this choice seems legitimate.
This chapter analyses the impact of political decentralization in a state on the position of ethnic and linguistic minorities, in particular with regard to the role of parliamentary assemblies in the political system. It relates a number of typical functions of parliaments to the specific needs of minorities and their languages. The most important of these functions are the representation of the minority and responsiveness to the minority’s needs. The chapter then discusses six examples from the European Union (and Norway) which prototypically represent different types of parliamentary decentralization: the ethnically defined Sameting in Norway and its importance for the Sámi population, the Scottish Parliament and its role for speakers of Scottish Gaelic, the German regional parliaments of the Länder of Schleswig-Holstein and Saxony and their impact on the Frisian and Sorbian minorities respectively, the autonomy of predominantly German-speaking South Tyrol within the Italian state, and finally the situation of the speakers of Latgalian in Latvia, where a decentralized parliament is missing. The chapter also makes suggestions on comparisons of these situations with minorities in Russia. It finally argues that political decentralization may indeed empower minorities to gain a greater voice in their states, even if much ultimately depends on individual factors in each situation and the attitudes by the majority population and the political center.
This chapter investigates policies which shape the role of the German language in contemporary Estonia. Whereas German played for many centuries an important role as the language of the economic and cultural elite in Estonia, it severely declined in importance throughout the twentieth century. Mirrored on this historical background, the paper provides an overview of the current functions of German and attitudes towards it and it discusses how these functions and attitudes are influenced by policies of various actors from inside and outside Estonia. The paper argues that German continues to play a significant role: while German is no longer a lingua franca, it still enjoys a number of functions and prestige in clearly defined niches involving communication within German-speaking circles or between Estonians and Germans. The interplay of language policies of the Estonian and the German-speaking states as well as by semi-state and private institutions succeed in maintaining German as an additional language in contemporary Estonia.
This chapter introduces readers to the context and concept of this volume. It starts by providing an historical overview of languages and multilingualism in Lithuania, Estonia and Latvia, highlighting the 100th anniversary of statehood which the three Baltic states are celebrating in 2018. Then, the chapter briefly presents important strands of research on multilingualism in the region throughout the past decades; in particular, questions about language policies and the status of the national languages (Estonian, Latvian and Lithuanian) and Russian. It also touches on debates about languages in education and the roles of other languages such as the regional languages of Latgalian and Võro and the changing roles of international languages such as English and German. The chapter concludes by providing short summaries of the contributions to this book.
Studies on the Linguistic Landscapes (LLs) investigate frequencies, functions, and power relations between languages and their speakers in public space. Research on the LL thereby aims to understand how the production and perception of signs reflect and simultaneously shape realities. In this sense, the LL is one of the most dynamic places where processes of minoritization take place: the (in)visibility of minority languages and the functional and symbolic relationships to majority languages are in direct relationship with negotiations of minorities’ place in society. This chapter looks at minority languages in the LL from two major perspectives. Firstly, it discusses language policies, focussing on which policy categories and which domains of language use are of particular relevance for understanding minority languages in the LL. Then, it turns to issues of conflict, contestation, and exclusion by providing examples from a range of geographically and typologically prototypical case studies, including Israel, Canada, Belgium, the Basque Country, and Friesland.
This paper investigates synchronic variation in the lexical and grammatical environments of the German lexical verb verdienen ‘earn’, ‘deserve’. In its lexical uses, verdienen co-occurs with an object noun phrase whose head is either concrete (e.g. Geld ‘money’) or, more commonly, abstract (e.g. Beachtung ‘attention’). When it is used more grammatically with deontic modal meaning, verdienen is followed by a passive or active infinitive. This paper uses collostructional analyses to contrast lexical and grammatical uses in terms of the most strongly attracted lexical items, which are grouped into semantic classes. The results reflect different degrees of host-class expansion (cf. Himmelmann 2004), whereby the collexemes of verdienen expand from concrete to abstract and their morpho-syntactic contexts from nominal to infinitival complement and subsequently from passive to active. Synchronic distribution can thus serve as a window on diachronic development (Kuteva 2001), in this case the rise of a deontic modality marker.
Meta-communicative practices are generally reflexive in a fairly obvious sense: Inasmuch as speakers use them to talk about or comment on earlier/subsequent talk, they use language self-reflexively. In this paper, we explore a practice that is reflexive not only in this meta-communicative sense but also in a sequential-interactional one: Prefacing a conversational turn with I was gonna say. We show that the I was gonna say-preface furnishes the following general semantic-pragmatic affordances: (1) It retroactively relates the speaker’s subsequent talk to preceding talk from a co-participant, (2) it embodies a claim to prior, now-preempted, communicative intent with regard to what their co-participant has (just) said/done, (3) it therefore displays its speaker’s orientation to the relevance or the appropriate placement of the action(s) done in their own subsequent talk at an earlier moment in the interaction, and (4) it reflexively re-invokes, or retrieves, this earlier moment as the relevant sequential context for their action(s). We then go on to illustrate how speakers draw on these sequentially reflexive affordances for managing recurrent interactional contingencies in specific sequential environments. The paper ends with a discussion of the role that reflexivity plays in and for the deployment of this practice.
The article addresses Solution-Oriented Questions (SOQs) as an interactional practice for relationship management in psychodiagnostic interviews. Therapeutic alliance results from the concordance of alignment, as willingness to cooperate regarding common goals, and of affiliation, as relationship based upon trust. SOQs particularly allow for both: They are situated at the end of a troublesome topic area, which is linked to low agency on the patient’s side, and they reveal understanding of and interest in the patient. Following the paradigm of Conversation Analysis and German Gesprächsanalyse this paper analyzes the design and functions of SOQs as a means for securing and enhancing the relationship in the process of therapy. Our data comprise 15 videotaped first interviews following the manual of the Operationalized Psychodynamic Diagnostics. The analyses refer to all SOQs found but will be illustrated by means of a single conversation.
Null subjects (NSs) have been a central research topic in generative syntax ever since the 1980s. This chapter considers the situation of German NSs both from a dialectological and from a diachronic perspective and attempts to reconstruct a direct line concerning the licensing conditions of pro-drop from Old High German (OHG) through Middle High German (MHG) and Early New High German (ENHG) to current dialects of New High German (NHG). Particularly, we will argue that German changed from a consistent, yet asymmetric pro-drop language to a partial, but symmetric one. In order to demonstrate that this development took place and the steps involved, we survey the existing empirical evidence and introduce new data.
Based on conference reports and minutes, archive material and official documents, the article seeks to explore the way in which the promotion of women’s sports and of women in leadership positions became an important part of the sport policy of two major organizations involved in European sport cooperation: the Council of Europe and the European Sport Conference. During first and modest discussions in the 1960s and 1970s it constituted a rather paternalistic project. Also, it was based on the assumption of an essential difference between men and women concerning the need for participation in sport. This only changed since the beginning of the 1980s when women took the course in their own hands, challenged the underlying assumptions and created new networks of cooperation.
Since Lerner coined the notion of delayed completion in 1989, this recurrent social practice of continuing one’s speaking turn while disregarding an intermediate co-participant’s utterance has not been investigated with regard to embodied displays and actions. A sequential approach to videotaped mundane conversations in German will explain the occurrence and use of delayed completions. First, especially in multi-party and multi-activity settings, delayed completions can result from reduced monitoring and coordinating activities. Second, recipients can use intra-turn response slots for more extended responsive actions than the current speaker initially projected, leading to delayed completion sequences. Finally, delayed completions are used for blocking possibly misaligned co-participant actions. The investigation of visible action illustrates that delayed completions are a basic practice for retrospectively managing co-participant response slots.
The chapter on formats and models for lexicons deals with different available data formats of lexical resources. It elaborates on their structure and possible uses. Motivated by the restrictions in merging different lexical resources based on widely spread formalisms and international standards, a formal lexicon model for lexical resources is developed which is related to graph structures in annotations. For lexicons this model is termed the Lexicon Graph. Within this model the concepts of lexicon entries and lexical structures frequently described in the literature are formally defined and examples are given. The article addresses the problem of ambiguity in those formal terms. An implementation based on XML and XML technology such as XQuery for the defined structures is given. The relation to international standards is included as well.
Mock fiction is a genre of humorous, fictional narratives. It is pervasive in adolescents’ peer-group interaction. Building on a corpus of informal peer-group interaction among 14 to 17 year-old German adolescents, it is shown how mock fiction is used to sanction identity-claims of peer-group co-members that are taken to be inadequate by the teller of a mock fiction. Mock fiction exposes and ridicules those claims by fictional exaggeration. Mock fiction is an indirect, yet sometimes even highly abusive means for criticizing and negotiating identities and statuses of peer-group members. The analysis shows how mock fiction is collaboratively produced, how it is used to convey criticism and to negotiate social norms indirectly, and how, in addition, it allows for performative self-positioning of the tellers as skilled, entertaining tellers and socio-psychological diagnosticians.
In this paper we examine the composition and interactional deployment of suspended assessments in ordinary German conversation. We define suspended assessments as lexicosyntactically incomplete assessing TCUs that share a distinct cluster of prosodic-phonetic features which auditorily makes them come off as 'left hanging' rather than cut-off (e.g., Schegloff/Jefferson/Sacks 1977; Jasperson 2002) or trailing-off (e.g., Local/Kelly 1986; Walker 2012). Using CA/IL methodology (Couper-Kuhlen/Selting 2018) and drawing on a large body of video-recorded face-to-face conversations, we highlight the verbal, vocal and bodily-visual resources participants use to render such unfinished assessing TCUs recognizably incomplete and identify six recurrent usage types. Overall, the suspension of assessing TCUs appears to either serve as a practice for circumventing the production of assessments that are interactionally inapposite, or as a practice for coping with local contingencies that render the very doing of an assessment problematic for the speaker. Data are in German with English translations.
This paper discusses a theoretical and empirical approach to language fixedness that we have developed at the Institut für Deutsche Sprache (IDS) (‘Institute for German Language’) in Mannheim in the project Usuelle Worterbindungen(UWV) over the last decade. The analysis described is based on the Deutsches Referenzkorpus (‘German Reference Corpus’; DeReKo) which is located at the IDS. The corpus analysis tool used for accessing the corpus data is COSMAS II (CII) and – for statistical analysis – the IDS collocation analysis tool (Belica, 1995; CA). For detecting lexical patterns and describing their semantic and pragmatic nature we use the tool lexpan (or ‘Lexical Pattern Analyzer’) that was developed in our project. We discuss a new corpus-driven pattern dictionary that is relevant not only to the field of phraseology, but also to usage-based linguistics and lexicography as a whole.
In informal interaction, speakers rarely thank a person who has complied with a request. Examining data from British English, German, Italian, Polish, and Telugu, we ask when speakers do thank after compliance. The results show that thanking treats the other’s assistance as going beyond what could be taken for granted in the circumstances. Coupled with the rareness of thanking after requests, this suggests that cooperation is to a great extent governed by expectations of helpfulness, which can be long-standing, or built over the course of a particular interaction. The higher frequency of thanking in some languages (such as English or Italian) suggests that cultures differ in the importance they place on recognizing the other’s agency in doing as requested.
This study investigates the question of whether the processing of complex anaphors require more cognitive effort than the processing of NP-anaphors. Complex anaphors refer to abstract objects which are not introduced as a noun phrase and bring about the creation of a new discourse referent. This creation is called “complexation process”. We describe ERP findings which provide converging support for the assumption that the cognitive cost of this complexation process is higher than the cognitive cost of processing NP-anaphors.
Preface
(2010)
This paper deals with multiword lexemes (MWLs), focussing on two types of verbal MWLs: verbal idioms and support verb constructions. We discuss the characteristic properties of MWLs, namely nonstandard compositionality, restricted substitutability of components, and restricted morpho-syntactic flexibility, and we show how these properties may cause serious problems during the analysis, generation, and transfer steps of machine translation systems. In order to cope with these problems, MT lexicons need to provide detailed descriptions of MWL properties. We list the types of information which we consider the necessary minimum for a successful processing of MWLs, and report on some feasibility studies aimed at the automatic extraction of German verbal multiword lexemes from text corpora and machine-readable dictionaries.
This article discusses questions concerning the creation, annotation and sharing of spoken language corpora. We use the Hamburg Map Task Corpus (HAMATAC), a small corpus in which advanced learners of German were recorded solving a map task, as an example to illustrate our main points. We first give an overview of the corpus creation and annotation process including recording, metadata documentation, transcription and semi-automatic annotation of the data. We then discuss the manual annotation of disfluencies as an example case in which many of the typical and challenging problems for data reuse – in particular the reliability of interpretative annotations – are revealed.
This contribution deals with right-dislocated complement clauses with the subordinating conjunction dass (‘that’) in German talk-in-interaction. The bi-clausal construction we analyze is as follows: The first clause, in which one argument is realized by the demonstrative pronoun das (‘this/that’), is syntactically and semantically complete; the reference of the pronoun is (re-)specified by adding a dass-complement clause after a point of possible completion (e.g., aber das hab ich nich MITbekommen. (0.32) dass es da so YOUtubevideos gab. (‘But I wasn’t aware of that. That there were videos about that on YouTube.’). The first clause always performs a backward-oriented action (e.g., an assessment) and the second clause (re-)specifies the propositional reference of the demonstrative, allowing for a (strategic) perspective shift. Based on a collection of 93 cases from everyday conversations and institutional interactions, we found that the construction is used close to the turn-beginning for referring to and (re-)specifying (parts of) another speaker’s prior turn; turn-internal uses tie together parts of a speaker’s multi-unit turn. The construction thus facilitates an incremental constitution of meaning and reference.
This paper discusses the semi-formal language of mathematics and presents the Naproche CNL, a controlled natural language for mathematical authoring. Proof Representation Structures, an adaptation of Discourse Representation Structures, are used to represent the semantics of texts written in the Naproche CNL. We discuss how the Naproche CNL can be used in formal mathematics, and present our prototypical Naproche system, a computer program for parsing texts in the Naproche CNL and checking the proofs in them for logical correctness.
Introduction
(2010)
The analysis which we present in this paper is part of an ethnographically based sociolinguistic study of various immigrant youth groups and their social style of communication. The study describes the wide variety of migrant groups and their socio-cultural orientation in relation to different migrant worlds as well as to different social worlds of the dominant German society. The development of a social style of communication is grounded in the groups’ socio-cultural orientation as well as in the perception of themselves in relation to relevant others. The main purpose of our study is to analyse the construction of the groups’ social identity in terms of their social style of communication.
The main point of this chapter is to demonstrate how a speaker’s concept of his/her professional role can be inferred from his/her perspectival work (perspective setting and relating different perspectives to one another) in professional encounters. Thereby some risks of complex perspectival work in discourse will become manifest which result - at one point in the talk - in perspectival inconsistency, revealing a deeply grounded social problem for the speaker. This will be examined in the framework of a rhetorical conversation analysis.
Based on the empirical data of 97 fourth-graders from three districts of Braunschweig in Germany, this paper investigates the possibility of changing semantic frames in multilingual communities. The focus of study is the verb field of self-motion. In a free-sorting task involving 52 verbs, Turkish-speaking students, in particular, placed the verbs schleichen (‘to sneak’) and kommen (‘to come’) in the same group. When explaining the perceived similarity they also used the word schleichen (‘to sneak’), in a specific grammatical construction that is not found in Standard German. This paper suggests that semantic frames may change along with grammatical constructions when typologically distinct languages come into close contact.
Consistency of reference structures is an important issue in lexicography and dictionary research, especially with respect to information on sense-related items. In this paper, the systematic challenges of this area (e.g. ‘non-reversed reference’, bidirectional linking being realised as unidirectional structures) will be outlined, and the problems which can be caused by these challenges for both lexicographers and dictionary users will be discussed. The paper also discusses how text-technological Solutions may help to provide Support for the consistency of sense-related pairings during the process of compiling a dictionary.
This study examines what kind of cues and constraints for discourse interpretation can be derived from the logical and generic document structure of complex texts by the example of scientific journal articles. We performed statistical analysis on a corpus of scientific articles annotated on different annotations layers within the framework of XML-based multi-layer annotation. We introduce different discourse segment types that constrain the textual domains in which to identify rhetorical relation spans, and we show how a canonical sequence of text type structure categories is derived from the corpus annotations. Finally, we demonstrate how and which text type structure categories assigned to complex discourse segments of the type “block” statistically constrain the occurrence of rhetorical relation types.
Germany's (single) national official language is German. The dominance of German in schools, politics, the legal system, administration and the entire written public domain is so great that for a long time the lack of a coherent language policy was not seen as a problem. State restraint in this area is due, on the one hand, to historical reasons; on the other hand, it has been promoted by the federal system in Germany, which grants the federal states far-reaching responsibilities in the fields of education and culture. More recently, multilingualism among the population has increased and has resulted in a growing interest in understanding the language situation in Germany and (in particular) taking a closer look at the different minority languages. In 2017, for the first time in about 80 years, there is a question on the language of the population in the German micro census. The Institute for the German Language has also carried out various representative surveys; in the winter of 2017/201, a large representative survey with questions on the language repertoire and language attitudes is in the field.
Content analysis provides a useful and multifaceted, methodological framework for Twitter analysis. CAQDAS tools support the structuring of textual data by enabling categorising and coding. Depending on the research objective, it may be appropriate to choose a mixed-methods approach that combines quantitative and qualitative elements of analysis and plays out their respective advantages to the greatest possible extent while minimising their shortcomings. In this chapter, we will discuss CAQDAS speech act analysis of tweets as an example of software-assisted content analysis. We start with some elementary thoughts on the challenges of the collection and evaluation of Twitter data before we give a brief description of the potentials and limitations of using the software QDA Miner (as one typical example for possible analysis programmes). Our focus will lie on analytical features that can be particularly helpful in speech act analysis of tweets.
Language Change
(2017)
The present chapter outlines a research program for historical linguistics based on the idea that the object of the formal study of language change should be defined as grammar change, that is, a set of discrete differences between the target grammar and the grammar acquired by the learner (Hale 2007). This approach is shown to offer new answers to some classical problems of historical linguistics (Weinreich et al. 1968), concerning, specifically, the actuation of changes and the observation that the transition from one historical state to another proceeds gradually. It is argued that learners are highly sensitive to small fluctuations in the linguistic input they receive, making change inevitable, while the impression of gradualness is linked to independent factors (diffusion in a speech community, and grammar competition). Special attention is paid to grammaticalization phenomena, which offer insights into the nature of functional categories, the building blocks of clause structure.
In spite of the obvious importance that is accorded to the notion grammatical construction in any approach that sees itself as a construction grammar (CxG), there is as yet no generally accepted definition of the term across different variants of the framework. In particular, there are different assumptions about which additional requirements a given structure has to meet in order to be recognized as a construction besides being a ‘form-meaning pair’. Since the choice of a particular definition will determine the range of both relevant phenomena and concrete observations to be considered in empirical research within the framework, the issue is not just a mere terminological quibble but has important methodological repercussions especially for quantitative research in areas such as corpus linguistics. The present study illustrates some problems in identifying and delimiting such patterns in naturally occurring text and presents arguments for a usage-based interpretation of the term grammatical construction.
The lexicography of German
(2020)
This chapter discusses the main dictionaries of the German language as it is spoken and written in Germany, and also German as it is spoken and written in Austria, Switzerland, the eastern fringes of Belgium, and South Tyrol. It also briefly describes Pennsylvania German. Corpora and other language resources used in German dictionary-making are also presented. Finally, there is a discussion of some current issues in German lexicography, as well as future prospects.
The English language has taken advantage of the Digital Revolution to establish itself as the global language; however, only 28.6 %of Internet users speak English as their native language. Machine Trans-lation (MT) is a powerful technology that can bridge this gap. In devel-opment since the mid-20th century, MT has become available to every Internet user in the last decade, due to free online MT services. This paper aims to discuss the implications that these tools may have for the privacy of their users and how they are addressed by EU data protec-tion law. It examines the data-flows in respect of the initial processing (both from the perspective of the user and the MT service provider) and potential further processing that may be undertaken by the MT service provider.
Contemporary studies on the characteristics of natural language benefit enormously from the increasing amount of linguistic corpora. Aside from text and speech corpora, corpora of computer-mediated communication (CMC) Position themselves between orality and literacy, and beyond that provide in- sight into the impact of "new", mainly intemet-based media on language beha- viour. In this paper, we present an empirical attempt to work with annotated CMC corpora for the explanation of linguistic phenomena. In concrete terms, we implement machine leaming algorithms to produce decision trees that reveal rules and tendencies about the use of genitive markers in German.
Discourse parsing of complex text types such as scientific research articles requires the analysis of an input document on linguistic and structural levels that go beyond traditionally employed lexical discourse markers. This chapter describes a text-technological approach to discourse parsing. Discourse parsing with the aim of providing a discourse structure is seen as the addition of a new annotation layer for input documents marked up on several linguistic annotation levels. The discourse parser generates discourse structures according to the Rhetorical Structure Theory. An overview of the knowledge sources and components for parsing scientific joumal articles is given. The parser’s core consists of cascaded applications of the GAP, a Generic Annotation Parser. Details of the chart parsing algorithm are provided, as well as a short evaluation in terms of comparisons with reference annotations from our corpus and with recently developed Systems with a similar task.
Integrated Linguistic Annotation Models and Their Application in the Domain of Antecedent Detection
(2011)
Seamless integration of various, often heterogeneous linguistic resources in terms of their output formats and a combined analysis of the respective annotation layers are crucial tasks for linguistic research. After a decade of concentration on the development of formats to structure single annotations for specific linguistic issues, in the last years a variety of specifications to store multiple annotations over the same primary data has been developed. The paper focuses on the integration of the knowledge resource logical document structure information into a text document to enhance the task of automatic anaphora resolution both for the task of candidate detection and antecedent selection. The paper investigates data structures necessary for knowledge integration and retrieval.
Different Views on Markup
(2010)
In this chapter, two different ways of grouping information represented in document markup are examined: annotation levels, referring to conceptual levels of description, and annotation layers, referring to the technical realisation of markup using e.g. document grammars. In many current XML annotation projects, multiple levels are integrated into one layer, often leading to the problem of having to deal with overlapping hierarchies. As a solution, we propose a framework for XML-based multiple, independent XML annotation layers for one text, based on an abstract representation of XML documents with logical predicates. Two realisations of the abstract representation are presented, a Prolog fact base format together with an application architecture, and a specification for XML native databases. We conclude with a discussion of projects that have currently adopted this framework.
An approach to the unification of XML (Extensible Markup Language) documents with identical textual content and concurrent markup in the framework of XML-based multi-layer annotation is introduced. A Prolog program allows the possible relationships between element instances on two annotation layers that share PCDATA to be explored and also the computing of a target node hierarchy for a well-formed, merged XML document. Special attention is paid to identity conflicts between element instances, for which a default solution that takes into account metarelations that hold between element types on the different annotation layers is provided. In addition, rules can be specified by a user to prescribe how identity conflicts should be solved for certain element types.
This article introduces the topic of ‘‘Multilingual language resources and interoperability’’. We start with a taxonomy and parameters for classifying language resources. Later we provide examples and issues of interoperatability, and resource architectures to solve such issues. Finally we discuss aspects of linguistic formalisms and interoperability.
Researchers in many disciplines, sometimes working in close cooperation, have been concerned with modeling textual data in order to account for texts as the prime information unit of written communication. The list of disciplines includes computer science and linguistics as well as more specialized disciplines like computational linguistics and text technology. What many of these efforts have in common is the aim to model textual data by means of abstract data types or data structures that support at least the semi-automatic processing of texts in any area of written communication.
We report on finished work in a project that is concerned with providing methods, tools, best practice guidelines, and solutions for sustainable linguistic resources. The article discusses several general aspects of sustainability and introduces an approach to normalizing corpus data and metadata records. Moreover, the architecture of the sustainability platform implemented by the authors is described.
This article shows that the TEI tag set for feature structures can be adopted to represent a heterogeneous set of linguistic corpora. The majority of corpora is annotated using markup languages that are based on the Annotation Graph framework, the upcoming Linguistic Annotation Format ISO standard, or according to tag sets defined by or based upon the TEI guidelines. A unified representation comprises the separation of conceptually different annotation layers contained in the original corpus data (e.g. syntax, phonology, and semantics) into multiple XML files. These annotation layers are linked to each other implicitly by the identical textual content of all files. A suitable data structure for the representation of these annotations is a multi-rooted tree that again can be represented by the TEI and ISO tag set for feature structures. The mapping process and representational issues are discussed as well as the advantages and drawbacks associated with the use of the TEI tag set for feature structures as a storage and exchange format for linguistically annotated data.
The multiple gradations of German strong verbs are but manifestations of a rather uncomplicated system. There is a small number of ways to make up ablaut forms; these types of formation are identifiable in formal terms and, what is more, they have definite functions as morphological markers. Using classifications of stem forms according to quality, complexity and quantity of vowels, three types of operations involved in ablaut formation are identified. Ablaut always includes a change of quality type or a change of complexity type, and in addition it may include a change of quantity type. Ablaut forms are clearly distinguished as against bases (and against each other): their vocalism meets a defined standard of dissimilarity. On this basis, gradations are collected into inflectional classes that are defined in strictly synchronic terms. These classes continue the historical seven classes known from reference grammars. For the majority of strong verbs, membership in these classes (and thus ablaut) is predictable.
The paper gives an analysis of productively occurring dative constructions in German, attempting to unify what are known traditionally as Double Object and Experiencer Datives. The datives in question - cipients as we call them - are argued to be licensed under two conditions: One, predicates licensing cipients project a theme and a location argument internally; two, interpretation of the predication as a whole involves reference to two dissociated temporal intervals, or more generally, indexical truth intervals. It is argued that the location argument is needed because it provides the variable that is bound by the cipient argument - the variable in question ranges over superlocations of the location argument referent. Reference to two truth intervals is forced because interpreting the cipient structure involves evaluation of two propositional meanings that would contradict each other in a single context. The first propositional meaning is embedded in the predicate; it encodes that something is at a certain location (in quality space). The second propositional meaning is projected as a presupposition that corresponds just to the negation of the first one. The cipient, functioning as the logical subject of the construction, accommodates this second presuppositional meaning; this makes the construction as a whole interpretable. The analysis applies uniformly to what appear to be the two major contexts licensing cipients: ‘eventive’ and ‘foo-comparative’ predications, thereby accounting for some striking parallels between them.
Discourse segmentation is the division of a text into minimal discourse segments, which form the leaves in the trees that are used to represent discourse structures. A definition of elementary discourse segments in German is provided by adapting widely used segmentation principles for English minimal units, while considering punctuation, morphology, sytax, and aspects of the logical document structure of a complex text type, namely scientific articles. The algorithm and implementation of a discourse segmenter based on these principles is presented, as well an evaluation of test runs.
This paper analyses paramedic emergency interaction as multimodal multiactivity. Based on a corpus of video-recordings of emergency drills performed by professional paramedics during advanced training, the focus is on paramedics’ participation in multiple joint projects which become simultaneously relevant. Simultaneity and fast succession of multiactivity does not only characterise work on the team level, but also the work profile of the individual paramedic. Participants have to coordinate their own participation in more than one joint project intrapersonally. In the data studied, three patterns of allocating multimodal resources stood out as routine ways of coordinating participation in two simultaneous projects intrapersonally:
1. Talk and hearing vs. manual action monitored by gaze,
2. Talk and hearing vs. gazing (and pointing),
3. Manual action vs. gaze (and talk and hearing).
In her overview, Margret Selting makes the case for the claim that dealing with authentic conversation necessarily lies at the heart of an interactionallinguistic approach to prosody (see Selting this volume, Section 3.3). However, collecting and transcribing corpora of authentic interaction is a time-consuming enterprise. This fact often severely restricts what the individual researcher is able to do in terms of analysis within the scope of his or her resources. Still, for dealing with many of the desiderata Margret Selting points out in Section 5 of her extensive overview, the use of larger corpora seems to be required. In this commenting paper, I want to argue that future progress in research on prosody in interaction will essentially rest on the availability and use of large public corpora. After reviewing arguments for and against the use of public corpora, I will discuss some upshots regarding corpus design and issues of transcription of public corpora.
One major issue in the accomplishment of contrasts in conversation is lexical choice of items which carry the semantic Ioad of the two states of affair which are represented as being opposed to one another. These items or expressions are co-selected to be understood as being contrastively related to each other. In this paper, it is argued that the activity of contrasting itself provides them with a specific local opposite meaning which they would not obtain in other contexts. Practices of contrastingare thus seen as an example of conversational activities which creatively and systematically affect situated meanings. Basedon data from various genres, such as meetings, mediation sessions and conversations, the paper discusses two practices of contrasting, their sequential construction and their interpretative effects. It is concluded that the interpretative effects of conversational contrasting rest on the sequential deployment oflinguistic resources and on the cognitive procedures of frame-based interpretation and constructing a maximally contrastive interpretation for the co-selected expressions.