Refine
Year of publication
Document Type
- Part of a Book (20)
- Article (11)
- Conference Proceeding (5)
Has Fulltext
- yes (36)
Keywords
- Computerlinguistik (9)
- Deutsch (7)
- Natürliche Sprache (6)
- Automatische Sprachanalyse (5)
- Konversationsanalyse (4)
- Korpus <Linguistik> (4)
- Annotation (3)
- Interaktion (3)
- Maschinelles Lernen (3)
- Multimodalität (3)
Publicationstate
- Postprint (36) (remove)
Reviewstate
- (Verlags)-Lektorat (17)
- Peer-Review (14)
- Peer-review (1)
Publisher
- Springer (36) (remove)
Mock fiction is a genre of humorous, fictional narratives. It is pervasive in adolescents’ peer-group interaction. Building on a corpus of informal peer-group interaction among 14 to 17 year-old German adolescents, it is shown how mock fiction is used to sanction identity-claims of peer-group co-members that are taken to be inadequate by the teller of a mock fiction. Mock fiction exposes and ridicules those claims by fictional exaggeration. Mock fiction is an indirect, yet sometimes even highly abusive means for criticizing and negotiating identities and statuses of peer-group members. The analysis shows how mock fiction is collaboratively produced, how it is used to convey criticism and to negotiate social norms indirectly, and how, in addition, it allows for performative self-positioning of the tellers as skilled, entertaining tellers and socio-psychological diagnosticians.
Der Beitrag stellt zunächst die drei grundlegenden methodischen Verfahren der Konversationsanalyse und der mittlerweile deren Vorgehen folgenden diskursiven Psychologie dar: die Transkription, die detaillierte Sequenzanalyse am Einzelfall und die (komparative) Analyse von Datenkollektionen. Nach einer Übersicht über grundlegende Befunde zur Organisation von Interaktionen wird auf drei psychologische Untersuchungsbereiche eingegangen: Die Konstitution von Identität in Gesprächen, die Rolle von Kognitionen in der sozialen Interaktion und die Erforschung von Psychotherapiegesprächen.
The lexicography of German
(2020)
This chapter discusses the main dictionaries of the German language as it is spoken and written in Germany, and also German as it is spoken and written in Austria, Switzerland, the eastern fringes of Belgium, and South Tyrol. It also briefly describes Pennsylvania German. Corpora and other language resources used in German dictionary-making are also presented. Finally, there is a discussion of some current issues in German lexicography, as well as future prospects.
Question Answering Systems for retrieving information from Knowledge Graphs (KG) have become a major area of interest in recent years. Current systems search for words and entities but cannot search for grammatical phenomena. The purpose of this paper is to present our research on developing a QA System that answers natural language questions about German grammar.
Our goal is to build a KG which contains facts and rules about German grammar, and is also able to answer specific questions about a concrete grammatical issue. An overview of the current research in the topic of QA systems and ontology design is given and we show how we plan to construct the KG by integrating the data in the grammatical information system Grammis, hosted by the Leibniz-Institut für Deutsche Sprache (IDS). In this paper, we describe the construction of the initial KG, sketch our resulting graph, and demonstrate the effectiveness of such an approach. A grammar correction component will be part of a later stage. The paper concludes with the potential areas for future research.
German subjectively veridical sicher sein ‘be certain’ can embed ob-clauses in negative contexts, while subjectively veridical glauben ‘believe’ and nonveridical möglich sein ‘be possible’ cannot. The Logical Form of F isn’t certain if M is in Rome is regarded as the negated disjunction of two sentences ¬(cf σ ∨ cf ¬σ) or ¬cf σ ∧ ¬cf ¬σ. Be certain can have this LF because ¬cf σ and ¬cf ¬σ are compatible and nonveridical. Believe excludes this LF because ¬bf σ and ¬bf ¬σ are incompatible in a question-under-discussion context. It follows from this incompatibility and from the incompatibility of bf σ and bf ¬σ that bf ¬σ and ¬bf σ are equivalent. Therefore believe cannot be nonveridical. Be possible doesn’t allow the LF either. Similar to believe, ¬pf σ and ¬pf ¬σ are incompatible. But unlike believe, pf σ and pf ¬σ are compatible.
The demo presents a minimalist, off-the-shelf AND tool which provides a fundamental AND operation, the comparison of two publications with ambiguous authors, as an easily accessible HTTP interface. The tool implements this operation using standard AND functionality, but puts particular emphasis on advanced methods from natural language processing (NLP) for comparing publication title semantics.
The use of digital resources and tools across humanities disciplines is steadily increasing, giving rise to new research paradigms and associated methods that are commonly subsumed under the term digital humanities. Digital humanities does not constitute a new discipline in itself, but rather a new approach to humanities research that cuts across different existing humanities disciplines. While digital humanities extends well beyond language-based research, textual resources and spoken language materials play a central role in most humanities disciplines.
The transfer of research data management from one institution to another infrastructural partner is all but trivial, but can be required, for instance, when an institution faces reorganization or closure. In a case study, we describe the migration of all research data, identify the challenges we encountered, and discuss how we addressed them. It shows that the moving of research data management to another institution is a feasible, but potentially costly enterprise. Being able to demonstrate the feasibility of research data migration supports the stance of data archives that users can expect high levels of trust and reliability when it comes to data safety and sustainability.
Terminological resources play a central role in the organization and retrieval of scientific texts. Both simple keyword lists and advanced modelings of relationships between terminological concepts can make a most valuable contribution to the analysis, classification, and finding of appropriate digital documents, either on the web or within local repositories. This seems especially true for long-established scientific fields with elusive theoretical and historical branches, where the use of terminology within documents from different origins is often far from being consistent. In this paper, we report on the progress of a linguistically motivated project on the onomasiological re-modeling of the terminological resources for the grammatical information system grammis. We present the design principles and the results of their application. In particular, we focus on new features for the authoring backend and discuss how these innovations help to evaluate existing, loosely structured terminological content, as well as to efficiently deal with automatic term extraction. Furthermore, we introduce a transformation to a future SKOS representation. We conclude with a positioning of our resources with regard to the Knowledge Organization discourse and discuss how a highly complex information environment like grammis benefits from the re-designed terminological KOS.
Just like most varieties of West Germanic, virtually all varieties of German use a construction in which a cognate of the English verb 'do' (standard German 'tun') functions as an auxiliary and selects another verb in the bare infinitive, a construction known as 'do'-periphrasis or 'do'-support. The present paper provides an Optimality Theoretic (OT) analysis of this phenomenon. It builds on a previous analysis by Bader and Schmid (An OT-analysis of 'do'-support in Modern German, 2006) but (i) extends it from root clauses to subordinate clauses and (ii) aims to capture all of the major distributional patterns found across (mostly non-standard) varieties of German. In so doing, the data are used as a testing ground for different models of German clause structure. At first sight, the occurrence of 'do' in subordinate clauses, as found in many varieties, appears to support the standard CP-IP-VP analysis of German. In actual fact, however, the full range of data turn out to challenge, rather than support, this model. Instead, I propose an analysis within the IP-less model by Haider (Deutsche Syntax - generativ. Vorstudien zur Theorie einer projektiven Grammatik, Narr, Tübingen, 1993 et seq.). In sum, the 'do'-support data will be shown to have implications not only for the analysis of clause structure but also for the OT constraints commonly assumed to govern the distribution of 'do', for the theory of non-projecting words (Toivonen in Non-projecting words, Kluwer, Dordrecht, 2003) as well as research on grammaticalization.
Dieser Beitrag widmet sich der Analyse des Zusammenspiels sprachlich-hörbarer und sichtbar-kinesischer Praktiken, die beim alltäglichen Erzählen eingesetzt werden. Im Rahmen einer konversationsanalytisch basierten Untersuchung von Videoaufnahmen deutscher Alltagsgespräche wird die Bandbreite alltäglicher narrativer Praktiken in der face-to-face-Kommunikation aufgezeigt. Dies erfolgt exemplarisch anhand zweier Beispiele, in denen Einstieg, Ausgestaltung sowie Beendigung der Erzählung unter unterschiedlichen sequentiellen und multimodalen Bedingungen vollzogen werden. Die Untersuchung unterstreicht einerseits die Indexikalität alltäglicher narrativer Praktiken, andererseits die Notwendigkeit einer interaktionalen Narratologie, die diese Praktiken als Produkt sprachlicher, verkörperter und räumlicher Ressourcen sowie der Zusammenarbeit mehrerer Teilnehmer analysiert und konzeptualisiert.
Wir diskutieren in diesem Beitrag Implikationen, mit denen man zu tun bekommt, wenn man kleinste Formen situativer Vergesellschaftung – wir sprechen von kommunikativen Minimalformen – untersucht. Kommunikative Minimalformen sind kurzzeitige, nur wenige Sekunden dauernde, gemeinsam konstituierte Interaktionsereignisse. Ungeachtet ihrer Kürze weisen sie zum einen eine komplexe Interaktionsstruktur auf. Zum anderen besitzen sie auch eine klare soziale Implikation und eigene Wertigkeit. In dem hier untersuchten Fall, bei dem Passanten durch ein offenes Fenster in einen Privatraum blicken und dabei ertappt werden, zeigt sich diese soziale Implikativität als moralische Kommunikation im Sinne der interaktiven Bearbeitung eigenen Fehlverhaltens.
We present a method to identify and document a phenomenon on which there is very little empirical data: German phrasal compounds occurring in the form of as a single token (without punctuation between their components). Relying on linguistic criteria, our approach implies to have an operational notion of compounds which can be systematically applied as well as (web) corpora which are large and diverse enough to contain rarely seen phenomena. The method is based on word segmentation and morphological analysis, it takes advantage of a data-driven learning process. Our results show that coarse-grained identification of phrasal compounds is best performed with empirical data, whereas fine-grained detection could be improved with a combination of rule-based and frequency-based word lists. Along with the characteristics of web texts, the orthographic realizations seem to be linked to the degree of expressivity.
This chapter investigates policies which shape the role of the German language in contemporary Estonia. Whereas German played for many centuries an important role as the language of the economic and cultural elite in Estonia, it severely declined in importance throughout the twentieth century. Mirrored on this historical background, the paper provides an overview of the current functions of German and attitudes towards it and it discusses how these functions and attitudes are influenced by policies of various actors from inside and outside Estonia. The paper argues that German continues to play a significant role: while German is no longer a lingua franca, it still enjoys a number of functions and prestige in clearly defined niches involving communication within German-speaking circles or between Estonians and Germans. The interplay of language policies of the Estonian and the German-speaking states as well as by semi-state and private institutions succeed in maintaining German as an additional language in contemporary Estonia.
We present a supervised machine learning AND system which tackles semantic similarity between publication titles by means of word embeddings. Word embeddings are integrated as external components, which keeps the model small and efficient, while allowing for easy extensibility and domain adaptation. Initial experiments show that word embeddings can improve the Recall and F score of the binary classification sub-task of AND. Results for the clustering sub-task are less clear, but also promising and overall show the feasibility of the approach.
The English language has taken advantage of the Digital Revolution to establish itself as the global language; however, only 28.6 %of Internet users speak English as their native language. Machine Trans-lation (MT) is a powerful technology that can bridge this gap. In devel-opment since the mid-20th century, MT has become available to every Internet user in the last decade, due to free online MT services. This paper aims to discuss the implications that these tools may have for the privacy of their users and how they are addressed by EU data protec-tion law. It examines the data-flows in respect of the initial processing (both from the perspective of the user and the MT service provider) and potential further processing that may be undertaken by the MT service provider.
This chapter analyses the impact of political decentralization in a state on the position of ethnic and linguistic minorities, in particular with regard to the role of parliamentary assemblies in the political system. It relates a number of typical functions of parliaments to the specific needs of minorities and their languages. The most important of these functions are the representation of the minority and responsiveness to the minority’s needs. The chapter then discusses six examples from the European Union (and Norway) which prototypically represent different types of parliamentary decentralization: the ethnically defined Sameting in Norway and its importance for the Sámi population, the Scottish Parliament and its role for speakers of Scottish Gaelic, the German regional parliaments of the Länder of Schleswig-Holstein and Saxony and their impact on the Frisian and Sorbian minorities respectively, the autonomy of predominantly German-speaking South Tyrol within the Italian state, and finally the situation of the speakers of Latgalian in Latvia, where a decentralized parliament is missing. The chapter also makes suggestions on comparisons of these situations with minorities in Russia. It finally argues that political decentralization may indeed empower minorities to gain a greater voice in their states, even if much ultimately depends on individual factors in each situation and the attitudes by the majority population and the political center.
In this article, we explore the feasibility of extracting suitable and unsuitable food items for particular health conditions from natural language text. We refer to this task as conditional healthiness classification. For that purpose, we annotate a corpus extracted from forum entries of a food-related website. We identify different relation types that hold between food items and health conditions going beyond a binary distinction of suitability and unsuitability and devise various supervised classifiers using different types of features. We examine the impact of different task-specific resources, such as a healthiness lexicon that lists the healthiness status of a food item and a sentiment lexicon. Moreover, we also consider task-specific linguistic features that disambiguate a context in which mentions of a food item and a health condition co-occur and compare them with standard features using bag of words, part-of-speech information and syntactic parses. We also investigate in how far individual food items and health conditions correlate with specific relation types and try to harness this information for classification.
Contemporary studies on the characteristics of natural language benefit enormously from the increasing amount of linguistic corpora. Aside from text and speech corpora, corpora of computer-mediated communication (CMC) Position themselves between orality and literacy, and beyond that provide in- sight into the impact of "new", mainly intemet-based media on language beha- viour. In this paper, we present an empirical attempt to work with annotated CMC corpora for the explanation of linguistic phenomena. In concrete terms, we implement machine leaming algorithms to produce decision trees that reveal rules and tendencies about the use of genitive markers in German.