Refine
Year of publication
Document Type
- Part of a Book (4500)
- Article (2966)
- Book (996)
- Conference Proceeding (688)
- Part of Periodical (308)
- Review (257)
- Other (151)
- Working Paper (83)
- Doctoral Thesis (68)
- Report (35)
Language
- German (8078)
- English (1765)
- Russian (145)
- French (38)
- Multiple languages (22)
- Spanish (16)
- Portuguese (14)
- Italian (9)
- Polish (7)
- Ukrainian (5)
Keywords
- Deutsch (5140)
- Korpus <Linguistik> (940)
- Wörterbuch (605)
- Konversationsanalyse (451)
- Rezension (423)
- Grammatik (405)
- Rechtschreibung (374)
- Gesprochene Sprache (361)
- Sprachgebrauch (356)
- Interaktion (339)
Publicationstate
- Veröffentlichungsversion (3883)
- Zweitveröffentlichung (1642)
- Postprint (395)
- Preprint (10)
- Erstveröffentlichung (8)
- Ahead of Print (7)
- (Verlags)-Lektorat (4)
- Hybrides Open Access (2)
- Verlags-Lektorat (1)
- Verlagsveröffentlichung (1)
Reviewstate
- (Verlags)-Lektorat (3836)
- Peer-Review (1596)
- Verlags-Lektorat (94)
- Peer-review (56)
- Qualifikationsarbeit (Dissertation, Habilitationsschrift) (44)
- Review-Status-unbekannt (14)
- Peer-Revied (12)
- Abschlussarbeit (Bachelor, Master, Diplom, Magister) (Bachelor, Master, Diss.) (10)
- (Verlags-)Lektorat (9)
- Verlagslektorat (5)
Publisher
- de Gruyter (1334)
- Institut für Deutsche Sprache (1091)
- Schwann (638)
- Narr (484)
- Leibniz-Institut für Deutsche Sprache (IDS) (263)
- De Gruyter (245)
- Niemeyer (200)
- Lang (184)
- Narr Francke Attempto (170)
- IDS-Verlag (144)
Die erfolgreiche Wiederverwendung gesprochener Korpora muss fachspezifischen Evaluationskritierien genügen und erfordert daher eine flexible Korpusarchitektur, die durch multirepräsentationale (Verfügbarkeit eines akustischen Signals und einer Transliteration) und multisituationale Daten (Variabilität von Situationen bzw. Aufgaben) gekennzeichnet ist. Diese Kriterien werden in einer Fallstudie zur /eː/-Diphthongisierung polnischer Deutschlerner/-innen angewendet und diskutiert. Die Fallstudie repliziert die Ergebnisse der /eː/-Diphthongisierung bei Bildbenennungen von Nimz (2016). Vor der Wiederverwendung werden weitere fachspezifische Evaluationskriterien überprüft, wie Multisituationalität, Aufnahmequalitäten, Erweiterbarkeit, vorhandene Metadaten und vorhandene Dokumentation. Nach der Replikationsstudie werden die Herausforderungen für eine Umsetzung der Wiederverwendung bezüglich Datenmanagement, Workflows und Data Literacy in Forschungs- und Lehrkontexten diskutiert.
Albert Verdoodt: Der Stand der deutschen Sprache im Elsaß, Lothringen, Luxemburg und Ostbelgien
(1969)
This paper presents a dictionary writing system developed at the Institute for the German Language in Mannheim (IDS) for an ongoing international lexicographical project that traces the way of German loanwords in the East Slavic languages Russian, Belarusian and Ukrainian that were possibly borrowed via Polish. The results will be published in the Lehnwortportal Deutsch (LWP, lwp.ids-mannheim.de), a web portal for loanword dictionaries with German as the common donor language. The system described here is currently in use for excerpting data from a large range of historical and contemporary East Slavic monolingual dictionaries. The paper focuses on the tools that help in merging excerpts that are etymologically related to one and the same Polish etymon. The merging process involves eliminating redundancies and inconsistencies and, above all, mapping word senses of excerpted entries onto a common cross-language set of ‘metasenses’. This mapping may involve literally hundreds of excerpted East Slavic word senses, including quotations, for one ‘underlying’ Polish etymon.
The English language has taken advantage of the Digital Revolution to establish itself as the global language; however, only 28.6 %of Internet users speak English as their native language. Machine Trans-lation (MT) is a powerful technology that can bridge this gap. In devel-opment since the mid-20th century, MT has become available to every Internet user in the last decade, due to free online MT services. This paper aims to discuss the implications that these tools may have for the privacy of their users and how they are addressed by EU data protec-tion law. It examines the data-flows in respect of the initial processing (both from the perspective of the user and the MT service provider) and potential further processing that may be undertaken by the MT service provider.
Der vorliegende Beitrag widmet sich einem etwas stiefmütterlich behandelten Aspekt im ansonsten durchaus lebhaft geführten Diskurs um die didaktische Grammatik: dem grammatischen Wissen der Lehrenden. Auf diesem Gebiet besteht m.E. eine Diskrepanz zwischen Anforderungen an die Lehrerrolle im Gefolge veränderter Forschungsparadigmen und konkreter Gegebenheiten einerseits und einer Realität andererseits, die durch quantitativ wie qualitativ sehr heterogene und teilweise defizitäre Wissensbestände gekennzeichnet ist. Diese Behauptung soll im ersten Teil des Beitrags argumentativ untermauert werden. Im zweiten Teil soll mit dem multimedialen grammatischen Informationssystem GRAMMIS eine mögliche Quelle für grammatisches Wissen vorgestellt werden, die den Bedürfnissen von DaF-Lehrenden entgegenkommt.
Ziel dieses Beitrags soll es sein, den Absentiv in seinen semantischen und syntaktischen Besonderheiten zu analysieren sowie ihn von anderen Konstruktionen, insbesondere dem Progressiv, abzugrenzen. Dabei beziehe ich mich unter anderem auf die Arbeiten von de Groot (2000), Krause (2002), Vogel (2007) und Abraham (2008), die die aktuellen Ansätze zu einer Analyse des deutschen Absentivs repräsentieren. Um über das empirische Material hinaus, welches das Internet sowie Hörbelege liefern, an Daten über den Absentiv und seinen Gebrauch zu gelangen, wurde eine Befragung von deutschen Muttersprachlern vorgenommen. Darin wurden anhand eines Fragebogens fünfzehn unterschiedliche Szenarien geschildert. Die Probanden (insgesamt 30 deutsche Muttersprachler im Alter zwischen 21 und 63) waren aufgefordert, entweder einen teilweise vorgegebenen Satz zu vervollständigen oder zwei Sätze dahingehend zu beurteilen, welcher ihnen plausibler erscheint. In wieder anderen Szenarien wurde erwartet, dass die Befragten die Reaktion schildern, mit der sie auf eine vorgegebene Situation reagieren würden. Desweiteren werden die Ergebnisse einer Korpusstudie eingebunden, die 2008 am Institut für Deutsche Sprache durchgeführt wurde. Im Rahmen dieser Untersuchung wurden sämtliche Progressivbelege aus dem COSMAS-II-Korpus extrahiert sowie alle Absentivbelege für eine 589 Verben umfassende Liste und in einer Datenbank gespeichert.
In der wissenschaftlichen Forschung zum Argumentieren besteht immer noch ein eklatantes Empiriedefizit. Die gesprächsanalytische Untersuchung natürlicher Gespräche zeigt die Schwierigkeiten bei der Bestimmung der Grenzen des Argumentierens auf wie auch bei der Identifikation der internen Strukturen. Im Beitrag wird versucht, ausgehend von der theoretischen Konzeption der Interaktionskonstitution sensu Kallmeyer und Schütze, den Gegenstand 'Argumentieren in Gesprächen' aus den konstitutiven Eigenschaften von Interaktion selbst heraus zu bestimmen. Es zeigt sich, dass Gesprächsteilnehmer argumentieren, wenn die Bearbeitung übergreifender Handlungsaufgaben durch ein Darstellungsdefizit gefährdet oder blockiert ist. Argumentieren ist dabei intern in fünf Sequenzschritten organisiert, wobei die Argumentationssequenz auf verschiedene Weise expandiert und kondensiert werden kann. Sequenzielle Struktur und Variabilität gewährleisten interaktive Kontrolle des Geschehens und maximale Flexibilität, was Argumentieren zu einem praktikablen, lösungsorientierten Interaktionsverfahren macht.
There are strict formal requirements for the use of a comma. However, there are none regarding the comma’s actual shape. In printed fonts, it is determined by the font’s specification. In hand-written texts though, the shape of the comma is variable; most writers choose from a set of straight, convex and concave shapes. By using a corpus of 1464 commas written by 99 individuals, we will present three case studies of persons whose comma shapes do somehow correlate with linguistic structures. With that, we might identify a few (possibly subconscious) shaping strategies. Some writers might mark a norm insecurity by a different comma form, others might mark the function of the entity which is segmented by the comma, or the comma type itself (sentence boundary, exposition or coordination).
Alles verstehen heißt alles verzeihen ist ein Satz, der im Deutschen den Charakter eines Spruchs, eines geflügelten Wortes angenommen hat, und der wahrscheinlich auf einem Zitat aus „Corinne ou l‘Italie“ von Madame de Staël (1807) (tout) comprendre c‘est (tout) pardonner basiert. Dieser Satz wurde ins Deutsche übersetzt und als Alles verstehen heißt alles verzeihen tradiert. Die Form eines Spruchs, eines geflügelten Wortes ist im Allgemeinen sehr konstant. Die Tendenz zur grammatischen Variation ist auch dann gering, wenn sie nach gängigen grammatischen Regeln möglich wäre.
Allgemeines: Argumentieren
(2013)
A model of grammar needs to reconcile the undesirability inherent to allomorphy, the apparent extra burden on learning and memory, with its occurrence and possible stability. OT approaches this task by positing an anti-allomorphy constraint, henceforth referred to as "OO-correspondence", which requires leveling (i.e. sameness of sound structure) in related word forms (Benua 1997). The occurrence of allomorphy then indicates crucial domination of OO-correspondence by other constraints. To assess the adequacy of this proposal it is necessary to establish the level of abstractness at which OO-correspondence applies and to examine the consequences of this decision for ranking order. While proponents of OT tacitly assume the level in question to be rather concrete, the notion of allomorphy as originally envisioned in Structuralism was defined by distinctness at a more abstract level referred to as "phonemic" (Harris 1942; Nida 1944). The basic intuition here is that the defining property of subphonemic sound properties, their conditionedness by context, entails that whatever burden they put on learning and memory is of a fundamentally different nature than that entailed by phonemic distinctness. The evidence from German supports that intuition in that leveling can be shown to target phonemic sound structure to the exclusion of subphonemic properties. Allomorphy, defined by phonemic alterna-tion, tends to serve phonological optimization in closed class items (function words, affixes) while serving to express morphological distinctions in open class items. The key to demonstrating the correlations in question lies in the discernment of phonemic structure, which is therefore at the core of the article.
Alltagsgespräche
(2001)
Allusion
(2023)
Almanca tuhfe / Deutsches Geschenk (1916) oder: Wie schreibt man deutsch mit arabischen Buchstaben?
(2022)
Versified dictionaries are bilingual/multilingual glossaries written in verse form to teach essential words in any foreign language. In Islamic culture, versified dictionaries were produced to teach the Arabic language to the young generations of Muslim communities not native in Arabic. In the course of time, many bilingual/multilingual versified dictionaries were written in different languages throughout the Islamic world. The focus of this study is on the Turkish-German versified dictionary titled Almanca Tuhfe / Deutsches Geschenk [German Gift], published by Dr. Sherefeddin Pasha in Istanbul in 1916. This dictionary is the only dictionary in verse ever written combining these two languages. Moreover the dictionary is one of the few texts containing German words written in Arabic letters (applying Ottoman spelling conventions). The study concentrates on the way German words are spelled and tries to find out, whether Sherefeddin Pasha applied something like fixed rules to write the German lexemes.
Für die Analyse der gesprochenen Sprache wurden in den letzten Jahrzehnten computerisierte Sprachkorpora bereitgestellt, die qualitative und quantitative Untersuchungen erlauben. Die nächste „Herausforderung“ für die Korpuslinguistik stellt heute der Gebrauch des „World Wide Webs“ als unerschöpflicher Datenbank dar. Theoretischen Überlegungen über das Potenzial des WWWs folgt ein praktisches Beispiel: die Verwendung deutscher sprechsprachlicher Relativkonstruktionen in „Webforen“.
Altern und Identitätsarbeit
(2009)
Altern ist eine Aufgabe, die von allen Menschen - durchaus auf unterschiedliche Weise - zu bewältigen ist und an der sie aktiv teilhaben. Altern ist demnach nicht etwas, was Personen nur passiert bzw. widerfährt, sondern es erfolgt in einem sozialen Prozess, in dem sich die Beteiligten mit dem Altern auseinandersetzen und es interaktiv gestalten. Altern impliziert so als Aufgabe auch die Reflexion der lebensgeschichtlich eintretenden Veränderungen und ihre interaktive und kommunikative Be- und Verarbeitung. In der kommunikativen Bewältigung dieser Veränderungen wird zugleich Identitätsarbeit geleistet und werden Aspekte von Altersidentität ausgebildet. Dabei spielt die Auseinandersetzung mit Identitätsmerkmalen der mittleren Generation eine zentrale Rolle. Der Beitrag modelliert diese Wechselwirkungen zwischen Altern, Kommunikation und Identitätsarbeit.
Altern wird in diesem Band untersucht als eine Aufgabe, die von allen Menschen - durchaus auf unterschiedliche Weise - zu bewältigen ist und an der sie aktiv teilhaben. Altern ist demnach nicht etwas, was einem nur passiert bzw. widerfährt, sondern erfolgt in einem sozialen Prozess, in dem sich die Beteiligten mit dem Altern auseinandersetzen und es interaktiv gestalten. Altern impliziert so als Aufgabe auch die Reflexion der lebensgeschichtlich eintretenden Veränderungen und ihre interaktive und kommunikati-ve Be- und Verarbeitung. In der kommunikativen Bewältigung dieser Veränderungen wird zugleich Identitätsarbeit geleistet und werden Aspekte von Altersidentität ausgebildet. Diese Wechselwirkungen zwischen Altern, Kommunikation und Identitätsarbeit werden anhand von Ausschnitten aus authentischen Gesprächen herausgearbeitet und mit gesprächsanalytischen Methoden untersucht. Im Anhang geben zwei lange Transkriptausschnitte Einblick in die Kommunikationsweisen älterer Menschen und stellen Material für weitere Analysen bereit.
Am Anfang ist das Wort
(2017)
Am Anfang war die Lücke
(2012)
Beim Lesen stolpert man über den unscheinbaren Artikel den. Muss das nicht dem heißen? Richtig. Die lokale Angabe am Stadioneingang und die temporale Angabe am Sonntag stehen im Dativ, wie sich eindeutig an dem definiten Artikel dem erkennen lässt, der hier mit der Präposition an zu am verschmolzen ist. Und der Artikel, der nach dem Komma folgt und den ‚lockere‘ oder
‚lose Apposition‘ genannten Nachtrag einleitet, bezieht sich ebenfalls auf Stadioneingang bzw. Sonntag und sollte mit diesem Bezugsnomen kongruieren, das heißt ebenfalls im Dativ – und nicht wie in den Beispielen in im Akkusativ – stehen.
We describe a simple and efficient Java object model and application programming interface (API) for (possibly multi-modal) annotated natural language corpora. Corpora are represented as elements like Sentences, Turns, Utterances, Words, Gestures and Markables. The API allows linguists to access corpora in terms of these discourse-level elements, i.e. at a conceptual level they are familiar with, with the flexibility offered by a general purpose programming language. It is also a contribution to corpus standardization efforts because it is based on a straightforward and easily extensible data model which can serve as a target for conversion of different corpus formats.
Wortgeschichte digital (Digital Word History) is an emerging historical dictionary of the German language that focuses on describing semantic shifts from about 1600 through today. This article provides deeper insight into the dictionary’s “cross-reference clusters,” one of its software tools that performs visualization of its reference network. Hence, the clusters are a part of the project’s macrostructure. They serve as both a means for users to find entries of interest and a tool to elucidate relations among dictionary entries. Rather than delve into technical aspects, this article focuses on the applied logics of the software and discusses the approach in light of the dictionary’s microstructure. The article concludes with some considerations about the clusters’ advantages and limitations.
Der nationalsozialistische Interaktions- und Kommunikationsraum war mithin bevölkert von kommunikativ konstruierten Sozialfiguren. Hierbei gab es sowohl positiv Konnotierte (z. B. Volksgenosse, Nationalsozialist, Parteigenosse, SA-Mann, Alter Kämpfer) als auch negativ Konnotierte (z. B. Asozialer, Judenfreund, Schwarzer, Roter, Freimaurer). Diese stereotypisierten Sozialfiguren, an die wiederum vielfältige positive wie negative Attribuierungen geknüpft waren, stellten gleichsam Diskurspositionen dar, die anderen zugewiesen wurden oder eingenommen werden konnten – sofern den individuellen Voraussetzungen nach möglich – und die mit unterschiedlichen Graden der In- bzw. Exklusion einhergingen. Die folgenden Ausführungen konzentrieren sich auf zwei dieser Figuren, die spezifischer als Grenzfiguren begriffen werden können: Meckerer und Märzgefallene. Es wird untersucht, wie diese beiden Grenzfiguren sprachlich konstruiert, in welchen Kontexten und Kommunikationssituationen sie angeeignet und verwendet wurden. In beiden Fällen wird der Fokus dabei über den wörtlichen Ausdruck hinaus auf zeitgenössisch ähnliche oder eng verwandte Bezeichnungen ausgeweitet.
Terminological resources play a central role in the organization and retrieval of scientific texts. Both simple keyword lists and advanced modelings of relationships between terminological concepts can make a most valuable contribution to the analysis, classification, and finding of appropriate digital documents, either on the web or within local repositories. This seems especially true for long-established scientific fields with elusive theoretical and historical branches, where the use of terminology within documents from different origins is often far from being consistent. In this paper, we report on the progress of a linguistically motivated project on the onomasiological re-modeling of the terminological resources for the grammatical information system grammis. We present the design principles and the results of their application. In particular, we focus on new features for the authoring backend and discuss how these innovations help to evaluate existing, loosely structured terminological content, as well as to efficiently deal with automatic term extraction. Furthermore, we introduce a transformation to a future SKOS representation. We conclude with a positioning of our resources with regard to the Knowledge Organization discourse and discuss how a highly complex information environment like grammis benefits from the re-designed terminological KOS.
Taking a usage-based perspective, lexical-semantic relations and other aspects of lexical meaning are characterised as emerging from language use. At the same time, they shape language use and therefore become manifest in corpus data. This paper discusses how this mutual influence can be taken into account in the study of these relations. An empirically driven methodology is proposed that is, as an initial step, based on self-organising clustering of comprehensive collocation profiles. Several examples demonstrate how this methodology may guide linguists in explicating implicit knowledge of complex semantic structures. Although these example analyses are conducted for written German, the overall methodology is language-independent.
The paper presents the results of a joint effort of a group of multimodality researchers and tool developers to improve the interoperability between several tools used for the annotation and analysis of multimodality. Each of the tools has specific strengths so that a variety of different tools, working on the same data, can be desirable for project work. However this usually requires tedious conversion between formats. We propose a common exchange format for multimodal annotation, based on the annotation graph (AG) formalism, which is supported by import and export routines in the respective tools. In the current version of this format the common denominator information can be reliably exchanged between the tools, and additional information can be stored in a standardized way.
This paper presents the results of a joint effort of a group of multimodality researchers and tool developers to improve the interoperability between several tools used for the annotation and analysis of multimodality. Each of the tools has specific strengths so that a variety of differ-ent tools, working on the same data, can be desirable for project work. However this usually re-quires tedious conversion between formats. We propose a common exchange format for multi-modal annotation, based on the annotation graph (AG) formalism, which is supported by import and export routines in the respective tools. In the current version of this format the common de-nominator information can be reliably exchanged between the tools, and additional information can be stored in a standardized way.
We investigate whether prototypicality or prominence of semantic roles can account for role-related effects in sentence interpretation. We present two acceptability-rating experiments testing three different constructions: active, personal passive and DO-clefts involving the same type of transitive verbs that differ with respect to the agentive role features they select. Our results reveal that there is no cross-constructional advantage for prototypical roles (e.g., agents), hence disconfirming a central tenet of role prototypicality. Rather, acceptability clines depend on the construction under investigation, thereby highlighting different role features. This finding is in line with one core assumption of the prominence account stating that role features are flexibly highlighted depending on the discourse function of the respective construction.
This paper aims to describe different patterns of syntactic extensions of turns-at-talk in mundane conversations in Czech. Within interactional linguistics, same-speaker continuations of possibly complete syntactic structures have been described for typologically diverse languages, but have not yet been investigated for Slavic languages. Based on previously established descriptions of various types of extensions (Vorreiter 2003; Couper-Kuhlen & Ono 2007), our initial description shall therefore contribute to the cross-linguistic exploration of this phenomenon. While all previously described forms for continuing a turn-constructional unit seem to exist in Czech, some grammatical features of this language (especially free word order and strong case morphology) may lead to problems in distinguishing specific types of syntactic extensions. Consequently, this type of language allows for critically evaluating the cross-linguistic validity of the different categories and underlines the necessity of analysing syntactic phenomena within their specific action contexts.
The paper presents the results of a survey on lexicographic practices and lexicographers’ needs across Europe that was conducted in the context of the Horizon 2020 project European Lexicographic Infrastructure (ELEXIS) among the observer institutions of the project. The survey is a revised and upgraded version of the survey which was originally conducted among ELEXIS lexicographic partner institutions in 2018 (Kallas et al. 2019a). The main goal of this new survey was to complement the data from the ELEXIS lexicographic partner institutions in order to get a more complete picture of lexicographic practices both for born-digital and retro-digitised resources in Europe. The results offer a detailed insight into many aspects of the lexicographic process at European institutions, such as funding, training, staff, lexicographic expertise, software and tools. In addition, the survey reflects on current trends in lexicography and reveals what institutions see as the most important emerging trends that will affect lexicography in the short-term and long-term future. Overall, the results provide valuable input informing the development of tools, resources, guidelines and training materials within ELEXIS.
This article describes an English Zulu learners’ dictionary that is part of a larger set of information tools, namely an online Zulu course, an e-dictionary of possessives (which was implemented earlier) accompanied by training software offering translation tasks on several levels, and an ontology of morphemic items categorizing and describing all parts of speech of Zulu. The underlying lexicographic database contains the usual type of lexicographic data, such as translation equivalents and their respective morphosyntactic data, but its entries have been extended with data related to the lessons of the online course in order to enable the learner to link both tools autonomously. The ‘outer matter’ is integrated into the website in the form of several texts on additional web pages (how-to-use, typical outputs, grammar tables, information on morphosyntactic rules, etc.). The dictionary comprises a modular system, where each module fulfils one of the necessary functions.
Lexical chaining has become an important part of many NLP tasks. However, the goodness of a chaining process and hence its annotation output depends on the quality of the chaining resource. Therefore, a framework for chaining is needed which integrates divergent resources in order to balance their deficits and to compare their strengths and weaknesses. In this paper we present an application that incorporates the framework of a meta model of lexical chaining exemplified on three resources and its generalized exchange format.
The paper reports on a dictionary of German loanwords in the languages of the South Pacific that is compiled at the Institut für Deutsche Sprache in Mannheim. The loanwords described in this dictionary mainly result from language contact between 1884 and 1914, when the German empire was in possession of large areas of the South Pacific where overall more than 700 indigenous languages were spoken. The dictionary is designed as an electronic XML-based resource from which an internet dictionary and a printed dictionary can be derived. Its printed version is intended as an ‘inverted loanword dictionary’, that is, a dictionary that – in contrast to the usual praxis in loanword lexicography – lemmatizes the words of a source language that have been borrowed by other languages. Each of the loanwords will be described with respect to its form and meaning and the contact situation in which it was borrowed. Among the outer texts of the dictionary are (i) a list of all sources with bibliographic and archival information, (ii) a commentary on each source, (iii) a short history of the language contact with German for each target language, and perhaps (iv) facsimiles of source texts.The dictionary is supposed to (i) help to reconstruct the history of language contact of the source language, (ii) provide evidence for the cultural contact between the populations speaking the source and the target languages, (iii) enable linguistic theories about the systematic changes of the semantic, morphosyntactic, or phonological lexical properties of the source language when its words are borrowed into genetically and typologically different languages, and (iv) establish a thoroughly described case for testing typological theories of borrowing.
This paper discusses an investigation of how senses are ordered across eight dictionaries. A dataset of 75 words was used for this purpose, and two senses were examined for each word. The words are divided into three groups of 25 words each according to the relationship between the senses: Homonymy, Metaphor, and Systematic Polysemy. The primary finding is that WordNet differs from the other dictionaries in terms of Metaphor. The order of the senses was more often figurative/literal, and it had the highest percentage of figurative senses that were not found. We discuss leveraging another dictionary, COBUILD, to re-order the senses according to frequency.
Just like most varieties of West Germanic, virtually all varieties of German use a construction in which a cognate of the English verb 'do' (standard German 'tun') functions as an auxiliary and selects another verb in the bare infinitive, a construction known as 'do'-periphrasis or 'do'-support. The present paper provides an Optimality Theoretic (OT) analysis of this phenomenon. It builds on a previous analysis by Bader and Schmid (An OT-analysis of 'do'-support in Modern German, 2006) but (i) extends it from root clauses to subordinate clauses and (ii) aims to capture all of the major distributional patterns found across (mostly non-standard) varieties of German. In so doing, the data are used as a testing ground for different models of German clause structure. At first sight, the occurrence of 'do' in subordinate clauses, as found in many varieties, appears to support the standard CP-IP-VP analysis of German. In actual fact, however, the full range of data turn out to challenge, rather than support, this model. Instead, I propose an analysis within the IP-less model by Haider (Deutsche Syntax - generativ. Vorstudien zur Theorie einer projektiven Grammatik, Narr, Tübingen, 1993 et seq.). In sum, the 'do'-support data will be shown to have implications not only for the analysis of clause structure but also for the OT constraints commonly assumed to govern the distribution of 'do', for the theory of non-projecting words (Toivonen in Non-projecting words, Kluwer, Dordrecht, 2003) as well as research on grammaticalization.
This contribution presents an XML Schema for annotating a high level narratological category: speech, thought and writing representation (ST&WR). It focusses on two aspects: Firstly, the original Schema is presented as an example for the challenge to encode a narrative feature in a structured and flexible way and secondly, ways of adapting this Schema to TEI are considered, in Order to make it usable for other, TEI-based projects.
Anakoluthe dependenziell
(2008)
Analepsen mit Topik-Drop sind hochfrequente sprachliche Strukturen in Interaktionen. In dieser Arbeit stehen neben der interaktionslinguistischen Untersuchung der Diskursfunktionen, Bedingungen und Restriktionen von Analepsen diskurssemantische Perspektiven und Fragestellungen im Mittelpunkt, insbesondere die detaillierte Beschreibung der semantischen Relationen zwischen Analepsen und ihrem Präkontext. Die Analepsenresolution muss dabei situiert erklärt werden, da das Verstehen von Analepsen von der kontextuellen Einbettung sowie von grammatischen, semantischen und pragmatischen Merkmalen der Äußerung abhängt.
Es wird gezeigt, dass kognitive Zuschreibungen hinsichtlich der Interaktionsbeteiligten auch mit interaktionslinguistischen Methoden möglich sind. Die Studie demonstriert außerdem, dass die Kombination von qualitativen und quantitativen Methoden erkenntnisträchtig ist, um spezifische Verwendungspräferenzen von analeptischen im Vergleich zu anaphorischen Äußerungen herauszuarbeiten.
Analepses with topic-drop are frequent structures in German interaction. While hitherto the focus on analepses was a rather syntactic one, this paper deals with analeptic structures from a semantic perspective. It particularly concentrates on the semantic relations between the referents of the analepses and the prior interactional context. This analysis shows that even for rather simple analepses which just omit a constituent from the prior utterance, conceptual processes are more decisive for its interpretation than syntactic features of the antecedent constituents. This is even more the case for complex analepses that are only indirectly linked to the prior context, and for the interpretation of which hearers need to draw inferences. The paper argues that theoretical approaches like Conversation Analysis and Interactional Linguistics can profit from adopting a semantic and conceptual perspective for the interpretation of interactional structures.
Analyse des Sprachverhaltens im Redekonstellationstyp "Interview" : eine empirische Untersuchung
(1975)
Cette contribution s’intéresse aux co-constructions d’un tour de parole en interaction, plus spécifiquement, à la manière dont la complétion d’un énoncé de la part d’un co-participant est ensuite réceptionnée par le locuteur dont le tour a été complété. Malgré l’intérêt certain porté par l’analyse conversationnelle et la linguistique interactionnelle à la co-énonciation, l’évaluation de cette pratique par le premier locuteur n’a pas fait l’objet d’analyses approfondies. Dans ce qui suit, nous nous focalisons plus particulièrement sur les pratiques interactionnelles qui permettent aux participants de valider une co-construction. Ce travail est issu du projet ANR SPIM (« L’imitation dans la parole »), dans le cadre duquel nous nous sommes interrogée sur la fonction de l’hétéro-répétition (le fait de répéter un énoncé d’un autre locuteur ou une partie de celui-ci, opposée à l’auto- répétition) dans des séquences de co-construction d’un tour de parole.
Dieser Beitrag versucht, eine Einschätzung der Einsatzmöglichkeiten für automatische Analysemethoden aus der aktuellen computerlinguistischen Forschung für die sprachvergleichende Grammatikforschung vorzunehmen. Zur Illustration werden die Ergebnisse einer computerlinguistischen Studie für die vergleichende Untersuchung von Spaltsatzkonstruktionen in verschiedenen Sprachen wiedergegeben und ausführlich diskutiert. Der Korpuszugang erfolgt in diesem Rahmen auf Basis einer vollautomatischen syntaktischen Analyse, die dann noch zusätzlich durch eine statistische Wortalignierung kontrastiv auf Parallelkorpora beleuchtet werden kann. Neben der Vorstellung der bereits bestehenden automatischen Annotationsmöglichkeiten, die in meinen Augen vielversprechende Wege für den sprachwissenschaftlichen Korpuszugang eröffnen, ist die Hoffnung, dass dieser Beitrag durch die abschließende Diskussion zu dem Bewusstsein beiträgt, dass eine tiefere, organischere Verbindung der beiden sprachwissenschaftlichen Disziplinen möglich ist: dann nämlich, wenn der Korpuszugang nicht mit statischen, vordefinierten Werkzeugen erfolgt, deren Verhalten durch die Grammatikforscherin oder den Grammatikforscher nicht beeinflusst werden kann, sondern wenn ein interaktiver Werkzeuggebrauch erfolgt, der von den vielfältigen Anpassungsmöglichkeiten mit den zugrunde liegenden maschinellen Lernverfahren Gebrauch macht.
This paper aims to address these problems by dealing with theoretical and methodological questions concerning the national effects of the Bologna Process and the role national factors play in determining the impact of these effects. Altogether the purpose of the paper is to serve as a starting point for future research – both as a guide for systematic and comparative empirical work on higher education, but also for further theoretical and methodological reasoning concerning research on (higher) education policy. As higher education research so far particularly lacks an approach allowing for a competitive and systematic falsification of theoretical arguments by clearly indicating testable and specific hypothesis as well as variables behind the research design (Goedegebuure/Vught 1996) we propose to fall back on neighbouring disciplines, namely social science to improve and enhance the analysis (Slaughter 2001: 398; Altbach 2002: 154; Teichler 1996a: 433, 2005: 448). Several strands of research have to be considered – namely literature on Europeanization as well as insights and approaches of studies dealing with cross-national policy convergence. Taking into account the non-obligatory and mainly intergovernmental character of the Bologna Process the main focus of the paper is on factors related to the effects of transnational communication. The inherent goal is to extend the research agenda on higher education (McLendon 2003: 184ff) and to leave behind the restriction of to analyse only a few cases by striving for a research design that allows for systematic testing and sufficient explanations of cross-national policy convergence at the interface between the Bologna Process and domestic factors.
Although there is a growing interest of policy makers in higher education issues (especially on an international scale), there is still a lack of theoretically well-grounded comparative analyses of higher education policy. Even broadly discussed topics in higher education research like the potential convergence of European higher education systems in the course of the Bologna Process suffer from a thin empirical and comparative basis. This paper aims to deal with these problems by addressing theoretical questions concerning the domestic impact of the Bologna Process and the role national factors play in determining its effects on cross-national policy convergence. It develops a distinct theoretical approach for the systematic and comparative analysis of cross-national policy convergence. In doing so, it relies upon insights from related research areas — namely literature on Europeanization as well as studies dealing with cross-national policy convergence.
This study investigated whether an analysis of narrative style (word use and cross-clausal syntax) of patients with symptoms of generalised anxiety and depression disorders can help predict the likelihood of successful participation in guided self-help. Texts by 97 people who had made contact with a primary care mental health service were analysed. Outcome measures were completion of the guided self-help programme, and change in symptoms assessed by a standardised scale (CORE-OM). Regression analyses indicated that some aspects of participants' syntax helped to predict completion of the programme, and that aspects of syntax and word use helped to predict improvement of symptoms. Participants using non-finite complement clauses with above-average frequency were four times more likely to complete the programme (95% confidence interval 1.4 to 11.7) than other participants. Among those who completed, the use of causation words and complex syntax (adverbial clauses) predicted improvement, accounting for 50% of the variation in well-being benefit. These results suggest that the analysis of narrative style can provide useful information for assessing the likelihood of success of individuals participating in a mental health guided self-help programme.
MRI data of German vowels and consonants was acquired for 9 speakers. In this paper tongue contours for the vowels were analyzed using the three-mode factor analysis technique PARAFAC. After some difficulties, probably related to what constitutes an adequate speaker sample for this three-mode technique to work, a stable two-factor solution was extracted that explained about 90% of the variance. Factor 1 roughly captured the dimension low back to high front; Factor 2 that from mid front to high back. These factors are compared with earlier models based on PARAFAC. These analyses were based on midsagittal contours; the paper concludes by illustrating from coronal and axial sections how non-midline information could be incorporated into this approach.
Analytikerwissen, Teilnehmerwissen und soziale Wirklichkeit in der ethnographischen Gesprächsanalyse
(2013)
Distributional models of word use constitute an indispensable tool in corpus based lexicological research for discovering paradigmatic relations and syntagmatic patterns (Belica et al. 2010). Recently, word embeddings (Mikolov et al. 2013) have revived the field by allowing to construct and analyze distributional models on very large corpora. This is accomplished by reducing the very high dimensionality of word cooccurrence contexts, the size of the vocabulary, to few dimensions, such as 100-200. However, word use and meaning can vary widely along dimensions such as domain, register, and time, and word embeddings tend to represent only the most prevalent meaning. In this paper we thus construct domain specific word embeddings to allow for systematically analyzing variations in word use. Moreover, we also demonstrate how to reconstruct domain specific co-occurrence contexts from the dense word embeddings.
This thesis consists of the following three papers that all have been published in international peer-reviewed journals:
Chapter 3: Koplenig, Alexander (2015c). The Impact of Lacking Metadata for the Measurement of Cultural and Linguistic Change Using the Google Ngram Data Sets—Reconstructing the Composition of the German Corpus in Times of WWII. Published in: Digital Scholarship in the Humanities. Oxford: Oxford University Press. [doi:10.1093/llc/fqv037]
Chapter 4: Koplenig, Alexander (2015b). Why the quantitative analysis of dia-chronic corpora that does not consider the temporal aspect of time-series can lead to wrong conclusions. Published in: Digital Scholarship in the Humanities. Oxford: Oxford University Press. [doi:10.1093/llc/fqv030]
Chapter 5: Koplenig, Alexander (2015a). Using the parameters of the Zipf–Mandelbrot law to measure diachronic lexical, syntactical and stylistic changes – a large-scale corpus analysis. Published in: Corpus Linguistics and Linguistic Theory. Berlin/Boston: de Gruyter. [doi:10.1515/cllt-2014-0049]
Chapter 1 introduces the topic by describing and discussing several basic concepts relevant to the statistical analysis of corpus linguistic data. Chapter 2 presents a method to analyze diachronic corpus data and a summary of the three publications. Chapters 3 to 5 each represent one of the three publications. All papers are printed in this thesis with the permission of the publishers.
The grammatical information system grammis combines descriptive texts on German grammar with dictionaries of specific word classes and grammatical terminology. In this paper, we describe the first attempts at analyzing user behavior for an online grammar of the German language and the implementation of an analysis and data extraction tool based on Matomo, a web analytics tool. We focus on the analysis of the keywords the users search for, either within grammis or via an external search platform like Google, and the analysis of the interaction between the text components within grammis and the integrated dictionaries. The overall results show that about 50% of the searches are for grammatical terms, and that the users shift from texts to dictionaries, mainly by using the integrated links to the dictionary of terminology within the texts. Based on these findings, we aim to improve grammis by extending its integrated dictionaries.