Refine
Year of publication
Document Type
- Article (41) (remove)
Has Fulltext
- yes (41)
Keywords
- Deutsch (19)
- Wörterbuch (7)
- Korpus <Linguistik> (6)
- Rezension (6)
- Gesprochene Sprache (4)
- Computerunterstützte Lexikografie (3)
- Fremdsprachenlernen (3)
- Verb (3)
- Benutzerforschung (2)
- Benutzung (2)
Publicationstate
- Zweitveröffentlichung (41) (remove)
Reviewstate
- Peer-Review (37)
- (Verlags)-Lektorat (4)
Publisher
- de Gruyter (41) (remove)
A constructicon, i.e., a structured inventory of constructions, essentially aims at documenting functions of lexical and grammatical constructions. Among other parameters, so-called constructional collo-profiles, as introduced by Herbst (2018, 2020), are conclusive for determining constructional meanings. They provide information on how relevant individual words are for construction slots, they hint at usage preferences of constructions and serve as a helpful indicator for semantic peculiarities of constructions. However, even though collo-profiles constitute an indispensable component of constructicon entries, they pose major challengers for constructicographers: For a constructicographic enterprise it is not feasible to conduct collostructional analyses for hundreds or even thousands of constructions. In this article, we introduce a procedure based on the large language model BERT that allows to predict collo-profiles without having to extensively annotate instances of constructions in a given corpus. Specifically, by discussing the constructions X macht Y ADJP (‘x makes Y ADJ’, e.g. he drives him crazy) and N1 PREP N1 (e.g., bumper to bumper, constructions over constructions), we show how the developed automated system generates collo-profiles based on a limited number of annotated instances. Finally, we place collo-profiles alongside other dimensions of constructional meanings included in the German Constructicon.
The internationally renowned conference of the European Association for Lexicography (EURALEX) has taken place every two years for the past 39 years. Last year’s conference, held July 12th–16th, 2022, marked EURALEX’s 20th edition, and more than 200 international participants gathered at Mannheim Palace to discuss current developments, learn about new projects, and present their own work — either in lexicography or in one of the many applied or neighboring disciplines such as corpus and computational linguistics.
What is the subject of German linguistics? This seemingly simple question has no obvious answer. In the ZGL’s first issue, the editors required contributions to cover the whole of the German language and to be theoretically sound but application-orientated, whereas the current ZGL-homepage defines the German language of present and history in all its differentiations as its subject matter.
Looking through the fifty volumes of ZGL, three relationships can be identified as presumably enlightening the role of language, in particular the German language: language and mind; language and language use; language and culture. Though of a different systematic type, language and data should be added as an increasingly important pairing for conceptualizing language. On this basis, I also discuss the position of linguistic studies of the German language, mirrored in the ZGL-volumes, between social, cultural and natural sciences, as well as the corresponding epistemic approaches – like explaining vs. understanding.
Der Beitrag betrachtet movierbare Personenbezeichnungen, die in einem Prädikativum mit Bezug auf ein weibliches Subjekt gebraucht werden (Typ sie ist Käufer/Käuferin). In solchen Fällen ist neben der Verwendung der movierten Personenbezeichnung auch die ihrer maskulinen Basis möglich, wobei zum tatsächlichen Gebrauch der beiden Varianten bisher widersprüchliche Angaben und kaum Daten vorlagen. Diese Untersuchung ergibt, dass die Movierung in der Prädikativkonstruktion seit dem Ahd. der Normalfall war und ist. Allerdings lassen sich einige Nischen ausmachen, in denen unmovierte Bezeichnungen etwas frequenter sind: Der mit Abstand höchste Wert findet sich bei weiblicher Selbstreferenz, während Maskulina bei weiblichen Subjekten der dritten Person Singular mit einer Ausnahme weitgehend unüblich sind. Diese Ausnahme ist der offizielle Sprachgebrauch der damaligen DDR. Öffentlichkeitsgerichtete Texte des 20./21. Jh., die nicht aus der DDR stammen, zeigen einen vermutlich gesellschaftlich bedingten Rückgang der sowieso schon seltenen unmovierten Formen ab Mitte der 1970er-Jahre.
Canadian heritage German across three generations: A diary-based study of language shift in action
(2019)
It is well known that migration has an effect on language use and language choice. If the language of origin is maintained after migration, it tends to change in the new contact setting. Often, migrants shift to the new majority language within few generations. The current paper examines a diary corpus containing data from three generations of one German-Canadian family, ranging from 1867 to 1909, and covering the second to fourth generation after immigration. The paper analyzes changes that can be observed between the generations, with respect to the language system as well as to the individuals’ decision on language choice. The data not only offer insight into the dynamics of acquiring a written register of a heritage language, and the eventual shift to the majority language. They also allow us to identify different linguistic profiles of heritage speakers within one community. It is discussed how these profiles can be linked to the individuals’ family backgrounds and how the combination of these backgrounds may have contributed to giving up the heritage language in favor of the majority language.
This paper presents observations on the phonetic realisations of the German particles ja – ‘yes’ and naja – approximately ‘well’. As part of a large-scale study on the particle ja, we identified numerous instances in the dataset that had been orthographically transcribed as ja, but were phonetically realised as [nja]. Using phonetic and functional parameters, we explore the question whether these instances can be attributed to either the lexeme ja or naja. While phonetic measurements yield ambivalent results, analyses of pragmatic parameters such as function and turn position seem to indicate that [nja] was predominantly intended to be ja, although some functional differences between ja and [nja] could also be identified.
Gesprochene Lernerkorpora: Methodisch-technische Aspekte der Erhebung, Erschließung und Nutzung
(2022)
This article provides an overview of methodological and technical issues that arise in the collection, indexing and use of spoken learner corpora, i. e. corpora containing spoken utterances of learners of a target language. After an introductory discussion of the most important special features of this type of corpus that distinguish it from written language learner corpora and spoken corpora with L1 speakers, we will go into more detail on questions of corpus design. The main part of the paper is then an overview of the methodological and technical procedures of the individual steps of collecting, indexing, providing and using spoken learner corpora. The main aim of this overview is to highlight practices that can be considered best practices according to the current state of research. Finally, we outline the challenges that still exist for this type of corpus.
Für die sprachbasierte Forschung in den Geistes- und Sozialwissenschaften stellt CLARIN eine Forschungsinfrastruktur bereit, die auf die hochgradig heterogenen Forschungsdaten in diesen Wissenschaftsbereichen angepasst ist. Mit Werkzeugen zum Auffinden, zur standardkonformen Aufbereitung und zur nachhaltigen Aufbewahrung von Daten sowie mit der Bereitstellung von virtuellen Forschungsumgebungen zur kollaborativen Erstellung und Auswertung von Forschungsdaten unterstützt CLARIN alle wesentlichen Aspekte des Datenmanagements und der Datenarchivierung. Diese CLARIN-Angebote werden durch Beratungs- und Schulungsmaßnahmen begleitet.
Dictionary usage research views dictionaries primarily as tools for solving linguistic problems. A large proportion of dictionary use now takes place online and can thus be easily monitored using tracking technologies. Using the data gathered through tracking usage data, we hope to optimize user experiences of dictionaries and other linguistic resources. Usage statistics are also used for external evaluation of linguistic resources. In this paper, we pursue the following three questions from a quantitative perspective: (1) What new insights can we gain from collecting and analysing usage data? (2) What limitations of the data and/or the collection process do we need to be aware of? (3) How can these insights and limitations inform the development and evaluation of linguistic resources?
In this paper we present the results of a survey conducted among students of German Philology at Adam Mickiewicz University in Poznań in the years 2015–2017. The target group was composed of first-semester students from whom we collected data about their lexicographical competence at the start of the program. The results contain some interesting findings, e.g. students prefer online dictionaries, but the number of students using print dictionaries is comparable and we have also observed the rising number of students who use smartphone applications. The aim of the survey is to provide information for university instructors who teach German as a foreign language (DaF) and lexicography.
Mit der Tagung zu Bauernkomödien des 17. Jahrhunderts verfolgten Markus Denkler (Münster) und Michael Elmentaler (Kiel) ein ungewöhnliches Konzept, das einen besonders intensiven wissenschaftlichen Austausch ermöglichte: Gemeinsame Textgrundlage für alle Beitragenden stellten zwölf hoch- und niederdeutsche Bauernkomödien aus dem 17. Jahrhundert (ca. 1593–1701) dar. Dabei handelt es sich um Dramen mit bäuerlichen Figuren, die eine komödiantische Ausrichtung haben und in Prosaform verfasst sind. Alle Vortragenden erhielten im Vorfeld Zugriff auf die Sammlung und entwickelten daraus in der Folge Fragestellungen für ihre Vorträge. Inhaltlich ergaben sich drei Blöcke. Zwei literaturwissenschaftliche Beiträge ordneten die Textsorte literatur- und kulturhistorisch ein. Daran schlossen sich ein umfangreicher Block zur historischen Dialogforschung und Pragmatik und ein etwas kürzerer zu historischer Varietätenlinguistik und Grammatik an.
This article investigates the use of überhaupt and sowieso in German and Dutch. These two words are frequently classified as particles, if only because of their pragmatic functions. The frequent use of particles is considered a specific trait common to German and Dutch, and the description of their semantics and pragmatics is notoriously difficult. It is unclear whether both particles have the same meaning in Dutch (where they are loanwords) and German, whether they can fulfil the same syntactic functions and to what extent the (semantic and pragmatic) functions of überhaupt und sowieso overlap. There has already been linguistic research on überhaupt and sowieso by Fisseni (2009) using the world-wide web and by Bruijnen and Sudhoff (2013) using the EUROPARL corpus. In the present study we critically evaluated the corpus study, integrating information on original utterance language and discussing the adequacy of this corpus. Moreover, we conducted an experimental survey collecting subjective-intuitive judgements in three dimensions, thus gathering more data on sparse and informal constructions.
By using these complementary methods, we obtain a more nuanced picture of the use of überhaupt and sowieso in both languages: On the one hand, the data show where the use of both words is more similar and on the other hand, differences between the languages can also be discerned.
Novel formats of construction-based description hold great potential for phenomena that fall through the cracks in traditional kinds of linguistic reference works. On the example of German verb argument structure constructions with a prepositional object, we demonstrate that a construction-based description of such phenomena is superior to existing lexicographic and grammaticographic treatments, but that it also poses a number of new problems. The most fundamental of these relates to the fact that construction-based analyses can be proposed on different levels of abstraction. We illustrate pertinent problems relating to the precise identification of constructional form and meaning and suggest a multi-layered descriptive format for web-based electronic reference constructica that can accommodate these challenges. Semantically, the proposed solution integrates both lumping and splitting perspectives on constructional grain size and permits users to flexibly zoom in and out on individual elements in the resource. Formally, it can capture variation in the number and marking of realised arguments as found in e.g. passives and transitivity alternations. Aspects of the theoretical controversy between Construction Grammar and Valency Theory are addressed where relevant, but our focus is on questions of description and the practical implementation of construction-based analyses in a suitable type of linguistic reference work.
This paper investigates emergent pseudo-coordination in spoken German. In a corpus-based study, seven verbs in the first conjunct are analyzed regarding the degree of semantic bleaching and the development of subjective or aspectual meaning components. Moreover, it is shown that each verb shows distinct tendencies for co-ocurrences, especially with deictic adverbs in the first conjunct and with specific verbs and verb classes in the second conjunct. It is argued that pseudo-coordination is originally motivated by the need for ‘chunking’ in unplanned speech and that it is still prominently used in this function in German, in contrast to languages in which pseudo-coordination is grammaticalized further.
Das Kombinieren von Daten aus verschiedenen diachronen Korpora bringt besondere methodische Herausforderungen mit sich, die in den vorliegenden Untersuchungen beleuchtet werden. Dazu gehört der Abgleich von Metadaten und ihrer Kategorisierungen, das Verhalten bekannter Phänomene über sich zeitlich überschneidende Korpora hinweg und die Formulierung vergleichbarer Suchabfragen. Anhand von sechs Fallstudien zu graphematischen, lexikalischen, morphologischen und syntaktischen Phänomenen in Korpora des (Früh-) Neuhochdeutschen werden Möglichkeiten und Probleme des diachron korpusübergreifenden Arbeitens herausgearbeitet.
In the first volume of Corpus Linguistics and Linguistic Theory, Gries (2005. Null-hypothesis significance testing of word frequencies: A follow-up on Kilgarriff. Corpus Linguistics and Linguistic Theory 1(2). doi:10.1515/cllt.2005.1.2.277. http://www.degruyter.com/view//cllt.2005.1.issue-2/cllt.2005.1.2.277/cllt.2005.1.2.277.xml: 285) asked whether corpus linguists should abandon null-hypothesis significance testing. In this paper, I want to revive this discussion by defending the argument that the assumptions that allow inferences about a given population – in this case about the studied languages – based on results observed in a sample – in this case a collection of naturally occurring language data – are not fulfilled. As a consequence, corpus linguists should indeed abandon null-hypothesis significance testing.
This paper presents types and annotation layers of reply relations in computer- mediated communication (CMC). Reply relations hold between post units in CMC interactions and describe references from one given post to a previous post. We classify three types of reply relations in CMC interactions: first, technical replies, i. e. the possibility to reply directly to a previous post by clicking a ‘reply’ button; second, indentations, e. g. in wiki talk pages in which users insert their contributions in the existing talk page by indenting them and third, interpretative reply relations, i. e. the reply action is not realised formally but signalled by other structural or linguistics means such as address markers ‘@’, greetings, citations and/or Q-A structures. We take a look at existing practices in the description and representation of such relations in corpora and examples of chat, Wikipedia talk pages, Twitter and blogs. We then provide an annotation proposal that combines the different levels of description and representation of reply relations and which adheres to the schemas and practices for encoding CMC corpus documents within the TEI framework as defined by the TEI CMC SIG. It constitutes a prerequisite for correctly identifying higher levels of interactional relations such as dialogue acts or discussion trees.
Lexikographische und lexikalische Ressourcen zum Deutschen werden an vielen unterschiedlichen Institutionen erarbeitet. Zum einen im Dudenverlag, der mit den gedruckten Wörterbüchern der Duden-Reihe und mit „Duden online“ die meistkonsultierten gegenwartssprachlichen Wörterbücher zum Deutschen erstellt, dann die Union deutscher Akademien, unter deren Dach an verschiedenen einzelnen Akademien zahlreiche historische wie auch synchrone Wörterbücher zum Deutschen erstellt werden (z. B. das „Digitale Wörterbuch der deutschen Sprache“, das „Wörterbuchnetz“ sowie das geplante Informationssystem des neuen „Zentrums für digitale Lexikographie der deutschen Sprache“). Auch am Institut für Deutsche Sprache in Mannheim werden wissenschaftliche wortschatzbezogene Ressourcen zum Deutschen erarbeitet und der (Fach-)Öffentlichkeit unter dem Dach von OWID, dem „Online-Wortschatz-Informationssystem Deutsch“, präsentiert. Obwohl wir uns in OWID auf Ressourcen zu spezialisierten Wortschatzbereichen konzentriert haben, erreichen wir Nutzerinnen und Nutzer in verschiedensten Ländern der Welt. Wir wollen hier die Gelegenheit wahrnehmen, den ZGL-Leserinnen und -Lesern unsere Ressourcen in OWID und OWIDplus näher vorzustellen.
Tourlex: ein deutsch-italienisches Fachwörterbuch zur Tourismussprache für italienische DaF-Lerner
(2019)
Tourlex is a specialized bilingual online dictionary under construction hosted at the University of Mannheim with a particular focus on collocations and multi-word units. The languages included are German and Italian, but because of the need for online dictionaries of tourism language (Flinz 2015: 56) the framework is open to the inclusion of other languages. Tourlex is a corpus-based dictionary, i.e. the primary sources will be corpora, in particular a proper bilingual comparable corpus analysed with the tools Sketch Engine and Lexpan, and the freely accessible corpus DeReKo. The aim of this paper is to give an overview of the main actions (already done but also in planning), according to the phases of the lexicographical process of a dictionary under construction. The description of each phase will be enriched by examples taken from the project, showing also how the decisions taken to satisfy the needs of the user, the Italian learner of German as a foreign language, had influenced the microstructure of the entries. We conclude with a final reflection on the data, facts, and ongoing problems.
Plädoyer für die Entwicklung einer digital-lexikografischen Kompetenz im Fremdsprachenunterricht
(2019)
The aim of this paper is to promote an explicit and active development of digital-lexicographical competence in foreign language teaching. The results of two online surveys conducted as part of the research project DICONALE-COMBIDIGILEX in connection with the teaching and learning process of German as a foreign language (= DaF) provide a comparative insight into the behaviour and attitude of both teachers and learners of DaF on the topic “Use of lexicographical resources in the process of DaF-acquisition”. The evaluation of the surveys shows, that the digitallexicographical competences in the process of DaF-acquisition must be promoted more intensively, since the existing lexicographic offer is not optimally used for teaching purposes both on the part of the teachers and on the part of the learners. To this end, the following three main lexicographical competences will be examined from a methodological-didactic and application-oriented perspective: (i) Adequate selection of the electronic resource regarding the communicative situation, (ii) development of disambiguation strategies for reception in L2 or translation from L2 and (iii) development of strategies for production and translation into L2. This research will ultimately lead to a debate on the use of the dictionary in the digital environment in the DaF-teaching and discuss its actual influence on the learning process.
The article shows how the topic of dictionaries can be dealt with in German language teaching and how this subject has the potential to acquaint learners with a descriptive and data-driven perspective on language. The project Denkwerk, realized as cooperation among the Institute for German Language, the University of Mannheim and two regional secondary schools, fostered the students’ intellectual
curiosity and their interest in discovering linguistic details. Using empirical methods like corpus analysis, pupils learned both how to write wiki-based dictionary articles on their own and how to publish them in the Denktionary, the dictionary of the project. Our contribution describes the didactic and organisational framework of the project, its aims and contents, its schedule of events, as well as the structure of dictionary articles in the Denktionary, and the observed advantages of such a wikibased system.
In the past two decades, more and more dictionary usage studies have been published, but most of them deal with the question what users appreciate about dictionaries, which dictionaries they use and which information they need in specific situations. These studies presuppose that users indeed consult lexicographic resources. However, language teachers and lecturers of linguistics often have the impression that students use too few high-quality dictionaries in their every-day work. Against this background, we started an international cooperation project to collect empirical data evaluating that impression. Our aim was to evaluate what students (here from the Romance language area) actually do when they correct language problems. We used a new methodological setting to do this (screen recording with a thinking-aloud task). The empirical data we gained offers a broad insight into what language users really do when solving language-related tasks today.
In the project LeGeDe („Lexik des gesprochenen Deutsch”), we are developing a corpus-based lexicographical resource focusing on features of the lexicon of spoken German. To investigate the expectations of future users, two studies were conducted: interviews with a smaller group of experts and a large-scale online survey. We report on selected results, mainly from the online survey and with a focus on the learning perspective. We want to show if and to which extent the L2-learners’
expectations differ from those of native speakers and in which aspects the two groups agree. We also want to give an outlook on the possibilities that will be available to learners in the planned lexicographical resource.
Brief an die Herausgeber zu den vorgelegten Neuregelungsvorschlägen zur deutschen Rechtschreibung
(1995)
This article evaluates the terminological component (TC) of the grammis portal on German grammar developed by the Institut für Deutsche Sprache. The TC is included into grammis to facilitate nonlinguists‘ access to the main components of the portal: Grammar in questions and answers, and the Systematic Grammar. The TC thus has the potential to be an extremely useful and important grammis component. We discuss to what extend the TC achieves its goals, and make some suggestions how it could be improved. The most important aspects considered in the evaluation are: (a) TC completeness and consistency, (b) accessibility and usability of definitions and index, (c) integration of the TC with the overall system.
In multimodal scholarly presentations supported by presentation software, spoken and written language, various visualizations on the projected slides as well as the contributors’ gestures and facial expressions build a meaningful oneness. On the one hand, communication scientists as well as linguists have for a relatively long time neglected the presentation as a complex form of communication. On the other hand, since Tafte (2003 ), columnists of major German newspapers have been dealing with the question of the value, the quality and the place of PowerPoint in science, they have even tried to find the answer to the question whether PowerPoint is evil or not.
The presentation practice is perceived as fundamentally deficient of systematic empirical research on presentations. Also Grabowski called attention to this desideratum with two critical articles (Grabowski 2003, 2008). Various questions - still unanswered - have motivated the implementation of a number of experiments (in the summer of 2010) for analyzing the knowledge and learning effects and the communicational impact of scientific presentations. The general aim of these experiments was to conduct empirical research on selected presentations in order to find out what kind of presentation is successful. The main interest is to find out which model of scholarly presentation produces the best results regarding learning effect and communicative impact.
In recent times presentations have drawn the attention of scientific interest as a new form of communication. In visualization of abstract structures or relationships in scholarly presentations using diagrams, different medial layers of meaning are conjoined in a very special way. The present paper examines firstly the multimodal structure of presentations and the mechanisms of establishing cross-modality coherence. Then the results of a reception experiment are discussed that gives rise to the assumption that multimodality can in fact improve the understanding of scholarly presentations. In the final part of the paper the production of an abstract visualization in a scholarly presentation is exemplified with regard to the solution of disambiguation and linearization problems. We claim that abstract visualizations in presentations are used to produce narratives by the speaker, and without such narratives this kind of visualization cannot be understood properly.
Using the Google Ngram Corpora for six different languages (including two varieties of English), a large-scale time series analysis is conducted. It is demonstrated that diachronic changes of the parameters of the Zipf–Mandelbrot law (and the parameter of the Zipf law, all estimated by maximum likelihood) can be used to quantify and visualize important aspects of linguistic change (as represented in the Google Ngram Corpora). The analysis also reveals that there are important cross-linguistic differences. It is argued that the Zipf–Mandelbrot parameters can be used as a first indicator of diachronic linguistic change, but more thorough analyses should make use of the full spectrum of different lexical, syntactical and stylometric measures to fully understand the factors that actually drive those changes.
Two empirical studies were carried out in the project „Lexik des gesprochenen Deutsch” (LeGeDe) at the Institute for the German Language (IDS) in Mannheim. The main goal of these studies was to shed light on people’s expectations of the planned lexicographical online-resource. In the first study, selected experts were interviewed in the form of a guided interview. In the second study, a broader online survey was conducted, which should reach a wider range of potential users. This contribution introduces the basic concepts of the project LeGeDe, outlines the two studies and presents selected results on four subject blocks: (i) sociodemographic data, (ii) personal use of (online) dictionaries, (iii) individual experience with the lexis of spoken language and (iv) expectations concerning a lexicographical online-resource for spoken German.