Refine
Year of publication
Document Type
- Article (30)
- Part of a Book (21)
- Preprint (8)
- Conference Proceeding (6)
- Other (2)
- Working Paper (2)
- Book (1)
- Doctoral Thesis (1)
Keywords
- Korpus <Linguistik> (23)
- Deutsch (22)
- Sprachstatistik (19)
- Wortschatz (17)
- Computerunterstützte Lexikographie (12)
- COVID-19 (10)
- Lexikostatistik (9)
- Sprachwandel (9)
- Online-Medien (8)
- Vielfalt (8)
Publicationstate
- Veröffentlichungsversion (37)
- Zweitveröffentlichung (12)
- Postprint (6)
- Preprint (2)
Reviewstate
Publisher
- De Gruyter (10)
- de Gruyter (8)
- Leibniz-Institut für Deutsche Sprache (IDS) (7)
- Cornell University (4)
- IDS-Verlag (3)
- Institut für Deutsche Sprache (3)
- MDPI (3)
- Erich Schmidt (2)
- Oxford University Press (OUP) (2)
- Springer Nature (2)
Die Benutzung von Onlinewörterbüchern ist bislang wenig erforscht. Am Institut für Deutsche Sprache in Mannheim wurde versucht, diese Forschungslücke mit einem Projekt zur Benutzungsforschung zumindest zum Teil schließen (s. www.benutzungsforschung.de). Die empirischen Studien wurden methodisch sowohl in Form von Onlinefragebögen, die neben befragenden auch experimentelle Elemente enthielten, als auch anhand eines Labortests (mit Eyetracking-Verfahren) durchgeführt. Die erste Studie untersuchte generell die Anlässe und sozialen Situationen der Verwendung von Onlinewörterbüchern sowie die Ansprüche, die Nutzer an Onlinewörterbücher stellen. An der zweisprachigen Onlinestudie (deutsch/englisch) nahmen international fast 700 Probanden teil. Durch die hohe Resonanz auf die erste Studie und den daraus folgenden Wunsch, die gewonnenen Informationen empirisch zu vertiefen, richtet sich auch die die zweite Studie an ein internationales Publikum und schloss inhaltlich an die erste Studie an. Später konzentrierten sich die Studien auf monolinguale deutsche Onlinewörterbücher wie elexiko (Studien 3 und 4), sowie auf das Wörterbuchportal OWID (Studie 5). Im Vortrag werden ausgewählte Ergebnisse der verschiedenen Studien vorgestellt.
This thesis consists of the following three papers that all have been published in international peer-reviewed journals:
Chapter 3: Koplenig, Alexander (2015c). The Impact of Lacking Metadata for the Measurement of Cultural and Linguistic Change Using the Google Ngram Data Sets—Reconstructing the Composition of the German Corpus in Times of WWII. Published in: Digital Scholarship in the Humanities. Oxford: Oxford University Press. [doi:10.1093/llc/fqv037]
Chapter 4: Koplenig, Alexander (2015b). Why the quantitative analysis of dia-chronic corpora that does not consider the temporal aspect of time-series can lead to wrong conclusions. Published in: Digital Scholarship in the Humanities. Oxford: Oxford University Press. [doi:10.1093/llc/fqv030]
Chapter 5: Koplenig, Alexander (2015a). Using the parameters of the Zipf–Mandelbrot law to measure diachronic lexical, syntactical and stylistic changes – a large-scale corpus analysis. Published in: Corpus Linguistics and Linguistic Theory. Berlin/Boston: de Gruyter. [doi:10.1515/cllt-2014-0049]
Chapter 1 introduces the topic by describing and discussing several basic concepts relevant to the statistical analysis of corpus linguistic data. Chapter 2 presents a method to analyze diachronic corpus data and a summary of the three publications. Chapters 3 to 5 each represent one of the three publications. All papers are printed in this thesis with the permission of the publishers.
This paper explores speakers’ notions of the situational appropriacy of linguistic variants. We conducted a web-based survey in which we collected ratings of the appropriacy of variants of linguistic variables in spoken German. A range of quantitative methods (cluster analysis, factor analysis and various forms of visualization techniques) is applied in order to analyze metalinguistic awareness and the differences in the evaluation of written vs. spoken stimuli. First, our data show that speakers’ ratings of the appropriacy of linguistic variants vary reliably with two rough clusters representing formal and informal speech situations and genres. The findings confirm that speakers adhere to a notion of spoken standard German which takes genre and register-related variation into account. Secondly, our analysis reveals a written language bias: metalinguistic awareness is strongly influenced by the physical mode of the presentation of linguistic items (spoken vs. written).
A comparison between morphological complexity measures: typological data vs. language corpora
(2016)
Language complexity is an intriguing phenomenon argued to play an important role in both language learning and processing. The need to compare languages with regard to their complexity resulted in a multitude of approaches and methods, ranging from accounts targeting specific structural features to global quantification of variation more generally. In this paper, we investigate the degree to which morphological complexity measures are mutually correlated in a sample of more than 500 languages of 101 language families. We use human expert judgements from the World Atlas of Language Structures (WALS), and compare them to four quantitative measures automatically calculated from language corpora. These consist of three previously defined corpus-derived measures, which are all monolingual, and one new measure based on automatic word-alignment across pairs of languages. We find strong correlations between all the measures, illustrating that both expert judgements and automated approaches converge to similar complexity ratings, and can be used interchangeably.
We present studies using the 2013 log files from the German version of Wiktionary. We investigate several lexicographically relevant variables and their effect on look-up frequency: Corpus frequency of the headword seems to have a strong effect on the number of visits to a Wiktionary entry. We then consider the question of whether polysemic words are looked up more often than monosemic ones. Here, we also have to take into account that polysemic words are more frequent in most languages. Finally, we present a technique to investigate the time-course of look-up behaviour for specific entries. We exemplify the method by investigating influences of (temporary) social relevance of specific headwords.
cOWIDplus Analyse ist eine kontinuierlich aktualisierte Ressource zu der Frage, ob und wie stark sich der Wortschatz ausgewählter deutscher Online-Pressemeldungen während der Corona-Pandemie systematisch einschränkt und ob bzw. wann sich das Vokabular nach der Krise wieder ausweitet. In diesem Artikel erläutern die Autor*innen die hinter der Ressource stehende Forschungsfrage, die zugrunde gelegten Daten, die Methode sowie die bisherigen Ergebnisse.
In this paper, we present the concept and the results of two studies addressing (potential) users of monolingual German online dictionaries, such as www.elexiko.de. Drawing on the example of elexiko, the aim of those studies was to collect empirical data on possible extensions of the content of monolingual online dictionaries, e.g. the search function, to evaluate how users comprehend the terminology of the user interface, to find out which types of information are expected to be included in each specific lexicographic module and to investigate general questions regarding the function and reception of examples illustrating the use of a word. The design and distribution of the surveys is comparable to the studies described in the chapters 5-8 of this volume. We also explain, how the data obtained in our studies were used for further improvement of the elexiko-dictionary.