Refine
Year of publication
- 2024 (3) (remove)
Document Type
- Article (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- yes (3)
Keywords
- Korpus <Linguistik> (3) (remove)
Publicationstate
Reviewstate
- Peer-Review (3) (remove)
Publisher
- Springer (1)
- de Gruyter (1)
In diesem Beitrag werden Komposita mit den relationalen Zweitgliedern Gatte und Gattin aus genderlinguistischer Perspektive untersucht, basierend auf manuell annotiertem zeitungssprachlichen Korpusmaterial. Frauen werden im analysierten Korpus ca. 12-mal häufiger in ihrer ehelichen Rolle versprachlicht als Männer. Statistische Analysen zeigen, dass sie dabei systematisch in ein possessives Verhältnis zum Ehemann gesetzt werden (Arztgattin = Gattin eines Arztes), während Ehemänner in den untersuchten Komposita tendenziell doppelt individualisiert werden (Arztgatte = Gatte, der Arzt ist). Neben den Zweitgliedern geben auch die Genera der beiden Konstituenten Aufschluss über die kodierte Bedeutungsrelation: Genusgleichheit (Kanzlergatte) führt zu einer qualifizierenden, Genusdivergenz (Kanzleringatte) zu einer possessiven Lesart. Die Analyse belegt außerdem die Existenz movierter Kompositumserstglieder – diese sind sogar die häufigste Form zur Benennung weiblicher Personen im Erstglied. Trotzdem herrscht bei der Bezugnahme auf Frauen eine größere Formenvarianz als bei Männern, welche fast ausschließlich mit maskulinen Erstgliedern versprachlicht werden. Damit zeigt die Studie, wie genderlinguistische Perspektiven auch im Bereich der Wortbildung einen neuen Analysezugang bilden.
This contribution explores the relationship between the English CEFR (Common European Framework of Reference for Languages) vocabulary levels and user interest in English Wiktionary entries. User interest was operationalized through the number of views of these entries in Wikimedia server logs covering a period of four years (2019–2022). Our findings reveal a significant relationship between CEFR levels and user interest: entries classified at lower CEFR levels tend to attract more views, which suggests a greater user interest in more basic vocabulary. A multiple regression model controlling for other known or potential factors affecting interest: corpus frequency, polysemy, word prevalence, and age of acquisition confirmed that lower CEFR levels attract significantly more views even after taking into account the other predictors. These findings highlight the importance of CEFR levels in predicting which words users are likely to look up, with implications for lexicography and the development of language learning materials.
We investigate the optional omission of the infinitival marker in a Swedish future tense construction. During the last two decades the frequency of omission has been rapidly increasing, and this process has received considerable attention in the literature. We test whether the knowledge which has been accumulated can yield accurate predictions of language variation and change. We extracted all occurrences of the construction from a very large collection of corpora. The dataset was automatically annotated with language-internal predictors which have previously been shown or hypothesized to affect the variation. We trained several models in order to make two kinds of predictions: whether the marker will be omitted in a specific utterance and how large the proportion of omissions will be for a given time period. For most of the approaches we tried, we were not able to achieve a better-than-baseline performance. The only exception was predicting the proportion of omissions using autoregressive integrated moving average models for one-step-ahead forecast, and in this case time was the only predictor that mattered. Our data suggest that most of the language-internal predictors do have some effect on the variation, but the effect is not strong enough to yield reliable predictions.