Refine
Document Type
- Article (2)
Language
- English (2)
Has Fulltext
- yes (2) (remove)
Is part of the Bibliography
- yes (2)
Keywords
- Deutsch (2)
- Worthäufigkeit (2)
- Bias (1)
- Deutsches Referenzkorpus (DeReKo) (1)
- Informationsgehalt (1)
- Informationstheorie (1)
- Kognitive Semantik (1)
- Korpus <Linguistik> (1)
- N-gram modeling (1)
- Semantische Analyse (1)
Publicationstate
- Zweitveröffentlichung (2) (remove)
Reviewstate
- Peer-Review (2)
Publisher
- Wiley (2) (remove)
This replication study aims to investigate a potential bias toward addition in the German language, building upon previous findings of Winter and colleagues who identified a similar bias in English. Our results confirm a bias in word frequencies and binomial expressions, aligning with these previous findings. However, the analysis of distributional semantics based on word vectors did not yield consistent results for German. Furthermore, our study emphasizes the crucial role of selecting appropriate translational equivalents, highlighting the significance of considering language-specific factors when testing for such biases for languages other than English.
In a recent article, Meylan and Griffiths (Meylan & Griffiths, 2021, henceforth, M&G) focus their attention on the significant methodological challenges that can arise when using large-scale linguistic corpora. To this end, M&G revisit a well-known result of Piantadosi, Tily, and Gibson (2011, henceforth, PT&G) who argue that average information content is a better predictor of word length than word frequency. We applaud M&G who conducted a very important study that should be read by any researcher interested in working with large-scale corpora. The fact that M&G mostly failed to find clear evidence in favor of PT&G's main finding motivated us to test PT&G's idea on a subset of the largest archive of German language texts designed for linguistic research, the German Reference Corpus consisting of ∼43 billion words. We only find very little support for the primary data point reported by PT&G.