Refine
Year of publication
- 2024 (1)
Document Type
- Preprint (1)
Language
- English (1)
Has Fulltext
- yes (1)
Is part of the Bibliography
- yes (1)
Keywords
- Häufigkeitsverteilung (1) (remove)
Publicationstate
- Veröffentlichungsversion (1) (remove)
Publisher
In a previous study, Aceves and Evans present a large-scale quantitative information-theoretic analysis of parallel corpus data in ~1,000 languages to show that there are apparently strong associations between the way languages encode information into words and patterns of communication, e.g. the configuration of semantic information. During the peer review process, one reviewer raised the question of the extent to which the presented results depend on different corpus sizes (see the Peer Review File). This is a very important question given that most, if not all, of the quantities associated with word frequency distributions vary systematically with corpus size. While Aceves and Evans claim that corpus size does not affect the results presented, I challenge this view by presenting reanalyses of the data that clearly suggest that it does.