Refine
Document Type
- Article (3)
- Conference Proceeding (1)
Language
- English (4)
Has Fulltext
- yes (4)
Keywords
- Wortlänge (4) (remove)
Publicationstate
- Postprint (3)
- Zweitveröffentlichung (1)
Reviewstate
- Peer-Review (4)
Publisher
In a recent article, Meylan and Griffiths (Meylan & Griffiths, 2021, henceforth, M&G) focus their attention on the significant methodological challenges that can arise when using large-scale linguistic corpora. To this end, M&G revisit a well-known result of Piantadosi, Tily, and Gibson (2011, henceforth, PT&G) who argue that average information content is a better predictor of word length than word frequency. We applaud M&G who conducted a very important study that should be read by any researcher interested in working with large-scale corpora. The fact that M&G mostly failed to find clear evidence in favor of PT&G's main finding motivated us to test PT&G's idea on a subset of the largest archive of German language texts designed for linguistic research, the German Reference Corpus consisting of ∼43 billion words. We only find very little support for the primary data point reported by PT&G.
A frequently replicated finding is that higher frequency words tend to be shorter and contain more strongly reduced vowels. However, little is known about potential differences in the articulatory gestures for high vs. low frequency words. The present study made use of electromagnetic articulography to investigate the production of two German vowels, [i] and [a], embedded in high and low frequency words. We found that word frequency differently affected the production of [i] and [a] at the temporal as well as the gestural level. Higher frequency of use predicted greater acoustic durations for long vowels; reduced durations for short vowels; articulatory trajectories with greater tongue height for [i] and more pronounced downward articulatory trajectories for [a]. These results show that the phonological contrast between short and long vowels is learned better with experience, and challenge both the Smooth Signal Redundancy Hypothesis and current theories of German phonology.
This paper deals with the distribution of word length in short native mythological and historical Eskimo narrative texts. To my knowledge, no Eskimo‐Aleut data have been the object of quantitative linguistic investigation so far. Due to the strong linguistic and Stylistic homogeneity of the examined texts it was assumed that these texts can be subsumed under a single law of word length distribution, if word length distribution of a text is considered as a function of certain of its properties, such as author, language, and genre. So far, word length distribution in texts of a wide variety of languages and genres has been demonstrated to follow distributions of the compound Poisson family of discrete probability distributions. In view of the morphological idiosyncrasies of the Eskimo language in general, which are responsible for an unusually high mean word length of about 4.5 to 5.2 syllables per word in the texts, it is interesting to see whether Eskimo texts show a significantly different behaviour with respect to word length. The results demonstrate that the Eskimo data employed in this study can be fitted well by the Hyperpoisson distribution. Two further discrete probability distributions will be deduced from certain morphology‐based assumptions about Eskimo. It turns out that most of the Eskimo data can be fitted by these two distributions. The question to what extent these results point to a more grammar‐oriented theory of word length is also discussed.