Refine
Document Type
- Article (2)
- Part of a Book (2)
- Conference Proceeding (1)
Has Fulltext
- yes (5)
Keywords
- Quantitative Linguistik (5) (remove)
Publicationstate
- Veröffentlichungsversion (4)
- Postprint (1)
- Zweitveröffentlichung (1)
Reviewstate
- (Verlags)-Lektorat (3)
- Peer-Review (2)
Publisher
Diese Fallstudie untersucht die quantitative Verteilung von direkten und nicht-direkten Formen von Redewiedergabe im Vergleich zwischen zwei Literaturtypen: Hochliteratur - definiert als Werke, die auf der Auswahlliste von Literaturpreisen standen - und Heftromanen - massenproduzierten Erzählwerken, die zumeist über den Zeitschriftenhandel vertrieben werden. Die Studie geht von manuell annotierten Daten aus und überprüft daran die Verlässlichkeit automatischer Annotationswerkzeuge, die im Anschluss eingesetzt werden, um eine Untersuchung von insgesamt 250 Volltexten durchzuführen. Es kann nachgewiesen werden, dass sich die Literaturtypen sowie auch unterschiedliche Genres von Heftromanen hinsichtlich der verwendeten Wiedergabeformen unterscheiden.
The understanding of story variation, whether motivated by cultural currents or other factors, is important for applications of formal models of narrative such as story generation or story retrieval. We present the first stage of an experiment to elicit natural narrative variation data suitable for evaluation with respect to story similarity, to qualitative and quantitative analysis of story variation, and also for data processing. We also present few preliminary results from the first stage of the experiment, using Red Riding Hood and Romeo and Juliet as base texts.
In order to demonstrate why it is important to correctly account for the (serial dependent) structure of temporal data, we document an apparently spectacular relationship between population size and lexical diversity: for five out of seven investigated languages, there is a strong relationship between population size and lexical diversity of the primary language in this country. We show that this relationship is the result of a misspecified model that does not consider the temporal aspect of the data by presenting a similar but nonsensical relationship between the global annual mean sea level and lexical diversity. Given the fact that in the recent past, several studies were published that present surprising links between different economic, cultural, political and (socio-)demographical variables on the one hand and cultural or linguistic characteristics on the other hand, but seem to suffer from exactly this problem, we explain the cause of the misspecification and show that it has profound consequences. We demonstrate how simple transformation of the time series can often solve problems of this type and argue that the evaluation of the plausibility of a relationship is important in this context. We hope that our paper will help both researchers and reviewers to understand why it is important to use special models for the analysis of data with a natural temporal ordering.