Textlinguistik / Schriftsprache
Refine
Year of publication
Document Type
- Conference Proceeding (5)
- Article (4)
- Part of a Book (3)
Language
- English (12) (remove)
Has Fulltext
- yes (12)
Keywords
- Computerlinguistik (4)
- Erzählforschung (3)
- Handlungsstruktur <Literatur> (3)
- Textlinguistik (3)
- Formalisierung (2)
- Methodologie (2)
- Morphology of the Folktale (2)
- Narrative (2)
- Variation (2)
- computational models of narrative (2)
Publicationstate
- Veröffentlichungsversion (6)
- Zweitveröffentlichung (3)
- Postprint (2)
- Preprint (1)
Reviewstate
- Peer-Review (9)
- (Verlags)-Lektorat (2)
Publisher
- Dagstuhl (2)
- de Gruyter (2)
- Benjamins (1)
- Edizioni Università di Trieste (1)
- La Rochelle University (1)
- Niemeyer (1)
- Oxford University Press (1)
- Peeters (1)
- Zenodo (1)
We present a simple tool for extracting text and markup information from printouts of (not only) scientific documents. While the heavy-lifting OCR is done by off-the-shelf tesseract, our focus is on detection, extraction, and basic categorization of color-highlighted text sections, as well as on providing a framework for downstream processing of extraction results. The tool can be useful for document analysis tasks that must, or benefit from being able to, use printed paper.
National Socialism, one could argue, was all about belonging: belonging to the ‘Volk’ or the ‘Volksgemeinschaft’, belonging to the ‘Aryan’ or ‘Non-Aryan race’, belonging to the National Socialist ‘movement’, and so on. These categories of belonging worked both inclusionary and exclusionary and they were constituted, proclaimed and enacted to a great part through language. What is more, they had to be performed through communicative acts. For the normative side of National Socialist propaganda and legislation, this seems rather obvious and one-directional. On the side of the general population, however, this entailed a mixture of communicative need to position oneself vis-à-vis National Socialism (mostly in affirmative ways), but also the urge to do so willingly. When we look at the language use of ‘ordinary people’ in different communicative situations and texts during National Socialism, we have to focus on these dimensions of discursive collusion, co-constitution and appropriation. People during National Socialism, such is our hypothesis, navigated through discourses of belonging and by that made them real and effective. Besides diaries, war letters and autobiographical writings, one way to grasp this phenomenon is to analyse petitions, i.e., letters of complaint and request sent in large numbers by ‘ordinary people’ to public authorities of the party and the state. As I will show by some examples, letter-writers tried to inscribe themselves within (what they took for) National Socialist discourses of belonging in order to legitimate their claims. By doing so, they co-constituted and co-created the discursive realm of National Socialism.
The workshop presents ATHEN 1 (Annotation and Text Highlighting Environment), an extensible desktop-based annotation environment which supports more than just regular annotation. Besides being a general purpose annotation environment, ATHEN supports indexing and querying support of your data as well as the ability to automatically preprocess your data with Meta information. It is especially suited for those who want to extend existing general purpose annotation tools by implementing their own custom features, which cannot be fulfilled by other available annotation environments. On the according gitlab, we provide online tutorials, which demonstrate the use of specific features of ATHEN
The question of whether a letter is a grapheme or not is a perennial issue in writing research. The answer depends on which criteria are used to differentiate between letters and graphemes and, ultimately,how the unit ‘grapheme’ is defined. This problem is particularly relevant to complex graphemes, i.e. sequences of letters that behave like a single grapheme in certain respects. Typical for German is the ‹ch›. This paper argues for a scalar concept of graphemes, which compares the grapheme status of each of the units under investigation. For this purpose, new criteria for the identification of complex graphemes are used, which originate from handwriting analysis. There, it is shown that complex graphemes are connected with each other disproportionately often and also have deviating letter forms disproportionately often.
We present a technique called event mapping that allows to project text representations into event lists, produce an event table, and derive quantitative conclusions to compare the text representations. The main application of the technique is the case where two classes of text representations have been collected in two different settings (e.g., as annotations in two different formal frameworks) and we can compare the two classes with respect to their systematic differences in the event table. We illustrate how the technique works by applying it to data collected in two experiments (one using annotations in Vladimir Propp’s framework, the other using natural language summaries).
We continue the study of the reproducibility of Propp’s annotations from Bod et al. (2012). We present four experiments in which test subjects were taught Propp’s annotation system; we conclude that Propp’s system needs a significant amount of training, but that with sufficient time investment, it can be reliably trained for simple tales.
A formal narrative representation is a procedure assigning a formal description to a natural language narrative. One of the goals of the computational models of narrative community is to understand this procedure better in order to automatize it. A formal framework fit for automatization should allow for objective and reproducible representations. In this paper, we present empirical work focussing on objectivity and reproducibility of the formal framework by Vladimir Propp (1928). The experiments consider Propp’s formalization of Russian fairy tales and formalizations done by test subjects in the same formal framework; the data show that some features of Propp’s system such as the assignment of the characters to the dramatis personae and some of the functions are not easy to reproduce.
Interested in formally modelling similarity between narratives, we investigate judgements of similarity between narratives in a small corpus of film reviews and book–film comparisons. A main finding is that judgements tend to concern multiple levels of story representation at once. As these texts are pragmatically related to reception contexts, we find many references to reception quality and optimality. We conclude that current formal models of narrative can not capture the task of naturalistic narrative comparisons given in the analysed reviews, but that the development of models containing a more reception-oriented point of view will be necessary.
The understanding of story variation, whether motivated by cultural currents or other factors, is important for applications of formal models of narrative such as story generation or story retrieval. We present the first stage of an experiment to elicit natural narrative variation data suitable for evaluation with respect to story similarity, to qualitative and quantitative analysis of story variation, and also for data processing. We also present few preliminary results from the first stage of the experiment, using Red Riding Hood and Romeo and Juliet as base texts.