Refine
Document Type
- Conference Proceeding (2)
- Article (1)
Language
- English (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
- Annotation (2)
- Automatische Sprachanalyse (2)
- Deutsch (2)
- Korpus <Linguistik> (2)
- Englisch (1)
- Gesprochene Sprache (1)
- Rechtssprache (1)
- Syntax (1)
Publicationstate
Reviewstate
This paper presents a study on the comprehensibility of rephrased syntactic structures in German court decisions. While there are a number of studies using psycholinguistic methods to investigate the comprehensibility of original legal texts, we are not aware of any study looking into the effect resolving complex structures has on the comprehensibility. Our study combines three methodological steps. First, we analyse an annotated corpus of court decisions, press releases and newspaper reports on these decisions in order to detect those complex structures in the decisions which distinguish them from the other text types. Secondly, these structures are rephrased into two increasingly simple versions. Finally, all versions are subjected to a self paced reading experiment. The findings suggest that rephrasing greatly enhances the comprehensibility for the lay reader.
We present the annotation of information structure in the MULI project. To learn more about the information structuring means in prosody, syntax and discourse, theory- independent features were defined for each level. We describe the features and illustrate them on an example sentence. To investigate the interplay of features, the representation has to allow for inspecting all three layers at the same time. This is realised by a stand-off XML mark-up with the word as the basic unit. The theory-neutral XML stand-off annotation allows integrating this resource with other linguistic resources such as the Tiger Treebank for German or the Penn treebank for English.
The goal of the MULI (MUltiLingual Information structure) project is to empirically analyse information structure in German and English newspaper texts. In contrast to other projects in which information structure is annotated and investigated (e.g. in the Prague Dependency Treebank, which mirrors the basic information about the topic-focus articulation of the sentence), we do not annotate theory-biased categories like topic-focus or theme-rheme. Trying to be as theory-independent as possible, we annotate those features which are relevant to information structure and on the basis of which typical patterns, co-occurrences or correlations can be determined. We distinguish between three annotation levels: syntax, discourse and prosody. The data is based on the TIGER Corpus for German and the Penn Treebank for English, since the existing information on part-of-speech and syntactic structure can be re-used for our purposes. The actual annotation of an English example sequence illustrates our choice of categories on each level. Their combination offers the possibility to investigate how information structure is realised and can be interpreted.