Sprache im 20. Jahrhundert. Gegenwartssprache
Refine
Year of publication
- 2004 (42) (remove)
Document Type
- Part of a Book (15)
- Article (14)
- Conference Proceeding (5)
- Part of Periodical (5)
- Book (2)
- Working Paper (1)
Has Fulltext
- yes (42)
Is part of the Bibliography
- no (42)
Keywords
- Deutsch (35)
- Sprachgeschichte (7)
- Konversationsanalyse (5)
- Sprachpflege (5)
- Gesprochene Sprache (4)
- Annotation (3)
- Korpus <Linguistik> (3)
- Sprachvariante (3)
- Türkisch (3)
- Wortbildung (3)
Publicationstate
- Veröffentlichungsversion (13)
- Postprint (3)
Reviewstate
- (Verlags)-Lektorat (12)
- Peer-Review (2)
- Review-Status-unbekannt (1)
Publisher
- Institut für Deutsche Sprache (11)
- Carocci (3)
- Goethe-Institut (2)
- Iudicium (2)
- Schmidt (2)
- Aufbau Taschenbuch Verlag (1)
- Dt. Krebsforschungszentrum (1)
- DuMont (1)
- Dudenverlag (1)
- European Language Resources Association (1)
Das Bild von der 'Sprache der DDR' in der alten Bundesrepublik oder: Haben sie so gesprochen?
(2004)
Deutsch-türkische Kontaktvarietäten. Am Beispiel der Sprache von deutsch-türkischen Jugendlichen
(2004)
We present the annotation of information structure in the MULI project. To learn more about the information structuring means in prosody, syntax and discourse, theory- independent features were defined for each level. We describe the features and illustrate them on an example sentence. To investigate the interplay of features, the representation has to allow for inspecting all three layers at the same time. This is realised by a stand-off XML mark-up with the word as the basic unit. The theory-neutral XML stand-off annotation allows integrating this resource with other linguistic resources such as the Tiger Treebank for German or the Penn treebank for English.
The goal of the MULI (MUltiLingual Information structure) project is to empirically analyse information structure in German and English newspaper texts. In contrast to other projects in which information structure is annotated and investigated (e.g. in the Prague Dependency Treebank, which mirrors the basic information about the topic-focus articulation of the sentence), we do not annotate theory-biased categories like topic-focus or theme-rheme. Trying to be as theory-independent as possible, we annotate those features which are relevant to information structure and on the basis of which typical patterns, co-occurrences or correlations can be determined. We distinguish between three annotation levels: syntax, discourse and prosody. The data is based on the TIGER Corpus for German and the Penn Treebank for English, since the existing information on part-of-speech and syntactic structure can be re-used for our purposes. The actual annotation of an English example sequence illustrates our choice of categories on each level. Their combination offers the possibility to investigate how information structure is realised and can be interpreted.
This paper outlines the generation process of a specifi computational linguistic representation termed the Multilingual Time Map, conceptually a multi-tape finit state transducer encoding linguistic data at different levels of granularity. The fi st component acquires phonological data from syllable labeled speech data, the second component define feature profiles the third component generates feature hierarchies and augments the acquired data with the define feature profiles and the fourth component displays the Multilingual Time Map as a graph.
The aim of this paper is to highlight the actual need for corpora that have been annotated based on acoustic information. The acoustic information should be coded in features or properties and is needed to inform further processing systems, i.e. to present a basis for a speech recognition system using linguistic information. Feature annotation of existing corpora in combination with segmental annotation can provide a powerful training material for speech recognition systems, but will as well challenge the further processing of features to segments and syllables. We present here the theoretical preliminaries for our multilingual feature extraction system, that we are currently working on.
Vorbemerkung
(2004)