Refine
Year of publication
- 2004 (186) (remove)
Document Type
- Part of a Book (97)
- Article (46)
- Conference Proceeding (20)
- Book (12)
- Part of Periodical (6)
- Doctoral Thesis (1)
- Habilitation (1)
- Other (1)
- Review (1)
- Working Paper (1)
Keywords
- Deutsch (111)
- Korpus <Linguistik> (16)
- Konversationsanalyse (13)
- Phraseologie (11)
- Sprachgeschichte (11)
- Wortverbindung (10)
- Annotation (9)
- Gesprochene Sprache (9)
- Logische Partikel (9)
- Verb (8)
Publicationstate
- Veröffentlichungsversion (75)
- Zweitveröffentlichung (35)
- Postprint (6)
Reviewstate
Publisher
- de Gruyter (37)
- Institut für Deutsche Sprache (17)
- Lang (11)
- Stauffenburg (7)
- Narr (6)
- Schmidt (4)
- Verlag für Gesprächsforschung (4)
- iudicium (4)
- Carocci (3)
- De Gruyter (3)
The motivation for this article is to describe a methodology for interrelating and analyzing language and theory-specific corpus data from various languages. As an example phenomeon we use information structure (IS, see [3]) in treebanks from three languages: Spanish, Korean and Japanese. Korean and Japanese are typologically close, while both are typologically different from Spanish. Therefore, the problem of annotating IS is that there are diverging language-specific formal linguistic means for the realization of IS-functions (like “topicalization / contrast”) on various levels like prosody, morphology and word-order. Hence, it is necessary to describe the relations between language-specific formal means and functional views on IS, and how to operationalize these relations for corpus analysis.
Das Bild von der 'Sprache der DDR' in der alten Bundesrepublik oder: Haben sie so gesprochen?
(2004)
The goal of the MULI (MUltiLingual Information structure) project is to empirically analyse information structure in German and English newspaper texts. In contrast to other projects in which information structure is annotated and investigated (e.g. in the Prague Dependency Treebank, which mirrors the basic information about the topic-focus articulation of the sentence), we do not annotate theory-biased categories like topic-focus or theme-rheme. Trying to be as theory-independent as possible, we annotate those features which are relevant to information structure and on the basis of which typical patterns, co-occurrences or correlations can be determined. We distinguish between three annotation levels: syntax, discourse and prosody. The data is based on the TIGER Corpus for German and the Penn Treebank for English, since the existing information on part-of-speech and syntactic structure can be re-used for our purposes. The actual annotation of an English example sequence illustrates our choice of categories on each level. Their combination offers the possibility to investigate how information structure is realised and can be interpreted.
We present the annotation of information structure in the MULI project. To learn more about the information structuring means in prosody, syntax and discourse, theory- independent features were defined for each level. We describe the features and illustrate them on an example sentence. To investigate the interplay of features, the representation has to allow for inspecting all three layers at the same time. This is realised by a stand-off XML mark-up with the word as the basic unit. The theory-neutral XML stand-off annotation allows integrating this resource with other linguistic resources such as the Tiger Treebank for German or the Penn treebank for English.
The second edition of Hermann Paul’s Principien der Sprachgeschichte [Principles of the History of Language] of 1886 is considered to be the true version of this milestone in linguistics. I consider why the circa 80 page shorter, that is ten chapters shorter, first edition did not yet receive such a high assessment. What could have been the author's reason for a revised version six years later? One must not only consider the inclusion of newer and foreign research literature but also the beginning of Paul’s work on his German dictionary which was published in 1897. The relationship between the second edition of the Principien and Paul’s lexicographic procedure resulted in substantial methodological and theoretical innovations. These were either caused or stimulated in the second edition by the lexicographic procedure. Altogether four types of evidence were examined. He explained (1) the meaning of the terminological pair ‘usuell’ and ’occasionell’, (2) explicit references to lexicographic aspects, (3) sort and context of the discussion of semantics in the recently included literature, and (4) classification of meaning change based on ancient rhetoric categories. My conclusion, in general, is that a text-oriented history of linguistics must include the intertextual references: It does not matter whether it concerns a theoretical thesis, empirical analyses, or presentations addressed to non-linguistic readers.
Rund 700 Neologismen werden in diesem ersten größeren Neologismenwörterbuch für das Deutsche dargestellt. Bei ihnen handelt es sich um neue Wörter, neue Bedeutungen von Wörtern und neue feste Wortverbindungen, die in den 90er Jahren des 20. Jahrhunderts in die Allgemeinsprache eingegangen sind. Ziel ist es, den allgemein großen Informationsbedarf in Bezug auf neuen Wortschatz zu befriedigen. Präsentiert werden rund 700 Neologismen, d.h. Neulexeme (z.B. Eurowährung) Neubedeutungen (z.B. surfen) und Neuphraseologismen (z.B. im grünen Bereich), die in den 90er Jahren des 20. Jahrhunderts in die deutsche Allgemeinsprache eingegangen sind. Das erste größere, nach Prinzipien der wissenschaftlichen Lexikographie erarbeitete Neologismenwörterbuch für das Deutsche will den Informationsbedarf, der gemeinhin in Bezug auf neuen Wortschatz besteht, befriedigen. Um dies zu erreichen, wurden für eine ausführliche deskriptive und auf umfangreichen Textkorpora fußende Beschreibung der Neologismen z.T. neue Datentypen und Präsentationsformen gewählt, die die in der Lexikographie bekannten in sinnvoller Weise ergänzen und bereichern.
Die Autoren, erfahrene Lexikographen, arbeiten seit längerem auf dem Gebiet der Neologismenforschung.
The administration of electronic publication in the Information Era congregates old and new problems, especially those related with Information Retrieval and Automatic Knowledge Extraction. This article presents an Information Retrieval System that uses Natural Language Processing and Ontology to index collection’s texts. We describe a system that constructs a domain specific ontology, starting from the syntactic and semantic analyses of the texts that compose the collection. First the texts are tokenized, then a robust syntactic analysis is made, subsequently the semantic analysis is accomplished in conformity with a metalanguage of knowledge representation, based on a basic ontology composed of 47 classes. The ontology, automatically extracted, generates richer domain specific knowledge. It propitiates, through its semantic net, the right conditions for the user to find with larger efficiency and agility the terms adapted for the consultation to the texts. A prototype of this system was built and used for the indexation of a collection of 221 electronic texts of Information Science written in Portuguese from Brazil. Instead of being based in statistical theories, we propose a robust Information Retrieval System that uses cognitive theories, allowing a larger efficiency in the answer to the users queries.
VALBU ist ein einsprachiges Wörterbuch deutscher Verben. Es enthält eine umfassende semantische und syntaktische Beschreibung von 638 Verben mit ihrer spezifischen Umgebung, ferner Informationen zur Morphologie, Wortbildung, Passivfähigkeit, Phraseologie und Stilistik sowie zahlreiche Verwendungsbeispiele.
Die Stichwortauswahl lehnt sich an den Verbbestand in der Wortschatzliste des "Zertifikats Deutsch" (ZD) an. Mit VALBU wird also im Bereich der Verben der Vokabularausschnitt weitgehend abgedeckt, den man zur sprachlichen Bewältigung von Alltagssituationen benötigt.
Das Wörterbuch wendet sich vorrangig an Lehrkräfte und Lehrbuchautoren im Bereich Deutsch als Fremdsprache. In erster Linie ist dabei an die Lehrer und Lehrerinnen gedacht, deren Muttersprache nicht Deutsch ist. Ihnen wird hier die Möglichkeit geboten, für Unterrichts- und Korrektursituationen die genauen Verwendungsbedingungen für den zentralen deutschen Verbbestand nachzuschlagen. Das Wörterbuch kann auch von fortgeschrittenen Deutschlernern mit fremder Muttersprache zu Rate gezogen werden.