Refine
Year of publication
- 2006 (12) (remove)
Document Type
- Conference Proceeding (12) (remove)
Has Fulltext
- yes (12)
Is part of the Bibliography
- no (12)
Keywords
- Korpus <Linguistik> (7)
- Deutsch (3)
- Sprachstatistik (3)
- Computerlinguistik (2)
- Modeling (2)
- Polnisch (2)
- Pronomen (2)
- Anapher <Syntax> (1)
- Annotation (1)
- Automatische Klassifikation (1)
Publicationstate
- Veröffentlichungsversion (12) (remove)
Reviewstate
- (Verlags)-Lektorat (8)
- Peer-Review (3)
Publisher
- Association for Computational Linguistics (3)
- Extreme Markup Languages Conference (2)
- ACL (1)
- Berkeley Linguistics Society, Inc. (1)
- Bibliothek der Universität Konstanz (1)
- European Language Resources Association (ELRA) (1)
- Neisse | Oficyna Wydawnicza ATUT (1)
- University of Tübingen (1)
- Universität Konstanz (1)
Der Beitrag befasst sich zunächst mit der Satzklammer des Deutschen aus der Perspektive der Informationsverteilung. Nachdem gezeigt ist, dass sie als Informationsklammer fungiert, wird ihre Interaktion mit den Teilen gespaltener Nominalphrasen untersucht. Dabei zeigen sich zwei interessante Befunde:
• die Satzklammer und die NP-Teile unterstützen sich bei der Informationsklammerbildung; insbesondere können die Spalt-NP-Teile Akzent tragen;
• die Spalt-NP-Teile können alleine die Rolle einer Informationsklammer spielen, wodurch eine Topikalisierung des Partizips II möglich wird.
This paper discusses the behaviour of German particle verbs formed by two-way prepositions in combination with pleonastic PPs including the verb particle as a preposition. These particle verbs have a characteristic feature: some of them license directional prepositional phrases in the accusative, some only allow for locative PPs in the dative, and some particle verbs can occur with PPs in the accusative and in the dative. Directional particle verbs together with directional PPs present an additional problem: the particle and the preposition in the PP seem to provide redundant information. The paper gives an overview of the semantic verb classes influencing this phenomenon, based on corpus data, and explains the underlying reasons for the behaviour of the particle verbs. We also show how the restrictions on particle verbs and pleonastic PPs can be expressed in a grammar theory like Lexical Functional Grammar (LFG).
Linguistic corpora have been annotated by means of SGML-based markup languages for almost 20 years. We can, very roughly, differentiate between three distinct evolutionary stages of markup technologies. (1)Originally, single SGML tree-based document instances were deemed sufficient for the representation of linguistic structures. (2) Linguists began to realize that alternatives and extensions to the traditional model are needed. Formalisms such as, for example, NITE were proposed: the NITE Object Model (NOM) consists of multi-rooted trees. (3) We are now on the threshold of the third evolutionary stage: even NITE's very flexible approach is not suited for all linguistic purposes. As some structures, such as these, cannot be modeled by multi-rooted trees, an even more flexible approach is needed in order to provide a generic annotation format that is able to represent genuinely arbitrary linguistic data structures.
The paper discusses two topics: firstly an approach of using multiple layers of annotation is sketched out. Regarding the XML representation this approach is similar to standoff annotation. A second topic is the use of heterogeneous linguistic resources (e.g., XML annotated documents, taggers, lexical nets) as a source for semiautomatic multi-dimensional markup to resolve typical linguistic issues, dealing with anaphora resolution as a case study.
The aim of the paper is twofold. Firstly, an approach is presented how to select the correct antecedent for an anaphoric element according to the kind of text segments in which both of them occur. Basically, information on logical text structure (e.g. chapters, sections, paragraphs) is used in order to select the antecedent life span of a linguistic expression, i.e. some linguistic expressions are more likely to be chosen as an antecedent throughout the whole text than others. In addition, an appropriate search scope for an anaphora expressed by an expression can be defined according to the document structuring elements that include the linguistic expression. Corpus investigations give rise to the supposition that logical text structure influences the search scope of candidates for antecedents. Second, a solution is presented how to integrate the resources used for anaphora resolution. In this approach, multi-layered XML annotation is used in order to make a set of resources accessible for the anaphora resolution system.
This paper presents the current results of an ongoing research project on corpus distribution of prepositions and pronouns within Polish preposition-pronoun contractions. The goal of the project is to provide a quantitative description of Polish preposition-pronoun contractions taking into consideration morphosyntactic properties of their components. It is expected that the results will provide a basis for a revision of the traditionally assumed inflectional paradigms of Polish pronouns and, thus, for a possible remodeling of these paradigms. The results of corpus-based investigations of the distribution of prepositions within preposition-pronoun contractions can be used for grammar-theoretical and lexicographic purposes.