Refine
Document Type
- Article (1)
- Conference Proceeding (1)
- Doctoral Thesis (1)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3) (remove)
Keywords
- Korpus <Linguistik> (2)
- Deutsch (1)
- Grammatiktheorie (1)
- Head-Driven Phrase Structure Grammar (HPSG) (1)
- Parsing (1)
- Spanisch (1)
- Syntax (1)
- Textsorte (1)
- Topikalisierung (1)
- dependency parsing (1)
Publicationstate
- Veröffentlichungsversion (3) (remove)
Reviewstate
Publisher
In the NLP literature, adapting a parser to new text with properties different from the training data is commonly referred to as domain adaptation. In practice, however, the differences between texts from different sources often reflect a mixture of domain and genre properties, and it is by no means clear what impact each of those has on statistical parsing. In this paper, we investigate how differences between articles in a newspaper corpus relate to the concepts of genre and domain and how they influence parsing performance of a transition-based dependency parser. We do this by applying various similarity measures for data point selection and testing their adequacy for creating genre-aware parsing models.
Dieses Papier diskutiert informationsstrukturelle Aspekte der mehrfachen Vorfeldbesetzung im Deutschen. Auf der Grundlage einer größtenteils aus den IDS-Korpora extrahierten Belegsammlung werden Diskursgegebenheit, Fokus- und Topikstatus (vor allem) des Vorfeldmaterials beschrieben und in Bezug zu entsprechenden Aussagen in der Literatur gesetzt. Neben informationsstrukturellen Faktoren werden im letzten Abschnitt mögliche weitere Faktoren angesprochen, die mehrfache Vorfeldbesetzung favorisieren könnten. Zudem werden für einen begrenzten Ausschnitt des Deutschen erstmals Zahlen vorgelegt, die das Verhältnis von mehrfacher Vorfeldbesetzung zur ähnlichen, aber als „kanonischer“ geltenden Besetzung des Vorfelds mit einer (möglicherweise partiellen) Verbalphrase illustrieren.
This is a study of how aspects of information structure can be captured within a formal grammar of Spanish, couched in the framework of Head-Driven Phrase Structure Grammar (HPSG, Pollard
and Sag 1994). While a large number of morphological, syntactic and semantic aspects in a variety of languages have been successfully analysed in this theory, information structure has not been paid the same attention in the HPSG literature. However, as a theory of signs, HPSG should include all
levels of description without which the structural descriptions offered by the grammar would ultimately remain incomplete. Languages often explicitly mark the information-structural partitioning of utterances. Depending on the particular language, linguistic resources used for this purpose include
prosody (stress/intonation), syntax (e. g. constituent order, special syntactic constructions) and morphology (e. g. special affixes). In HPSG, phonological, syntactic, semantic and pragmatic information is represented in parallel, which would seem to be a well-suited architecture for modelling
the sort of interfaces called for.