Refine
Document Type
Language
- English (8)
Has Fulltext
- yes (8)
Is part of the Bibliography
- no (8)
Keywords
- Korpus <Linguistik> (8) (remove)
Publicationstate
- Veröffentlichungsversion (8) (remove)
Reviewstate
- Peer-Review (8)
We present an implemented system for the resolution of it, this, and that in transcribed multi-party dialog. The system handles NP-anaphoric as well as discourse-deictic anaphors, i.e. pronouns with VP antecedents. Selectional preferences for NP or VP antecedents are determined on the basis of corpus counts. Our results show that the system performs significantly better than a recency-based baseline.
We present an implemented machine learning system for the automatic detection of nonreferential it in spoken dialog. The system builds on shallow features extracted from dialog transcripts. Our experiments indicate a level of performance that makes the system usable as a preprocessing filter for a coreference resolution system. We also report results of an annotation study dealing with the classification of it by naive subjects.
We present an implemented XML data model and a new, simplified query language for multi-level annotated corpora. The new query language involves automatic conversion of queries into the underlying, more complicated MMAXQL query language. It supports queries for sequential and hierarchical, but also associative (e.g. coreferential) relations. The simplified query language has been designed with non-expert users in mind.
In this paper, we investigate the practical applicability of Co-Training for the task of building a classifier for reference resolution. We are concerned with the question if Co-Training can significantly reduce the amount of manual labeling work and still produce a classifier with an acceptable performance.
We describe a simple and efficient Java object model and application programming interface (API) for (possibly multi-modal) annotated natural language corpora. Corpora are represented as elements like Sentences, Turns, Utterances, Words, Gestures and Markables. The API allows linguists to access corpora in terms of these discourse-level elements, i.e. at a conceptual level they are familiar with, with the flexibility offered by a general purpose programming language. It is also a contribution to corpus standardization efforts because it is based on a straightforward and easily extensible data model which can serve as a target for conversion of different corpus formats.
We present a light-weight tool for the annotation of linguistic data on multiple levels. It is based on the simplification of annotations to sets of markables having attributes and standing in certain relations to each other. We describe the main features of the tool, emphasizing its simplicity, customizability and versatility
We apply a decision tree based approach to pronoun resolution in spoken dialogue. Our system deals with pronouns with NP- and non-NP-antecedents. We present a set of features designed for pronoun resolution in spoken dialogue and determine the most promising features. We evaluate the system on twenty Switchboard dialogues and show that it compares well to Byron’s (2002) manually tuned system.