Refine
Year of publication
- 2002 (15) (remove)
Document Type
- Conference Proceeding (15) (remove)
Has Fulltext
- yes (15)
Is part of the Bibliography
- no (15)
Keywords
- Korpus <Linguistik> (5)
- XML (4)
- Deutsch (2)
- Fachsprache (2)
- FrameNet (2)
- Hyperlink (2)
- Hypertext (2)
- Terminologie (2)
- Verb (2)
- Wissenserwerb (2)
Publicationstate
- Veröffentlichungsversion (10)
- Postprint (4)
Reviewstate
- Peer-Review (8)
- (Verlags)-Lektorat (4)
Publisher
- European Language Resources Association (ELRA) (2)
- Association for Computational Linguistics (1)
- Berkeley Linguistics Society (1)
- CSLI Publications (1)
- DFKI GmbH (1)
- Euralex (1)
- Gesellschaft für Informatik (1)
- International Speech Communication Association (1)
- Las Palmas (1)
- Universidad de Las Palmas de Gran Canaria (1)
In this paper, we investigate the practical applicability of Co-Training for the task of building a classifier for reference resolution. We are concerned with the question if Co-Training can significantly reduce the amount of manual labeling work and still produce a classifier with an acceptable performance.
We describe a simple and efficient Java object model and application programming interface (API) for (possibly multi-modal) annotated natural language corpora. Corpora are represented as elements like Sentences, Turns, Utterances, Words, Gestures and Markables. The API allows linguists to access corpora in terms of these discourse-level elements, i.e. at a conceptual level they are familiar with, with the flexibility offered by a general purpose programming language. It is also a contribution to corpus standardization efforts because it is based on a straightforward and easily extensible data model which can serve as a target for conversion of different corpus formats.
Der Kurzbeitrag berichtet über ein Projekt ”Hypertextualisierung auf textgrammatischer Grundlage“ (HyTex), in dem erforscht wird, wie sich linear organisierte Dokumente mit semiautomatischen Methoden auf der Grundlage von textgrammatischem Markup und der linguistisch motivierten Modellierung terminologischen Wissens in delinearisierte Hyperdokumente überführen lassen. Ziel ist es, eine Sammlung von Fachtexten so in einen Hypertext zu überführen, dass terminologiebedingte Verständnisschwierigkeiten beim Lesen durch entsprechende Linkangebote aufgelöst werden, so dass die Fachtexte auch von Semi-Experten der Domäne selektiv gelesen werden können. Der Schwerpunkt des Beitrags liegt auf der Modellierung terminologischen Wissens mit XML Topic Maps und dessen Stellenwert für die automatische Erzeugung von Hyperlinks.
Online Access Tools for Spoken German: The Resources of the Deutsches Spracharchiv in a Database
(2002)
This paper shows some details of the modernization of the Deutsches Spracharchiv (DSAv). It explores some future possibilities of linguistical documentation and analysis using the Web. The Institut für Deutsche Sprache (IDS) in Mannheim is the central institution for linguistic research in Germany. The DSAv in the IDS is the center for documentation and research of spoken German. These archives include the largest collection of sound recordings of spoken German (dialects and colloquial speech, including e.g. lots of extinct dialects of former German territories in Eastern Europe) - altogether more than 15,000 sound recordings. The lacking clarification and accessibility of this data material has been felt as an essential deficit. The opportunity to edit the sound signal digitally offers a much easier access to spoken language. Through the integration of the already existing information about the corpora and the transcribed texts in an information- and full text databank, as well as the linking of the data with the acoustic signal (alignment), arises a data-pool with considerably better documentation of the materials and a fast direct grasp of the recorded sounds. Thus, the DSAv initiates totally new research questions for the work at the IDS, as well as for linguistics altogether.
Here we will present a graphical software tool called Morph Moulder (MoMo) for teaching the formal foundations of a language with a denotation in a domain of relational typed feature structures as used in Head-Driven Phrase Structure Grammar. With MoMo, students learn the properties of totally well-typed, sort resolved relational feature structures, the use of formal languages to describe typed feature structures and the notions of constraint satisfaction and models of grammars written in a formal language. MoMo was realized and conceived within the context of a set of courses in the format of web-based training, that focuses on the concept of typed feature structures in a curriculum in grammar formalisms and parsing. The formal language of MoMo amends the constraint language of TRALE (an implementation platform for HPSG grammars based on ALE) to accommodate the expressive power of HPSG.
In the context of the HyTex project, our goal is to convert a corpus into a hypertext, basing conversion strategies on annotations which explicitly mark up the text-grammatical structures and relations between text segments. Domain-specific knowledge is represented in the form of a knowledge net, using topic maps. We use XML as an interchange format. In this paper, we focus on a declarative rule language designed to express conversion strategies in terms of text-grammatical structures and hypertext results. The strategies can be formulated in a concise formal syntax which is independend of the markup, and which can be transformed automatically into executable program code.
Generierung von Linkangeboten zur Rekonstruktion terminologiebedingter Wissensvoraussetzungen
(2002)
Dieser Beitrag skizziert Strategien zur (semi-)automatischen Annotation von definitorischen Textsegmenten und Termverwendungsinstanzen auf der Grundlage grammatisch annotierter Korpora. Ziel unserer Überlegungen ist es, bei der selektiven Rezeption von Fachtexten in einer Hypertextumgebung die je spezifischen Wissensvoraussetzungen, die der Verwendung von Fachtermini unterliegen und die für das Textverständnis eine entscheidende Rolle spielen, über automatisch generierte Linkangebote rekonstruierbar zu machen.
The classification of verbs in Levin's (1993) English Verb Classes and Alternations: A preliminary Investigation, on the basis of both intuitive semantic grouping and their participation in valence alternations, is often used by the NLP community as evidence of the semantic similarity of verbs (Jing & McKeown 1998; Lapata & Brew 1999; Kohl et al. 1998). In this paper, we compare the Levin classification with the work of the FrameNet project (Fillmore & Baker 2001), where words (not just verbs) are grouped according to the conceptual structures (frames) that underlie them and their combinatorial patterns are inductively derived from corpus evidence. This means that verbs grouped together in FrameNet (FN) might be semantically similar but have different (or no) alternations, and that verbs which share the same alternation might be represented in two different semantic frames.