OPUS 4 | Search

19 search hits

1 to 10

Sort by

OWL ontologies as a resource for discourse parsing (2008)

Bärenfänger, Maja ; Hilbert, Mirco ; Lobin, Henning ; Lüngen, Harald

In the project SemDok (Generic document structures in linearly organised texts) funded by the German Research Foundation DFG, a discourse parser for a complex type (scientific articles by example), is being developed. Discourse parsing (henceforth DP) according to the Rhetorical Structure Theory (RST) (Mann and Taboada, 2005; Marcu, 2000) deals with automatically assigning a text a tree structure in which discourse segments and rhetorical relations between them are marked, such as Concession. For identifying the combinable segments, declarative rules are employed, which describe linguistic and structural cues and constraints about possible combinations by referring to different XML annotation layers of the input text, and external knowledge bases such as a discourse marker lexicon, a lexico-semantic ontology (later to be combined with a domain ontology), and an ontology of rhetorical relations. In our text-technological environment, the obvious choice of formalism to represent such ontologies is OWL (Smith et al., 2004). In this paper, we describe two OWL ontologies and how they are consulted from the discourse parser to solve certain tasks within DP. The first ontology is a taxononomy of rhetorical relations which was developed in the project. The second one is an OWL version of GermaNet, the model of which we designed together with our project partners.

Allgemeine Überlegungen zur Retrodigitalisierung historischer Wörterbücher des Deutschen (2008)

Lobenstein-Reichmann, Anja

Jens Kegel: “Wollt ihr den totalen Krieg?”. Eine semiotische und linguistische Gesamtanalyse der Rede Goebbels' im Berliner Sportpalast am 18. Februar 1943 [Rezension] (2008)

Lobenstein-Reichmann, Anja

Einstellungen zu Normen aus sprachlicher Sicht (2008)

Wimmer, Rainer

Weltansichten aus sprachlicher und rechtlicher Perspektive. Zur Ontisierung von Konzepten des Rechts (2008)

Wimmer, Rainer

Zur semantischen und formalen Differenzierung im Übersetzungvergleich, am Beispiel von über-, auf- und an-Attributen (2008)

Cosma, Ruxandra

Dependenz, Valenz und kategoriale Analyse (2008)

Lobin, Henning

Ulrich Engel. Doctor Honoris Causa (2008)

Vorwort (2008)

Engel, Ulrich ; Stănescu, Speranţa

Cost-Sensitive Learning in Answer Extraction (2008)

Wiegand, Michael ; Leidner, Jochen L. ; Klakow, Dietrich

One problem of data-driven answer extraction in open-domain factoid question answering is that the class distribution of labeled training data is fairly imbalanced. In an ordinary training set, there are far more incorrect answers than correct answers. The class-imbalance is, thus, inherent to the classification task. It has a deteriorating effect on the performance of classifiers trained by standard machine learning algorithms. They usually have a heavy bias towards the majority class, i.e. the class which occurs most often in the training set. In this paper, we propose a method to tackle class imbalance by applying some form of cost-sensitive learning which is preferable to sampling. We present a simple but effective way of estimating the misclassification costs on the basis of class distribution. This approach offers three benefits. Firstly, it maintains the distribution of the classes of the labeled training data. Secondly, this form of meta-learning can be applied to a wide range of common learning algorithms. Thirdly, this approach can be easily implemented with the help of state-of-the-art machine learning software.

1 to 10

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

19 search hits