Refine
Year of publication
- 2011 (6) (remove)
Document Type
- Part of a Book (4)
- Article (2)
Language
- English (6)
Has Fulltext
- yes (6)
Is part of the Bibliography
- no (6)
Keywords
- Computerlinguistik (3)
- Konversationsanalyse (2)
- interaktionale Semanitik (2)
- Annotation (1)
- Automatische Sprachanalyse (1)
- Datenstruktur (1)
- Deutsch (1)
- Dokumentenverarbeitung (1)
- Englisch (1)
- Kategorisierung (1)
Reviewstate
Publisher
- Springer (6) (remove)
In this contribution, we discuss and compare alternative options of modelling the entities and relations of wordnet-like resources in the Web Ontology Language OWL. Based on different modelling options, we developed three models of representing wordnets in OWL, i.e. the instance model, the dass model, and the metaclass model. These OWL models mainly differ with respect to the ontological Status of lexical units (word senses) and the synsets. While in the instance model lexical units and synsets are represented as individuals, in the dass model they are represented as classes; both model types can be encoded in the dialect OWL DL. As a third alternative, we developed a metaclass model in OWL FULL, in which lexical units and synsets are defined as metaclasses, the individuals of which are classes themselves. We apply the three OWL models to each of three wordnet-style resources: (1) a subset of the German wordnet GermaNet, (2) the wordnet-style domain ontology TermNet, and (3) GermaTermNet, in which TermNet technical terms and GermaNet synsets are connected by means of a set of “plug-in” relations. We report on the results of several experiments in which we evaluated the performance of querying and processing these different models: (1) A comparison of all three OWL models (dass, instance, and metaclass model) of TermNet in the context of automatic text-to-hypertext conversion, (2) an investigation of the potential of the GermaTermNet resource by the example of a wordnet-based semantic relatedness calculation.
Discourse parsing of complex text types such as scientific research articles requires the analysis of an input document on linguistic and structural levels that go beyond traditionally employed lexical discourse markers. This chapter describes a text-technological approach to discourse parsing. Discourse parsing with the aim of providing a discourse structure is seen as the addition of a new annotation layer for input documents marked up on several linguistic annotation levels. The discourse parser generates discourse structures according to the Rhetorical Structure Theory. An overview of the knowledge sources and components for parsing scientific joumal articles is given. The parser’s core consists of cascaded applications of the GAP, a Generic Annotation Parser. Details of the chart parsing algorithm are provided, as well as a short evaluation in terms of comparisons with reference annotations from our corpus and with recently developed Systems with a similar task.
Integrated Linguistic Annotation Models and Their Application in the Domain of Antecedent Detection
(2011)
Seamless integration of various, often heterogeneous linguistic resources in terms of their output formats and a combined analysis of the respective annotation layers are crucial tasks for linguistic research. After a decade of concentration on the development of formats to structure single annotations for specific linguistic issues, in the last years a variety of specifications to store multiple annotations over the same primary data has been developed. The paper focuses on the integration of the knowledge resource logical document structure information into a text document to enhance the task of automatic anaphora resolution both for the task of candidate detection and antecedent selection. The paper investigates data structures necessary for knowledge integration and retrieval.
Researchers in many disciplines, sometimes working in close cooperation, have been concerned with modeling textual data in order to account for texts as the prime information unit of written communication. The list of disciplines includes computer science and linguistics as well as more specialized disciplines like computational linguistics and text technology. What many of these efforts have in common is the aim to model textual data by means of abstract data types or data structures that support at least the semi-automatic processing of texts in any area of written communication.
As an Introduction to the Special Issue on "Formulation, generalization,
and abstraction in interaction,’’ this paper discusses key problems of a conversation
analytic (CA) approach to semantics in interaction. Prior research in CA and
Interactional Linguistics has only rarely dealt with issues of linguistic meaning in
interaction. It is argued that this is a consequence of limitations of sequential
analysis to capture meaning in interaction. While sequential analysis remains the
encompassing methodological framework, it is suggested that it needs to be complemented
by analyzing semantic relationships between choices of formulation in
the interaction, ethnography, and structural techniques of comparing selected
options with possible alternatives. The paper describes the methodological approach
taken to interactional semantics by the papers in the Special Issue, which analyse
practices of generalization and abstraction in interaction as they are accomplished
by formulations of prior versions of reference and description.
This paper analyses one specific conversational practice of formulation
called ‘notionalization’. It consists in the transformation of a description by a prior
speaker into a categorization by the next speaker. Sequences of this kind are a
‘‘natural laboratory’’ for studying the differences between descriptions and categorizations
regarding their semantic, interactional, and rhetorical properties:
Descriptive/narrative versions are often vague and tentative, multi unit turns,
which are temporalized and episodic, offering a lot of contingent, situational,
and indexical detail.
Notionalizations turn them into condensed, abstract, timeless, and often
agentless categorizations expressed by a noun (phrase) within one turn
constructional unit (TCU).
Drawing on audio- and video-taped German data from various types of interaction,
the paper focuses on one particular practice of notionalization, the formulation
of purportedly common ground by TCUs prefaced with the connective also.
The paper discusses their turn-constructional and morphological properties, pointing
out affinities of notionalization with language for special purposes. Notionalizations
are used for reducing detail and for topical closure. They provide grounds for
emergent keywords, which can be reused to re-contextualize topical issues and
interactional histories efficiently. Notionalizations are powerful means for accomplishing
intersubjectivity while pursuing (sometimes one-sided) practical relevancies
at the same time. Their inevitably perspective design thus may lead to re-open
the issue they were deemed to settle. The paper closes with an outlook to other
practices of notionalization, pointing to dimensions of interactionally relevant
variation and commonalities.