OPUS 4 | Search

13 search hits

1 to 10

Sort by

XML in der Praxis: Dokument parsen, validieren und verarbeiten (2003)

Köhler, Werner ; Witt, Andreas ; Sasaki, Felix ; Milde, Jan-Torsten ; Pönninghaus, Jens

Unification of XML Documents with Concurrent Markup (2004)

Witt, Andreas ; Lüngen, Harald ; Sasaki, Felix ; Goecke, Daniela

Unification of XML Documents with Concurrent Markup (2005)

Witt, Andreas ; Goecke, Daniela ; Sasaki, Felix ; Lüngen, Harald

An approach to the unification of XML (Extensible Markup Language) documents with identical textual content and concurrent markup in the framework of XML-based multi-layer annotation is introduced. A Prolog program allows the possible relationships between element instances on two annotation layers that share PCDATA to be explored and also the computing of a target node hierarchy for a well-formed, merged XML document. Special attention is paid to identity conflicts between element instances, for which a default solution that takes into account metarelations that hold between element types on the different annotation layers is provided. In addition, rules can be specified by a user to prescribe how identity conflicts should be solved for certain element types.

Schema Languages & Internationalization Issues: A survey (2005)

Sasaki, Felix ; Lieske, Christian ; Witt, Andreas

Many XML-related activities (e.g. the creation of a new schema) already address issues with different languages, scripts, and cultures. Nevertheless, a need exists for additional mechanisms and guidelines for more effective internationalization (i18n) and localization (l10n) in XML-related contents and processes. The W3C Internationalization Tag Set Working Group (W3C ITS WG) addresses this need and works on data categories, representation mechanisms and guidelines related to i18n and l10n support in the XML realm. This paper describes initial findings from the (W3C ITS WG). Furthermore, the paper discusses how these findings relate to specific schema languages, and complementary technologies like namespace sectioning, schema annotation and the description of processing chains. The paper exemplifies why certain requirements only can be met by a combination of technologies, and discusses these technologies.

Präsentation, Transformation und Analyse: Verarbeitung XML-basierter japanischer Dialoge (2001)

Sasaki, Felix ; Witt, Andreas

In diesem Beitrag wird die Verwendung von XML zur Annotation, Präsentation, Transformation und Analyse von japanischen Dialogdaten thematisiert. Für die unterschiedlichen Zweckbereiche werden unterschiedliche texttechnologische Standards beschrieben und implementiert.

Multilingual language resources and interoperability (2009)

Witt, Andreas ; Heid, Ulrich ; Sasaki, Felix ; Sérasset, Gilles

This article introduces the topic of ‘‘Multilingual language resources and interoperability’’. We start with a taxonomy and parameters for classifying language resources. Later we provide examples and issues of interoperatability, and resource architectures to solve such issues. Finally we discuss aspects of linguistic formalisms and interoperability.

Linguistische Korpora (2004)

Sasaki, Felix ; Witt, Andreas

Interrelating Treebanks with Language-Specific Descriptions of Information Structure (2004)

Storbeck, Daniel ; Kwon, Sanghee ; Sasaki, Felix ; Witt, Andreas

The motivation for this article is to describe a methodology for interrelating and analyzing language and theory-specific corpus data from various languages. As an example phenomeon we use information structure (IS, see [3]) in treebanks from three languages: Spanish, Korean and Japanese. Korean and Japanese are typologically close, while both are typologically different from Spanish. Therefore, the problem of annotating IS is that there are diverging language-specific formal linguistic means for the realization of IS-functions (like “topicalization / contrast”) on various levels like prosody, morphology and word-order. Hence, it is necessary to describe the relations between language-specific formal means and functional views on IS, and how to operationalize these relations for corpus analysis.

GOLD and Discourse: Domain- and Community-Specific Extensions (2005)

Goecke, Daniela ; Lüngen, Harald ; Sasaki, Felix ; Witt, Andreas ; Farrar, Scott

Declarations of Relations, Differences and Transformations between Theory-specific Treebanks: A New Methodology (2003)

Sasaki, Felix ; Witt, Andreas ; Metzing, Dieter

This paper deals with the problem of how to interrelate theory-specific treebanks and how to transform one treebank format to another. Currently, two approaches to achieve these goals can be differentiated. The first creates a mapping algorithm between treebank formats. Categories of a source format are transformed into a target format via a given set of general or language-specific mapping rules. The second relates treebanks via a transformation to a general model of linguistic categories, for example based on the EAGLES recommendations for syntactic annotations of corpora, or relying on the HPSG framework. This paper proposes a new methodology as a solution for these desiderata.

1 to 10

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

13 search hits