OPUS 4 | Search

94 search hits

1 to 10

Sort by

[tiger2] As a standardized serialisation for ISO 24615 - SynAF (2012)

Bosch, Sonja ; Choi, Key-Sun ; de la Clergerie, Éric Villemonte ; Fang, Alex Chengyu ; Faaß, Gertrud ; Lee, Kiyong ; Pareja-Lora, Antonio ; Romary, Laurent ; Witt, Andreas ; Zeldes, Amir ; Zipser, Forian

This paper presents the application of the <tiger2/> format to various linguistic scenarios with the aim of making it the standard serialisation for the ISO 24615 [1] (SynAF) standard. After outlining the main characteristics of both the SynAF metamodel and the <tiger2/> format, as extended from the initial Tiger XML format [2], we show through a range of different language families how <tiger2/> covers a variety of constituency and dependency based analyses.

A pragmatic approach to XML interoperability – the Component Metadata Infrastructure (CMDI) (2011)

Broeder, Daan ; Schonefeld, Oliver ; Trippel, Thorsten ; Van Uytvanck, Dieter ; Witt, Andreas

XML has been designed for creating structured documents, but the information that is encoded in these structures are, by definition, out of scope for XML. Additional sources, normally not easily interpretable by computers, such as documentation are needed to determine the intention of specific tags in a tag-set. The Component Metadata Infrastructure (CMDI) takes a rather pragmatic approach to foster interoperability between XML instances in the domain of metadata descriptions for language resources. This paper gives an overview of this approach.

A standards-related web-based information system (2012)

Stührenberg, Maik ; Schonefeld, Oliver ; Witt, Andreas

This late breaking proposal introduces the prototype of a web-based information system dealing with standards in the field of annotation.

A Web-Platform for Preserving, Exploring, Visualising, and Querying Linguistic Corpora and other Resources (2008)

Rehm, Georg ; Schonefeld, Oliver ; Witt, Andreas ; Chiarcos, Christian ; Lehmberg, Timm

We present SPLICR, the Web-based Sustainability Platform for Linguistic Corpora and Resources. The system is aimed at people who work in Linguistics or Computational Linguistics: a comprehensive database of metadata records can be explored in order to find language resources that could be appropriate for one’s spe cific research needs. SPLICR also provides a graphical interface that enables users to query and to visualise corpora. The project in which the system is developed aims at sustainably archiving the ca. 60 language resources that have been constructed in three collaborative research centres. Our project has two primary goals: (a) To process and to archive sustainably the resources so that they are still available to the research community in five, ten, or even 20 years time. (b) To enable researchers to query the resources both on the level of their metadata as well as on the level of linguistic annotations. In more general terms, our goal is to enable solutions that leverage the interoperability, reusability, and sustainability of heterogeneous collec- tions of language resources.

Aspects of Sustainability in Digital Humanities (2008)

Rehm, Georg ; Witt, Andreas

Aufbau einer Korpusinfrastruktur für die Beobachtung des Schreibgebrauchs (2016)

Fischer, Peter M. ; Diewald, Nils ; Kupietz, Marc ; Witt, Andreas

Avoiding Data Graveyards : from Heterogeneous Data Collected in Multiple Research Projects to Sustainable Linguistic Resources (2006)

Schmidt, Thomas ; Chiarcos, Christian ; Lehmberg, Timm ; Rehm, Georg ; Witt, Andreas ; Hinrichs, Erhard

This paper describes a new research initiative addressing the issue of sustainability of linguistic resources. The initiative is a cooperation between three collaborative research centres in Germany – the SFB 441 “Linguistic Data Structures” in Tübingen, the SFB 538 “Multilingualism” in Hamburg, and the SFB 632 “Information Structure” in Potsdam/Berlin. The aim of the project is to develop methods for sustainable archiving of the diverse bodies of linguistic data used at the three sites. In the first half of the paper, the data handling solutions developed so far at the three centres are briefly introduced. This is followed by an assessment of their commonalities and differences and of what these entail for the work of the new joint initiative. The second part then sketches seven areas of open questions with respect to sustainable data handling and gives a more detailed account of two of them – integration of linguistic terminologies and development of best practice guidelines.

Beispiel einer komplexen DTD: DocBook (2003)

Witt, Andreas ; Pönninghaus, Jens

Co-reference annotation and resources: a multilingual corpus of typologically diverse languages (2002)

Sasaki, Felix ; Wegener, Claudia ; Witt, Andreas ; Metzing, Dieter ; Pönninghaus, Jens

This article introduces a dialogue corpus containing data from two typologically different languages, Japanese and Kilivila. The corpus is annotated in accordance with language specific annotation schemes for co-referential and similar relations. The article describes the corpus data, the properties of language specific co-reference in the two languages and a methodology for its annotation. Examples from the corpus show how this methodology is used in the workflow of the annotation process.

Co-reference in Japanese Task-oriented Dialogues: A Contribution to the Development of Language-specific and Language-general Annotation Schemes and Resources (2004)

Sasaki, Felix ; Witt, Andreas

This paper describes a corpus of Japanese task-oriented dialogues, i.e. its data, annotations, analysis methodology and preliminary results for the modeling of co-referential phenomena. Current corpus based approaches to co-reference concentrate on textual data from English or other European languages. Hence, the emerging language-general models of co-reference miss input from dialogue data of non-European languages. We aim to fill this gap and contribute to a model of co-reference on various language-specific and language-general levels.

1 to 10

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

94 search hits