Refine
Document Type
- Article (3)
- Conference Proceeding (2)
- Part of a Book (1)
Language
- English (6)
Has Fulltext
- yes (6)
Is part of the Bibliography
- no (6)
Keywords
- Computerunterstützte Lexikografie (2)
- Datenmanagement (2)
- Forschungsdaten (2)
- Lexikon (2)
- Annotation (1)
- Automatische Sprachanalyse (1)
- Computerlinguistik (1)
- Datenmodell (1)
- Datenstruktur (1)
- Datenverarbeitung (1)
Publicationstate
- Postprint (6) (remove)
Reviewstate
- Peer-Review (5)
- (Verlags)-Lektorat (1)
Publisher
Lexical resources are often represented in table form, e. g., in relational databases, or represented in specially marked up texts, for example, in document based XML models. This paper describes how it is possible to model lexical structures as graphs and how this model can be used to exploit existing lexical resources and even how different types of lexical resources can be combined.
In this contribution we present some work of the R&D European project “LIRICS” and of the ISO/TC 37/SC 4 committee related to the topic of interoperability and re-use of language resources. We introduce some basic mechanisms of the standardization work in ISO and describe in more details the general approach on how to cope with the annotation of language data within ISO.
The use of digital resources and tools across humanities disciplines is steadily increasing, giving rise to new research paradigms and associated methods that are commonly subsumed under the term digital humanities. Digital humanities does not constitute a new discipline in itself, but rather a new approach to humanities research that cuts across different existing humanities disciplines. While digital humanities extends well beyond language-based research, textual resources and spoken language materials play a central role in most humanities disciplines.
The ISOcat registry reloaded
(2012)
The linguistics community is building a metadata-based infrastructure for the description of its research data and tools. At its core is the ISOcat registry, a collaborative platform to hold a (to be standardized) set of data categories (i.e., field descriptors). Descriptors have definitions in natural language and little explicit interrelations. With the registry growing to many hundred entries, authored by many, it is becoming increasingly apparent that the rather informal definitions and their glossary-like design make it hard for users to grasp, exploit and manage the registry’s content. In this paper, we take a large subset of the ISOcat term set and reconstruct from it a tree structure following the footsteps of schema.org. Our ontological re-engineering yields a representation that gives users a hierarchical view of linguistic, metadata-related terminology. The new representation adds to the precision of all definitions by making explicit information which is only implicitly given in the ISOcat registry. It also helps uncovering and addressing potential inconsistencies in term definitions as well as gaps and redundancies in the overall ISOcat term set. The new representation can serve as a complement to the existing ISOcat model, providing additional support for authors and users in browsing, (re-)using, maintaining, and further extending the community’s terminological metadata repertoire.
The transfer of research data management from one institution to another infrastructural partner is all but trivial, but can be required, for instance, when an institution faces reorganization or closure. In a case study, we describe the migration of all research data, identify the challenges we encountered, and discuss how we addressed them. It shows that the moving of research data management to another institution is a feasible, but potentially costly enterprise. Being able to demonstrate the feasibility of research data migration supports the stance of data archives that users can expect high levels of trust and reliability when it comes to data safety and sustainability.
The chapter on formats and models for lexicons deals with different available data formats of lexical resources. It elaborates on their structure and possible uses. Motivated by the restrictions in merging different lexical resources based on widely spread formalisms and international standards, a formal lexicon model for lexical resources is developed which is related to graph structures in annotations. For lexicons this model is termed the Lexicon Graph. Within this model the concepts of lexicon entries and lexical structures frequently described in the literature are formally defined and examples are given. The article addresses the problem of ambiguity in those formal terms. An implementation based on XML and XML technology such as XQuery for the defined structures is given. The relation to international standards is included as well.