Unification of XML Documents with Concurrent Markup
- An approach to the unification of XML (Extensible Markup Language) documents with identical textual content and concurrent markup in the framework of XML-based multi-layer annotation is introduced. A Prolog program allows the possible relationships between element instances on two annotation layers that share PCDATA to be explored and also the computing of a target node hierarchy for a well-formed, merged XML document. Special attention is paid to identity conflicts between element instances, for which a default solution that takes into account metarelations that hold between element types on the different annotation layers is provided. In addition, rules can be specified by a user to prescribe how identity conflicts should be solved for certain element types.
Author: | Andreas WittORCiDGND, Daniela Goecke, Felix Sasaki, Harald LüngenGND |
---|---|
URN: | urn:nbn:de:bsz:mh39-45269 |
DOI: | https://doi.org/10.1093/llc/fqh046 |
ISSN: | 1477-4615 |
Parent Title (English): | Literary and Linguistic Computing |
Publisher: | Oxford University Press |
Place of publication: | Oxford |
Document Type: | Article |
Language: | English |
Year of first Publication: | 2005 |
Date of Publication (online): | 2016/01/04 |
Publicationstate: | Postprint |
Reviewstate: | (Verlags)-Lektorat |
GND Keyword: | Information Retrieval; XML (Extensible Markup Language) |
Volume: | 20 |
Issue: | 1 |
First Page: | 103 |
Last Page: | 116 |
Note: | Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugänglich. This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively. |
DDC classes: | 400 Sprache / 410 Linguistik |
Open Access?: | ja |
Linguistics-Classification: | Computerlinguistik |
Licence (German): | ![]() |