Refine
Document Type
- Part of a Book (2)
- Conference Proceeding (1)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
- DSSSL (3) (remove)
Publicationstate
Reviewstate
Publisher
Im Folgenden wird eine texttechnologische Komponente zur Expansion eines XML- annotierten Stammformenlexikons, das auf Einträgen eines Standardwörterbuchs basiert, vorgestellt. Diese Expansion wurde in der Document Style Semantics and Specification Language implementiert. Ihr Ergebnis ist ein Vollformenlexikon, das ebenfalls in XML repräsentiert ist.
We describe a general two-stage procedure for re-using a custom corpus for spoken language system development involving a transformation from character-based markup to XML, and DSSSL stylesheet-driven XML markup enhancement with multiple lexical tag trees. The procedure was used to generate a fully tagged corpus; alternatively with greater economy of computing resources, it can be employed as a parametrised ‘tagging on demand’ filter. The implementation will shortly be released as a public resource together with the corpus (German spoken dialogue, about 500k word form tokens) and lexicon (about 75k word form types).