TY - CHAP U1 - Konferenzveröffentlichung A1 - Witt, Andreas A1 - Lüngen, Harald A1 - Gibbon, Dafydd T1 - Enhancing speech corpus resources with multiple lexical tag layers T2 - Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC-2000). Athen, Griechenland N2 - We describe a general two-stage procedure for re-using a custom corpus for spoken language system development involving a transformation from character-based markup to XML, and DSSSL stylesheet-driven XML markup enhancement with multiple lexical tag trees. The procedure was used to generate a fully tagged corpus; alternatively with greater economy of computing resources, it can be employed as a parametrised ‘tagging on demand’ filter. The implementation will shortly be released as a public resource together with the corpus (German spoken dialogue, about 500k word form tokens) and lexicon (about 75k word form types). KW - DSSSL KW - Morphology KW - Speech Corpora KW - Speech Lexica KW - Text Technology KW - XML Y1 - 2000 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-45517 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-45517 UR - http://lrec-conf.org/proceedings/lrec2000/ SP - 5 S1 - 5 PB - European Language Resources Association (ELRA) CY - Paris ER -