Volltext-Downloads (blau) und Frontdoor-Views (grau)
  • search hit 5 of 0
Back to Result List

Enhancing speech corpus resources with multiple lexical tag layers

  • We describe a general two-stage procedure for re-using a custom corpus for spoken language system development involving a transformation from character-based markup to XML, and DSSSL stylesheet-driven XML markup enhancement with multiple lexical tag trees. The procedure was used to generate a fully tagged corpus; alternatively with greater economy of computing resources, it can be employed as a parametrised ‘tagging on demand’ filter. The implementation will shortly be released as a public resource together with the corpus (German spoken dialogue, about 500k word form tokens) and lexicon (about 75k word form types).

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Andreas WittORCiDGND, Harald LüngenGND, Dafydd Gibbon
URN:urn:nbn:de:bsz:mh39-45517
URL:http://lrec-conf.org/proceedings/lrec2000/
Parent Title (English):Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC-2000). Athen, Griechenland
Publisher:European Language Resources Association (ELRA)
Place of publication:Paris
Document Type:Conference Proceeding
Language:English
Year of first Publication:2000
Date of Publication (online):2016/01/11
Publicationstate:Veröffentlichungsversion
Reviewstate:(Verlags)-Lektorat
Tag:DSSSL; Morphology; Speech Corpora; Speech Lexica; Text Technology; XML
Page Number:5
DDC classes:400 Sprache / 410 Linguistik
Open Access?:ja
Linguistics-Classification:Korpuslinguistik
Licence (German):License LogoUrheberrechtlich geschützt