TY - CHAP U1 - Konferenzveröffentlichung A1 - Evert, Stefan A1 - Hardie, Andrew ED - Bański, Piotr ED - Biber, Hanno ED - Breiteneder, Evelyn ED - Kupietz, Marc ED - Lüngen, Harald ED - Witt, Andreas T1 - Ziggurat: A new data model and indexing format for large annotated text corpora T2 - Proceedings of the 3rd Workshop on Challenges in the Management of Large Corpora (CMLC-3), Lancaster, 20 July 2015 N2 - The IMS Open Corpus Workbench (CWB) software currently uses a simple tabular data model with proven limitations. We outline and justify the need for a new data model to underlie the next major version of CWB. This data model, dubbed Ziggurat, defines a series of types of data layer to represent different structures and relations within an annotated corpus; each such layer may contain variables of different types. Ziggurat will allow us to gradually extend and enhance CWB’s existing CQP-syntax for corpus queries, and also make possible more radical departures relative not only to the current version of CWB but also to other contemporary corpus-analysis software. KW - Korpus KW - Annotation KW - Datenbanksystem KW - Large corpora KW - Corpus annotation KW - Corpus technology KW - Corpus linguistics KW - Corpus query language Y1 - 2015 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-38335 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-38335 SP - 21 EP - 27 PB - Institut für Deutsche Sprache CY - Mannheim ER -