OPUS 4 | CMLC-3 / 3rd Workshop on Challenges in the Management of Large Corpora

CMLC-3 / 3rd Workshop on Challenges in the Management of Large Corpora

1 search hit

1 to 1

Ziggurat: A new data model and indexing format for large annotated text corpora (2015)

The IMS Open Corpus Workbench (CWB) software currently uses a simple tabular data model with proven limitations. We outline and justify the need for a new data model to underlie the next major version of CWB. This data model, dubbed Ziggurat, defines a series of types of data layer to represent different structures and relations within an annotated corpus; each such layer may contain variables of different types. Ziggurat will allow us to gradually extend and enhance CWB’s existing CQP-syntax for corpus queries, and also make possible more radical departures relative not only to the current version of CWB but also to other contemporary corpus-analysis software.

1 to 1

Open Access

CMLC-3 / 3rd Workshop on Challenges in the Management of Large Corpora

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

1 search hit