TY - CHAP U1 - Konferenzveröffentlichung A1 - Schäfer, Roland ED - Bański, Piotr ED - Biber, Hanno ED - Breiteneder, Evelyn ED - Kupietz, Marc ED - Lüngen, Harald ED - Witt, Andreas T1 - Processing and querying large web corpora with the COW14 architecture T2 - Proceedings of the 3rd Workshop on Challenges in the Management of Large Corpora (CMLC-3), Lancaster, 20 July 2015 N2 - In this paper, I present the COW14 tool chain, which comprises a web corpus creation tool called texrex, wrappers for existing linguistic annotation tools as well as an online query software called Colibri2. By detailed descriptions of the implementation and systematic evaluations of the performance of the software on different types of systems, I show that the COW14 architecture is capable of handling the creation of corpora of up to at least 100 billion tokens. I also introduce our running demo system which currently serves corpora of up to roughly 20 billion tokens in Dutch, English, French, German, Spanish, and Swedish KW - Korpus KW - Annotation KW - Datenbanksystem KW - Web corpus KW - Large corpora KW - Corpus annotation KW - Corpus technology KW - Corpus query language Y1 - 2015 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-38367 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-38367 SP - 28 EP - 34 PB - Institut für Deutsche Sprache CY - Mannheim ER -