TY - CHAP U1 - Konferenzveröffentlichung A1 - Davies, Mark ED - Bański, Piotr ED - Barbaresi, Adrien ED - Biber, Hanno ED - Breiteneder, Evelyn ED - Clematide, Simon ED - Kupietz, Marc ED - Lüngen, Harald ED - Iliadi, Caroline T1 - The best of both worlds: Multi-billion word “dynamic” corpora T2 - Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019. Cardiff, 22nd July 2019 N2 - Nearly all of the very large corpora of English are “static”, which allows a wide range of one-time, pre-processed data, such as collocates. The challenge comes with large “dynamic” corpora, which are updated regularly, and where preprocessing is much more difficult. This paper provides an overview of the NOW corpus (News on the Web), which is currently 8.2 billion words in size, and which grows by about 170 million words each month. We discuss the architecture of NOW, and provide many examples that show how data from NOW can (uniquely) be extracted to look at a wide range of ongoing changes in English. KW - corpus linguistics KW - corpus processing KW - web corpora KW - Korpus Y1 - 2019 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-90234 U6 - https://doi.org/10.14618/ids-pub-9023 DO - https://doi.org/10.14618/ids-pub-9023 SP - 23 EP - 28 PB - Leibniz-Institut für Deutsche Sprache CY - Mannheim ER -