Refine
Year of publication
- 2019 (1)
Document Type
Language
- English (1) (remove)
Has Fulltext
- yes (1)
Is part of the Bibliography
- no (1)
Keywords
- Korpus <Linguistik> (1)
- corpus linguistics (1)
- corpus processing (1)
- web corpora (1)
Publicationstate
Reviewstate
- Peer-Review (1)
Publisher
Nearly all of the very large corpora of English are “static”, which allows a wide range of one-time, pre-processed data, such as collocates. The challenge comes with large “dynamic” corpora, which are updated regularly, and where preprocessing is much more difficult. This paper provides an overview of the NOW corpus (News on the Web), which is currently 8.2 billion words in size, and which grows by about 170 million words each month. We discuss the architecture of NOW, and provide many examples that show how data from NOW can (uniquely) be extracted to look at a wide range of ongoing changes in English.