Are web corpora inferior? The Case of Czech and Slovak
- Our paper describes an experiment aimed to assessment of lexical coverage in web corpora in comparison with the traditional ones for two closely related Slavic languages from the lexicographers’ perspective. The preliminary results show that web corpora should not be considered ― inferior, but rather ― different.
Author: | Vladimír Benko |
---|---|
URN: | urn:nbn:de:bsz:mh39-62648 |
Parent Title (English): | Proceedings of the Workshop on Challenges in the Management of Large Corpora and Big Data and Natural Language Processing (CMLC-5+BigNLP) 2017 including the papers from the Web-as-Corpus (WAC-XI) guest section. Birmingham, 24 July 2017 |
Publisher: | Institut für Deutsche Sprache |
Place of publication: | Mannheim |
Editor: | Piotr Bański, Marc Kupietz, Harald Lüngen, Paul Rayson, Hanno Biber, Evelyn Breiteneder, Simon Clematide, John Mariani, Mark Stevenson, Theresa Sick |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2017 |
Date of Publication (online): | 2017/07/05 |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | Peer-Review |
Tag: | Corpus linguistics; Czech; Slovak; Web corpora |
GND Keyword: | Internet; Korpus <Linguistik>; Slowakisch; Tschechisch |
Page Number: | 6 |
First Page: | 43 |
Last Page: | 48 |
DDC classes: | 400 Sprache |
Open Access?: | ja |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Korpuslinguistik |
Conferences, Workshops: | CMLC-5 + BigNLP / 5th Workshop on Challenges in the Management of Large Corpora and Big Data and Natural Language Processing |
Licence (German): | Creative Commons - Namensnennung-Nicht kommerziell-Keine Bearbeitung 3.0 Deutschland |