Volltext-Downloads (blau) und Frontdoor-Views (grau)
  • search hit 7 of 34
Back to Result List

Proceedings of the 12th Web as Corpus Workshop (ACL SIGWAC). Language Resources and Evaluation Conference (LREC 2020), Marseille, 11–16 May 2020

  • The 12th Web as Corpus workshop (WAC-XII) looks at the past, present, and future of web corpora given the fact that large web corpora are nowadays provided mostly by a few major initiatives and companies, and the diversity of the early years appears to have faded slightly. Also, we acknowledge the fact that alternative sources of data (such as data from Twitter and similar platforms) have emerged, some of them only available to large companies and their affiliates, such as linguistic data from social media and other forms of the deep web. At the same time, gathering interesting and relevant web data (web crawling) is becoming an ever more intricate task as the nature of the data offered on the web changes (for example the death of forums in favour of more closed platforms).

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
URN:urn:nbn:de:bsz:mh39-118271
URL:http://www.lrec-conf.org/proceedings/lrec2020/workshops/WAC-II/index.html
ISBN:979-10-95546-68-9
Publisher:European Language Resources Association
Place of publication:Paris
Editor:Adrien Barbaresi, Felix Bildhauer, Roland Schäfer, Egon Stemle
Document Type:Book
Language:English
Year of first Publication:2020
Date of Publication (online):2023/05/30
Publishing Institution:Leibniz-Institut für Deutsche Sprache (IDS)
Publicationstate:Veröffentlichungsversion
Reviewstate:Peer-Review
Tag:linguistic data; web corpora; web crawling; web data
GND Keyword:Computerlinguistik; Forschungsdaten; International Conference on Language Resources and Evaluation (12. : 2020 : Marseille); Korpus <Linguistik>
Page Number:v; 65
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Korpuslinguistik
Program areas:G1: Beschreibung und Erschließung Grammatischen Wissens
Licence (English):License LogoCreative Commons - Attribution-NonCommercial 4.0 International