Reddit Corpus Keyword Search (ReCKS). Ein Tool für die Gewinnung und Auswertung von Sprachdaten aus der Social Media-Plattform Reddit
- ReCKS (“Reddit Corpus Keyword Search”) is a web application for the linguistic research of Reddit comments. The current underlying dataset comes from the largest German-language subreddit, r/de, and includes user comments from 2006 to 2023 with a total of ca. 41 million tokens. As input, ReCKS allows both simple fixed keyword searches and complex search queries using regular expressions (RegEx). The output is given in the form of an exportable online table and a diagram that visualises the normalised frequency of the search term per year. This paper first explains the technical architecture of the application. It then briefly describes various usage scenarios and discusses in detail how the tool can be used for microdiachronic analyses. This is illustrated with an analysis of Genderzeichen (‘gender signs’, i.e., spelling variants that index gender-inclusivity, such as Student:in or Student*in) by r/de users over the last 15 years.
| Author: | Jenia YudytskaORCiDGND, Jannis K. AndroutsopoulosORCiDGND |
|---|---|
| URN: | urn:nbn:de:bsz:mh39-136973 |
| DOI: | https://doi.org/10.21248/idsopen.16.2026.79 |
| ISBN: | 978-3-948831-80-6 |
| ISSN: | 2749-9855 |
| Parent Title (German): | Deutsch im Wandel. Beiträge zur Methodenmesse der IDS-Jahrestagung 1 |
| Series (Serial Number): | IDSopen: Online-only Publikationen des Leibniz-Instituts für Deutsche Sprache (16) |
| Publisher: | IDS-Verlag |
| Place of publication: | Mannheim |
| Editor: | Annelen BrunnerORCiDGND, Sandra HansenORCiDGND, Christian LangGND, Ngoc Duyen Tanja Tu, Sascha WolferORCiDGND |
| Document Type: | Part of a Book |
| Language: | German |
| Year of first Publication: | 2026 |
| Date of Publication (online): | 2026/03/17 |
| Publishing Institution: | Leibniz-Institut für Deutsche Sprache (IDS) |
| Publicationstate: | Veröffentlichungsversion |
| Reviewstate: | (Verlags)-Lektorat |
| Tag: | Microdiachronische Analyse; Nativ digitale Sprachkorpora; Nutzerkommentare; Reddit-Korpus-Schlagwortsuche; User-Kommentare Digitally written language; Microdiachronic analysis of digitally written language; Natively digital language corpora; ReCKS; Reddit Corpus Keyword Search; RegEx; User comments |
| GND Keyword: | Deutsch; Internetsprache; Korpus <Linguistik>; Reddit; Social Media; Sprachdaten |
| First Page: | 81 |
| Last Page: | 95 |
| DDC classes: | 400 Sprache / 430 Deutsch |
| Open Access?: | ja |
| Licence (German): | Creative Commons - Namensnennung-Weitergabe unter gleichen Bedingungen 3.0 Deutschland |


