Volltext-Downloads (blau) und Frontdoor-Views (grau)

Reddit Corpus Keyword Search (ReCKS). Ein Tool für die Gewinnung und Auswertung von Sprachdaten aus der Social Media-Plattform Reddit

  • ReCKS (“Reddit Corpus Keyword Search”) is a web application for the linguistic research of Reddit comments. The current underlying dataset comes from the largest German-language subreddit, r/de, and includes user comments from 2006 to 2023 with a total of ca. 41 million tokens. As input, ReCKS allows both simple fixed keyword searches and complex search queries using regular expressions (RegEx). The output is given in the form of an exportable online table and a diagram that visualises the normalised frequency of the search term per year. This paper first explains the technical architecture of the application. It then briefly describes various usage scenarios and discusses in detail how the tool can be used for microdiachronic analyses. This is illustrated with an analysis of Genderzeichen (‘gender signs’, i.e., spelling variants that index gender-inclusivity, such as Student:in or Student*in) by r/de users over the last 15 years.

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Jenia YudytskaORCiDGND, Jannis K. AndroutsopoulosORCiDGND
URN:urn:nbn:de:bsz:mh39-136973
DOI:https://doi.org/10.21248/idsopen.16.2026.79
ISBN:978-3-948831-80-6
ISSN:2749-9855
Parent Title (German):Deutsch im Wandel. Beiträge zur Methodenmesse der IDS-Jahrestagung 1
Series (Serial Number):IDSopen: Online-only Publikationen des Leibniz-Instituts für Deutsche Sprache (16)
Publisher:IDS-Verlag
Place of publication:Mannheim
Editor:Annelen BrunnerORCiDGND, Sandra HansenORCiDGND, Christian LangGND, Ngoc Duyen Tanja Tu, Sascha WolferORCiDGND
Document Type:Part of a Book
Language:German
Year of first Publication:2026
Date of Publication (online):2026/03/17
Publishing Institution:Leibniz-Institut für Deutsche Sprache (IDS)
Publicationstate:Veröffentlichungsversion
Reviewstate:(Verlags)-Lektorat
Tag:Microdiachronische Analyse; Nativ digitale Sprachkorpora; Nutzerkommentare; Reddit-Korpus-Schlagwortsuche; User-Kommentare
Digitally written language; Microdiachronic analysis of digitally written language; Natively digital language corpora; ReCKS; Reddit Corpus Keyword Search; RegEx; User comments
GND Keyword:Deutsch; Internetsprache; Korpus <Linguistik>; Reddit; Social Media; Sprachdaten
First Page:81
Last Page:95
DDC classes:400 Sprache / 430 Deutsch
Open Access?:ja
Licence (German):License LogoCreative Commons - Namensnennung-Weitergabe unter gleichen Bedingungen 3.0 Deutschland