Converting and Representing Social Media Corpora into TEI: Schema and best practices from CLARIN-D
- The paper presents results from a curation project within CLARIN-D, in which an existing lMWord corpus of German chat communication has been integrated into the DEREKO and DWDS corpus infrastructures of the CLARIN-D centres at the Institute for the German Language (IDS, Mannheim) and at the Berlin-Brandenburg Academy of Sciences (BBAW, Berlin). The focus is on the solutions developed for converting and representing the corpus in a TEI format.
Author: | Michael Beißwenger, Eric Ehrhardt, Axel Herold, Harald LüngenGND, Angelika Storrer |
---|---|
URN: | urn:nbn:de:bsz:mh39-55736 |
URL: | http://tei2016.acdh.oeaw.ac.at/sites/default/files/TEIconf2016_BookOfAbstracts.pdf |
ISBN: | 978-3-200-04689-4 |
Parent Title (English): | TEI Conference and Members' Meeting 2016. Book of Abstracts |
Publisher: | Austrian Centre for Digital Humanities, Austrian Academy of Sciences |
Place of publication: | Wien |
Editor: | Claudia Resch, Vanessa Hannesschläger, Tanja Wissik |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2016 |
Date of Publication (online): | 2016/11/16 |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | Peer-Review |
Research ressource: | http://hdl.handle.net/10932/00-03B0-14FA-A8D0-0F01-F |
GND Keyword: | Chatten <Kommunikation>; Deutsch; Korpus <Linguistik> |
First Page: | 39 |
Last Page: | 41 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Korpuslinguistik |
Licence (English): | Creative Commons - Attribution 4.0 International |