Korpuslinguistik
Refine
Year of publication
- 2019 (5) (remove)
Document Type
- Article (3)
- Part of a Book (2)
Has Fulltext
- yes (5)
Is part of the Bibliography
- yes (5)
Keywords
- Korpus <Linguistik> (5)
- Annotation (2)
- Sprachstatistik (2)
- Antwortrelationen (1)
- Antwortstrukturen (1)
- Automatische Sprachanalyse (1)
- Computerunterstützte Kommunikation (1)
- Datenbank (1)
- Deutsch (1)
- Gesprochene Sprache (1)
Publicationstate
Reviewstate
- Peer-Review (3)
- (Verlags)-Lektorat (2)
Publisher
- de Gruyter (5) (remove)
In the first volume of Corpus Linguistics and Linguistic Theory, Gries (2005. Null-hypothesis significance testing of word frequencies: A follow-up on Kilgarriff. Corpus Linguistics and Linguistic Theory 1(2). doi:10.1515/cllt.2005.1.2.277. http://www.degruyter.com/view//cllt.2005.1.issue-2/cllt.2005.1.2.277/cllt.2005.1.2.277.xml: 285) asked whether corpus linguists should abandon null-hypothesis significance testing. In this paper, I want to revive this discussion by defending the argument that the assumptions that allow inferences about a given population – in this case about the studied languages – based on results observed in a sample – in this case a collection of naturally occurring language data – are not fulfilled. As a consequence, corpus linguists should indeed abandon null-hypothesis significance testing.
This paper presents types and annotation layers of reply relations in computer- mediated communication (CMC). Reply relations hold between post units in CMC interactions and describe references from one given post to a previous post. We classify three types of reply relations in CMC interactions: first, technical replies, i. e. the possibility to reply directly to a previous post by clicking a ‘reply’ button; second, indentations, e. g. in wiki talk pages in which users insert their contributions in the existing talk page by indenting them and third, interpretative reply relations, i. e. the reply action is not realised formally but signalled by other structural or linguistics means such as address markers ‘@’, greetings, citations and/or Q-A structures. We take a look at existing practices in the description and representation of such relations in corpora and examples of chat, Wikipedia talk pages, Twitter and blogs. We then provide an annotation proposal that combines the different levels of description and representation of reply relations and which adheres to the schemas and practices for encoding CMC corpus documents within the TEI framework as defined by the TEI CMC SIG. It constitutes a prerequisite for correctly identifying higher levels of interactional relations such as dialogue acts or discussion trees.
This contribution presents a quantitative approach to speech, thought and writing representation (ST&WR) and steps towards its automatic detection. Automatic detection is necessary for studying ST&WR in a large number of texts and thus identifying developments in form and usage over time and in different types of texts. The contribution summarizes results of a pilot study: First, it describes the manual annotation of a corpus of short narrative texts in relation to linguistic descriptions of ST&WR. Then, two different techniques of automatic detection – a rule-based and a machine learning approach – are described and compared. Evaluation of the results shows success with automatic detection, especially for direct and indirect ST&WR.
Neues von KorAP
(2019)