Refine
Year of publication
- 2014 (13) (remove)
Document Type
- Part of a Book (7)
- Conference Proceeding (5)
- Book (1)
Language
- English (13) (remove)
Has Fulltext
- yes (13)
Is part of the Bibliography
- no (13) (remove)
Keywords
- Korpus <Linguistik> (6)
- Gesprochene Sprache (3)
- Konversationsanalyse (3)
- Kiezdeutsch (2)
- Adjective (1)
- Adjektiv (1)
- Arzt (1)
- British English (1)
- Bulgarian (1)
- Bulgarisch (1)
Publicationstate
- Veröffentlichungsversion (9)
- Postprint (3)
Reviewstate
- (Verlags)-Lektorat (13) (remove)
Publisher
The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut für Informationswissenschaft und Sprachtechnologie of Universität Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years.
This chapter focuses on the way in which co-present parties in meetings manage language choice and treat it as raising problems of participation - in the sense that participants can orient to the fact that a given language choice may increase or diminish participation for some or all co-present group members. Choosing one language rather than another is approached here as a members' problem (in an ethnomethodological sense), and as a decision the participants make themselves, in situ and within their courses of action, displaying the way in which they orient to its local consequences, and how they justify and legitimize it. In order to explore this link between language choice and participation systematically, in this chapter we focus on a particular and recurrent phenomenon, the announcement of a language change. Within the conversation analysis framework, we analyse these announcements by taking into account the sequential position in which they occur, their format, the way in which they are addressed to a sub-group or to the group as a whole, and the specific action they accomplish. We will also look at how the group receives the announcement, its effects on the participation framework, as well as the categorizations that ensue from it. This chapter therefore highlights the mutual configuration between language choice and participation framework. Our analyses are based on several video- and audio-recorded corpora of international work meetings. These video data call for reflection not only on the linguistic dimension of participation frameworks and language switches, but more broadly on their multimodal organization. This chapter shows that multimodal details are crucial if we aim to understand the relation between multilingualism and participation as occasioned, contingent and emergent dynamics.
Annotating Spoken Language
(2014)
Content analysis provides a useful and multifaceted, methodological framework for Twitter analysis. CAQDAS tools support the structuring of textual data by enabling categorising and coding. Depending on the research objective, it may be appropriate to choose a mixed-methods approach that combines quantitative and qualitative elements of analysis and plays out their respective advantages to the greatest possible extent while minimising their shortcomings. In this chapter, we will discuss CAQDAS speech act analysis of tweets as an example of software-assisted content analysis. We start with some elementary thoughts on the challenges of the collection and evaluation of Twitter data before we give a brief description of the potentials and limitations of using the software QDA Miner (as one typical example for possible analysis programmes). Our focus will lie on analytical features that can be particularly helpful in speech act analysis of tweets.
We discovered several recurring errors in the current version of the Europarl Corpus originating both from the web site of the European Parliament and the corpus compilation based thereon. The most frequent error was incompletely extracted metadata leaving non-textual fragments within the textual parts of the corpus files. This is, on average, the case for every second speaker change. We not only cleaned the Europarl Corpus by correcting several kinds of errors, but also aligned the speakers’ contributions of all available languages and compiled every- thing into a new XML-structured corpus. This facilitates a more sophisticated selection of data, e.g. querying the corpus for speeches by speakers of a particular political group or in particular language combinations.
This study investigates cross-language differences in pitch range and variation in four languages from two language groups: English and German (Germanic) and Bulgarian and Polish (Slavic). The analysis is based on large multi-speaker corpora (48 speakers for Polish, 60 for each of the other three languages). Linear mixed models were computed that include various distributional measures of pitch level, span and variation, revealing characteristic differences across languages and between language groups. A classification experiment based on the relevant parameter measures (span, kurtosis and skewness values for pitch distributions for each speaker) succeeded in separating the language groups.
Following a welcome in Lithuanian and English to the guests and members on the occa- sion of the 10"’ anniversary of EFNIL, the history of this European language Organization is sketched. A brief survey of the sociolinguistic themes treated at previous Conferences and the state of the inajor projects is given, followed by an introduction (in German) to the general topic of the present Conference. The importance that translation and interpretation have for European language diversity and the individual national languages beside foreign language education of all Europeans is being stressed.
This paper presents the first release of the KiezDeutsch Korpus (KiDKo), a new language resource with multiparty spoken dialogues of Kiezdeutsch, a newly emerging language variety spoken by adolescents from multi-ethnic urban areas in Germany. The first release of the corpus includes the transcriptions of the data as well as a normalisation layer and part-of-speech annotations. In the paper, we describe the main features of the new resource and then focus on automatic POS tagging of informal spoken language. Our tagger achieves an accuracy of nearly 97% on KiDKo. While we did not succeed in further improving the tagger using ensemble tagging, we present our approach to using the tagger ensembles for identifying error patterns in the automatically tagged data.
We study the influence of information structure on the salience of subjective expressions for human readers. Using an online survey tool, we conducted an experiment in which we asked users to rate main and relative clauses that contained either a single positive or negative or a neutral adjective. The statistical analysis of the data shows that subjective expressions are more prominent in main clauses where they are asserted than in relative clauses where they are presupposed. A corpus study suggests that speakers are sensitive to this differential salience in their production of subjective expressions.
We compare several different corpus- based and lexicon-based methods for the scalar ordering of adjectives. Among them, we examine for the first time a low- resource approach based on distinctive- collexeme analysis that just requires a small predefined set of adverbial modifiers. While previous work on adjective intensity mostly assumes one single scale for all adjectives, we group adjectives into different scales which is more faithful to human perception. We also apply the methods to both polar and non-polar adjectives, showing that not all methods are equally suitable for both types of adjectives.
Speakers’ dialogical orientation to the particular others they talk to is implemented by practices of recipient-design. One such practice is the use of negation as a means to constrain interpretations of speaker’s actions by the partner. The paper situates this use of negation within the larger context of other recipient-designed uses of negation which negate assumptions the speaker makes about what the addressee holds to be true (second-order assumptions) or what the addressee assumes the speaker holds to be true (third- order assumptions). The focus of the study is on the ways in which speakers use negation to disclaim interpretations of their turns which partners have displayed or may possibly arrive at. Special emphasis is given to the positionally sensitive uses of negation, which may occur before, after or inserted between the nucleus actions whose interpretation is constrained by the negation. Interactional motivations and rhetorical potentials of the practice are pointed out, partly depending on the position of the negation vis-à-vis the nucleus action. The analysis shows that the concept of ‘recipient design’ is in need of distinctions which have not been in focus in prior research.
This paper analyses paramedic emergency interaction as multimodal multiactivity. Based on a corpus of video-recordings of emergency drills performed by professional paramedics during advanced training, the focus is on paramedics’ participation in multiple joint projects which become simultaneously relevant. Simultaneity and fast succession of multiactivity does not only characterise work on the team level, but also the work profile of the individual paramedic. Participants have to coordinate their own participation in more than one joint project intrapersonally. In the data studied, three patterns of allocating multimodal resources stood out as routine ways of coordinating participation in two simultaneous projects intrapersonally:
1. Talk and hearing vs. manual action monitored by gaze,
2. Talk and hearing vs. gazing (and pointing),
3. Manual action vs. gaze (and talk and hearing).