Refine
Year of publication
- 2015 (13) (remove)
Document Type
- Part of a Book (5)
- Conference Proceeding (4)
- Article (2)
- Book (1)
- Working Paper (1)
Keywords
- Englisch (13) (remove)
Publicationstate
- Veröffentlichungsversion (7)
- Preprint (1)
Reviewstate
- (Verlags)-Lektorat (4)
- Peer-Revied (1)
- Peer-Review (1)
- Review-Status-unbekannt (1)
Publisher
- De Gruyter (2)
- Lang (2)
- INCOMA Ltd. (1)
- International Speech Communication Association (1)
- John Benjamins (1)
- Narr Francke Attempto (1)
- University of Texas (1)
- de Gruyter (1)
The puzzle we consider in this paper is that Merchant (2004) judges certain elliptical utterances in context to be ungrammatical, while Culicover and Jackendoff (2005) judge similar examples to be grammatical. The main difference between the examples appears to be that Merchant’s are introduced by no, while Culicover and Jackendoff’s are introduced by yes. We propose that the different judgments do not reflect grammaticality, but complexity associated with ambiguity. First, there is an ambiguity with respect to the reference of noun phrases in discourse: the relationship of the fragment to the preceding discourse is ambiguous. Second, there is an ambiguity with respect to the discourse function of an utterance, and in particular, whether it is an affirmation triggered by yes or a denial triggered by no. In the case of the denial, it needs to be established, which part of the preceding statement has to be corrected, while in the case of the affirmation, no such ambiguity arises. The interactions between these two interpretive functions may under certain circumstances render particular sentences in discourse difficult to interpret. Interpretive difficulty has the subjective flavor of ‘ungrammaticality’; in the case that we discuss here, these judgments form the basis for a particular linguistic analysis. But, we argue, manipulation of the dis-course context can simplify discourse interpretation by resolving the ambiguity, which removes the interpretive difficulty. The conclusion that we draw is that the phenomenon in question is not a matter of linguistic structure, but of discourse interpretation.
Preface
(2015)
We investigate whether non-configurational languages, which display more word order variation than configurational ones, require more training data for a phenomenon to be parsed successfully. We perform a tightly controlled study comparing the dative alternation for English (a configurational language), German, and Russian (both non-configurational). More specifically, we compare the performance of a dependency parser when only canonical word order is present with its performance on data sets when all word orders are present. Our results show that for all languages, canonical data not only is easier to parse, but there exists no direct correspondence between the size of training sets containing free(er) word order variation and performance.
Based on specific linguistic landmarks in the speech signal, this study investigates pitch level and pitch span differences in English, German, Bulgarian and Polish. The analysis is based on 22 speakers per language (11 males and 11 females). Linear mixed models were computed that include various linguistic measures of pitch level and span, revealing characteristic differences across languages and between language groups. Pitch level appeared to have significantly higher values for the female speakers in the Slavic than the Germanic group. The male speakers showed slightly different results, with only the Polish speakers displaying significantly higher mean values for pitch level than the German males. Overall, the results show that the Slavic speakers tend to have a wider pitch span than the German speakers. But for the linguistic measure, namely for span between the initial peaks and the non-prominent valleys, we only find the difference between Polish and German speakers. We found a flatter intonation contour in German than in Polish, Bulgarian and English male and female speakers and differences in the frequency of the landmarks between languages. Concerning “speaker liveliness” we found that the speakers from the Slavic group are significantly livelier than the speakers from the Germanic group.
In recent years, theoretical and computational linguistics has paid much attention to linguistic items that form scales. In NLP, much research has focused on ordering adjectives by intensity (tiny < small). Here, we address the task of automatically ordering English adverbs by their intensifying or diminishing effect on adjectives (e.g. extremely small < very small). We experiment with 4 different methods: 1) using the association strength between adverbs and adjectives; 2) exploiting scalar patterns (such as not only X but Y); 3) using the metadata of product reviews; 4) clustering. The method that performs best is based on the use of metadata and ranks adverbs by their scaling factor relative to unmodified adjectives.
Centering on German self-motion verbs, this paper demonstrates the advantages of free-sorting over creating and delineating word fields with more traditional methods. In particular, I draw a comparison to Snell-Hornby’s (1983) work on German descriptive verbs, which produces lexical fields with the help of dictionary entries, a thesaurus, a small corpus of written text and limited speaker feedback. While these methods have benefits, they are limited in their ability to represent the average organization of semantic fields in the mind of everyday speakers. Freesorting, by contrast, does not rely on academic resources, corpora or singular speaker judgments. In sorting, a group of informants creates visible sets of items according to perceived similarity. Psycholinguists have used the method to quantitatively explore the perception of color terms across cultures (c.f. Roberson et al. 2005). With a sufficiently large number of informants, one can generate lexical sorting data that is apt for cluster analysis, the results of which are represented by dendrograms. The experiment I conducted involved 33 school children from a middle class neighborhood in Braunschweig, Northern Germany. My experiment shows that Snell-Hornby’s (1983) representation of the self-motion field can be improved by integrating further dimensions of meaning, such as body-space relations and sound, that young speakers find salient in the grouping procedure.
In this paper, a method for measuring synchronic corpus (dis-)similarity put forward by Kilgarriff (2001) is adapted and extended to identify trends and correlated changes in diachronic text data, using the Corpus of Historical American English (Davies 2010a) and the Google Ngram Corpora (Michel et al. 2010a). This paper shows that this fully data-driven method, which extracts word types that have undergone the most pronounced change in frequency in a given period of time, is computationally very cheap and that it allows interpretations of diachronic trends that are both intuitively plausible and motivated from the perspective of information theory. Furthermore, it demonstrates that the method is able to identify correlated linguistic changes and diachronic shifts that can be linked to historical events. Finally, it can help to improve diachronic POS tagging and complement existing NLP approaches. This indicates that the approach can facilitate an improved understanding of diachronic processes in language change.
Word-formation rules differ from syntactic rules in that they, apart from obeying morphological and semantic constraints, can also be − and often are − restricted phonologically. The present article includes an overview of the relevant phenomena in English and discusses the consequences for the representation of words in the mental lexicon and for grammar.
The book investigates the diachronic dimension of contact-induced language change based on empirical data from Pennsylvania German (PG), a variety of German in long-term contact with English. Written data published in local print media from Pennsylvania (USA) between 1868 and 1992 are analyzed with respect to semantic changes in the argument structure of verbs, the use of impersonal constructions, word order changes in subordinate clauses and in prepositional phrase constructions.
The research objective is to trace language change based on diachronic empirical data, and to assess whether existing models of language contact make provisions to cover the long-term developments found in PG. The focus of the study is thus twofold: first, it provides a detailed analysis of selected semantic and syntactic changes in Pennsylvania German, and second, it links the empirical findings to theoretical approaches to language contact.
Previous investigations of PG have drawn a more or less static, rather than dynamic, picture of this contact variety. The present study explores how the dynamics of language contact can bring about language mixing, borrowing, and, eventually, language change, taking into account psycholinguistic processes in (the head of) the bilingual speaker.
Corpus-assisted analyses of public discourse often focus on the level of the lexicon. This article argues in favour of corpus-assisted analyses of discourse, but also in favour of conceptualising salient lexical items in public discourse in a more determined way. It draws partly on non-Anglophone academic traditions in order to promote a conceptualisation of discourse keywords, thereby highlighting how their meaning is determined by their use in discourse contexts. It also argues in favour of emphasising the cognitive and epistemic dimensions of discourse-determined semantic structures. These points will be exemplified by means of a corpus-assisted, as well as a frame-based analysis of the discourse keyword financial crisis in British newspaper articles from 2009. Collocations of financial crisis are assigned to a generic matrix frame for ‘event’ which contains slots that specify possible statements about events. By looking at which slots are more, respectively less filled with collocates of financial crisis, we will trace semantic presence as well as absence, and thereby highlight the pragmatic dimensions of lexical semantics in public discourse. The article also advocates the suitability of discourse keyword analyses for systematic contrastive analyses of public/political discourse and for lexicographical projects that could serve to extend the insights drawn from corpus-guided approaches to discourse analysis.