Refine
Year of publication
- 2017 (8) (remove)
Document Type
- Part of a Book (4)
- Conference Proceeding (3)
- Article (1)
Language
- English (8) (remove)
Has Fulltext
- yes (8)
Keywords
- Deutsch (4)
- Computerlinguistik (3)
- Grammatik (2)
- Maschinelles Lernen (2)
- Semantik (2)
- Automatische Sprachanalyse (1)
- Betrieb (1)
- Bildung (1)
- Deep learning (1)
- Englisch (1)
Publicationstate
- Zweitveröffentlichung (8) (remove)
Reviewstate
- Peer-Review (4)
- (Verlags)-Lektorat (3)
Publisher
- Springer (2)
- Association for Computing Machinery (1)
- Cengage (1)
- Fundacja Uniwersytetu im. Adama Mickiewicza (1)
- Narr Francke Attempto (1)
- Wiley (1)
- de Gruyter (1)
The present paper examines the rise and fall of Modern High German loanwords in English from 1600 until 2000, principally making use of the record of borrowing documented by the Oxford English Dictionary (OED) in its Third Edition (online version, in revision 2000-). Groups of loanwords are analysed by century, with reference to the changing social and cultural landscape characterising relationships between the relevant nations over this period. This is not a simple picture: each language grows over the period in different ways, and the speakers of English look to German at different times for different types of borrowing, as the political and intellectual balance alters.
Basic grammatical categories may carry social meanings irrespective of their semantic content. In a set of four studies, we demonstrate that verbs—a basic linguistic category present and distinguishable in most languages—are related to the perception of agency, a fundamental dimension of social perception. In an archival analysis of actual language use in Polish and German, we found that targets stereotypically associated with high agency (men and young people) are presented in the immediate neighborhood of a verb more often than non-agentic social targets (women and older people). Moreover, in three experiments using a pseudo-word paradigm, verbs (but not adjectives and nouns) were consistently associated with agency (but not with communion). These results provide consistent evidence that verbs, as grammatical vehicles of action, are linguistic markers of agency. In demonstrating meta-semantic effects of language, these studies corroborate the view of language as a social tool and an integral part of social perception.
This chapter investigates policies which shape the role of the German language in contemporary Estonia. Whereas German played for many centuries an important role as the language of the economic and cultural elite in Estonia, it severely declined in importance throughout the twentieth century. Mirrored on this historical background, the paper provides an overview of the current functions of German and attitudes towards it and it discusses how these functions and attitudes are influenced by policies of various actors from inside and outside Estonia. The paper argues that German continues to play a significant role: while German is no longer a lingua franca, it still enjoys a number of functions and prestige in clearly defined niches involving communication within German-speaking circles or between Estonians and Germans. The interplay of language policies of the Estonian and the German-speaking states as well as by semi-state and private institutions succeed in maintaining German as an additional language in contemporary Estonia.
In this paper we present work in developing a computerized grammar for the Latin language. It demonstrates the principles and challenges in developing a grammar for a natural language in a modern grammar formalism. The grammar presented here provides a useful resource for natural language processing applications in different fields. It can be easily adopted for language learning and use in language technology for Cultural Heritage like translation applications or to support post-correction of document digitization.
We present a supervised machine learning AND system which tackles semantic similarity between publication titles by means of word embeddings. Word embeddings are integrated as external components, which keeps the model small and efficient, while allowing for easy extensibility and domain adaptation. Initial experiments show that word embeddings can improve the Recall and F score of the binary classification sub-task of AND. Results for the clustering sub-task are less clear, but also promising and overall show the feasibility of the approach.
Syntactic theory has tended to vacillate between implausible methodological extremes. Some linguists hold that our theories are accountable solely for the corpus of attested utterances; others assume our subject matter is unobservable intuitive feelings about sentences. Both extremes should be rejected. The subject matter of syntax is neither past utterance production nor the functioning of inaccessible mental machinery; it is normative - a system of tacitly grasped constraints defining correctness of structure. There are interesting parallels between syntactic and moral systems, modulo the key difference that linguistic systems are diverse whereas morality is universal. The appropriate epistemology for justifying formulations of normative systems is familiar in philosophy: it is known as the method of reflective equilibrium.
While good results have been achieved for named entity recognition (NER) in supervised settings, it remains a problem that for low resource languages and less studied domains little or no labelled data is available. As NER is a crucial preprocessing step for many natural language processing tasks, finding a way to overcome this deficit in data remains of great interest. We propose a distant supervision approach to NER that is both language and domain independent where we automatically generate labelled training data using gazetteers that we previously extracted from Wikipedia. We test our approach on English, German and Estonian data sets and contribute further by introducing several successful methods to reduce the noise in the generated training data. The tested models beat baseline systems and our results show that distant supervision can be a promising approach for NER when no labelled data is available. For the English model we also show that the distant supervision model is better at generalizing within the same domain of news texts by comparing it against a supervised model on a different test set.