Refine
Year of publication
Document Type
- Part of a Book (4500)
- Article (2966)
- Book (996)
- Conference Proceeding (688)
- Part of Periodical (308)
- Review (257)
- Other (151)
- Working Paper (83)
- Doctoral Thesis (68)
- Report (35)
- Preprint (18)
- Contribution to a Periodical (13)
- Master's Thesis (8)
- Habilitation (6)
- Course Material (2)
- Periodical (2)
- Bachelor Thesis (1)
- Diploma Thesis (1)
- Image (1)
- Lecture (1)
Language
- German (8078)
- English (1765)
- Russian (145)
- French (38)
- Multiple languages (22)
- Spanish (16)
- Portuguese (14)
- Italian (9)
- Polish (7)
- Ukrainian (5)
Keywords
- Deutsch (5140)
- Korpus <Linguistik> (940)
- Wörterbuch (605)
- Konversationsanalyse (451)
- Rezension (423)
- Grammatik (405)
- Rechtschreibung (374)
- Gesprochene Sprache (361)
- Sprachgebrauch (356)
- Interaktion (339)
Publicationstate
- Veröffentlichungsversion (3883)
- Zweitveröffentlichung (1642)
- Postprint (395)
- Preprint (10)
- Erstveröffentlichung (8)
- Ahead of Print (7)
- (Verlags)-Lektorat (4)
- Hybrides Open Access (2)
- Verlags-Lektorat (1)
- Verlagsveröffentlichung (1)
Reviewstate
- (Verlags)-Lektorat (3836)
- Peer-Review (1596)
- Verlags-Lektorat (94)
- Peer-review (56)
- Qualifikationsarbeit (Dissertation, Habilitationsschrift) (44)
- Review-Status-unbekannt (14)
- Peer-Revied (12)
- Abschlussarbeit (Bachelor, Master, Diplom, Magister) (Bachelor, Master, Diss.) (10)
- (Verlags-)Lektorat (9)
- Verlagslektorat (5)
Publisher
- de Gruyter (1334)
- Institut für Deutsche Sprache (1091)
- Schwann (638)
- Narr (484)
- Leibniz-Institut für Deutsche Sprache (IDS) (263)
- De Gruyter (245)
- Niemeyer (200)
- Lang (184)
- Narr Francke Attempto (170)
- IDS-Verlag (144)
Converting and Representing Social Media Corpora into TEI: Schema and best practices from CLARIN-D
(2016)
The paper presents results from a curation project within CLARIN-D, in which an existing lMWord corpus of German chat communication has been integrated into the DEREKO and DWDS corpus infrastructures of the CLARIN-D centres at the Institute for the German Language (IDS, Mannheim) and at the Berlin-Brandenburg Academy of Sciences (BBAW, Berlin). The focus is on the solutions developed for converting and representing the corpus in a TEI format.
Pogled u e-leksikografiju
(2015)
U radu se daje pregled temeljnih pojmova i klasifikacija u području e-leksikografije. Donosi se klasifikacija e-rječnika, prikazuje se leksikografski proces izrade e-rječnika te pregled najraširenijih sustava za izradu rječnika (DWS) i sustava za pretragu korpusa (CQS). Kao primjer dobre prakse detaljnije se opisuje mrežni rječnik elexiko (Institut za njemački jezik u Mannheimu): prikazuju se njegovi ciljevi i namjena, teorijske i metodološke postavke, moduli te mogućnosti uporabe. Kao moguća osnova za izradu korpusno utemeljenoga e-rječnika hrvatskoga jezika koji bi bio u skladu s najrecentnijim leksikografskim (i uopće lingvističkim) teorijama i praksama prikazuje se rad na mrežnome leksičko-semantičkome repozitoriju hrvatskoga jezika (baza semantičkih okvira, predodžbenih shema, kognitivnih primitiva i leksičkih jedinica) u okviru projekta Repozitorij metafora hrvatskoga jezika.
The article analyses data from a corpus of email-correspondence and chat protocols that describe the initial steps of romantic contacts. It shows that different types of silences are used strategically in the process of people getting to know each other. Five silence strategies within conversations are described and their functions are illustrated by typical examples.
Overview of the IGGSA 2016 Shared Task on Source and Target Extraction from Political Speeches
(2016)
We present the second iteration of IGGSA’s Shared Task on Sentiment Analysis for German. It resumes the STEPS task of IGGSA’s 2014 evaluation campaign: Source, Subjective Expression and Target Extraction from Political Speeches. As before, the task is focused on fine-grained sentiment analysis, extracting sources and targets with their associated subjective expressions from a corpus of speeches given in the Swiss parliament. The second iteration exhibits some differences, however; mainly the use of an adjudicated gold standard and the availability of training data. The shared task had 2 participants submitting 7 runs for the full task and 3 runs for each of the subtasks. We evaluate the results and compare them to the baselines provided by the previous iteration. The shared task homepage can be found at http://iggsasharedtask2016.github.io/.
We examine different features and classifiers for the categorization of opinion words into actor and speaker view. To our knowledge, this is the first comprehensive work to address sentiment views on the word level taking into consideration opinion verbs, nouns and adjectives. We consider many high-level features requiring only few labeled training data. A detailed feature analysis produces linguistic insights into the nature of sentiment views. We also examine how far global constraints between different opinion words help to increase classification performance. Finally, we show that our (prior) word-level annotation correlates with contextual sentiment views.
We present an approach to the new task of opinion holder and target extraction on opinion compounds. Opinion compounds (e.g. user rating or victim support) are noun compounds whose head is an opinion noun. We do not only examine features known to be effective for noun compound analysis, such as paraphrases and semantic classes of heads and modifiers, but also propose novel features tailored to this new task. Among them, we examine paraphrases that jointly consider holders and targets, a verb detour in which noun heads are replaced by related verbs, a global head constraint allowing inferencing between different compounds, and the categorization of the sentiment view that the head conveys.
The wdlpOst dictionary writing system to be presented in this paper has been developed for the specific purposes of a lexicographical project on German loanwords in the East Slavic languages Russian, Belarusian, and Ukrainian. The project’s main objectives are (i) to document those loanwords for which a cognate lexical borrowing from German is known in Polish and (ii) to establish possible borrowing pathways for these lexical items. In the first phase of the project, the collaborative client/server architecture of the wdlpOst system has been used for excerpting detailed lexicographical information from a large range of historical and contemporary East Slavic dictionaries, taking the entries in a large dictionary of German loanwords in Polish as a common frame of reference. For the project’s second phase, the wdlpOst system provides innovative tooling for compiling entries of the East Slavic loanwords. Most importantly, the numerous word sense definitions for a set of cognate loanwords, as excerpted from different lexicographical sources, are mapped onto a system of newly defined cross-language word senses; in a similar vein, the phonemic and graphemic variation in the loanwords and their derivatives is captured through a tool that abstracts from dictionary-specific idiosyncrasies.
Lexicography of Language Contact: An Internet Dictionary of Words of German Origin in Tok Pisin
(2016)
The paper presents an ongoing project in the domain of lexicography of language contact, namely, the “Internet Dictionary of Words of German Origin in Tok Pisin”. The German influence onto the lexicon of the main pidgin language of Papua New Guinea has its roots in the German colonial empire, where Tok Pisin played an important role as a lingua franca in the colony of German New Guinea. Tok Pisin also served as an intermediate language for many borrowing processes; that is, German loans entered many languages in the South Pacific via Tok Pisin. The Internet Dictionary of Words of German Origin in Tok Pisin is based on all available lexicographical sources from the early 20th century up to now. These sources are systematically evaluated within our project; the results will be documented in the dictionary. The microstructure of the dictionary will be presented with respect to its major features: documentation of sources, examples for word usage, audio files, and lexicographic comment.
The Online Bibliography of Electronic Lexicography (OBELEXmeta) is a bibliographic database which is developed for researchers working in the field of dictionary research. The platform is hosted at the Institute for the German Language (Institut für Deutsche Sprache, IDS) in Mannheim. The poster presentation aims at presenting the current status of the ongoing project.
The Shared Task on Source and Target Extraction from Political Speeches (STEPS) first ran in 2014 and is organized by the Interest Group on German Sentiment Analysis (IGGSA). This volume presents the proceedings of the workshop of the second iteration of the shared task. The workshop was held at KONVENS 2016 at Ruhr-University Bochum on September 22, 2016.
There is increasing interest in recognizing opinion inferences in addition to expressions of explicit sentiment. While different formalisms for representing inferential mechanisms are being developed and lexical resources are being built alongside, we here address the need for deeper investigation of the robustness of various aspects of opinion inference, performing crowdsourcing experiments with constructed stimuli as well as a corpus study of attested data.
Sentiment analysis has so far focused on the detection of explicit opinions. However, of late implicit opinions have received broader attention, the key idea being that the evaluation of an event type by a speaker depends on how the participants in the event are valued and how the event itself affects the participants. We present an annotation scheme for adding relevant information, couched in terms of so-called effect functors, to German lexical items. Our scheme synthesizes and extends previous proposals. We report on an inter-annotator agreement study. We also present results of a crowdsourcing experiment to test the utility of some known and some new functors for opinion inference where, unlike in previous work, subjects are asked to reason from event evaluation to participant evaluation.
"Kaum [...] da, wird' ich gedisst!" Funktionale Aspekte des Banter-Prinzips auf dem Online-Prüfstand
(2016)
The article is to be considered as an attempt to enrich the theoretical approach of the Banter-Principle (Leech 1983) with an online point of view. Examples from Teamspeak- conversations and comments on the social network site Facebook reveal different user practices regarding the identifiability of the Banter-Principle: Nonverbal elements or emoticons in order to make sure that Banter is understood correctly in written language on the one hand; coping with assigned roles depending on dynamic group internal hierarchies in oral communication on the other hand. Nevertheless one question remains. Why should one disguise a cordial message rudely? My analysis shows two functions of Online Banter. Firstly, maximize the entertainment value of a conversation and secondly, establish an accepted online-identity.
We present an empirical study addressing the question whether, and to which extent, lexicographic writing aids improve text revision results. German university students were asked to optimise two German texts using (1) no aids at all, (2) highlighted problems, or (3) highlighted problems accompanied by lexicographic resources that could be used to solve the specific problems. We found that participants from the third group corrected the largest number of problems and introduced the fewest semantic distortions during revision. Also, they reached the highest overall score and were most efficient (as measured in points per time). The second group with highlighted problems lies between the two other groups in almost every measure we analysed. We discuss these findings in the scope of intelligent writing environments, the effectiveness of writing aids in practical usage situations and teaching dictionary skills.
Der Begriff der „Gattung“ wird in der Soziologie und der Sprachwissenschaft als Sammelbegriff für verfestigte, (sprachlich) ähnliche Muster mit repetitiver Frequenz zur Lösung verwandter kommunikativer Probleme gefasst (z.B. unterschiedliche moralische Gattungen, vgl. Bergmann/Luckmann (Hg.) 1999). Wenig Aufmerksamkeit wurde bislang den Gemeinsamkeiten und Unterschieden – also den Abgrenzungsmöglichkeiten – von prototypischen zu weniger prototypischen Vertretern einzelner Gattungsfamilien zuteil. Im vorliegenden Beitrag beschreiben wir anhand von authentischen Daten die sogenannten „Gassigespräche“ als spontane Kommunikation des Alltags von Hundebesitzer/innen. Außerhalb der Sprachwissenschaft werden diese primär als Hyponym des Hyperonyms „Small Talk“ subsumiert. Wir versuchen zunächst unter gattungsanalytischen Gesichtspunkten die obligatorischen und fakultativen Einheiten um ein – sofern es denn überhaupt existiert – prototypisches Zentrum von Small-Talk zu gruppieren. Anhand eines paradigmatischen Falls beschreiben wir Gemeinsamkeiten und Unterschiede in Bezug auf andere Gattungen, die sich im Spektrum der Alltagsgespräche – oder auch darüber hinaus – ansiedeln. Wir plädieren in der Diskussion dafür, Gattungsfamilien als mehr oder weniger verfestigte Muster mit teils wiederkehrenden Merkmalen zu sehen, die ihre Eigenschaften in Form und Funktion teilen können.