Refine
Year of publication
Document Type
- Part of a Book (235) (remove)
Language
- English (174)
- German (58)
- French (2)
- Multiple languages (1)
Has Fulltext
- yes (235)
Keywords
- Deutsch (71)
- Korpus <Linguistik> (55)
- Wörterbuch (21)
- Englisch (17)
- Lexikographie (16)
- Gesprochene Sprache (14)
- Konversationsanalyse (14)
- Annotation (13)
- Neologismus (13)
- Kontrastive Linguistik (11)
Publicationstate
- Veröffentlichungsversion (193)
- Zweitveröffentlichung (34)
- Postprint (13)
Reviewstate
- Peer-Review (235) (remove)
Publisher
- IDS-Verlag (85)
- Peter Lang (14)
- European language resources association (ELRA) (11)
- The Association for Computational Linguistics (7)
- Znanstvena založba Filozofske fakultete Univerze v Ljubljani / Ljubljana University Press, Faculty of Arts (7)
- Benjamins (6)
- European Language Resources Association (6)
- De Gruyter (5)
- Ids-Verlag (5)
- Springer (5)
This paper focusss on the first Slavonic-Romanian lexicons, compiled in the second half of the 17th century and their use(rs), proposing a method of investigating the manner in which lexical information available in the above corpus relates, if at all, to the vocabulary of texts from the same period. We chose to investigate their relation to an anonymous Old Testament translation made from Church Slavonic, also from the second half of the 17th century, which was supposed to be produced in the same geographical area, in the same Church Slavonic school or even by the same author as the lexicons. After applying a lemmatizer on both the Biblical text (Books of Genesis and Daniel) and the Romanian material from the lexicons, we analyse the results and double the statistical analysis with a series of case studies, focusing on some common lexemes that might be an indicator of the relatedness of the texts. Even if the analysis points out that the lexicons might not have been compiled as a tool for the translation of religious texts, it proves to be a useful method that reveals interesting data and provides the basis for more extensive approaches.
Am Beispiel der polyfunktionalen Mehrworteinheit <was weiß ich> wird das Zusammenspiel von pragmatischer und phonetischer Ausdifferenzierung in Pragmatikalisierungsprozessen untersucht. Hierzu werden spontan-sprachliche Belege aus dem Korpus „Deutsch heute“ analysiert. Die beobachtete phonetische Variationsbreite deutet auf eine komplexe Beziehung zu den jeweiligen pragmatischen Funktionen hin.
This paper argues that there is a correlation between functional and purely grammatical patterning in language, yet the nature of this correlation has to be explored. This claim is based on the results of a corpus-driven study of the Slavic aspect, drawing on the socalled Distributional Hypothesis. According to the East-West Theory of the Slavic aspect, there is a broad east-west isogloss dividing the Slavic languages into an eastern group and a western group. There are also two transitional zones in the north and south, which share some properties with each group (Dickey 2000; Barentsen 1998, 2008). The East-West Theory uses concepts of cognitive grammar such as totality and temporal definiteness, and is based on various parameters of aspectual usage in discourse, including contexts such as habituals, general factuals, historical (narrative) present, performatives, sequenced events in the past etc. The purpose of the above-mentioned study is to challenge the semantic approach to the Slavic aspect by comparing the perfective and imperfective verbal aspect on the basis of purely grammatical co-occurrence patterns (see also Janda & Lyashevskaya 2011). The study focused on three Slavic languages: Russian, which, following the East-West Theory, belongs to the eastern group, Czech, which belongs to the western group, and Polish, which is considered as transitional in its aspectual patterning.
We present a study on gaps in spoken language interaction as a potential candidate for syntactic boundaries. On the basis of an online annotation experiment, we can show that there is an effect of gap duration and gap type on its likelihood of being a syntactic boundary. We discuss the potential of these findings for an automation of the segmentation process.
A syntax-based scheme for the annotation and segmentation of German spoken language interactions
(2018)
Unlike corpora of written language where segmentation can mainly be derived from orthographic punctuation marks, the basis for segmenting spoken language corpora is not predetermined by the primary data, but rather has to be established by the corpus compilers. This impedes consistent querying and visualization of such data. Several ways of segmenting have been proposed,
some of which are based on syntax. In this study, we developed and evaluated annotation and segmentation guidelines in reference to the topological field model for German. We can show that these guidelines are used consistently across annotators. We also investigated the influence of various interactional settings with a rather simple measure, the word-count per segment and unit-type. We observed that the word count and the distribution of each unit type differ in varying interactional settings and that our developed segmentation and annotation guidelines are used consistently across annotators. In conclusion, our syntax-based segmentations reflect interactional properties that are intrinsic to the social interactions that participants are involved in. This can be used for further analysis of social interaction and opens the possibility for automatic segmentation of transcripts.
The paper presents the process of developing the AirFrame database, a specialized lexical resource in which aviation terminology is defined in the form of semantic frames, following the methodology of the Berkeley FrameNet (FN). First, the structure of the database is presented, and then the methodology applied in developing and populating the database is described. The link between specialized aviation frames and general language semantic frames, of which frames defining entities, processes, attributes and events are particularly relevant, is discussed on the example of the semantic frame of Flight and its related frames. The paper ends with discussing possibilities of using AirFrame as a model for further developing resources in which general and specialized knowledge are linked.
Almanca tuhfe / Deutsches Geschenk (1916) oder: Wie schreibt man deutsch mit arabischen Buchstaben?
(2022)
Versified dictionaries are bilingual/multilingual glossaries written in verse form to teach essential words in any foreign language. In Islamic culture, versified dictionaries were produced to teach the Arabic language to the young generations of Muslim communities not native in Arabic. In the course of time, many bilingual/multilingual versified dictionaries were written in different languages throughout the Islamic world. The focus of this study is on the Turkish-German versified dictionary titled Almanca Tuhfe / Deutsches Geschenk [German Gift], published by Dr. Sherefeddin Pasha in Istanbul in 1916. This dictionary is the only dictionary in verse ever written combining these two languages. Moreover the dictionary is one of the few texts containing German words written in Arabic letters (applying Ottoman spelling conventions). The study concentrates on the way German words are spelled and tries to find out, whether Sherefeddin Pasha applied something like fixed rules to write the German lexemes.
Wortgeschichte digital (Digital Word History) is an emerging historical dictionary of the German language that focuses on describing semantic shifts from about 1600 through today. This article provides deeper insight into the dictionary’s “cross-reference clusters,” one of its software tools that performs visualization of its reference network. Hence, the clusters are a part of the project’s macrostructure. They serve as both a means for users to find entries of interest and a tool to elucidate relations among dictionary entries. Rather than delve into technical aspects, this article focuses on the applied logics of the software and discusses the approach in light of the dictionary’s microstructure. The article concludes with some considerations about the clusters’ advantages and limitations.
The paper presents the results of a survey on lexicographic practices and lexicographers’ needs across Europe that was conducted in the context of the Horizon 2020 project European Lexicographic Infrastructure (ELEXIS) among the observer institutions of the project. The survey is a revised and upgraded version of the survey which was originally conducted among ELEXIS lexicographic partner institutions in 2018 (Kallas et al. 2019a). The main goal of this new survey was to complement the data from the ELEXIS lexicographic partner institutions in order to get a more complete picture of lexicographic practices both for born-digital and retro-digitised resources in Europe. The results offer a detailed insight into many aspects of the lexicographic process at European institutions, such as funding, training, staff, lexicographic expertise, software and tools. In addition, the survey reflects on current trends in lexicography and reveals what institutions see as the most important emerging trends that will affect lexicography in the short-term and long-term future. Overall, the results provide valuable input informing the development of tools, resources, guidelines and training materials within ELEXIS.
This paper discusses an investigation of how senses are ordered across eight dictionaries. A dataset of 75 words was used for this purpose, and two senses were examined for each word. The words are divided into three groups of 25 words each according to the relationship between the senses: Homonymy, Metaphor, and Systematic Polysemy. The primary finding is that WordNet differs from the other dictionaries in terms of Metaphor. The order of the senses was more often figurative/literal, and it had the highest percentage of figurative senses that were not found. We discuss leveraging another dictionary, COBUILD, to re-order the senses according to frequency.