Refine
Year of publication
Document Type
- Part of a Book (1424) (remove)
Language
- German (1252)
- English (151)
- French (6)
- Multiple languages (6)
- Spanish (4)
- Portuguese (2)
- Italian (1)
- Dutch (1)
- Russian (1)
Has Fulltext
- yes (1424)
Is part of the Bibliography
- no (1424) (remove)
Keywords
- Deutsch (616)
- Korpus <Linguistik> (111)
- Wörterbuch (85)
- Konversationsanalyse (71)
- Lexikographie (61)
- Sprachvariante (49)
- Grammatik (46)
- Gesprochene Sprache (44)
- Semantik (44)
- Neologismus (43)
Publicationstate
- Veröffentlichungsversion (1424) (remove)
Reviewstate
- (Verlags)-Lektorat (1288)
- Peer-Review (81)
- Verlags-Lektorat (42)
- (Verlags-)Lektorat (4)
- Peer Review (2)
- (Verlags-) Lektorat (1)
- Peer-review (1)
Publisher
- de Gruyter (466)
- Narr (174)
- Lang (92)
- Niemeyer (84)
- IDS-Verlag (58)
- Institut für Deutsche Sprache (44)
- Stauffenburg (31)
- Westdeutscher Verlag (25)
- Olms (24)
- Akademie der Wissenschaften der DDR; Zentralinstitut für Sprachwissenschaft (20)
We report results from an exploratory study of college students’ conceptions of poetry in which we asked them to name three things they expect from a poem. Frequency- and list-based analyses of their responses revealed that they primarily expect poems to rhyme, but they also identified a number of form-, content-, and reception-related genre expectations, which we discuss in relation to relevant previous research. We propose that rhyme’s predominance in college students’ genre expectations reflects its perceptual and cognitive salience during incremental poetry comprehension rather than its frequency in contemporary poetic practice. Our results characterize the genre conceptions of the population that empirical studies of poetry comprehension typically investigate, and thus provide relevant background information for the interpretation of empirical
findings in this field.
When we first started the project of looking at minority languages through a linguistic landscape lens, we felt that the visibility of minority languages in public space had been insufficiently dealt with in traditional minority language research. A linguistic landscape approach, as it had developed over the last years, would constitute a valuable path to explore, by looking at the ‘same old issues’ of language contact and language conflict from a specific angle. We were convinced that fresh linguistic landscape data would be able to provide innovative and useful insights into ‘patterns of language […] use, official language policies, prevalent language attitudes, [and] power relations between different linguistic groups’ (Backhaus 2007, p. 11). The linguistic landscape approach, as presented by the different authors in this volume, has clearly proven to be a heuristic appropriate and relevant for a wide range of minority language situations. More specifically, the ideas and analyses in the different chapters do contribute to a further understanding of minority languages and their speakers. They deepen our comprehension of language policies, power relations and ideologies in minority language settings.
Der Beitrag widmet sich dem Thema der kommunikativen Deviationen in Interviews im Ukrainischen und Deutschen. Dabei werden die Deviationen sowohl in den Presseinterviews als auch in den populärsten Videointerviews auf YouTube untersucht. Die Deviationen werden in die von der Position des Adressanten, des Adressaten sowie des Zuschauers aufgeteilt. Die Aufmerksamkeit wird der Sprach- und der kommunikativen Kompetenz der Kommunikanten als der Hauptursache der Deviationen in den Interviews gelenkt. Die Deviationen werden als eine der Voraussetzungen der erfolgreichen Kommunikation bestimmt.
Simultandolmetschen ist eine komplexe und kognitive Aktivität, bei der verschiedene Prozesse gleichzeitig ablaufen. Neben monolingualer Textverarbeitung braucht man auch dolmetschspezifische Strategien, die erworben werden müssen. Die Notstrategien werden erst dann angewendet, wenn die Kapazitätsgrenze des Dolmetschers erreicht ist.
Unserdeutsch (Rabaul Creole German) entstand um 1900 an einer katholischen Missionsstation in Vunapope auf der Insel New Britain im Bismarck-Archipel. Seine dominante Substratsprache ist Tok Pisin, das melanesische Pidgin-Englisch, seine Superstratsprache Deutsch. Der Aufsatz versucht das sprachliche Superstrat von Unserdeutsch näher zu bestimmen, d. h. die Frage zu beantworten, welches Deutsch von den Missionaren in Vunapope um 1900, am Ort und zum Zeitpunkt der Entstehung von Unserdeutsch, gesprochen wurde. Zu diesem Zweck werden die als Superstrattransfer aus dem Deutschen erklärbaren, regional markierten linguistischen Strukturmerkmale in Unserdeutsch untersucht und im geschlossenen Sprachgebiet sprachgeografisch lokalisiert. Ergänzt wird diese linguistische Evidenz durch extra- und metalinguistische Evidenz aus einschlägigen, zeitgenössischen Quellen. Die Ergebnisse deuten auf ein vorwiegend nordwestdeutsch-westfälisch geprägtes, insgesamt jedoch heterogenes, standardnahes sprachliches Superstrat hin und widerlegen somit frühere diesbezügliche Aussagen in der einschlägigen Fachliteratur. Und sie zeigen zugleich auch, dass die Analyse von kolonialen und sonstigen Auswanderervarietäten, besonders von solchen, die – wie Unserdeutsch – im Laufe ihrer späteren Geschichte den Kontakt zum sprachlichen Mutterland vollständig verloren haben, zur Rekonstruktion historischer Mündlichkeit wertvolle Daten liefern kann.
In this article, we examine the current situation of data dissemination and provision for CMC corpora. By that we aim to give a guiding grid for future projects that will improve the transparency and replicability of research results as well as the reusability of the created resources. Based on the FAIR guiding principles for research data management, we evaluate the 20 European CMC corpora listed in the CLARIN CMC Resource family, individuate successful strategies among the existing corpora and establish best practices for future projects. We give an overview of existing approaches to data referencing, dissemination and provision in European CMC corpora, and discuss the methods, formats and strategies used. Furthermore, we discuss the need for community standards and offer recommendations for best practices when creating a new CMC corpus.
Preface
(2015)
Russia, its languages and its ethnic groups are for many readers of English surprisingly unknown territory. Even among academics and researchers familiar with many ethnolinguistic situations around the globe, there prevails rather unsystematic and fragmented knowledge about Russia. This relates to both the micro level such as the individual situations of specific ethnic or linguistic groups, and to the macro level with regard to the entire interplay of linguistic practices, ideologies, laws, and other policies in Russia. In total, this lack of information about Russia stands in sharp contrast to the abundance of literature on ethnolinguistic situations, minority languages, language revitalization, and ideologies toward languages and multilingualism which has been published throughout the past decades.
In this paper, the author studies the role of the dictionary in the first language acquisition, highlighting its didactic value. Based on two Romanian lexicographical works of the 19th century, Lexiconul de la Buda (Buda, 1825) [the Lexicon of Buda] et Vocabularu romano-francesu (Bucarest, 1870) [the Romanian-French Vocabulary], the author analyses the normative information recorded in the articles in order to observe which level of language (i. e. phonetical, morphological, syntactical and lexical) is concerned. Such an approach allows to distinguish between the possible changings both at the level of the perception or at the grammatical, lexical and semantical description, i. e. the settlement of the word in the first language, and at a technical level, i. e. the making of article and of dictionary.
This paper presents the decisions behind the design of a maths dictionary for primary school children. We are aware that there has been a considerable problem regarding Mexican children’s performance in maths dragging on for a long time, and far from getting better, it is getting worse. One of the probable causes seems to be the lack of coordination between maths textbooks and teaching methods. Most maths textbooks used in primary schools include lots of activities and problem-solving techniques, but hardly any conceptual information in the form of definitions or explanations. Consequently, many children learn to do things, but have difficulty understanding mathematical concepts and applying them in different contexts. To help solve this problem, at least partially, the project of the dictionary was launched aiming at helping children to grasp and understand maths concepts learned during those first six years of their formal education. The dictionary is a corpus-based terminographical product whose macrostructure, microstructure, typography, and additional information were specifically designed to help children understand mathematical concepts.
This paper deals with the lexicographic treatment of the evidently plenty and pervasive scatological vocabulary, that is vocabulary concerning the process and products of bodily excretion (especially feces), in the synchronic Early New High German Dictionary (FWB = Frühneuhochdeutsches Wörterbuch) from a dictionary user’s view. Initially, different cultural concepts of scatology by Norbert Elias, Michail Bachtin and Mary Douglas among others and the term taboo are reflected. Subsequently, selected lexical items such as words with a primary scatological meaning (e. g. drek, kot, scheisse), concealing expressions (euphemisms, periphrases, metaphors, e. g. sitzen, seine notdurft tun, bauernveiel), and certain aspects within the polysemy of the verb scheissen are discussed, the latter on the one hand referring to a physical process with uncontrollable aspects and on the other hand denoting a deliberate action and functionalized as a fighting word during the reformation. Focussing on different positions of lexicographical information within the microstructure of the FWB, the surveillance shows that in a synchronic perspective Early New High German scatological vocabulary is a heterogeneous and complex phenomenon due to speaker, context and respectively semantic and pragmatic purposes
To effectively design online tools and develop sophisticated programs, for the teaching of Ancient Greek language, there is a clear need for lexical resources that provide semantic links with Modern Greek. This paper proposes a microstructure for an online Ancient Greek to Modern Greek thesaurus (AMGthes) that serves educational purposes. The terms of this bilingual thesaurus have been selected from reference Ancient Greek texts, taught and studied during lower and upper secondary education in Greece. The main objective here is to build a semantic map that helps students find relevant and semanti- cally related terms (synonyms and antonyms) in Ancient Greek, and then provide a rich set of suitable translations and definitions in Modern Greek. Designed to be an online resource, the thesaurus is being developed using web technologies, and thus will be available to every school and university student that pursues a degree in digital humanities.
The paper presents the results of empirical research conducted with students from the Faculty of Translation studies of Ventspils University of Applied Sciences (VUAS) in Latvia. The study investigates the habits and practices concerning the use of dictionaries on the part of translation students, as well as types of dictionaries used, frequency of use, etc. The study also presents an insight into the evaluation of the usefulness of dictionaries by Latvian students. The research describes the advantages and disadvantages of dictionaries used by the respondents, the importance of the preface and the explanation of the terms and abbreviations used in dictionaries. The research conducted, as well as the insights, results and recommendations presented, will be relevant for the lexicographic community, as it reflects the experience of one Latvian University to improve the teaching of dictionary use and lexicographic culture in this country and to complement dictionary use research with the Latvian experience.
Learning from students. On the design and usability of an e-dictionary of mathematical graph theory
(2022)
We created a prototype of an electronic dictionary for the mathematical domain of graph theory. We evaluate our prototype and compare its effectiveness in task-based tests with that of Wikipedia. Our dictionary is based on a corpus; the terms and their definitions were automatically extracted and annotated by experts (cf. Kruse/Heid 2020). The dictionary is bilingual, covering German and English; it gives equivalents, definitions and semantically related terms. For the implementation of the dictionary, we used LexO (Bellandi et al. 2017). The target group of the dictionary are students of mathematics who attend lectures in German and work with English resources. We carried out tests to understand which items the students search for when they work on graph-theoretical tasks. We ran the same test twice, with comparable student groups, either allowing Wikipedia as an information source or our dictionary. The dictionary seems to be especially helpful for students who already have a vague idea of a term because they can use the resource to check if their idea is right.
This paper describes the results of an empirical investigation carried out within the project Lessico Multilingue dei Beni Culturali (LBC), whose aim is to create a multilingual online dictionary of the lexicon of the Italian artistic heritage. The dictionary, whose lexicographic process has already started, is intended for linguists and specialist translators as well as for professionals in the tourism sector and students of Foreign Languages and Literatures. The investigation conducted through a questionnaire submitted to undergraduate students at the University of Milan and at the University of Florence has a double aim: to research the habits in the use of lexicographic tools by possible users of the dictionary (Italian Learners of German Language), and to identify preferences regarding macro-, medio- and microstructural features of the future LBC-dictionary to realize a user-friendly tool. After a brief introduction on the state of the art of the survey in the field of Dictionary Users Studies, the article describes the questionnaire and the results obtained from the pilot study. A summary and a discussion on the future developments of the project conclude the work.
This paper gives an insight into a cross-media publishing process on different stages: from a printed bilingual syntagmatic dictionary for GFL to an online learner’s dictionary of German collocations to a German learner’s dictionary portal. On the basis of an sql database specially developed for a corpus-guided dictionary of German collocations, the bilingual syntagmatic learner’s dictionary KolleX was published in 2014. The first part of the article describes this lexicographic process, focusing the most relevant aspects of the dictionary concept, e. g. dictionary type, subject matter, corpus guided data selection and microstructure. The second part introduces the first online version of KolleX from 2016 and the profound changes in the editing system – from a desktop version (2005) to a web-based editing system (2016) –, which resulted successively in a prototype of a German learner’s dictionary portal, called E-KolleX DaF (2018–). Focusing on the aspects of dynamism and integration of different resources from a learner’s perspective the paper shows the innovative features of this new online reference work. The contribution presents the solutions for the integration of new datatypes in the database of KolleX and the linking to different data in German monolingual dictionary platforms. The paper outlines the web design, functioning and technical improvements of E-KolleX DaF. The conclusions provide an outlook to the forthcoming challenges.
There is a growing interest in pedagogical lexicography, and more specifically in the study of dictionary users’ abilities and strategies (Prichard 2008; Gavriilidou 2010, 2011; Gavriilidou/Mavrommatidou/Markos 2020; Gavriilidou/Konstantinidou 2021; Chatjipapa et al. 2020). Τhe purpose of this presentation is to investigate dictionary use strategy and the effect of an explicit and integrated dictionary awareness intervention program on upper elementary pupils’ dictionary use strategies according to gender and type of school. A total of 150 students from mainstream and intercultural schools, aged 10–12 years old, participated in the study. Data were collected before and after the intervention through the Strategy Inventory for Dictionary Use (SIDU) (Gavriilidou 2013). The results showed a significant effect of the intervention program on Dictionary Use Strategies employed by the experimental group and support the claim that increased dictionary use can be the outcome of explicit strategy instruction. In addition, the effective application of the program suggests that a direct and clear presentation of DUS is likely to be more successful than an implicit presentation. The present study contributes to the discussion concerning both the ‘teachability’ of dictionary use strategies and skills and the effective forms of intervention programs raising dictionary use awareness and culture.
This chapter investigates differences in language regards in Latvia and Estonia. Based on the results of a survey that had about 1000 respondents in each country, it analyses general views on languages and language-learning motivation, as well as specific regards of Estonian, Latvian, Russian, English, German and other languages. The results show that languages and language learning are generally important for the respondents; language-learning motivation is overwhelmingly instrumental. Besides the obvious value of the titular languages of each country, English and Russian are to differing degrees considered of importance for professional and leisure purposes, ahead of German, Finnish (in Estonia) and French, whereas other languages are of little relevance. In more emotionally related categories, differences are more salient. L1-speakers of Russian differ in their views from L1-speakers of Estonian and Latvian, indicating that the linguistic acculturation of society in Estonia tends to be more monodirectional towards Estonian, whereas in Latvia there are more bidirectional tendencies as both Latvian and Russian L1-speakers regard each other’s languages as at least moderately relevant.
In this paper, we propose a controlled language for authoring technical documents and report the status of its development, while maintaining a specific focus on the Japanese automotive domain. To reduce writing variations, our controlled language not only defines approved and unapproved lexical elements but also prescribes their preferred location in a sentence. It consists of components of a) case frames, b) case elements, c) adverbial modifiers, d) sentence-ending functions, and e) connectives, which have been developed based on the thorough analyses of a large-scale text corpus of automobile repair manuals. We also present our prototype of a writing assistant tool that implements word substitution and reordering functions, incorporating the constructed controlled language.
The focus of this paper will be on lexical information systems and the framework guidelines for the definition of the curricula within the educational system of the Autonomous Province of Bolzano/ Bozen (Italy). In Italy, the competences to be achieved at different school levels are published in the form of general guidelines. On this basis each school has to specify the general competency goals and to spell them out in a concrete curriculum. In this paper I will examine to what extent lexical information systems are represented in the framework guidelines within the German and the Italian educational system of the Autonomous Province, these being separate systems. In a second step, I will check the representations of the resources against the “Villa Vigoni Theses on Lexicography“. Finally, I will discuss the results and give an outlook for further research.
Thesauri have long been recognized as valuable structured resources aiding Information Retrieval systems. A thesaurus provides a precise and controlled vocabulary which serves to coordinate data indexing and retrieval. The paper presents a bilingual Greek and English specialized thesaurus that is being developed as the backbone of a platform aimed at enhancing and enriching the cultural experiences of visitors in Eastern Macedonia and Thrace, Greece. The cultural component of the intended platform comprises textual data, images of artifacts and living entities (animals and plants in the area), as well as audio and video. The thesaurus covers the domains of Archaeology, Literature, Mythology, and Travel; therefore, it can be viewed as a set of inter-linked thesauri. Where applicable, terms and names in the database are also geo-referenced.
This chapter starts out by giving a brief overview of the main priorities of international and German studies in the area of linguistic landscape research. The contributions to this volume are then embedded in current debates and developments in the field. Finally, we outline important desiderata of linguistic landscape research that focus on German and address challenges of knowledge transfer and application as well as possible contributions to international lines of research.
Vorwort
(2021)
Lexicographers working with minority languages face many challenges. When the language in question is also a sign language, circumstances specific to the visual-spatial modality have to be taken into consideration as well. In this paper, we aim to show and discuss which challenges we encounter while compiling the Digitales Wörterbuch der Deutschen Gebärdensprache (DW-DGS), the first corpus-based dictionary of German Sign Language (DGS). Some parallel the challenges minority language lexicographers of spoken languages encounter, e. g. few resources, no written tradition, and having to create one dictionary for all potential user groups, while others are specific to sign languages, e. g. representation of visual-spatial language and creating access structures for the dictionary.
The EMLex Dictionary of Lexicography (= EMLexDictoL) is a plurilingual subject field dictionary (in German, English, Afrikaans, Galician, Italian, Polish and Spanish) that contains the basic subject field terminology of lexicography and dictionary research, in which the dictionary article texts are presented in a sophisticated but comprehensible form. The articles are supplemented by a complex crossreferencing system and the current subject field literature of the respective national languages. Following the lemma position, the dictionary articles contain items regarding morphology, synonymy, the position of the definiens, additional explanations, the cross-reference position, the position for literature, the equivalent terms in the other six languages of the dictionary as well as the names of the authors.
This paper focusss on the first Slavonic-Romanian lexicons, compiled in the second half of the 17th century and their use(rs), proposing a method of investigating the manner in which lexical information available in the above corpus relates, if at all, to the vocabulary of texts from the same period. We chose to investigate their relation to an anonymous Old Testament translation made from Church Slavonic, also from the second half of the 17th century, which was supposed to be produced in the same geographical area, in the same Church Slavonic school or even by the same author as the lexicons. After applying a lemmatizer on both the Biblical text (Books of Genesis and Daniel) and the Romanian material from the lexicons, we analyse the results and double the statistical analysis with a series of case studies, focusing on some common lexemes that might be an indicator of the relatedness of the texts. Even if the analysis points out that the lexicons might not have been compiled as a tool for the translation of religious texts, it proves to be a useful method that reveals interesting data and provides the basis for more extensive approaches.
Given the relevance of interoperability, born-digital lexicographic resources as well as legacy retro-digitised dictionaries have been using structured formats to encode their data, following guidelines such as the Text Encoding Initiative or the newest TEI Lex-0. While this new standard is being defined in a stricter approach than the original TEI dictionary schema, its reuse of element names for several types of annotation as well as the highly detailed structure makes it difficult for lexicographers to efficiently edit resources and focus on the real content. In this paper, we present the approach designed within LeXmart to facilitate the editing of TEI Lex-0 encoded resources, guaranteeing consistency through all editing processes.
An ongoing academic and research program, the “Vocabula Grammatica” lexicon, implemented by the Centre for the Greek Language (Thessaloniki, Greece), aims at lemmatizing all the philological, grammatical, rhetorical, and metrical terms in the written texts of scholars (philologists and scholiasts) who curated the ancient Greek literature from the beginning of the Hellenistic period (4th/3rd c. BC) until the end of the Byzantine era (15th c. AD). In particular, it aspires to fill serious gaps (a) in the study of ancient Greek scholarship and (b) in the lexicography of the ancient Greek language and literature. By providing specific examples, we will highlight the typical and methodological features of the forthcoming dictionary.
Basnage’s revision (1701) of Furetiere’s Dictionnaire universel is profoundly different from Furetiere’s work in several regards. One of the most noticeable features of the dictionary lies in his in- creased use of usage labels. Although Furetiere already made use of usage labels (see Rey 1990), Basnage gives them a prominent role. As he states in the preface to his edition, a dictionary that aspires to the title of “universal” should teach how to speak in a polite way (“poliment”), right (“juste”) and making use of specific terminology for each art. He specifies, lemma by lemma, the diaphasic dimension by indicating the word’s register and context of use, the diastratic one by noting the differences in the use of the language within the social strata, the diachronic evolution by indicating both archaisms and neologisms, the diame- sic aspect by highlighting the gaps between oral and written language, the diatopic one by specifying either foreign borrowings or regionalisms.
After extracting the entries containing formulas such as “ce mot est...”, “ce terme est...” and similar ones, we compare the number of entries and the type of information provided by the two lexicographers1. In this paper, we will focus on Basnage’s innovative contribution. Furthermore, we will try to identify the lexi- cographer’s sources, i. e. we will try to establish on which grammars, collections of linguistic remarks or contemporary dictionaries Basnage relies his judgements.
Wortgeschichte digital (‘digital word history’) is a new historical dictionary of New High German, the most recent period of German reaching from approximately 1600 AD up to the present. By contrast to many historical dictionaries, Wortgeschichte digital has a narrated text – a “word history” – at the core of its entries. The motivation for choosing this format rather than traditional microstructures is
briefly outlined. Special emphasis it put on the way these word histories interact with other components of the dictionary, notably with the quotation section. As Wortgeschichte digital is an online only project, visualizations play an important role for the design of the dictionary. Two examples are presented: first, the “quotation navigator” which is relevant for the microstructure of the entries, and, second, a timeline (“Zeitstrahl”) which is part of the macrostructure as it gives access to the lemma inventory from a diachronic point of view.
In the currently ongoing process of retro-digitization of Serbian dialectal dictionaries, the biggest obstacle is the lack of machine readable versions of paper editions. Therefore, one essential step is needed before venturing into the dictionary-making process in the digital environment – OCRing the pages with the highest possible accuracy. Successful retro-digitization of Serbian dialectal dictionaries, currently in progress, has shown a dire need for one basic yet necessary step, lacking until now – OCRing the pages with the highest possible accuracy. OCR processing is not a new technology, as many opensource and commercial software solutions can reliably convert scanned images of paper documents into digital documents. Available software solutions are usually efficient enough to process scanned contracts, invoices, financial statements, newspapers, and books. In cases where it is necessary to process documents that contain accented text and precisely extract each character with diacritics, such software solutions are not efficient enough. This paper presents the OCR software called “SCyDia”, developed to overcome this issue. We demonstrate the organizational structure of the OCR software “SCyDia” and the first results. The “SCyDia” is a web-based software solution that relies on the open-source software “Tesseract” in the background. “SCyDia” also contains a module for semi-automatic text correction. We have already processed over 15,000 pages, 13 dialectal dictionaries, and five dialectal monographs. At this point in our project, we have analyzed the accuracy of the “SCyDia” by processing 13 dialectal dictionaries. The results were analyzed manually by an expert who examined a number of randomly selected pages from each dictionary. The preliminary results show great promise, spanning from 97.19% to 99.87%.
Almanca tuhfe / Deutsches Geschenk (1916) oder: Wie schreibt man deutsch mit arabischen Buchstaben?
(2022)
Versified dictionaries are bilingual/multilingual glossaries written in verse form to teach essential words in any foreign language. In Islamic culture, versified dictionaries were produced to teach the Arabic language to the young generations of Muslim communities not native in Arabic. In the course of time, many bilingual/multilingual versified dictionaries were written in different languages throughout the Islamic world. The focus of this study is on the Turkish-German versified dictionary titled Almanca Tuhfe / Deutsches Geschenk [German Gift], published by Dr. Sherefeddin Pasha in Istanbul in 1916. This dictionary is the only dictionary in verse ever written combining these two languages. Moreover the dictionary is one of the few texts containing German words written in Arabic letters (applying Ottoman spelling conventions). The study concentrates on the way German words are spelled and tries to find out, whether Sherefeddin Pasha applied something like fixed rules to write the German lexemes.
This article aims to show the influence of doctrines in the medical lexicographers choices, with the Capuron-Nysten-Littré lineage as a case study. Indeed, the Dictionnaire de médecine has been crossed by several schools of thought such as spiritualism and positivism. While lexical continuity may seem self-evident due to the nature of the work, thus reducing the reprint to a simple lexical increase, this process introduces neologisms and deletions, all can be considered in their effects by using text statistics and factorial analysis.
In the present contribution, I investigate if and how the English and French editions of the Wiktionary collaborative dictionary can be used as a corpus for real time neology watch. This option is envisaged as a stopgap, when no satisfactory corpus is available. Wiktionary can also prove useful in addition to standard corpus analysis, to minimize the risk of overlooking new coinages and new senses. Since the collaborative dictionary’s quest for exhaustiveness makes the manual inspection of the new additions unreasonable (more than 31,000 English lemmas and 11,000 French lemmas entered the nomenclature in 2020), identifying the possibly relevant headwords is an issue. The solution proposed here is to use Wiktionary revision history to detect the (new or existing) entries that received the greatest number of modifications. The underlying hypothesis is that the most heavily edited pages can help identify the vocabulary related to “hot topics”, assuming that, in 2020, the pandemic-related vocabulary ranks high. I used two measures introduced by Lih (2004), whose aim was to estimate the quality of Wikipedia articles: the so-called rigour (number of edits per page) and diversity (number of unique contributors per page). In the present study, I propose to adapt the rigour and diversity metrics to Wiktionary in order to identify the pages that generated a particular stir, rather than to estimate the quality of the articles. I do not subscribe to the idea that – in Wiktionary – more revisions necessarily produce quality articles (more revisions often produce complete articles). I therefore adopt Lih’s notion of diversity to refer to the number of distinct contributors, but leave out the name rigour when it comes to the number of revisions. Wolfer and Müller-Spitzer (2016) used the two metrics to describe the dynamics of the German and English editions of Wiktionary. One of their findings was that the number of edits per page is correlated with corpus word frequencies. The variation in number of page edits should therefore reflect to some extent the variation of corpus word frequencies. Renouf (2013) established a relationship between the fluctuation of word frequencies in a diachronic corpus and various neological processes. In particular, she illustrated how specific events generate sudden frequency spikes for words previously unseen in the corpus. For instance, Eyjafjallajökull, the – existing – name of an Icelandic glacier, appeared in the corpus when the underlying volcano erupted in 2010 and disrupted air traffic in Europe. In order to check if the same phenomenon occurs when using Wiktionary edits instead of corpus frequencies, I manually annotated the most frequently revised entries (according to various ranking scores) with the binary tag: “related to Covid-19” (yes/no). The annotations were then used to test the ability of various configurations to detect relevant headwords from the English and French Wiktionary, namely Covid-19 neologisms and related existing words that deserve updates.
To leverage the Deaf community’s increasing online presence, the web-based platform NZSL Share was launched in March 2020 to crowdsource new and previously undocumented signs, and to encourage community validation of these signs. The platform allows users to upload sign videos, comment on videos and agree or disagree with (often new) signs being proposed. It is managed by the research team that maintains the ODNZSL, which includes the authors. NZSL Share is being used by individuals as well as Deaf community groups to record and share signs of a specialist nature (e.g., school curriculum signs). NZSL Share now has close to 50 actively contributing members. Its launch coincided with the 2020 COVID-19 outbreak in New Zealand and so some of the first signs contributed were COVID-19-related, which are the focus of this paper.
This paper arises within the current communication urgency experienced throughout the pandemic. From its onset, several new lexical units have permeated the overall media discourse, as well as social media and other channels. These units convey information to the public regarding the ‘severe acute respiratory syndrome’ namely COVID-19. In addition to its worldwide impact healthwise, the pandemic generates noteworthy influence in the linguistic landscape, and as a result, a significant number of neologisms have emerged. Within the scope of our ongoing research, we identify the neologisms in European Portuguese that are related to the term COVID-19 via form or meaning. However, not all the new lexical units identified in our corpus containing COVID-19 in its formation can unequivocally be regarded as neoterms (terminological neologisms). Accordingly, this article aims not only to reflect on the distinction between neologism and neoterm but also to explore the determinologisation process that several of these new lexical units experience.
This paper examines a certain subset of the vocabulary of Modern Icelandic, namely those words that are labelled as ‘ancient’ in the Dictionary of Contemporary Icelandic (DCI). The words were analysed and grouped into two main categories, 1) Words with only ‘ancient’ sense(s) and 2) words that have modern as well as an obsolete older sense. Several subgroups were identified as well as some lexical characteristics. The words in question were then analysed in two other sources, the Dictionary of Old Norse Prose (ONP) and the Icelandic Gigaword Corpus (IGC). The results show that the words belong to several semantic domains that reflect the types of texts that have survived until modern times. Most of the words are robustly attested in Old Norse sources, although there are a few exceptions. Large majority of the words can be found in Modern Icelandic texts, but to a varying degree. Limits of the corpus material makes it difficult to analyse some of the words. The result indicate that the words labelled ‘ancient’ can be divided into three main groups: a) words that are poorly attested and should perhaps not be included in the lexicographic description of Modern Icelandic; b) words that are likely to occur sometimes in Modern Icelandic; c) words that function as other inherited Old Norse words and perhaps do not require a special label or should have an additional sense in the DCI.
This paper presents a multilingual dictionary project of discourse markers. During its first stage, consisting of collecting the list of headwords, we used a parallel corpus to automatically extract units from texts written in Spanish, Catalan, English, French and German. We also applied a method to create a taxonomy structure for automatically organising the markers in clusters. As a result, we obtain an extensive, corpus-driven list of headwords. We present a prototype of the microstructure of the dictionary in the form of a standard XML database and describe the procedure to automatically fill in most of its fields (e.g., the type of DM, the equivalents in other languages, etc.), before human intervention.
This paper describes a method for extracting collocation data from text corpora based on a formal definition of syntactic structures, which takes into account not only the POS-tagging level of annotation but also syntactic parsing (syntactic treebank model) and introduces the possibility of controlling the canonical form of extracted collocations based on statistical data on forms with different properties in the corpus. Specifically, we describe the results of extraction from the syntactically tagged Gigafida 2.1 corpus. Using the new method, 4,002,918 collocation candidates in 81 syntactic structures were extracted. We evaluate the extracted data sample in more detail, mainly in relation to properties that affect the extraction of canonical forms: definiteness in adjectival collocations, grammatical number in noun collocations, comparison in adjectival and adverbial collocations, and letter case (uppercase and lowercase) in canonical forms. The conclusion highlights the potential of the methodology used for the grammatical description of collocation and phrasal syntax and the possibilities for improving the model in the process of compilation of a digital dictionary database for Slovene.
This paper looks at whether, after two decades of corpus building for the Bantu languages, the time is ripe to begin using monitor corpora. As a proof-of-concept, the usefulness of a Lusoga monitor corpus for lexicographic purposes, in casu for the detection of neologisms, both in terms of new words and new meanings, is investigated and found useful.
This paper presents the main issues connected with the creation of a trilingual Hungarian-Italian-English dictionary of the COVID-19 pandemic using Lexonomy. My aim is not only to create a coronacorpus (in Hungarian, I propose my own corona-neologism or ‘coroneologism’: koronakorpusz) and a dictionary of equivalents, but also to understand how the different waves and phases of the COVID-19 pandemic are changing the Hungarian language, detect the Corona-, COVID-, pandemic-, virus-, mask-, quarantine-, and vaccine-related neologisms, and offer an overview of the most frequent or linguistically interesting Hungarian neologisms and multiword units related to COVID-19.
This article has a double objective. First, it seeks to offer an initial approach, with critical notes, to the group of pandemic-related neologisms incorporated into the DLE in the year 2020. To that end, the trends in the academic dictionary’s incorporation of neologisms will be reviewed, focusing in particular on specialized language neologisms. Second, the article presents the design of a research study that allows for the examination of any new words beginning with CORONA- added to the DLE and the DHLE. An assessment will be made of the particularities of the DLE and the DHLE regarding the incorporation of the new words, as well as the degree of correspondence or complementarity between the two works in this sense. This will show the complementary roles that the DLE and the DHLE are currently acquiring. In this sense, the new additions open up a debate on the treatment of neologisms in academic lexicography, in a particularly unique scenario.
This paper focuses on standardological and lexicographical aspects of Coronavirus-related neologisms in Croatian. The presented results are based on corpus analysis. The initial corpus for this analysis consists of terms collected for the Glossary of Coronavirus. This corpus has been supplemented by terms we collected on the Internet and from the media. The General Croatian corpora: Croatian Web Corpus – hrWaC (cf. Ljubešić/Klubička 2016) and Croatian Language Repository (cf. Brozović Rončević/Ćavar 2008: 173–186) were also used, but since they do not include neologisms that entered the language after 2013, they could be used only to check terms in the language before that time. From October 2021, a specialized Corona corpus compiled by Štrkalj Despot and Ostroški Anić (2021) became publicly available on request. The data from these corpora are analyzed by Sketch Engine (cf. Kilgarriff et al. 2004: 105–116), a corpus query system loaded with the corpora, enabling the display of lexeme context through concordances and (differential) word sketches and the extraction of keywords (terms) and N-grams. The most common collocations are sorted into syntactic categories. For English equivalents, in addition to the sources found on the Internet, enTenTen2020 corpus was consulted. In the second part of the paper, we analyze and compare the presentation of Coronavirus terminology in the descriptive Glossary of Coronavirus and the normative Croatian Web Dictionary – Mrežnik.
Within the scope of the project "Study and dissemination of COVID-19 terminology", the study reported here aims to detect, analyse and discuss the characteristics of COVID-19 terminology, in particular the role of the adjective novo [new] in this terminology, the high recurrence of terms in the plural and the resemantization of some of the terminological units used. The present paper also discusses how these characteristics influenced the choices that have guided the creation of the proposed dictionary. This paper presents, therefore, the results of the analyses of these aspects, starting with a discussion of the relation between terminology and neology and arriving at the characteristic aspects of the macrostructural and microstructural choices about which some considerations were made.
While adjusting to the COVID-19 pandemic, people around the world started to talk about the “new normal” way of life, and they conveyed feelings and thoughts on the topic through social networks and traditional communication channels resorting to a set of specific linguistic strategies, such as metaphors and neologisms. The vocabulary in different domains and in everyday speech was expanded to accommodate a complex social, cultural, and professional phenomenon of changes. Therefore, this new life gave birth to a new language – the “coronaspeak”. According to Thorne (2020), the “coronaspeak” has three stages: first, it emerged in the way medical aspects were communicated in everyday language; secondly, it occurred when speakers verbalized the experiences they had undergone and “invented their own terms”; finally, this “new” way of speaking emerged in the government and authorities’ jargon, to ensure that the new rules and policies were understood, and that population adopted socially responsible behaviours.
In this paper, we will focus on the second stage, because we intend to take stock of how speakers communicate and verbalize this new way of living, particularly on social networks, for example. Alongside, we are interested in the context in which the neologism – be it a new word, a new meaning, or a new use – emerged, is used, and understood, through the observation of the occurrence of the new word(s) either on social networks or through dissemination texts (press) to confront it with the ones that Portuguese digital dictionaries have attested so far. Different criteria regarding the insertion of new units, the inclusion date, and the lexicographic description of the entries in the dictionaries will be debated.
Phonesthemes (Firth 1930) are sublexical constructions that have an effect on the lexico-grammatical continuum: they are recurring form-meaning associations that occur more often than by chance but not systematically (Abramova/Fernandez/Sangati 2013). Phonesthemes have been shown (Bergen 2004) to affect psycholinguistic language processing; they organise the mental lexicon. Phonesthemes appear over time to emerge as driven by language use as indexical rather than purely iconic constructions in the lexicon (Smith 2016; Bergen 2004; Flaksman 2020). Phonesthemes are acknowledged in construction morphology (Audring/Booij/Jackendoff 2017) as motivational schemas. Some phonesthemes also tend to have lexicographic acknowledgment, as shown by etymologist Liberman (2010), although this relevance and cohesion appears to be highly variable as we will show in this paper.
eThis paper first attempts a state-of-the art overview of what is known about women in the history of lexicography up to the early twentieth century. It then focusses more closely on the German and German-English lexicographical traditions to 1900, examining them from three different perspectives (following Russell’s 2018 study of women in English lexicography): women as users and dedicatees of dictionaries; women as contributors to and compilers of lexicographical works; and (in a very preliminary way) women and female sexuality as represented in German/English bilingual dictionaries of the eighteenth and early nineteenth centuries. Russell (2018) was able to identify some 24 dictionaries invoking women as patrons, dedicatees or potential users before 1700, and some 150 works in English lexicography by women between 1500 and 1900, besides the contribution of hundreds of women as supporters and helpers, not least as unpaid readers and sub-editors for the Oxford English Dictionary. Equivalent research in other languages is lacking, but this paper presents some of the known examples of women as lexicographers. The evidence tends to support Russell’s finding for English, that women were more likely to find a place in lexicography outside the mainstream: sometimes in a more private sphere (like Hester Piozzi); often in bilingual lexicography (such as Margrethe Thiele, working on a Danish-French dictionary), including missionary and or colonizing activity (such as Cinie Louw in Africa, Daisy Bates in Australia); and in dialect description (Coronedi Berti in Italy, Luisa Lacal and María Moliner in Spain). Within the German-speaking context, women who participated in lexicographical work themselves are hard to identify before the late nineteenth century, though those few women who did have access to education were often engaged in language learning, including translation activity, and they were likely users of bilingual and multilingual dictionaries. Christian Ludwig’s (1706) English-German dictionary – the first of its kind – was dedicated to the Electoral Princess Sophia of Hanover. Elizabeth Weir may have been the first named female compiler of a German dictionary, with her bilingual New German Dictionary (1888). Rather better known are the cases of Agathe Lasch and Luise Pusch, who, as pioneering women in the field of German linguistics, ultimately led major lexicographical projects documenting German regional varieties in the first half of the twentieth century (Middle Low German and Hamburgish in the case of Lasch; the Hessisch Nassau dialect dictionary in the case of Berthold). In the light of existing research on gender and sexuality in the history of English lexicography (e. g. Iamartino 2010; Turton 2019), I conclude with a preliminary exploration how woman and sexuality have been represented in dictionaries of German and English, taking the words Hure and woman in bilingual German-English dictionaries of the eighteenth and nineteenth centuries as my case studies.
In a multilingual and multicultural society, dictionaries play an important role to enhance interlingual communication. A diversity of languages and different levels of dictionary culture demand innovative lexicographic approaches to establish a dictionary landscape that responds to the needs of the various speech communities. Focusing on the South African situation this paper discusses some aspects of a few dictionaries that contributed to an improvement of the local dictionary landscape. Using the metaphors of bridges, dykes and sluice gates it is shown how lexicographers need a balanced approach in their lemma selection and treatment. Whilst a too strong prescriptive approach can be to the detriment of the macrostructural selection, a lack of regulatory criteria could easily lead to a data overload. The lexicographer should strive to give a reflection of the actual language use and enable the users to retrieve the information that can satisfy their specific communication and cognitive needs. Such lexicographic products will enrich and improve the dictionary landscape.
Words and their usages are in many cases closely related to or embedded in social, cultural, technical and ideological contexts. This does not only apply to individual words and specific senses, but to many vocabulary zones as well. Moreover, the development of words is often related to aspects of socio-cultural evolution in a broad sense. In this paper I will have a look at traditional dictionaries and digital lexical systems focussing on the question how they deal with socio-cultural and discourse-related aspects of word usage. I will also propose a number of suggestions how future digital lexical systems might be enriched in this respect.
The aim of this paper is to show how lexicographical choices reflect ideological thinking, singled out by Eagleton (2007) into the strategies of rationalizing, legitimating, action orienting, unifying, naturalizing and universalizing. It will be carried out by examining two twenty first century editions of each of the five English monolingual learner’s dictionaries published by Cambridge, Collins, Longman, Macmillan, and Oxford. The synchronic and diachronic analyses of the dictionaries and their different editions at the macro structural level (the wordlists) and at the micro structural level (the definitional styles) will show how the reduction and change of data, derived from heterogeneous social and cultural contexts of language use, to abstract essential forms, involves decisions about the central and peripheral aspects of the lexicon and the meaning of words.
Applying terminological methods to lexicography helps lexicographers deal with the terms occurring in general language dictionaries, especially when it comes to writing the definitions of concepts belonging to special fields. In the context of the lexicographic work of the Dicionário da Língua Portuguesa, an updated digital version of the last Academia das Ciências de Lisboa’ dictionary published in 2001, we have assumed that terminology – in its dual dimension, both linguistic and conceptual – and lexicography are complementary in their methodological approaches. Both disciplines deal with lexical items, which can be lexical units or terms. In this paper, we apply terminological methods to improve the treatment of terms in general language dictionaries and to write definitions as a form of achieving more precision and accuracy, and also to specify the domains to which they belong. Additionally, we highlight the consistent modelling of lexicographic components, namely the hierarchy of domain labels, as they are term identification markers instead of a flat list of domains. The need to create and make available structured, organised and interoperable lexicographic resources has led us to follow a path in which the application of standards and best practices of treating and representing specialised lexicographic content are fundamental requirements.
Mensch-Maschine-Interaktion im lexikographischen Prozess zu lexikalischen Informationssystemen
(2022)
Dictionaries of today and tomorrow are rather digital products than print dictionaries. From the user’s perspective, electronic dictionary applications and in particular „lexical information systems“, also referred to as „digital word information systems“ are coming to the fore alongside Google searches. Given the rapid developments in the area of the automated provision of lexicographic information, more precisely the automatic creation of online dictionaries, the new role of the lexicographer in the modern lexicographic process is questionable. This article addresses this issue.
While there was arguably a need for multi authored, multi volume, metalexicographic handbooks three decades ago – when the field of metalexicography was still ‘young’ – it is a bit puzzling to make sense of the current output flurry in this field. Is it simply a matter of ‘every publisher trying to fill its shelves’? or is there really a need in the scientific community for more and (continuously) updated reference works? And once available, are such works also consulted? Which parts? By whom? How often? For what purposes? In this paper we look at an ongoing, real world metalexicographic handbook project to answer these questions.
This paper focuses on the treatment of culture bound lexical items in a novel type of online learner’s dictionary model, the Phrase Based Active Dictionary (PAD). A PAD has a strong phraseological orientation: each meaning of a word is exclusively defined in a typical phraseological context. After introducing the relevant theory of realia in translation studies, we develop a broader notion of culture specific lexical items which is more apt to serve the purposes of learner’s lexicography and thus to satisfy the needs of a larger and often undefined target group. We discuss the treatment of such words and expressions in common English learner’s dictionaries and then present various excerpts from PAD entries in English, German, and Italian which display different strategies for coping with cultural contents in the lexicon. Our aim is to demonstrate that the phraseological approach at the core of the PAD model turns out to be extremely important to convey cultural knowledge in a suitable way for users to fully grasp cultural implications in language.
In foreign language teaching the use of dictionaries, especially bilingual, has always been related to the hypotheses concerning the relationship between the native language (L1) and second language acquisition method. If the bilingual dictionary was an obvious tool in the grammar-translation method, it was banned from the classroom in the direct, audiolingual and audiovisual methods. Also in the communicative method, foreign language learners are discouraged from using a dictionary. Its use should not obstruct the goals of communicatively oriented foreign language learning – a view still held by many foreign language teachers. Nevertheless, the reality has been different: Foreign language learners have always used dictionaries, even if they no longer possess a print dictionary and mainly use online resources and applications. Dictionaries and online resources will continue to play an important role in the future. In the Council of Europe’s language policy, with its emphasis on multilingualism and lifelong learning, the adequate use of reference tools as a strategic skill is highlighted. In several European countries, educational guidelines refer to the use of dictionaries in the context of media literacy, both in mother tongue and foreign language teaching. Not only is their adequate use important, but so too is the comparison, assessment and evaluation of the information presented, in order to develop Language Awareness and Language Learning Awareness. This is good news. However, does this mean that dictionaries are actually used in class? What role do dictionaries play in foreign language teaching in schools and universities? Are foreign language learners in the digital era really competent users? And how competent are their teachers? Are they familiar with the current (online) dictionary landscape? Can they support their students? After a more in-depth study of the status quo of dictionary use by foreign language learners and teachers and the gap between their needs and the reality, this contribution discusses the challenges facing lexicographers and meta-lexicographers and what educational policy measures are necessary to make their efforts worthwhile in turning foreign language learners – and their teachers – into competent users in a multilingual and digital world.
Wortgeschichte digital (Digital Word History) is an emerging historical dictionary of the German language that focuses on describing semantic shifts from about 1600 through today. This article provides deeper insight into the dictionary’s “cross-reference clusters,” one of its software tools that performs visualization of its reference network. Hence, the clusters are a part of the project’s macrostructure. They serve as both a means for users to find entries of interest and a tool to elucidate relations among dictionary entries. Rather than delve into technical aspects, this article focuses on the applied logics of the software and discusses the approach in light of the dictionary’s microstructure. The article concludes with some considerations about the clusters’ advantages and limitations.
Looking up for an unknown word is the most frequent use of a dictionary. For languages both agglutinative and inflectional, such as Georgian, this can be quite challenging because an inflected form can be very far from the lemmas used by the target dictionary. In addition, there is no consensus among Georgian lexicographers on which lemmas represent a verb in dictionaries. It further complicates dictionaries access. Kartu-Verbs is a base of inflected forms of Georgian verbs accessible by a logical information system. It currently contains more than 5 million inflected forms related to more than 16,000 verbs for 11 tenses; each form can have 11 properties; there are more than 80 million links in the base. This demonstration shows how, from any inflected form, we can find the relevant lemma to access any dictionary. Kartu-Verbs can thus be used as a front-end to any Georgian dictionary.
This paper reports on the restructuring of a bilingual (Greek Sign Language, GSL – Modern Greek) lexicographic database with the use of the WordNet semantic and lexical database. The relevant research was carried out by the Institute for Language and Speech Processing (ILSP) / Athena R.C. team within the framework of the European project Easier. The project will produce a framework for intelligent machine translation to bring down language barriers among several spoken/written and sign languages. This paper describes the experience of the ILSP team to contribute to a multilingual repository of signs and their corresponding translations and to organize and enhance a bilingual dictionary (GSL – Modern Greek) as a result of this mapping; this will be the main focus of this paper. The methodology followed relies on the use of WordNet and, more specifically, the Open Multilingual WordNet (OMW) tool to map content in GSL to WordNet synsets.
The paper presents the process of developing the AirFrame database, a specialized lexical resource in which aviation terminology is defined in the form of semantic frames, following the methodology of the Berkeley FrameNet (FN). First, the structure of the database is presented, and then the methodology applied in developing and populating the database is described. The link between specialized aviation frames and general language semantic frames, of which frames defining entities, processes, attributes and events are particularly relevant, is discussed on the example of the semantic frame of Flight and its related frames. The paper ends with discussing possibilities of using AirFrame as a model for further developing resources in which general and specialized knowledge are linked.
Many European languages have undergone considerable changes in orthography over the last 150 years. This hampers the application of modern computer-based analysers to older text, and hence computer-based annotation and studies of text collections spanning a long period. As a step towards a functional analyser for Norwegian texts (Nynorsk standard) from the 19th century, funding was granted in 2020 for creating a full form generator for all inflected forms of headwords found in Ivar Aasen’s dictionary published in 1873 (Aasen 1873) and his grammar from 1864 (Aasen 1864). Creating this word bank led to new insight in Aasen (1873), its structure, internal organisation, and ambition level as well as its link to Aasen (1864). As a test, the full form list generated from this new word bank was used to analyse the word inventory of texts by Aa. O. Vinje, written in the period 1850–1870. The Vinje texts were also analysed using a full form list of modern standard Norwegian, to study the differences in applicability and see how Vinje’s language relates to the written standard of modern Norwegian.
In this paper, we present LexMeta, a metadata model for the description of human-readable and computational lexical resources in catalogues. Our initial motivation is the extension of the LexBib knowledge graph with the addition of metadata for dictionaries, making it a catalogue of and about lexicographical works. The scope of the proposed model, however, is broader, aiming at the exchange of metadata with catalogues of Language Resources and Technologies and addressing a wider community of researchers besides lexicographers. For the definition of the LexMeta core classes and properties, we deploy widely used RDF vocabularies, mainly Meta-Share, a metadata model for Language Resources and Technologies, and FRBR, a model for bibliographic records.
In the course of the last years, digital lexicography has opened up a variety of avenues fostering the conceptualisation, application and use of constructicons, a type of lexicographical reference work which has revealed itself highly promising in terms of connectivity and flexibility, at the same time, however, also challenging as to its technical implementation. The present paper takes up the ambitious aim to propose some reflections as well as a first draft for a possible model of a multilingual ‘periphrasticon’ as a subtype of a bigger constructicon focusing on a specific typology-related structural feature, i. e. periphrasticity. Taking periphrastic verbal constructions in French, Italian and Spanish as a starting point, it tries to sketch out a unified constructional network including not only equivalent (or corresponding) constructions within Romance, but also establishing (formal and functional) cross-linguistic connections to German and English. Comprising the major languages available to most language learners in (at least) German-speaking environments, the model is also supposed to pave the way for multilingual constructicography which, on the one hand, is able to account for intra- and cross-linguistic relations and, on the other hand, can also prove a valuable tool for language learning and use.
We describe the status of work intending at including sign language lexical data within the OntoLex-Lemon framework. Our general goal is to provide for a multimodal extension to this framework, which was originally conceived for covering only the written and phonetic representation of lexical data. Our aim is to achieve in the longer term the same type of semantic interoperability between sign language lexical data as this is achieved for their spoken or written counterparts. We want also to achieve this goal across modalities: between sign language lexical data and spoken/written lexical data.
The long road to a historical dictionary of Lower Sorbian. Towards a lexical information system
(2022)
The Sorbian Institute has been taking preparatory steps for a historical-documentary vocabulary information system for Lower Sorbian for about 10 years. To this end, the entire extant written material (16th–21st centuries) of this strongly endangered European minority language is to be systematically evaluated. An attempt made a few years ago to organise and finance the project as a long-term scientific project was not successful in the end. Therefore, it can only be advanced step by step and via some detours. The article informs about the interim status of the project, especially with respect to the creation of a reliable database.
The paper presents the results of a survey on lexicographic practices and lexicographers’ needs across Europe that was conducted in the context of the Horizon 2020 project European Lexicographic Infrastructure (ELEXIS) among the observer institutions of the project. The survey is a revised and upgraded version of the survey which was originally conducted among ELEXIS lexicographic partner institutions in 2018 (Kallas et al. 2019a). The main goal of this new survey was to complement the data from the ELEXIS lexicographic partner institutions in order to get a more complete picture of lexicographic practices both for born-digital and retro-digitised resources in Europe. The results offer a detailed insight into many aspects of the lexicographic process at European institutions, such as funding, training, staff, lexicographic expertise, software and tools. In addition, the survey reflects on current trends in lexicography and reveals what institutions see as the most important emerging trends that will affect lexicography in the short-term and long-term future. Overall, the results provide valuable input informing the development of tools, resources, guidelines and training materials within ELEXIS.
This paper aims at verifying if the most important online Brazilian Portuguese dictionaries include some of the neologisms identified in texts published in the 1990s to 2000s, formed with the elements ciber-, e-, bio-, eco- and narco, which we refer to as fractomorphemes / fracto-morphèmes. Three online dictionaries were analyzed (Aulete, Houaiss and Michaelis), as well as Vocabulário Ortográfico da Língua Portuguesa (VOLP). We were able to conclude that all three dictionaries and VOLP include neologisms with these elements; Michaelis and VOLP do not include separate entries for bound morphemes, whereas Houaiss includes entries for all of them and Aulete includes entries for bio-, eco- and narco-. Aulete also describes the neological meaning of eco- and narco-, whereas Houaiss does not.
Word Families in Diachrony. An epoch-spanning structure for the word families of older German
(2022)
The ‘Word Families in Diachrony’ project (WoDia), for which a funding application to the DFG is in preparation, aims to provide a database driven online research environment that will enable processes of change in the entire historical vocabulary of German to be investigated by focusing on the changes in word families and the individual means of word formation. WoDia will embed the vocabularies of Old High German (OHG), Middle High German (MHG), Old Saxon (OS), and Middle Low German (MLG) in a database, resulting in a word-family structure for High and Low German from the beginnings up to the 15th century (for High German) and up to the 17th century (for Low German). The basis of the vocabulary is provided by reference dictionaries of the four historical varieties, whereas the word families’ historical structure is based on the word-family dictionary of OHG by Jochen Splett (1992). Each lemma in the database will be assigned, where appropriate, to a word family. The individual word-formation elements and the word-formation hierarchy will be mapped in a structural formula. The etymologically corresponding lemmas and word families of the different periods/varieties of older German will be linked so that an analysis across the varieties will also be possible. The annotations of word families in the database (e. g., relating to word structure) will be supplemented by linking their lemmas to the online dictionaries and to the reference corpora of Old German (OS and OHG), MHG, and MLG.
The digital environment represents a qualitatively new level of service for research work with linguistic information presented in dictionary form. And first of all, this applies to index systems. By dictionary indexing we mean a set of formalized rules and procedures, on the basis of which it is possible to obtain information about certain linguistic facts recorded in the dictionary. These rules are implemented in the form of user interfaces. However, one should take into account the fact that the effectiveness of automatic construction of index schemes for a digital dictionary is possible only in a sufficiently formalized environment. This article describes the method and technology of indexing the Etymological Dictionary of the Ukrainian Language (EDUL). For the language indexing of the dictionary, a special computer instrumental system (VLL – virtual lexicographic laboratory) was developed, and adapted to the structure of the EDUL and focused on the creation of indexes in automatic mode. The digital implementation of the EDUL made it possible to access the entire corpus of the dictionary text regardless of the time of publication of the corresponding volume and opened up opportunities for various digital interpretations of etymological information.
The paper describes an online German-Russian database for phraseological constructions (PhC), or syntactic idioms. It is a linguistic phenomenon representing a stable multi-word form that usually contains some auxiliary words (“anchors”) and partially opens up empty spaces (“slots”) which are filled directly in spoken language by various lexemes or combinations of lexemes (“fillers”, or “slot fillers”). Linguists from several German institutions are currently working on the database. The PhCs selected for the database have to meet special criteria. The database is a manual that combines scientific descriptions, a thesaurus and a bilingual dictionary. The database is designed as an active aid for text production in the respective foreign language; it is also a manual for language researchers and for translators. Apart from that, it can serve as a basis for extensions for other language pairs. The aim of the project is to record and to describe 300 PhC before the database is published. Our objective is to enable foreign language learners to use the syntactic idioms correctly in the texts they produce rather than create a big-sized database. The paper describes some issues related to the creation of the database, namely objectives and target groups, material and methods, microstructure of the database article and some others.
This paper presents the methodology of a research project on the use of specialised German dictionaries. A mixed-methods research approach will help to answer the following main questions, concerning the lexicographic presentation of the data on the one hand and the data collection on the other hand: How do different systems of data organization and presentation affect the likelihood that users will correctly find and select the data they look up? And does the probability of success increase if users are familiar with the system? Which advantages and disadvantages do lexicographers and specialised languages experts see in using quantitative methods to extract terms? And are these methods accepted and considered reliable by the user community?
The purpose of this paper is to present the lexicographic protocol and to report on the progress of compilation of Mikaela_Lex, which is a Greek, free online monolingual school dictionary for upper elementary students with visual impairments including 4,000 lemmata. The dictionary is equipped with new digital tools, such as the “Braille-system keyboard, a “speech-to-text” tool, a “text-to-speech” tool and also a qwerty accessibility for visually non-impaired students.
This paper describes a method for automatic identification of sentences in the Gigafida corpus containing multi-word expressions (MWEs) from the list of 5,242 phraseological units, which was developed on the basis of several existing open-access lexical resources for Slovene. The method is based on a definition of MWEs, which includes information on two levels of corpus annotation: syntax (dependency parsing) and morphology (POS tagging), together with some additional statistical parameters. The resulting lexicon contains 12,358 sentences containing MWEs extracted from the corpus. The extracted sentences were analysed from the lexicographic point of view with the aim of establishing canonical forms of MWEs and semantic relations between them in terms of variation, synonymy, and antonymy.
This paper consists of a short analysis of the sources and the treatment of the legal lexicon in the first dictionary published by the Spanish Royal Academy (1726–1739), followed by a longer commentary on the representation and the treatment of the concept of judge, in which the reflection of the extralinguistic factors in the definitions stands in focus. The results highlight the relevance of the legal context of that era for the treatment of the lexicon related to the legal domain, but they also demonstrate the pattern in which the lexicographic data displays peculiarities of legal matters.
In the etymological information for a word in a dictionary, the first question to be answered is whether the word is a borrowing or the result of word formation. Here, we consider this question for internationalisms ending in -ation in German and in -ácia in Slovak. In German, -ation is a suffix that attaches to verbs in -ieren. For these verbs, it is in competition with -ung. In Slovak, -ácia is a suffix that attaches to bases of Latin or Greek origin. The corresponding verbs are often backformations. Most Slovak verbs also have a nominalization in -nie. In order to investigate to what extent the nouns in -ation or -ácia are borrowings or derived from the corresponding verbs in German and Slovak, we took a random sample of English nouns in -ation for which OED gives a corresponding verb. For this sample, we checked whether the cognate noun in -ation or -ácia is attested in standard dictionaries and in corpora. Then we did the same for the corresponding verbs and the nouns in -ung or -nie. Finally, we checked the frequency of these words in DeReKo for German and SNK for Slovak. On this basis, we found evidence that -ation in German has a slightly different status to -ácia in Slovak. This status affects the relationship to the corresponding verbs and to the nouns in -ung or -nie. Such generalizations are important as background information for specifying etymological information in dictionaries, especially for languages where first attestations dates are not readily available.
Inspired by GWLN 3, we take a look at the new words, meanings, and expressions that have been created during or promoted by the COVID-19 pandemic. The pandemic provides a rare opportunity to follow the rise, spread, and integration of words and expressions in a language that may serve as an illustration of how linguistic innovation in general works. Relevant words were selected from various lists, notably monthly and annual lists of prominent words attested in the corpus of The Danish Dictionary. Analysis of these lists gives an insight into the number of words that stand out month by month and what kinds of words are involved, both in terms of morphological type and of semantic category, with special attention given to neologisms. Finally, we discuss the criteria for selecting which words to include in the dictionary. With this study, Danish is added to the list of languages covered in the GWLN series on
COVID-19 neologisms.
This study examines a list of 3,413 neologisms containing one or more borrowed item, which was compiled using the databases built by the Korean Neologism Investigation Project. Etymological aspects and morphological aspects are taken into consideration to show that, besides the overwhelming prevalence of English-based neologisms, particular loans from particular languages play a significant role in the prolific formation of Korean neologisms. Aspects of the lexicographic inclusion of loan-based neologisms demonstrate the need for Korean neologism and lexicography research to broaden its scopes in terms of methodology and attitudes, while also providing a glimpse of changes.
One central goal of the project ‘Zentrum für digitale Lexikographie der deutschen Sprache’ (Center for digital lexicography for the German Language, www.zdl.org) is to provide a corpus-based lexicographic component of common German multi-word expressions (MWE), including idioms, for DWDS (www.dwds.de), a general language dictionary of contemporary German. As a central challenge of this task, we have identified an adequate lexicographic representation of such common properties of MWE as variation and modification. To document the variation, we have developed a special entry-clustering model, which we call hub-node entry. This model comprises a core hub entry headed by a short nuclear form of the MWE and several node entries, which represent the most common variants in their full lexical forms.
This paper discusses an investigation of how senses are ordered across eight dictionaries. A dataset of 75 words was used for this purpose, and two senses were examined for each word. The words are divided into three groups of 25 words each according to the relationship between the senses: Homonymy, Metaphor, and Systematic Polysemy. The primary finding is that WordNet differs from the other dictionaries in terms of Metaphor. The order of the senses was more often figurative/literal, and it had the highest percentage of figurative senses that were not found. We discuss leveraging another dictionary, COBUILD, to re-order the senses according to frequency.
The present paper examines the usage of 341 COVID-19 neologisms which appeared in South Korea over a span of eighteen months (from December 2019 to May 2021) and were extracted from a corpus composed of COVID-19-related news articles and comments, the COVID-19 Corpus, in order to address the following research questions: 1) How do the 341 COVID-19 neologisms extracted rank in news articles and comments respectively?, 2) What usage trends do neologisms designating the disease and other high-frequency neologisms show in news articles and comments respectively?, 3) What characteristic differences do comments as a non-expert and subjective language resource and news articles as an expert and objective language resource show and what value may each genre add to the lexicographic description of neologisms?
Since the beginning of 2020, the Covid-19 pandemic has dominated public discourse and introduced a wealth of words and expressions to the general vocabulary of English and other world languages. The lexical adaptation necessitated by this global health crisis has been unprecedented in speed and scope, and in response, the Oxford English Dictionary (OED) has continually revised its coverage, publishing special updates of Covid-19-related words in 2020 outside of its usual quarterly publication cycle. This article describes how OED lexicographers have analysed language corpora and other text databases to monitor the development of pandemic-related words and provide a linguistic and historical context to their usage.
The syntagma gel hidroalcohólico ‘hydroalcoholic gel’ or the noun hidroalcohol ‘hydroalcohol’ cannot be found in Diccionario de la lengua española (DLE) of the Real Academia Española (‘Royal Spanish Academy’) or other general reference dictionaries of the Spanish language. This is so despite the fact that, for well over a year and to this very day, we have not been able to do anything without first sanitising our hands with this product. It is one of the many neologisms that the COVID-19 pandemic has brought us, and these have become commonly used words that dictionaries should consider as candidates for future updates.
By looking at the dictionarisability of these neologisms, in this work we try to set their boundaries on the continuum along which they fall. “Dictionarisability” means, in our context, the greater or lesser interest of these unities regarding the updating of general language dictionaries. At both ends of this continuum, there are surprising nonce words, as well as neologisms that have recently lost their status as such because they have now been incorporated into the dictionary. To identify different groups on the continuum of pandemic neologisms, we take into account the criteria proposed in the current literature and, by so doing, we are able to assess the extent to which they are discriminatory. This will allow us to address the neological process and to reflect on the various stages of it, from the time a neologism is born until the moment it ceases to be one because it has been dictionarised. Before that, however, we present the framework of our study and refer to the mechanisms available for detecting neologisms in general and pandemic neologisms in particular.
The aim of this work is to describe criteria used in the process of inclusion and treatment of neologisms in dictionaries of Spanish within the framework of pandemic instability. Our starting point will be data obtained by the Antenas Neológicas Network (https://www.upf.edu/web/antenas), whose representation in three different lexicographic tools will be analyzed with the purpose of identifying problems in the methodology used to dictionarize – that is, how and what words were selected to be included in dictionaries and how they were represented in their entries – neologisms during the COVID-19 pandemic (sources and corpora of analysis, selection criteria, types of definition, among other aspects). Two of them are monolingual and COVID-19 lexical units were included as part of their updates: the Antenario, a dictionary of neologisms of Spanish varieties, and the Diccionario de la Lengua Española [DLE], a dictionary of general Spanish, published by the Real Academia Española [RAE], Spanish Royal Academy). The other is a bilingual unidirectional English-Spanish dictionary first published as a glossary, Diccionario de COVID-19 EN-ES [TREMEDICA], entirely made up of neological and non-neological lexical units related to the virus and the pandemic. Thus, the target lexis was either included in existing works or makes up the whole of a new tool located in a portal together with other lexicographic tools. Unlike other collections of COVID-19 vocabulary that kept cropping up as the pandemic unfolded, all three have been designed and written according to well-established lexicographic practices.
Our working hypothesis is that the need to record and define words which were recently created impacts the criteria for inclusion and treatment of neologisms in dictionaries about Spanish, including a certain degree of overlap of some features which are traditionally thought to be specific to each type of dictionary.
This paper presents the project “The first Romanian bilingual dictionaries (17th century). Digitally annotated and aligned corpus” (eRomLex) which deals with the editing of the first bilingual Romanian dictionaries. The aim of the project is to compile an electronic corpus comprising six Slavonic-Romanian lexicons dating from the 17th century, based on their relatedness and the fact that they follow a common model in order to highlight the characteristics of this lexicographical network (the affiliations between the lexicons, the way they relate to the source, the innovations towards it, their potential uses) and to facilitate the access to their content. A digital edition allows exhaustive data extraction and comparison and link with other digitized resources for old Romanian or Church Slavonic, including dictionaries. After presenting the corpus, we point to the necessary stages in achieving this project, the techniques used to access the material and the challenges and obstacles we encountered along the way. We describe how the corpus was created, stored, indexed and can be searched over; we will also present and discuss some statistical analyses highlighting relations between the Romanian lexicons and their Slavonic-Ruthenian source.
In this paper we present Trendi, a monitor corpus of written Slovene, which has been compiled recently as part of the SLED (Monitor corpus and related resources) project. The methodology and the contents of the corpus are presented, as well as the findings of the survey that aimed to identify the needs of potential users related to topical language use. The Trendi corpus currently contains news articles and other web content from 110 different sources, with the texts being collected and linguistically annotated on a daily basis. The corpus complements Gigafida 2.0, a 1.13-billion-word reference corpus of standard written Slovene. Also discussed are the ways in which the corpus will be integrated into various lexicographic projects, helping not only in the identification of neologisms but also in monitoring changes in already identified language phenomena.
This think-aloud study charts the use of online resources by five final-year MA students in Nordic and Literacy Studies based on the analysis of screen and audio recordings of an error-correction task. The article briefly presents some linguistic features of Norwegian Nynorsk that are not common in the context of other European languages, that is, norm optionality with regards to inflection and spelling. While performing the task, the participants were allowed to use all digital aids. This article examines their resource consultation behavior, and it makes use of Laporte/Gilquin’s (2018) annotation protocol. The following research questions are posed: What online resources are used by the students? What characterizes the use? Are online resources helpful? This study provides new insights into an as yet little explored topic within the Norwegian context. The findings demonstrate that the participants relied heavily on the official monolingual dictionary Nynorskordboka. Indeed, the dictionary was helpful in the vast majority of the searches, either resulting in error improvement or the validation of a word; that is, many of the searches considered correct words. The findings suggest severe norm insecurity and emphasize the need to improve norm knowledge and metalinguistic knowledge as prerequisites for better utilization of aids. It is also suggested to include necessary information on norm optionality and other commonly queried issues in the dictionary architecture.
Vergleichbare Korpora für multilinguale kontrastive Studien. Herausforderungen und Desiderata
(2022)
This contribution aims to show the necessity of working in the development of multilingual corpora and appropriate tools for multilingual contrastive studies. We take the corpus of the lexicographical project COMBIDIGILEX as example to show, how difficultit is to build a suitable data basis to study and compare linguistic phenomena in German, Spanish and Portuguese. Despite the availability of big reference corpora for the three languages (at least for written language), it is not able to obtain a comparable data basis from, because the mentioned corpora are created according to different requirements and they are also powered by disparate information systems and analyse tools. To break the status quo, we plead for increasing research infrastructures by means of compatible language technology and sharing data.
Identity effects in phonology are deviations from regular phonological form (i.e. canonical patterns) which are due to the relatedness between words. More specifically, identity effects are those deviations which have the function to enhance similarity in the surface phonological form of morphologically related words. In rule-based generative phonology the effects in question are described by means of the cycle. For example, the stress on the second syllable in cond[ɛ]nsation as opposed to the stresslessness of the second syllable in comp[ǝ]nsation is described by applying the stress rules initially to the sterns thereby yielding condénse and cómpensàte. Subsequently the stress rules are reapplied to the affixed words with the initial stress assignment (i.e. stress on the second syllable in condense, but not in compensate) leaving its mark in the output form (cf. Chomsky and Halle 1968). A second example are words like lie[p]los 'unloving' in German, which shows the effects of neutralization in coda position (i.e. only voiceless obstruents may occur in coda position) even though the obstruent should 'regularly' be syllabified in head position (i.e. bl is a wellformed syllable head in German). Here the stern is syllabified on an initial cycle, obstruent devoicing applies (i.e. lie[p]) and this structure is left intact when affixation applies (i.e. lie[p ]Ios ) (cf. Hall 1992). As a result the stern of lie[p]los is identical to the base lie[p].
Lexicography
(2008)
Esipuhe/Preface
(2020)
The theme of the AFinLA 2020 Yearbook Methodological turns in applied language studies is discussed in this introductory article from three interrelated perspectives, variously addressed in the three plenary presentations at the AFinLA Autumn Symposium 2019 as well as in the thirteen contributions to the yearbook. In the first set of articles presented, the authors examine the role and impact of technological development on the study of multimodal digital and non-digital contexts and discourses and ensuing new methods. The second set of studies in the yearbook revisits issues of language proficiency, critically discussing relevant concepts and approaches. The third set of articles explores participation and participatory research approaches, reflecting on the roles of the researcher and the researched community.
Ce chapitre s’intéresse à la façon dont les changements de langue dans des réunions sont gérés par les parties co-présentes qui les traitent comme posant des problèmes de participation, en s’orientant vers le fait que le choix d’une langue particulière peut avoir comme effet d’augmenter ou bien de diminuer la participation de certains ou de tous les membres co-présents. Le choix d’une langue plutôt que d’une autre est étudié comme répondant à un problème des membres et comme une décision prise par eux, exhibant la manière dont ils s’orientent vers ses conséquences et dont ils élaborent sa justification et légitimité. Dans ce sens, le choix de l’anglais ou de plusieurs langues co-existantes voire alternantes n’a pas en soi une valeur positive ou négative en termes de participation, d’adéquation ou d’efficacité, mais a une valeur qui est située et occasionnée, dépendant des formats spécifiques de participation, des compétences reconnues localement et de la manière dont l’interaction est organisée. Afin d’explorer de manière systématique cette articulation entre choix de langue et participation, nous allons nous pencher sur un phénomène particulier et récurrent. Il s’agit de l’annonce qui projette un changement de langue et qui peut prendre une forme telle que “now we will switch into English so that you can participate”. Nous l’analyserons en tenant compte de la position séquentielle où elle est produite, de son format, de la façon dont elle est adressée à une partie ou à la totalité des co-présents, et de l’action spécifique qui y est accomplie. Nous étudierons aussi la manière dont elle est reçue, ses effets sur le cadre de participation, ainsi que les catégorisations qui en découlent. On montrera ainsi la relation de configuration mutuelle qui s’établit entre choix de langue et cadre de participation. Nos analyses seront développées sur la base de plusieurs corpus de rencontres professionnelles internationales enregistrées en audio et en vidéo sur plusieurs terrains. Les données vidéo nous invitent à considérer non seulement la dimension linguistique des cadres participatifs et des changements de langue, mais aussi leur organisation multimodale : l’organisation incarnée (embodied) du code-switching n’a pratiquement pas encore été explorée et la participation incarnée reste sous-étudiée, ainsi que son lien avec des espaces interactionnels spécifiques. Ce chapitre montre que les détails multimodaux sont cruciaux pour la compréhension des liens entre plurilinguisme et participation en tant que dynamiques occasionnées, contingentes et émergentes.
This paper aims at contributing to the analysis of overlaps in turns-at-talk from both a sequential and a multimodal perspective. Overlaps have been studied within Conversation Analysis by focusing mainly on verbal and vocal resources; taking into account multimodal resources such as gesture, bodily posture, and gaze contributes to a better understanding of participants’ orientations to the sequential organization of overlapping talk and their management of speakership. First, we introduce the way in which overlaps have been studied in Conversation Analysis, mainly by Jefferson (1973, 1983, 2004) and Schegloff (2000); then we propose possible implications of their multimodal analysis. In order to demonstrate that speakers systematically orient to the overlap onset and resolution we analyze the multimodal conduct of overlapped speakers. Findings show methodical variations in trajectories of overlap resolution: speakers’ gestures in overlap display themselves as maintaining or withdrawing their turn, thereby exhibiting the speakership achieved and negotiated during overlap.
Cette contribution s’intéresse aux co-constructions d’un tour de parole en interaction, plus spécifiquement, à la manière dont la complétion d’un énoncé de la part d’un co-participant est ensuite réceptionnée par le locuteur dont le tour a été complété. Malgré l’intérêt certain porté par l’analyse conversationnelle et la linguistique interactionnelle à la co-énonciation, l’évaluation de cette pratique par le premier locuteur n’a pas fait l’objet d’analyses approfondies. Dans ce qui suit, nous nous focalisons plus particulièrement sur les pratiques interactionnelles qui permettent aux participants de valider une co-construction. Ce travail est issu du projet ANR SPIM (« L’imitation dans la parole »), dans le cadre duquel nous nous sommes interrogée sur la fonction de l’hétéro-répétition (le fait de répéter un énoncé d’un autre locuteur ou une partie de celui-ci, opposée à l’auto- répétition) dans des séquences de co-construction d’un tour de parole.
Cette contribution s'intéresse aux co-constructions d'un tour de parole en interaction, plus spécifiquement, à la manière dont la complétion d'un énoncé de la part d'un co-participant est ensuite réceptionnée par le locuteur dont le tour a été complété. Malgré l'intérét certain porté par l'analyse conversationnelle et la linguistique interactionnelle à la co-énonciation, l'évaluation de cette pratique par le premier locuteur n’a pas fait l’objet d’analyses approfondies. Dans ce qui suit, nous nous focalisons plus particulièrement sur les pratiques interactionnelles qui permettent aux participants de valider une co-construction. Ce travail est issu du projet ANR SPIM (« L'imitation dans la parole »), dans le cadre duquel nous nous sommes interrogée sur la fonction de l'hétéro-répétition (le fait de répéter un énoncé d'un autre locuteur ou une partie de celui-ci, opposée à l'auto-répétition) dans des séquences de co-construction d'un tour de parole. Dans la partie analytique, nous contrastons deux possibilités de validation d'une complétion collaborative, à savoir l'acquiescement simple (« oui ») et l'hétéro-répétition simple. Sur la base d’enregistrements vidéo de conversations naturelles, nous montrons que ces deux pratiques ne valident pas la complétion collaborative de la même manière, mais qu'elles permettent aux locuteurs d’évaluer finement le caractère plus ou moins adéquat des éléments co-construits.
In this chapter, we discuss steps toward extending CMDI’s semantic interoperability beyond the Social Sciences and Humanities: We stress the need for an initial data curation step, in part supported by a relation registry that helps impose some structure on CMDI vocabulary; we describe the use of authority file information and other controlled vocabulary to help connecting CMDI-based metadata to existing Linked Data; we show how significant parts of CMDI-based metadata can be converted to bibliographic metadata standards and hence entered into library catalogs; and finally we describe first steps to convert CMDI-based metadata to RDF. The initial grassroots approach of CMDI (meaning that anybody can define metadata descriptors and components) mirrors the AAA slogan of the Semantic Web (“Anyone can say Anything about Any topic”). Ironically, this makes it hard to fully link CMDI-based metadata to other Semantic Web datasets. This paper discusses the challenges of this enterprise.
Gerade wenn es um die Gewinnung und eine erste Bewertung von Forschungsdaten geht, ist derzeit oft vom Übergang zu citizen science die Rede. Nachdem dieses Konzept zunächst in den Lebenswissenschaften eine größere Rolle gespielt hat, findet es sich neuerdings auch in Teilen der Sprachwissenschaft. Viele einschlägige Initiativen schließen an die Tätigkeiten an, bei denen sich auch traditionell schon die professionalisierte Wissenschaft der Hilfe der ‚Laien‘ bediente, sie können allerdings jetzt die in ungeahntem Ausmaß gewachsenen Möglichkeiten elektronischer Kommunikation und elektronischen Daten-Managements nutzen. Das digitale Interagieren erweitert die Möglichkeiten der als beteiligte „Laien“ gesehenen Personen aber doch so sehr, dass sich auch qualitativ ein neues Verhältnis zwischen den am Forschungsprozess Beteiligten entwickelt. In diesem Beitrag wird diskutiert, welche Folgen diese Veränderung für die wissenschaftliche Praxis, aber auch für das Verständnis des Konzepts „Wissenschaft“ hat.
Concurrent standardization as a necessity: The genesis of the new official orthographic guidelines
(2009)
The new official orthographic guidelines were brought into force by the official state authorities on August 1st, 1998 and its principle goals were a standardized representation of the guidelines and a «gentle simplification in respect of content». This regulation was not supported by the public and in fact it was the starting point for a struggle for conceptual solutions and a quest for the achievement of' a consensus between different possible norms. Since orthography is an officially codified standard taking up a prominent position among linguistic standards, it is of particular socio-political importance. It was the foremost task of the Council for German Orthography (Rat für deutsche Rechtschreibung), instituted in December 2004, to elaborate a compromise in order to bring the «Orthographical war» (Die Zeit) to an end, which was led enthusiastically for more than a decade. - The concern of this article is to classify historically the agreement reached in 2006. Against this background, it can be stated that official guidelines will only be accepted, if they are based upon the usage in writing and if they take into account the interests of the reader. Both principles are characterizing the proposal made by the Council for German Orthography. An outlook on the Council's activities concerning orthographic standardization expected in the future will conclude this article.
Grußwort
(2018)
Grußwort/Welcome address
(2018)
“To cleanse and at the same time enrich your mother tongue is the task of the brightest people.”
With this quote Goethe, the famous German poet, seemed to have described the work of EFNIL today. But is our task really that easy? Do we “cleanse” our language by deleting superfluous elements? Do we not lose the rich abundance of a language in so doing? Or is Goethe asking for other languages to be prevented from influencing his mother tongue? Would this even be feasible in a globalised world?
Rudi Carrell, a famous entertainer on German TV, once said:
“When I came to Germany I only spoke English. But the German language contains so many English words nowadays that I am now fluent in German!”
His opinion is probably shared by many people learning German.
My daily job is to support around 100,000 schools abroad that offer German as a foreign language. We ask ourselves daily: which German language should we be offering young people today? The classical German of literature? Or practical German which will enable young people to join the workforce of many German companies worldwide? And most of all: how do we motivate young people to learn German? Or any other foreign language?
Yes, English, French, German, Spanish – these languages are in competition in many schools. But the most important fact is: the benefit lies in learning a foreign language, no matter which. Because by learning a foreign language we start to understand foreign cultures and other people. And THAT is what matters.