Refine
Year of publication
Document Type
- Part of a Book (6)
- Book (2)
- Conference Proceeding (2)
- Article (1)
Has Fulltext
- yes (11)
Keywords
- Maschinelle Übersetzung (11) (remove)
Publicationstate
- Veröffentlichungsversion (5)
- Zweitveröffentlichung (3)
- Postprint (2)
Reviewstate
- (Verlags)-Lektorat (8)
- Peer-Review (2)
Publisher
- Springer (2)
- BDÜ, Weiterbildungs- und Fachverlagsgesellschaft mbh (1)
- Buro van die WAT (1)
- International Committee on Computational Linguistics (1)
- Lang (1)
- Narr (1)
- Research Institute for Linguistics, Hungarian Academy of Sciences (1)
- Wydawnictwo Uniwersytetu Gdańskiego (1)
- de Gruyter (1)
- enigma corporation (1)
EFNIL, the European Federation of National Institutions for Language, promotes the standard languages and the linguistic diversity of the European countries as an essential characteristic of their cultural diversity and wealth. The 17th annual conference of EFNIL in Tallinn dealt with the relation between language and economy.
• Language politics often have economic intentions, the language use of the individual is embedded in economic conditions, languages seem to differ in their economic value. In recent years, economists and sociolinguists have developed models of describing these interdependencies.
• The interaction in multilingual settings needs professional handling. There are traditional instances such as language teaching or translation and new professional fields of the digital age such as multilingual databases. Lots of economic needs and opportunities appear in this field.
• Digitization and societal diversity are two elements leading to more successful interaction, assisted by the use of automatic everyday translation, the development of plain language etc.
This volume presents an extensive overview of the interplay of language and economy.
This paper deals with multiword lexemes (MWLs), focussing on two types of verbal MWLs: verbal idioms and support verb constructions. We discuss the characteristic properties of MWLs, namely nonstandard compositionality, restricted substitutability of components, and restricted morpho-syntactic flexibility, and we show how these properties may cause serious problems during the analysis, generation, and transfer steps of machine translation systems. In order to cope with these problems, MT lexicons need to provide detailed descriptions of MWL properties. We list the types of information which we consider the necessary minimum for a successful processing of MWLs, and report on some feasibility studies aimed at the automatic extraction of German verbal multiword lexemes from text corpora and machine-readable dictionaries.
Statistische Methoden finden derzeit in der Sprachtechnologie vielfache Verwendung. Ein Grundgedanke dabei ist das Trainieren von Programmen auf große Mengen von Daten. Für das Trainieren von statistischen Sprachmodellen gilt zur Zeit das Motto „Je mehr Daten desto besser“. In unserem System zur maschinellen Übersetzung sehen wir eine fast konstante qualitative Verbesserung (gemessen als BLEU-Score) mit jeder Verdoppelung der monolingualen Trainingsdatenmenge. Selbst bei Mengen von ca. 20 Milliarden Wörtern aus Nachrichtentexten und ca. 200 Milliarden Wörtern aus Webseiten ist kein Abflachen der Lernkurve in Sicht.
Dieser Artikel gibt kurze Einführungen in statistische maschinelle Übersetzung, die Evaluation von Übersetzungen mit dem BLEU-Score, und in statistische Sprachmodelle. Wir zeigen, welch starken Einfluß die Größe der Trainingsdaten des Sprachmodells auf die Übersetzungsqualität hat. Danach wird die Speicherung großer Datenmengen, das Trainieren in einer parallelen Architektur und die effiziente Verwendung der bis zu 1 Terabyte großen Modelle in der maschinellen Übersetzung beschrieben.
In the past two decades, more and more dictionary usage studies have been published, but most of them deal with questions related to what users appreciate about dictionaries, which dictionaries they use and what type of information they need in specific situations — presupposing that users actually consult lexicographic resources. However, language teachers and lecturers in linguistics often have the impression that students do not use enough high-quality dictionaries in their everyday work. With this in mind, we launched an international cooperation project to collect empirical data to evaluate what it is that students actually do while attempting to solve language problems. To this end, we applied a new methodological setting: screen recording in conjunction with a thinking-aloud task. The collected empirical data offers a broad insight into what users really do while they attempt to solve language-related tasks online.
Ulrich Engel hat mit seinen Publikationen zur deutschen Grammatik, zur Verbvalenz und zur kontrastiven Linguistik große Wirkung auf die internationale germanistische Linguistik ausgeübt. Weniger bekannt ist, dass er mit seinem Werk auch andere linguistische Teildisziplinen beeinflusst hat, die davon bis heute profitieren. Dependenzielle Ansätze spielen bei der maschinellen Syntaxanalyse mittlerweile eine zentrale Rolle, und bei der Entwicklung von Systemen zur maschinellen Übersetzung haben Engels Arbeiten ebenfalls ihre Spur hinterlassen. Der Aufbau von Sprachressourcen in Gestalt von „Baumbanken“ kann auf Engels Grammatikkonzeption zurückgreifen, und auch zur neuerlich florierenden Konstruktionsgrammatik bestehen klare Bezüge. Im Beitrag werden diese weniger bekannten Einwirkungen von Engels Werk in andere Bereiche dargestellt und in ihrer andauernden Aktualität gewürdigt.
This paper describes the lexical database tool LOLA (Linguistic-Oriented Lexical database Approach) which has been developed for the construction and maintenance of lexicons for the machine translation system LMT. First, the requirements such a tool should meet are discussed, then LMT and the lexical information it requires, and some issues concerning vocabulary acquisition are presented. Afterwards the architecture and the components of the LOLA system are described and it is shown how we tried to meet the requirements worked out earlier. Although LOLA originally has been designed and implemented for the German-English LMT prototype, it aimed from the beginning at a representation of lexical data that can be reused for other LMT or MT prototypes or even other NLP applications. A special point of discussion will therefore be the adaptability of the tool and its components as well as the reusability of the lexical data stored in the database for the lexicon development for LMT or for other applications.
The English language has taken advantage of the Digital Revolution to establish itself as the global language; however, only 28.6 %of Internet users speak English as their native language. Machine Trans-lation (MT) is a powerful technology that can bridge this gap. In devel-opment since the mid-20th century, MT has become available to every Internet user in the last decade, due to free online MT services. This paper aims to discuss the implications that these tools may have for the privacy of their users and how they are addressed by EU data protec-tion law. It examines the data-flows in respect of the initial processing (both from the perspective of the user and the MT service provider) and potential further processing that may be undertaken by the MT service provider.
Communication across all language barriers has long been a goal of humankind. In recent years, new technologies have enabled this at least partially. New approaches and different methods in the field of Machine Translation (MT) are continuously being improved, modified, and combined, as well. Significant progress has already been achieved in this area; many automatic translation tools, such as Google Translate and Babelfish, can translate not only short texts, but also complete web pages in real time. In recent years, new advances are being made in the mobile area; Googles Translate app for Android and iOS, for example, can recognize and translate words within photographs taken by the mobile device (to translate a restaurant menu, for instance). Despite this progress, a “perfect” machine translation system seems to be an impossibility because a machine translation system, however advanced, will always have some limitations. Human languages contain many irregularities and exceptions, and consequently go through a constant process of change, which is difficult to measure or to be processed automatically. This paper gives a short introduction of the state of the art of MT. It examines the following aspects: types of MT, the most conventional and widely developed approaches, and also the advantages and disadvantages of these different paradigms.