OPUS 4 | Search

63 search hits

1 to 10

Sort by

Year
Year
Title
Title
Author
Author

Data-driven identification of German phrasal compounds (2017)

We present a method to identify and document a phenomenon on which there is very little empirical data: German phrasal compounds occurring in the form of as a single token (without punctuation between their components). Relying on linguistic criteria, our approach implies to have an operational notion of compounds which can be systematically applied as well as (web) corpora which are large and diverse enough to contain rarely seen phenomena. The method is based on word segmentation and morphological analysis, it takes advantage of a data-driven learning process. Our results show that coarse-grained identification of phrasal compounds is best performed with empirical data, whereas fine-grained detection could be improved with a combination of rule-based and frequency-based word lists. Along with the characteristics of web texts, the orthographic realizations seem to be linked to the degree of expressivity.

Reconstruction of separable particle verbs in a corpus of spoken German (2018)

Batinić, Dolores ; Schmidt, Thomas

We present a method for detecting and reconstructing separated particle verbs in a corpus of spoken German by following an approach suggested for written language. Our study shows that the method can be applied successfully to spoken language, compares different ways of dealing with structures that are specific to spoken language corpora, analyses some remaining problems, and discusses ways of optimising precision or recall for the method. The outlook sketches some possibilities for further work in related areas.

The CLARIN infrastructure as an interoperable language technology platform for SSH and beyond (2023)

Branco, António ; Eskevich, Maria ; Frontini, Francesca ; Hajič, Jan ; Hinrichs, Erhard ; de Jong, Franciska ; Kamocki, Paweł ; König, Alexander ; Lindén, Krister ; Navarretta, Constanza ; Piasecki, Maciej ; Piperidis, Stelios ; Pitkänen, Olli ; Simov, Kiril ; Skadiņa, Inguna ; Trippel, Thorsten ; Witt, Andreas ; Zinn, Claus

CLARIN is a European Research Infrastructure Consortium developing and providing a federated and interoperable platform to support scientists in the field of the Social Sciences and Humanities in carrying-out language-related research. This contribution provides an overview of the entire infrastructure with a particular focus on tool interoperability, ease of access to research data, tools and services, the importance of sharing knowledge within and across (national) communities, and community building. By taking into account FAIR principles from the very beginning, CLARIN succeeded in becoming a successful example of a research infrastructure that is actively used by its members. The benefits CLARIN members reap from their infrastructure secure a future for their common good that is both sustainable and attractive to partners beyond the original target groups.

Tools for multimodal annotation (2017)

Cassidy, Steve ; Schmidt, Thomas

Researchers interested in the sounds of speech or the physical gestures of Speakers make use of audio and video recordings in their work. Annotating these recordings presents a different set of requirements to the annotation of text. Special purpose tools have been developed to display video and audio Signals and to allow the creation of time-aligned annotations. This chapter reviews the most widely used of these tools for both manual and automatic generation of annotations on multimodal data.

The Naproche Project. Controlled Natural Language Proof Checking of Mathematical Texts (2010)

Cramer, Marcos ; Fisseni, Bernhard ; Koepke, Peter ; Kühlwein, Daniel ; Schröder, Bernhard ; Veldman, Jip

This paper discusses the semi-formal language of mathematics and presents the Naproche CNL, a controlled natural language for mathematical authoring. Proof Representation Structures, an adaptation of Discourse Representation Structures, are used to represent the semantics of texts written in the Naproche CNL. We discuss how the Naproche CNL can be used in formal mathematics, and present our prototypical Naproche system, a computer program for parsing texts in the Naproche CNL and checking the proofs in them for logical correctness.

Notionalization : the transformation of descriptions into categorizations (2011)

Deppermann, Arnulf

This paper analyses one specific conversational practice of formulation called ‘notionalization’. It consists in the transformation of a description by a prior speaker into a categorization by the next speaker. Sequences of this kind are a ‘‘natural laboratory’’ for studying the differences between descriptions and categorizations regarding their semantic, interactional, and rhetorical properties: Descriptive/narrative versions are often vague and tentative, multi unit turns, which are temporalized and episodic, offering a lot of contingent, situational, and indexical detail. Notionalizations turn them into condensed, abstract, timeless, and often agentless categorizations expressed by a noun (phrase) within one turn constructional unit (TCU). Drawing on audio- and video-taped German data from various types of interaction, the paper focuses on one particular practice of notionalization, the formulation of purportedly common ground by TCUs prefaced with the connective also. The paper discusses their turn-constructional and morphological properties, pointing out affinities of notionalization with language for special purposes. Notionalizations are used for reducing detail and for topical closure. They provide grounds for emergent keywords, which can be reused to re-contextualize topical issues and interactional histories efficiently. Notionalizations are powerful means for accomplishing intersubjectivity while pursuing (sometimes one-sided) practical relevancies at the same time. Their inevitably perspective design thus may lead to re-open the issue they were deemed to settle. The paper closes with an outlook to other practices of notionalization, pointing to dimensions of interactionally relevant variation and commonalities.

The study of formulations as a key to an interactional semantics (2011)

Deppermann, Arnulf

As an Introduction to the Special Issue on "Formulation, generalization, and abstraction in interaction,’’ this paper discusses key problems of a conversation analytic (CA) approach to semantics in interaction. Prior research in CA and Interactional Linguistics has only rarely dealt with issues of linguistic meaning in interaction. It is argued that this is a consequence of limitations of sequential analysis to capture meaning in interaction. While sequential analysis remains the encompassing methodological framework, it is suggested that it needs to be complemented by analyzing semantic relationships between choices of formulation in the interaction, ethnography, and structural techniques of comparing selected options with possible alternatives. The paper describes the methodological approach taken to interactional semantics by the papers in the Special Issue, which analyse practices of generalization and abstraction in interaction as they are accomplished by formulations of prior versions of reference and description.

Positioning in adolescents’ peer co-narrations: The case of mock fiction (2021)

Deppermann, Arnulf

Mock fiction is a genre of humorous, fictional narratives. It is pervasive in adolescents’ peer-group interaction. Building on a corpus of informal peer-group interaction among 14 to 17 year-old German adolescents, it is shown how mock fiction is used to sanction identity-claims of peer-group co-members that are taken to be inadequate by the teller of a mock fiction. Mock fiction exposes and ridicules those claims by fictional exaggeration. Mock fiction is an indirect, yet sometimes even highly abusive means for criticizing and negotiating identities and statuses of peer-group members. The analysis shows how mock fiction is collaboratively produced, how it is used to convey criticism and to negotiate social norms indirectly, and how, in addition, it allows for performative self-positioning of the tellers as skilled, entertaining tellers and socio-psychological diagnosticians.

Developing a Knowledge Graph for a Question Answering System to Answer Natural Language Questions on German Grammar (2019)

Falke, Stefan

Question Answering Systems for retrieving information from Knowledge Graphs (KG) have become a major area of interest in recent years. Current systems search for words and entities but cannot search for grammatical phenomena. The purpose of this paper is to present our research on developing a QA System that answers natural language questions about German grammar. Our goal is to build a KG which contains facts and rules about German grammar, and is also able to answer specific questions about a concrete grammatical issue. An overview of the current research in the topic of QA systems and ontology design is given and we show how we plan to construct the KG by integrating the data in the grammatical information system Grammis, hosted by the Leibniz-Institut für Deutsche Sprache (IDS). In this paper, we describe the construction of the initial KG, sketch our resulting graph, and demonstrate the effectiveness of such an approach. A grammar correction component will be part of a later stage. The paper concludes with the potential areas for future research.

Different Views on Markup (2010)

Goecke, Daniela ; Lüngen, Harald ; Metzing, Dieter ; Stührenberg, Maik ; Witt, Andreas

In this chapter, two different ways of grouping information represented in document markup are examined: annotation levels, referring to conceptual levels of description, and annotation layers, referring to the technical realisation of markup using e.g. document grammars. In many current XML annotation projects, multiple levels are integrated into one layer, often leading to the problem of having to deal with overlapping hierarchies. As a solution, we propose a framework for XML-based multiple, independent XML annotation layers for one text, based on an abstract representation of XML documents with logical predicates. Two realisations of the abstract representation are presented, a Prolog fact base format together with an application architecture, and a specification for XML native databases. We conclude with a discussion of projects that have currently adopted this framework.

1 to 10

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

63 search hits