OPUS 4 | 400 Sprache, Linguistik

400 Sprache, Linguistik

400 Sprache (135)
401 Sprachphilosophie, Sprachtheorie (2)
402 Verschiedenes
403 Wörterbücher, Enzyklopädien
404 Spezielle Themen (1)
405 Fortlaufende Sammelwerke
406 Organisationen, Management
407 Ausbildung, Forschung, verwandte Themen (1)
408 Behandlung nach Personengruppen
409 Geografische, personenbezogene Behandlung

36 search hits

1 to 10

Sort by

MULLE: A grammar-based Latin language learning tool to supplement the classroom setting (2018)

MULLE is a tool for language learning that focuses on teaching Latin as a foreign language. It is aimed for easy integration into the traditional classroom setting and syllabus, which makes it distinct from other language learning tools that provide standalone learning experience. It uses grammar-based lessons and embraces methods of gamification to improve the learner motivation. The main type of exercise provided by our application is to practice translation, but it is also possible to shift the focus to vocabulary or morphology training.

Applying co-training to reference resolution (2002)

Müller, Mark-Christoph ; Rapp, Stefan ; Strube, Michael

In this paper, we investigate the practical applicability of Co-Training for the task of building a classifier for reference resolution. We are concerned with the question if Co-Training can significantly reduce the amount of manual labeling work and still produce a classifier with an acceptable performance.

Annotating anaphoric and bridging relations with MMAX (2001)

Müller, Mark-Christoph ; Strube, Michael

We present a tool for the annotation of anaphoric and bridging relations in a corpus of written texts. Based on differences as well as similarities between these phenomena, we define an annotation scheme. We then implement the scheme within an annotation tool and demonstrate its use.

Multi-level annotation in MMAX (2003)

Müller, Mark-Christoph ; Strube, Michael

We present a light-weight tool for the annotation of linguistic data on multiple levels. It is based on the simplification of annotations to sets of markables having attributes and standing in certain relations to each other. We describe the main features of the tool, emphasizing its simplicity, customizability and versatility

A machine learning approach to pronoun resolution in spoken dialogue (2003)

Strube, Michael ; Müller, Mark-Christoph

We apply a decision tree based approach to pronoun resolution in spoken dialogue. Our system deals with pronouns with NP- and non-NP-antecedents. We present a set of features designed for pronoun resolution in spoken dialogue and determine the most promising features. We evaluate the system on twenty Switchboard dialogues and show that it compares well to Byron’s (2002) manually tuned system.

A flexible stand-off data model with query language for multi-level annotation (2005)

Müller, Mark-Christoph

We present an implemented XML data model and a new, simplified query language for multi-level annotated corpora. The new query language involves automatic conversion of queries into the underlying, more complicated MMAXQL query language. It supports queries for sequential and hierarchical, but also associative (e.g. coreferential) relations. The simplified query language has been designed with non-expert users in mind.

Automatic detection of nonreferential it in spoken multi-party dialog (2006)

Müller, Mark-Christoph

We present an implemented machine learning system for the automatic detection of nonreferential it in spoken dialog. The system builds on shallow features extracted from dialog transcripts. Our experiments indicate a level of performance that makes the system usable as a preprocessing filter for a coreference resolution system. We also report results of an annotation study dealing with the classification of it by naive subjects.

Resolving it, this, and that in unrestricted multi-party dialog (2007)

Müller, Mark-Christoph

We present an implemented system for the resolution of it, this, and that in transcribed multi-party dialog. The system handles NP-anaphoric as well as discourse-deictic anaphors, i.e. pronouns with VP antecedents. Selectional preferences for NP or VP antecedents are determined on the basis of corpus counts. Our results show that the system performs significantly better than a recency-based baseline.

Transparent, efficient, and robust word embedding access with WOMBAT (2018)

Müller, Mark-Christoph ; Strube, Michael

We present WOMBAT, a Python tool which supports NLP practitioners in accessing word embeddings from code. WOMBAT addresses common research problems, including unified access, scaling, and robust and reproducible preprocessing. Code that uses WOMBAT for accessing word embeddings is not only cleaner, more readable, and easier to reuse, but also much more efficient than code using standard in-memory methods: a Python script using WOMBAT for evaluating seven large word embedding collections (8.7M embedding vectors in total) on a simple SemEval sentence similarity task involving 250 raw sentence pairs completes in under ten seconds end-to-end on a standard notebook computer.

Word-level alignment of paper documents with their electronic full-text counterparts (2021)

Müller, Mark-Christoph ; Ghosh, Sucheta ; Wittig, Ulrike ; Rey, Maja

We describe a simple procedure for the automatic creation of word-level alignments between printed documents and their respective full-text versions. The procedure is unsupervised, uses standard, off-the-shelf components only, and reaches an F-score of 85.01 in the basic setup and up to 86.63 when using pre- and post-processing. Potential areas of application are manual database curation (incl. document triage) and biomedical expression OCR.

1 to 10

Open Access

400 Sprache, Linguistik

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

36 search hits