OPUS 4 | Search

2 search hits

1 to 2

Sort by

Do FreeWord Order Languages Need More Treebank Data? Investigating Dative Alternation in German, English, and Russian (2015)

Dakota, Daniel ; Gilmanov, Timur ; Li, Wen ; Kuzma, Christopher ; Kim, Evgeny ; Abo Mokh, Noor ; Kübler, Sandra

We investigate whether non-configurational languages, which display more word order variation than configurational ones, require more training data for a phenomenon to be parsed successfully. We perform a tightly controlled study comparing the dative alternation for English (a configurational language), German, and Russian (both non-configurational). More specifically, we compare the performance of a dependency parser when only canonical word order is present with its performance on data sets when all word orders are present. Our results show that for all languages, canonical data not only is easier to parse, but there exists no direct correspondence between the size of training sets containing free(er) word order variation and performance.

Parsing German: How Much Morphology Do We Need? (2014)

Maier, Wolfgang ; Kübler, Sandra ; Dakota, Daniel ; Whyatt, Daniel

We investigate how the granularity of POS tags influences POS tagging, and furthermore, how POS tagging performance relates to parsing results. For this, we use the standard “pipeline” approach, in which a parser builds its output on previously tagged input. The experiments are performed on two German treebanks, using three POS tagsets of different granularity, and six different POS taggers, together with the Berkeley parser. Our findings show that less granularity of the POS tagset leads to better tagging results. However, both too coarse-grained and too fine-grained distinctions on POS level decrease parsing performance.

1 to 2

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

2 search hits