Translate and label! An encoder-decoder approach for cross-lingual semantic role labeling
- We propose a Cross-lingual Encoder-Decoder model that simultaneously translates and generates sentences with Semantic Role Labeling annotations in a resource-poor target language. Unlike annotation projection techniques, our model does not need parallel data during inference time. Our approach can be applied in monolingual, multilingual and cross-lingual settings and is able to produce dependencybased and span-based SRL annotations. We benchmark the labeling performance of our model in different monolingual and multilingual settings using well-known SRL datasets. We then train our model in a cross-lingual setting to generate new SRL labeled data. Finally, we measure the effectiveness of our method by using the generated data to augment the training basis for resource-poor languages and perform manual evaluation to show that it produces high-quality sentences and assigns accurate semantic role annotations. Our proposed architecture offers a flexible method for leveraging SRL data in multiple languages.
| Author: | Angel Daza, Anette Frank |
|---|---|
| URN: | urn:nbn:de:bsz:mh39-94395 |
| URL: | https://www.aclweb.org/anthology/D19-1056.pdf |
| URL: | https://www.aclweb.org/anthology/volumes/D19-1/ |
| DOI: | https://doi.org/10.18653/v1/D19-1056 |
| ISBN: | 978-1-950737-90-1 |
| Parent Title (English): | Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3.–7. November 2019, Hong Kong, China |
| Publisher: | The Association for Computational Linguistics |
| Place of publication: | Stroudsburg, PA, USA |
| Editor: | Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan |
| Document Type: | Part of a Book |
| Language: | English |
| Year of first Publication: | 2019 |
| Date of Publication (online): | 2019/12/11 |
| Publicationstate: | Veröffentlichungsversion |
| Reviewstate: | Peer-Review |
| GND Keyword: | Annotation; Automatische Sprachverarbeitung; Computerlinguistik; Semantik; Simultanübersetzen |
| First Page: | 603 |
| Last Page: | 615 |
| DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
| Open Access?: | ja |
| Leibniz-Classification: | Sprache, Linguistik |
| Linguistics-Classification: | Computerlinguistik |
| Program areas: | Digitale Sprachwissenschaft |
| Licence (German): | Creative Commons - CC BY - Namensnennung 4.0 International |


