Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations
Marco Antonio Sobrevilla Cabezudo, Simon Mille, Thiago Pardo
Abstract
This paper presents an exploratory study that aims to evaluate the usefulness of back-translation in Natural Language Generation (NLG) from semantic representations for non-English languages. Specifically, Abstract Meaning Representation and Brazilian Portuguese (BP) are chosen as semantic representation and language, respectively. Two methods (focused on Statistical and Neural Machine Translation) are evaluated on two datasets (one automatically generated and another one human-generated) to compare the performance in a real context. Also, several cuts according to quality measures are performed to evaluate the importance (or not) of the data quality in NLG. Results show that there are still many improvements to be made but this is a promising approach.- Anthology ID:
- D19-6313
- Volume:
- Proceedings of the 2nd Workshop on Multilingual Surface Realisation (MSR 2019)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Leo Wanner
- Venue:
- WS
- SIG:
- SIGGEN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 94–103
- Language:
- URL:
- https://aclanthology.org/D19-6313
- DOI:
- 10.18653/v1/D19-6313
- Cite (ACL):
- Marco Antonio Sobrevilla Cabezudo, Simon Mille, and Thiago Pardo. 2019. Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations. In Proceedings of the 2nd Workshop on Multilingual Surface Realisation (MSR 2019), pages 94–103, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations (Sobrevilla Cabezudo et al., 2019)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/D19-6313.pdf