Abstract
While multilingual pretrained language models (LMs) fine-tuned on a single language have shown substantial cross-lingual task transfer capabilities, there is still a wide performance gap in semantic parsing tasks when target language supervision is available. In this paper, we propose a novel Translate-and-Fill (TaF) method to produce silver training data for a multilingual semantic parser. This method simplifies the popular Translate-Align-Project (TAP) pipeline and consists of a sequence-to-sequence filler model that constructs a full parse conditioned on an utterance and a view of the same parse. Our filler is trained on English data only but can accurately complete instances in other languages (i.e., translations of the English training utterances), in a zero-shot fashion. Experimental results on three multilingual semantic parsing datasets show that data augmentation with TaF reaches accuracies competitive with similar systems which rely on traditional alignment techniques.- Anthology ID:
- 2021.findings-emnlp.279
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3272–3284
- Language:
- URL:
- https://aclanthology.org/2021.findings-emnlp.279
- DOI:
- 10.18653/v1/2021.findings-emnlp.279
- Cite (ACL):
- Massimo Nicosia, Zhongdi Qu, and Yasemin Altun. 2021. Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3272–3284, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data (Nicosia et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2021.findings-emnlp.279.pdf
- Data
- MTOP