El Volumen Louder Por Favor: Code-switching in Task-oriented Semantic Parsing
Arash Einolghozati, Abhinav Arora, Lorena Sainz-Maza Lecanda, Anuj Kumar, Sonal Gupta
Abstract
Being able to parse code-switched (CS) utterances, such as Spanish+English or Hindi+English, is essential to democratize task-oriented semantic parsing systems for certain locales. In this work, we focus on Spanglish (Spanish+English) and release a dataset, CSTOP, containing 5800 CS utterances alongside their semantic parses. We examine the CS generalizability of various Cross-lingual (XL) models and exhibit the advantage of pre-trained XL language models when data for only one language is present. As such, we focus on improving the pre-trained models for the case when only English corpus alongside either zero or a few CS training instances are available. We propose two data augmentation methods for the zero-shot and the few-shot settings: fine-tune using translate-and-align and augment using a generation model followed by match-and-filter. Combining the few-shot setting with the above improvements decreases the initial 30-point accuracy gap between the zero-shot and the full-data settings by two thirds.- Anthology ID:
- 2021.eacl-main.87
- Volume:
- Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
- Month:
- April
- Year:
- 2021
- Address:
- Online
- Editors:
- Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1009–1021
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2021.eacl-main.87/
- DOI:
- 10.18653/v1/2021.eacl-main.87
- Cite (ACL):
- Arash Einolghozati, Abhinav Arora, Lorena Sainz-Maza Lecanda, Anuj Kumar, and Sonal Gupta. 2021. El Volumen Louder Por Favor: Code-switching in Task-oriented Semantic Parsing. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1009–1021, Online. Association for Computational Linguistics.
- Cite (Informal):
- El Volumen Louder Por Favor: Code-switching in Task-oriented Semantic Parsing (Einolghozati et al., EACL 2021)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2021.eacl-main.87.pdf