GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems
Bosheng Ding, Junjie Hu, Lidong Bing, Mahani Aljunied, Shafiq Joty, Luo Si, Chunyan Miao
Abstract
Over the last few years, there has been a move towards data curation for multilingual task-oriented dialogue (ToD) systems that can serve people speaking different languages. However, existing multilingual ToD datasets either have a limited coverage of languages due to the high cost of data curation, or ignore the fact that dialogue entities barely exist in countries speaking these languages. To tackle these limitations, we introduce a novel data curation method that generates GlobalWoZ — a large-scale multilingual ToD dataset globalized from an English ToD dataset for three unexplored use cases of multilingual ToD systems. Our method is based on translating dialogue templates and filling them with local entities in the target-language countries. Besides, we extend the coverage of target languages to 20 languages. We will release our dataset and a set of strong baselines to encourage research on multilingual ToD systems for real use cases.- Anthology ID:
- 2022.acl-long.115
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1639–1657
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.115
- DOI:
- 10.18653/v1/2022.acl-long.115
- Cite (ACL):
- Bosheng Ding, Junjie Hu, Lidong Bing, Mahani Aljunied, Shafiq Joty, Luo Si, and Chunyan Miao. 2022. GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1639–1657, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems (Ding et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2022.acl-long.115.pdf
- Code
- bosheng2020/globalwoz
- Data
- MultiWOZ