Building a Task-oriented Dialog System for Languages with no Training Data: the Case for Basque
Maddalen López de Lacalle, Xabier Saralegi, Iñaki San Vicente
Abstract
This paper presents an approach for developing a task-oriented dialog system for less-resourced languages in scenarios where training data is not available. Both intent classification and slot filling are tackled. We project the existing annotations in rich-resource languages by means of Neural Machine Translation (NMT) and posterior word alignments. We then compare training on the projected monolingual data with direct model transfer alternatives. Intent Classifiers and slot filling sequence taggers are implemented using a BiLSTM architecture or by fine-tuning BERT transformer models. Models learnt exclusively from Basque projected data provide better accuracies for slot filling. Combining Basque projected train data with rich-resource languages data outperforms consistently models trained solely on projected data for intent classification. At any rate, we achieve competitive performance in both tasks, with accuracies of 81% for intent classification and 77% for slot filling.- Anthology ID:
- 2020.lrec-1.340
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 2796–2802
- Language:
- English
- URL:
- https://aclanthology.org/2020.lrec-1.340
- DOI:
- Cite (ACL):
- Maddalen López de Lacalle, Xabier Saralegi, and Iñaki San Vicente. 2020. Building a Task-oriented Dialog System for Languages with no Training Data: the Case for Basque. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2796–2802, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Building a Task-oriented Dialog System for Languages with no Training Data: the Case for Basque (López de Lacalle et al., LREC 2020)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2020.lrec-1.340.pdf