Abstract
Open-domain automatic dialogue evaluation plays an important role in dialogue systems. While recent efforts are being put into making learning-based evaluation metrics correlate better with human evaluation, robust metrics for parallel corpora and multiple domains remain unexplored. Parallel corpora refer to corpora that express the same idea in different ways (e.g., translation, paraphrasing and back-translation). In this paper, we propose Parallel Corpora Alignment Framework (PCAF), which improves the consistency and robustness of model evaluation on parallel corpora. Firstly, parallel corpora are aligned in semantic space through parallel-corpora-aligned contrastive learning. Then, parallel-corpora-aligned distillation on multi-dataset is applied to further improve model’s generalization ability across multiple data domains. Our approach ranks second on the final test data of DSTC11 track4 subtask1 (“Multilingual Automatic Evaluation Metrics”, turn-level) and third on the subtask2 (“Robust Automatic Evaluation Metrics”, turn-level), which proves the strong generalization ability and robustness of our proposed approach.- Anthology ID:
- 2023.dstc-1.15
- Volume:
- Proceedings of The Eleventh Dialog System Technology Challenge
- Month:
- September
- Year:
- 2023
- Address:
- Prague, Czech Republic
- Editors:
- Yun-Nung Chen, Paul Crook, Michel Galley, Sarik Ghazarian, Chulaka Gunasekara, Raghav Gupta, Behnam Hedayatnia, Satwik Kottur, Seungwhan Moon, Chen Zhang
- Venues:
- DSTC | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 123–132
- Language:
- URL:
- https://aclanthology.org/2023.dstc-1.15
- DOI:
- Cite (ACL):
- Xinglin Wang, Jiayi Shi, Peiwen Yuan, and Kan Li. 2023. Parallel Corpora Alignment Framework for Multilingual and Robust Automatic Dialogue Evaluation. In Proceedings of The Eleventh Dialog System Technology Challenge, pages 123–132, Prague, Czech Republic. Association for Computational Linguistics.
- Cite (Informal):
- Parallel Corpora Alignment Framework for Multilingual and Robust Automatic Dialogue Evaluation (Wang et al., DSTC-WS 2023)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2023.dstc-1.15.pdf