An Evaluation Dataset and Strategy for Building Robust Multi-turn Response Selection Model

Kijong Han, Seojin Lee, Dong-hun Lee


Abstract
Multi-turn response selection models have recently shown comparable performance to humans in several benchmark datasets. However, in the real environment, these models often have weaknesses, such as making incorrect predictions based heavily on superficial patterns without a comprehensive understanding of the context. For example, these models often give a high score to the wrong response candidate containing several keywords related to the context but using the inconsistent tense. In this study, we analyze the weaknesses of the open-domain Korean Multi-turn response selection models and publish an adversarial dataset to evaluate these weaknesses. We also suggest a strategy to build a robust model in this adversarial environment.
Anthology ID:
2021.emnlp-main.180
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2338–2344
Language:
URL:
https://aclanthology.org/2021.emnlp-main.180
DOI:
10.18653/v1/2021.emnlp-main.180
Bibkey:
Cite (ACL):
Kijong Han, Seojin Lee, and Dong-hun Lee. 2021. An Evaluation Dataset and Strategy for Building Robust Multi-turn Response Selection Model. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2338–2344, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
An Evaluation Dataset and Strategy for Building Robust Multi-turn Response Selection Model (Han et al., EMNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.emnlp-main.180.pdf
Video:
 https://preview.aclanthology.org/ingestion-script-update/2021.emnlp-main.180.mp4
Code
 kakaoenterprise/koradvmrstestdata