Abstract
This paper proposes a novel training method to improve the robustness of Extractive Question Answering (EQA) models. Previous research has shown that existing models, when trained on EQA datasets that include unanswerable questions, demonstrate a significant lack of robustness against distribution shifts and adversarial attacks. Despite this, the inclusion of unanswerable questions in EQA training datasets is essential for ensuring real-world reliability. Our proposed training method includes a novel loss function for the EQA problem and challenges an implicit assumption present in numerous EQA datasets. Models trained with our method maintain in-domain performance while achieving a notable improvement on out-of-domain datasets. This results in an overall F1 score improvement of 5.7 across all testing sets. Furthermore, our models exhibit significantly enhanced robustness against two types of adversarial attacks, with a performance decrease of only about one-third compared to the default models.- Anthology ID:
- 2024.findings-emnlp.121
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2024
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2222–2236
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2024.findings-emnlp.121/
- DOI:
- 10.18653/v1/2024.findings-emnlp.121
- Cite (ACL):
- Son Quoc Tran and Matt Kretchmar. 2024. Towards Robust Extractive Question Answering Models: Rethinking the Training Methodology. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 2222–2236, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Towards Robust Extractive Question Answering Models: Rethinking the Training Methodology (Tran & Kretchmar, Findings 2024)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2024.findings-emnlp.121.pdf