Cross-Lingual Extractive Question Answering with Unanswerable Questions

Yuval Gorodissky, Elior Sulem, Dan Roth


Abstract
Cross-lingual Extractive Question Answering (EQA) extends standard EQA by requiring models to find answers in passages written in languages different from the questions. The Generalized Cross-Lingual Transfer (G-XLT) task evaluates models’ zero-shot ability to transfer question answering capabilities across languages using only English training data. While previous research has primarily focused on scenarios where answers are always present, real-world applications often encounter situations where no answer exists within the given context. This paper introduces an enhanced G-XLT task definition that explicitly handles unanswerable questions, bridging a critical gap in current research. To address this challenge, we present two new datasets: miXQuAD and MLQA-IDK, which address both answerable and unanswerable questions and respectively cover 12 and 7 language pairs. Our study evaluates state-of-the-art large language models using fine-tuning, parameter-efficient techniques, and in-context learning approaches, revealing interesting trade-offs between a smaller fine-tuned model’s performance on answerable questions versus a larger in-context learning model’s capability on unanswerable questions. We also examine language similarity patterns based on model performance, finding alignments with known language families.
Anthology ID:
2025.starsem-1.8
Volume:
Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025)
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Lea Frermann, Mark Stevenson
Venue:
*SEM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
100–121
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.starsem-1.8/
DOI:
Bibkey:
Cite (ACL):
Yuval Gorodissky, Elior Sulem, and Dan Roth. 2025. Cross-Lingual Extractive Question Answering with Unanswerable Questions. In Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025), pages 100–121, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Cross-Lingual Extractive Question Answering with Unanswerable Questions (Gorodissky et al., *SEM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.starsem-1.8.pdf