Open-World Factually Consistent Question Generation
Himanshu Maheshwari, Sumit Shekhar, Apoorv Saxena, Niyati Chhaya
Abstract
Question generation methods based on pre-trained language models often suffer from factual inconsistencies and incorrect entities and are not answerable from the input paragraph. Domain shift – where the test data is from a different domain than the training data - further exacerbates the problem of hallucination. This is a critical issue for any natural language application doing question generation. In this work, we propose an effective data processing technique based on de-lexicalization for consistent question generation across domains. Unlike existing approaches for remedying hallucination, the proposed approach does not filter training data and is generic across question-generation models. Experimental results across six benchmark datasets show that our model is robust to domain shift and produces entity-level factually consistent questions without significant impact on traditional metrics.- Anthology ID:
- 2023.findings-acl.151
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2390–2404
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.151
- DOI:
- Cite (ACL):
- Himanshu Maheshwari, Sumit Shekhar, Apoorv Saxena, and Niyati Chhaya. 2023. Open-World Factually Consistent Question Generation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2390–2404, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Open-World Factually Consistent Question Generation (Maheshwari et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2023.findings-acl.151.pdf