ConvRGX: Recognition, Generation, and Extraction for Self-trained Conversational Question Answering

Tianhua Zhang, Liping Tang, Wei Fang, Hongyin Luo, Xixin Wu, Helen Meng, James Glass


Abstract
Collecting and constructing human-annotated corpora for training conversational question-answering (CQA) models has recently been shown to be inefficient and costly. To solve this problem, previous works have proposed training QA models with automatically generated QA data. In this work, we extend earlier studies on QA synthesis, and propose an efficient QA data generation algorithm under conversational settings. Our model recognizes potential dialogue topics, generates corresponding questions, and extracts answers from grounding passages. To improve the quality of generated QAs and downstream self-training of CQA models, we propose dropout and agreement-based QA selection methods. We conduct experiments on both data augmentation and domain adaptation settings. Experiments on the QuAC and Doc2Dial tasks show that the proposed method can significantly improve the quality of generated QA data, and also improves the accuracy of self-trained CQA models based on the constructed training corpora.
Anthology ID:
2023.dialdoc-1.10
Volume:
Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Smaranda Muresan, Vivian Chen, Kennington Casey, Vandyke David, Dethlefs Nina, Inoue Koji, Ekstedt Erik, Ultes Stefan
Venue:
dialdoc
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
86–100
Language:
URL:
https://aclanthology.org/2023.dialdoc-1.10
DOI:
10.18653/v1/2023.dialdoc-1.10
Bibkey:
Cite (ACL):
Tianhua Zhang, Liping Tang, Wei Fang, Hongyin Luo, Xixin Wu, Helen Meng, and James Glass. 2023. ConvRGX: Recognition, Generation, and Extraction for Self-trained Conversational Question Answering. In Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, pages 86–100, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
ConvRGX: Recognition, Generation, and Extraction for Self-trained Conversational Question Answering (Zhang et al., dialdoc 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2023.dialdoc-1.10.pdf