Abstract
Conversational question-answer generation is a task that automatically generates a large-scale conversational question answering dataset based on input passages. In this paper, we introduce a novel framework that extracts question-worthy phrases from a passage and then generates corresponding questions considering previous conversations. In particular, our framework revises the extracted answers after generating questions so that answers exactly match paired questions. Experimental results show that our simple answer revision approach leads to significant improvement in the quality of synthetic data. Moreover, we prove that our framework can be effectively utilized for domain adaptation of conversational question answering.- Anthology ID:
- 2022.coling-1.140
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 1636–1644
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.140
- DOI:
- Cite (ACL):
- Seonjeong Hwang and Gary Geunbae Lee. 2022. Conversational QA Dataset Generation with Answer Revision. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1636–1644, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- Conversational QA Dataset Generation with Answer Revision (Hwang & Lee, COLING 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.coling-1.140.pdf
- Data
- CoQA, DoQA, QuAC