Liping Tang
2023
ConvRGX: Recognition, Generation, and Extraction for Self-trained Conversational Question Answering
Tianhua Zhang
|
Liping Tang
|
Wei Fang
|
Hongyin Luo
|
Xixin Wu
|
Helen Meng
|
James Glass
Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
Collecting and constructing human-annotated corpora for training conversational question-answering (CQA) models has recently been shown to be inefficient and costly. To solve this problem, previous works have proposed training QA models with automatically generated QA data. In this work, we extend earlier studies on QA synthesis, and propose an efficient QA data generation algorithm under conversational settings. Our model recognizes potential dialogue topics, generates corresponding questions, and extracts answers from grounding passages. To improve the quality of generated QAs and downstream self-training of CQA models, we propose dropout and agreement-based QA selection methods. We conduct experiments on both data augmentation and domain adaptation settings. Experiments on the QuAC and Doc2Dial tasks show that the proposed method can significantly improve the quality of generated QA data, and also improves the accuracy of self-trained CQA models based on the constructed training corpora.
2022
Grounded Dialogue Generation with Cross-encoding Re-ranker, Grounding Span Prediction, and Passage Dropout
Kun Li
|
Tianhua Zhang
|
Liping Tang
|
Junan Li
|
Hongyuan Lu
|
Xixin Wu
|
Helen Meng
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
MultiDoc2Dial presents an important challenge on modeling dialogues grounded with multiple documents. This paper proposes a pipeline system of “retrieve, re-rank, and generate”, where each component is individually optimized. This enables the passage re-ranker and response generator to fully exploit training with ground-truth data. Furthermore, we use a deep cross-encoder trained with localized hard negative passages from the retriever. For the response generator, we use grounding span prediction as an auxiliary task to be jointly trained with the main task of response generation. We also adopt a passage dropout and regularization technique to improve response generation performance. Experimental results indicate that the system clearly surpasses the competitive baseline and our team CPII-NLP ranked 1st among the public submissions on ALL four leaderboards based on the sum of F1, SacreBLEU, METEOR and RougeL scores.
Search
Co-authors
- Tianhua Zhang 2
- Xixin Wu 2
- Helen Meng 2
- Wei Fang 1
- Hongyin Luo 1
- show all...