Post-Training Dialogue Summarization using Pseudo-Paraphrasing

Qi Jia, Yizhu Liu, Haifeng Tang, Kenny Zhu


Abstract
Previous dialogue summarization techniques adapt large language models pretrained on the narrative text by injecting dialogue-specific features into the models. These features either require additional knowledge to recognize or make the resulting models harder to tune. To bridge the format gap between dialogues and narrative summaries in dialogue summarization tasks, we propose to post-train pretrained language models (PLMs) to rephrase from dialogue to narratives. After that, the model is fine-tuned for dialogue summarization as usual. Comprehensive experiments show that our approach significantly improves vanilla PLMs on dialogue summarization and outperforms other SOTA models by the summary quality and implementation costs.
Anthology ID:
2022.findings-naacl.125
Volume:
Findings of the Association for Computational Linguistics: NAACL 2022
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1660–1669
Language:
URL:
https://aclanthology.org/2022.findings-naacl.125
DOI:
10.18653/v1/2022.findings-naacl.125
Bibkey:
Cite (ACL):
Qi Jia, Yizhu Liu, Haifeng Tang, and Kenny Zhu. 2022. Post-Training Dialogue Summarization using Pseudo-Paraphrasing. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1660–1669, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Post-Training Dialogue Summarization using Pseudo-Paraphrasing (Jia et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.findings-naacl.125.pdf
Software:
 2022.findings-naacl.125.software.zip
Video:
 https://preview.aclanthology.org/ingestion-script-update/2022.findings-naacl.125.mp4
Code
 jiaqisjtu/dialsent-pgg
Data
SAMSum Corpus