Leveraging Non-dialogue Summaries for Dialogue Summarization

Seongmin Park, Dongchan Shin, Jihwa Lee


Abstract
To mitigate the lack of diverse dialogue summarization datasets in academia, we present methods to utilize non-dialogue summarization data for enhancing dialogue summarization systems. We apply transformations to document summarization data pairs to create training data that better befit dialogue summarization. The suggested transformations also retain desirable properties of non-dialogue datasets, such as improved faithfulness to the source text. We conduct extensive experiments across both English and Korean to verify our approach. Although absolute gains in ROUGE naturally plateau as more dialogue summarization samples are introduced, utilizing non-dialogue data for training significantly improves summarization performance in zero- and few-shot settings and enhances faithfulness across all training regimes.
Anthology ID:
2022.tu-1.1
Volume:
Proceedings of the First Workshop On Transcript Understanding
Month:
Oct
Year:
2022
Address:
Gyeongju, South Korea
Editors:
Franck Dernoncourt, Thien Huu Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, Trung H. Bui, David Seunghyun Yoon
Venue:
TU
SIG:
Publisher:
International Conference on Computational Linguistics
Note:
Pages:
1–7
Language:
URL:
https://aclanthology.org/2022.tu-1.1
DOI:
Bibkey:
Cite (ACL):
Seongmin Park, Dongchan Shin, and Jihwa Lee. 2022. Leveraging Non-dialogue Summaries for Dialogue Summarization. In Proceedings of the First Workshop On Transcript Understanding, pages 1–7, Gyeongju, South Korea. International Conference on Computational Linguistics.
Cite (Informal):
Leveraging Non-dialogue Summaries for Dialogue Summarization (Park et al., TU 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2022.tu-1.1.pdf
Data
DialogSum