Compositional Data Augmentation for Abstractive Conversation Summarization

Siru Ouyang, Jiaao Chen, Jiawei Han, Diyi Yang


Abstract
Recent abstractive conversation summarization systems generally rely on large-scale datasets with annotated summaries. However, collecting and annotating these conversations can be a time-consuming and labor-intensive task. To address this issue, in this work, we present a sub-structure level compositional data augmentation method, Compo, for generating diverse and high-quality pairs of conversations and summaries. Specifically, Compo first extracts conversation structures like topic splits and action triples as basic units. Then we organize these semantically meaningful conversation snippets compositionally to create new training instances. Additionally, we explore noise-tolerant settings in both self-training and joint-training paradigms to make the most of these augmented samples. Our experiments on benchmark datasets, SAMSum and DialogSum, show that Compo substantially outperforms prior baseline methods by achieving a nearly 10% increase of ROUGE scores with limited data. Code is available at https://github.com/ozyyshr/Compo.
Anthology ID:
2023.acl-long.82
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1471–1488
Language:
URL:
https://aclanthology.org/2023.acl-long.82
DOI:
10.18653/v1/2023.acl-long.82
Bibkey:
Cite (ACL):
Siru Ouyang, Jiaao Chen, Jiawei Han, and Diyi Yang. 2023. Compositional Data Augmentation for Abstractive Conversation Summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1471–1488, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Compositional Data Augmentation for Abstractive Conversation Summarization (Ouyang et al., ACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.acl-long.82.pdf
Video:
 https://preview.aclanthology.org/emnlp-22-attachments/2023.acl-long.82.mp4