TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale

Pengcheng Jiang, Cao Xiao, Zifeng Wang, Parminder Bhatia, Jimeng Sun, Jiawei Han


Abstract
The advent of large language models (LLMs) has significantly advanced natural language processing tasks like text summarization. However, their large size and computational demands, coupled with privacy concerns in data transmission, limit their use in resource-constrained and privacy-centric settings. To overcome this, we introduce TriSum, a framework for distilling LLMs’ text summarization abilities into a compact, local model. Initially, LLMs extract a set of aspect-triple rationales and summaries, which are refined using a dual-scoring method for quality. Next, a smaller local model is trained with these tasks, employing a curriculum learning strategy that evolves from simple to complex tasks. Our method enhances local model performance on various benchmarks (CNN/DailyMail, XSum, and ClinicalTrial), outperforming baselines by 4.5%, 8.5%, and 7.4%, respectively. It also improves interpretability by providing insights into the summarization rationale.
Anthology ID:
2024.naacl-long.154
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2805–2819
Language:
URL:
https://aclanthology.org/2024.naacl-long.154
DOI:
10.18653/v1/2024.naacl-long.154
Bibkey:
Cite (ACL):
Pengcheng Jiang, Cao Xiao, Zifeng Wang, Parminder Bhatia, Jimeng Sun, and Jiawei Han. 2024. TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 2805–2819, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale (Jiang et al., NAACL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2024.naacl-long.154.pdf