Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning

Sangwon Ryu, Heejin Do, Yunsu Kim, Gary Lee, Jungseul Ok


Abstract
The evaluation of summary quality encompasses diverse dimensions such as consistency, coherence, relevance, and fluency. However, existing summarization methods often target a specific dimension, facing challenges in generating well-balanced summaries across multiple dimensions. In this paper, we propose multi-objective reinforcement learning tailored to generate balanced summaries across all four dimensions. We introduce two multi-dimensional optimization (MDO) strategies for adaptive learning: 1) MDO_min, rewarding the current lowest dimension score, and 2) MDO_pro, optimizing multiple dimensions similar to multi-task learning, resolves conflicting gradients across dimensions through gradient projection. Unlike prior ROUGE-based rewards relying on reference summaries, we use a QA-based reward model that aligns with human preferences. Further, we discover the capability to regulate the length of summaries by adjusting the discount factor, seeking the generation of concise yet informative summaries that encapsulate crucial points. Our approach achieved substantial performance gains compared to baseline models on representative summarization datasets, particularly in the overlooked dimensions.
Anthology ID:
2024.acl-long.319
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5858–5871
Language:
URL:
https://aclanthology.org/2024.acl-long.319
DOI:
10.18653/v1/2024.acl-long.319
Bibkey:
Cite (ACL):
Sangwon Ryu, Heejin Do, Yunsu Kim, Gary Lee, and Jungseul Ok. 2024. Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5858–5871, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning (Ryu et al., ACL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/autopr/2024.acl-long.319.pdf