Abstract
Multi-document summarization is gaining more and more attention recently and serves as an invaluable tool to obtain key facts among a large information pool. In this paper, we proposed a multi-document hybrid summarization approach, which simultaneously generates a human-readable summary and extracts corresponding key evidences based on multi-doc inputs. To fulfill that purpose, we crafted a salient representation learning method to induce latent salient features, which are effective for joint evidence extraction and summary generation. In order to train this model, we conducted multi-task learning to optimize a composited loss, constructed over extractive and abstractive sub-components in a hierarchical way. We implemented the system based on a ubiquiotously adopted transformer architecture and conducted experimental studies on multiple datasets across two domains, achieving superior performance over the baselines.- Anthology ID:
- 2023.acl-industry.37
- Volume:
- Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Sunayana Sitaram, Beata Beigman Klebanov, Jason D Williams
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 379–389
- Language:
- URL:
- https://aclanthology.org/2023.acl-industry.37
- DOI:
- 10.18653/v1/2023.acl-industry.37
- Cite (ACL):
- Min Xiao. 2023. Multi-doc Hybrid Summarization via Salient Representation Learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 379–389, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Multi-doc Hybrid Summarization via Salient Representation Learning (Xiao, ACL 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2023.acl-industry.37.pdf