QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization

Ming Zhong; Da Yin; Tao Yu; Ahmad Zaidi; Mutethia Mutuma; Rahul Jha; Ahmed Hassan; Asli Celikyilmaz; Yang Liu; Xipeng Qiu; Dragomir Radev

doi:10.18653/v1/2021.naacl-main.472

QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization

Ming Zhong, Da Yin, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, Dragomir Radev

Abstract

Meetings are a key component of human collaboration. As increasing numbers of meetings are recorded and transcribed, meeting summaries have become essential to remind those who may or may not have attended the meetings about the key decisions made and the tasks to be completed. However, it is hard to create a single short summary that covers all the content of a long meeting involving multiple people and topics. In order to satisfy the needs of different types of users, we define a new query-based multi-domain meeting summarization task, where models have to select and summarize relevant spans of meetings in response to a query, and we introduce QMSum, a new benchmark for this task. QMSum consists of 1,808 query-summary pairs over 232 meetings in multiple domains. Besides, we investigate a locate-then-summarize method and evaluate a set of strong summarization baselines on the task. Experimental results and manual analysis reveal that QMSum presents significant challenges in long meeting summarization for future research. Dataset is available at https://github.com/Yale-LILY/QMSum.

Anthology ID:: 2021.naacl-main.472
Volume:: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: June
Year:: 2021
Address:: Online
Editors:: Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5905–5921
Language:
URL:: https://aclanthology.org/2021.naacl-main.472
DOI:: 10.18653/v1/2021.naacl-main.472
Bibkey:
Cite (ACL):: Ming Zhong, Da Yin, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, and Dragomir Radev. 2021. QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5905–5921, Online. Association for Computational Linguistics.
Cite (Informal):: QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization (Zhong et al., NAACL 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2021.naacl-main.472.pdf
Optional supplementary data:: 2021.naacl-main.472.OptionalSupplementaryData.zip
Video:: https://preview.aclanthology.org/landing_page/2021.naacl-main.472.mp4
Code: Yale-LILY/QMSum
Data: QMSum

PDF Search Code Optional supplementary data Video