MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization

Xiachong Feng, Xiaocheng Feng, Bing Qin


Abstract
Dialogue summarization helps users capture salient information from various types of dialogues has received much attention recently. However, current works mainly focus on English dialogue summarization, leaving other languages less well explored. Therefore, we present a multi-lingual dialogue summarization dataset, namely MSAMSum, which covers dialogue-summary pairs in six languages. Specifically, we derive MSAMSum from the standard SAMSum using sophisticated translation techniques and further employ two methods to ensure the integral translation quality and summary factual consistency. Given the proposed MSAMum, we systematically set up five multi-lingual settings for this task, including a novel mix-lingual dialogue summarization setting. To illustrate the utility of our dataset, we benchmark various experiments with pre-trained models under different settings and report results in both supervised and zero-shot manners. We also discuss some future works towards this task to motivate future researches.
Anthology ID:
2022.dialdoc-1.1
Volume:
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
dialdoc
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–12
Language:
URL:
https://aclanthology.org/2022.dialdoc-1.1
DOI:
10.18653/v1/2022.dialdoc-1.1
Bibkey:
Cite (ACL):
Xiachong Feng, Xiaocheng Feng, and Bing Qin. 2022. MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization. In Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, pages 1–12, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization (Feng et al., dialdoc 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2022.dialdoc-1.1.pdf
Code
 xcfcode/msamsum