MDS: A Fine-Grained Dataset for Multi-Modal Dialogue Summarization

Zhipeng Liu, Xiaoming Zhang, Litian Zhang, Zelong Yu


Abstract
Due to the explosion of various dialogue scenes, summarizing the dialogue into a short message has drawn much attention recently. In the multi-modal dialogue scene, people tend to use tone and body language to illustrate their intentions. While traditional dialogue summarization has predominantly focused on textual content, this approach may overlook vital visual and audio information essential for understanding multi-modal interactions. Recognizing the established field of multi-modal dialogue summarization, we develop a new multi-modal dialogue summarization dataset (MDS), which aims to enhance the variety and scope of data available for this research area. MDS provides a demanding testbed for multi-modal dialogue summarization. Subsequently, we conducted a comparative analysis of various summarization techniques on MDS and found that the existing methods tend to produce redundant and incoherent summaries. All of the models generate unfaithful facts to some degree, suggesting future research directions. MDS is available at https://github.com/R00kkie/MDS.
Anthology ID:
2024.lrec-main.970
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
11123–11137
Language:
URL:
https://aclanthology.org/2024.lrec-main.970
DOI:
Bibkey:
Cite (ACL):
Zhipeng Liu, Xiaoming Zhang, Litian Zhang, and Zelong Yu. 2024. MDS: A Fine-Grained Dataset for Multi-Modal Dialogue Summarization. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 11123–11137, Torino, Italia. ELRA and ICCL.
Cite (Informal):
MDS: A Fine-Grained Dataset for Multi-Modal Dialogue Summarization (Liu et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/2024.lrec-main.970.pdf
Optional supplementary material:
 2024.lrec-main.970.OptionalSupplementaryMaterial.zip