Shi Xiaorui
2023
MCLS: A Large-Scale Multimodal Cross-Lingual Summarization Dataset
Shi Xiaorui
Proceedings of the 22nd Chinese National Conference on Computational Linguistics
“Multimodal summarization which aims to generate summaries with multimodal inputs, e.g., textand visual features, has attracted much attention in the research community. However, previousstudies only focus on monolingual multimodal summarization and neglect the non-native readerto understand the cross-lingual news in practical applications. It inspires us to present a newtask, named Multimodal Cross-Lingual Summarization for news (MCLS), which generates cross-lingual summaries from multi-source information. To this end, we present a large-scale multimodalcross-lingual summarization dataset, which consists of 1.1 million article-summary pairs with 3.4million images in 44 * 43 language pairs. To generate a summary in any language, we propose aunified framework that jointly trains the multimodal monolingual and cross-lingual summarizationtasks, where a bi-directional knowledge distillation approach is designed to transfer knowledgebetween both tasks. Extensive experiments on many-to-many settings show the effectiveness ofthe proposed model.”