Shi Xiaorui


2023

“Multimodal summarization which aims to generate summaries with multimodal inputs, e.g., textand visual features, has attracted much attention in the research community. However, previousstudies only focus on monolingual multimodal summarization and neglect the non-native readerto understand the cross-lingual news in practical applications. It inspires us to present a newtask, named Multimodal Cross-Lingual Summarization for news (MCLS), which generates cross-lingual summaries from multi-source information. To this end, we present a large-scale multimodalcross-lingual summarization dataset, which consists of 1.1 million article-summary pairs with 3.4million images in 44 * 43 language pairs. To generate a summary in any language, we propose aunified framework that jointly trains the multimodal monolingual and cross-lingual summarizationtasks, where a bi-directional knowledge distillation approach is designed to transfer knowledgebetween both tasks. Extensive experiments on many-to-many settings show the effectiveness ofthe proposed model.”
Search
Co-authors
    Venues
    Fix author