Capturing Latent Modal Association For Multimodal Entity Alignment

Yongquan Ji, Jingwei Cheng, Fu Zhang, Chenglong Lu


Abstract
Multimodal entity alignment aims to identify equivalent entities in heterogeneous knowledge graphs by leveraging complementary information from multiple modalities. However, existing methods often overlook the quality of input modality embeddings during modality interaction – such as missing modality generation, modal information transfer, modality fusion – which may inadvertently amplify noise propagation while suppressing discriminative feature representations. To address these issues, we propose a novel model – CLAMEA for capturing latent modal association for multimodal entity alignment. Specifically, we use a self- attention mechanism to enhance salient information while attenuating noise within individual modality embeddings. We design a dynamic modal attention flow fusion module to capture and balance latent intra- and inter-modal associations and generate fused modality embeddings. Based on both fused and available modalities, we adopt variational autoencoder (VAE) to generate high quality embeddings for the missing modality. We use a cross-modal association extraction module to extract latent modal associations from the completed modality embeddings, further enhancing embedding quality. Experimental results on two real-world datasets demonstrate the effectiveness of our approach, which achieves an absolute 3.1% higher Hits@ 1 score than the sota method.
Anthology ID:
2025.findings-emnlp.1213
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
22278–22293
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1213/
DOI:
10.18653/v1/2025.findings-emnlp.1213
Bibkey:
Cite (ACL):
Yongquan Ji, Jingwei Cheng, Fu Zhang, and Chenglong Lu. 2025. Capturing Latent Modal Association For Multimodal Entity Alignment. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 22278–22293, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Capturing Latent Modal Association For Multimodal Entity Alignment (Ji et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1213.pdf
Checklist:
 2025.findings-emnlp.1213.checklist.pdf