Trust in Internal or External Knowledge? Generative Multi-Modal Entity Linking with Knowledge Retriever

Xinwei Long, Jiali Zeng, Fandong Meng, Jie Zhou, Bowen Zhou


Abstract
Multi-modal entity linking (MEL) is a challenging task that requires accurate prediction of entities within extensive search spaces, utilizing multi-modal contexts. Existing generative approaches struggle with the knowledge gap between visual entity information and the intrinsic parametric knowledge of LLMs. To address this knowledge gap, we introduce a novel approach called GELR, which incorporates a knowledge retriever to enhance visual entity information by leveraging external sources. Additionally, we devise a prioritization scheme that effectively handles noisy retrieval results and manages conflicts arising from the integration of external and internal knowledge. Moreover, we propose a noise-aware instruction tuning technique during training to finely adjust the model’s ability to leverage retrieved information effectively. Through extensive experiments conducted on three benchmarks, our approach showcases remarkable improvements, ranging from 3.0% to 6.5%, across all evaluation metrics compared to strong baselines. These results demonstrate the effectiveness and superiority of our proposed method in tackling the complexities of multi-modal entity linking.
Anthology ID:
2024.findings-acl.450
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7559–7569
Language:
URL:
https://aclanthology.org/2024.findings-acl.450
DOI:
Bibkey:
Cite (ACL):
Xinwei Long, Jiali Zeng, Fandong Meng, Jie Zhou, and Bowen Zhou. 2024. Trust in Internal or External Knowledge? Generative Multi-Modal Entity Linking with Knowledge Retriever. In Findings of the Association for Computational Linguistics ACL 2024, pages 7559–7569, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Trust in Internal or External Knowledge? Generative Multi-Modal Entity Linking with Knowledge Retriever (Long et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2024.findings-acl.450.pdf