Ze Li
2025
How do Language Models Reshape Entity Alignment? A Survey of LM-Driven EA Methods: Advances, Benchmarks, and Future
Zerui Chen
|
Huiming Fan
|
Qianyu Wang
|
Tao He
|
Ming Liu
|
Heng Chang
|
Weijiang Yu
|
Ze Li
|
Bing Qin
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Entity alignment (EA), critical for knowledge graph (KG) integration, identifies equivalent entities across different KGs. Traditional methods often face challenges in semantic understanding and scalability. The rise of language models (LMs), particularly large language models (LLMs), has provided powerful new strategies. This paper systematically reviews LM-driven EA methods, proposing a novel taxonomy that categorizes methods in three key stages: data preparation, feature embedding, and alignment. We further summarize key benchmarks, evaluation metrics, and discuss future directions. This paper aims to provide researchers and practitioners with a clear and comprehensive understanding of how language models reshape the field of entity alignment.
Generating Domain-Specific Knowledge Graphs from Large Language Models
Marinela Parović
|
Ze Li
|
Jinhua Du
Findings of the Association for Computational Linguistics: ACL 2025
Knowledge graphs (KGs) have been a cornerstone of search and recommendation due to their ability to store factual knowledge about any domain in a structured form enabling easy search and retrieval. Large language models (LLMs) have shown impressive world knowledge across different benchmarks and domains but their knowledge is inconveniently scattered across their billions of parameters. In this paper, we propose a prompt-based method to construct domain-specific KGs by extracting knowledge solely from LLMs’ parameters. First, we use an LLM to create a schema for a specific domain, which contains a set of domain-representative entities and relations. After that, we use the schema to guide the LLM through an iterative data generation process equipped with Chain-of-Verification (CoVe) for increased data quality. Using this method, we construct KGs for two domains: books and landmarks, which we then evaluate against Wikidata, an open-source human-created KG. Our results show that LLMs can generate large domain-specific KGs containing tens of thousands of entities and relations. However, due to the increased hallucination rates as the procedure evolves, the utility of large-scale LLM-generated KGs in practical applications could remain limited.