Abstract
With the rapid development of NLP, large-scale language models (LLMs) excel in various tasks across multiple domains now. However, existing benchmarks may not adequately measure these models’ capabilities, especially when faced with new knowledge. In this paper, we address the lack of benchmarks to evaluate LLMs’ ability to handle new knowledge, an important and challenging aspect in the rapidly evolving world. We propose an approach called KnowGen that generates new knowledge by altering existing entity attributes and relationships, resulting in artificial entities that are distinct from real-world entities. With KnowGen, we introduce a benchmark named ALCUNA to assess LLMs’ abilities in knowledge understanding, differentiation, and association. We benchmark several LLMs, reveals that their performance in face of new knowledge is not satisfactory, particularly in reasoning between new and internal knowledge. We also explore the impact of entity similarity on the model’s understanding of entity knowledge and the influence of contextual entities. We appeal to the need for caution when using LLMs in new scenarios or with new knowledge, and hope that our benchmarks can help drive the development of LLMs in face of new knowledge.- Anthology ID:
- 2023.emnlp-main.87
- Volume:
- Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1397–1414
- Language:
- URL:
- https://aclanthology.org/2023.emnlp-main.87
- DOI:
- 10.18653/v1/2023.emnlp-main.87
- Cite (ACL):
- Xunjian Yin, Baizhou Huang, and Xiaojun Wan. 2023. ALCUNA: Large Language Models Meet New Knowledge. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 1397–1414, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- ALCUNA: Large Language Models Meet New Knowledge (Yin et al., EMNLP 2023)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2023.emnlp-main.87.pdf