Weiyan Zhang


2024

pdf
Unexpected Phenomenon: LLMs’ Spurious Associations in Information Extraction
Weiyan Zhang | Wanpeng Lu | Jiacheng Wang | Yating Wang | Lihan Chen | Haiyun Jiang | Jingping Liu | Tong Ruan
Findings of the Association for Computational Linguistics ACL 2024

Information extraction plays a critical role in natural language processing. When applying large language models (LLMs) to this domain, we discover an unexpected phenomenon: LLMs’ spurious associations. In tasks such as relation extraction, LLMs can accurately identify entity pairs, even if the given relation (label) is semantically unrelated to the pre-defined original one. To find these labels, we design two strategies in this study, including forward label extension and backward label validation. We also leverage the extended labels to improve model performance. Our comprehensive experiments show that spurious associations occur consistently in both Chinese and English datasets across various LLM sizes. Moreover, the use of extended labels significantly enhances LLM performance in information extraction tasks. Remarkably, there is a performance increase of 9.55%, 11.42%, and 21.27% in F1 scores on the SciERC, ACE05, and DuEE datasets, respectively.