Unexpected Phenomenon: LLMs’ Spurious Associations in Information Extraction

Weiyan Zhang; Wanpeng Lu; Jiacheng Wang; Yating Wang; Lihan Chen; Haiyun Jiang; Jingping Liu; Tong Ruan

doi:10.18653/v1/2024.findings-acl.545

Unexpected Phenomenon: LLMs’ Spurious Associations in Information Extraction

Weiyan Zhang, Wanpeng Lu, Jiacheng Wang, Yating Wang, Lihan Chen, Haiyun Jiang, Jingping Liu, Tong Ruan

Abstract

Information extraction plays a critical role in natural language processing. When applying large language models (LLMs) to this domain, we discover an unexpected phenomenon: LLMs’ spurious associations. In tasks such as relation extraction, LLMs can accurately identify entity pairs, even if the given relation (label) is semantically unrelated to the pre-defined original one. To find these labels, we design two strategies in this study, including forward label extension and backward label validation. We also leverage the extended labels to improve model performance. Our comprehensive experiments show that spurious associations occur consistently in both Chinese and English datasets across various LLM sizes. Moreover, the use of extended labels significantly enhances LLM performance in information extraction tasks. Remarkably, there is a performance increase of 9.55%, 11.42%, and 21.27% in F1 scores on the SciERC, ACE05, and DuEE datasets, respectively.

Anthology ID:: 2024.findings-acl.545
Volume:: Findings of the Association for Computational Linguistics: ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9176–9190
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.findings-acl.545/
DOI:: 10.18653/v1/2024.findings-acl.545
Bibkey:
Cite (ACL):: Weiyan Zhang, Wanpeng Lu, Jiacheng Wang, Yating Wang, Lihan Chen, Haiyun Jiang, Jingping Liu, and Tong Ruan. 2024. Unexpected Phenomenon: LLMs’ Spurious Associations in Information Extraction. In Findings of the Association for Computational Linguistics: ACL 2024, pages 9176–9190, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: Unexpected Phenomenon: LLMs’ Spurious Associations in Information Extraction (Zhang et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.findings-acl.545.pdf

PDF Cite Search Fix data