MPL: Multiple Programming Languages with Large Language Models for Information Extraction

Bo Li, Gexiang Fang, Wei Ye, Zhenghua Xu, Jinglei Zhang, Hao Cheng, Shikun Zhang


Abstract
Recent research in information extraction (IE) focuses on utilizing code-style inputs to enhance structured output generation. The intuition behind this is that the programming languages (PLs) inherently exhibit greater structural organization than natural languages (NLs). This structural advantage makes PLs particularly suited for IE tasks. Nevertheless, existing research primarily focuses on Python for code-style simulation, overlooking the potential of other widely-used PLs (e.g., C++ and Java) during the supervised fine-tuning (SFT) phase. In this research, we propose Multiple Programming Languages with large language models for information extraction (abbreviated as MPL), a novel framework that explores the potential of incorporating different PLs in the SFT phase. Additionally, we introduce function-prompt with virtual running to simulate code-style inputs more effectively and efficiently. Experimental results on a wide range of datasets demonstrate the effectiveness of MPL. Furthermore, we conduct extensive experiments to provide a comprehensive analysis. Our code and additional files are in the supplementary materials.
Anthology ID:
2025.findings-acl.122
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2403–2414
Language:
URL:
https://preview.aclanthology.org/transition-to-people-yaml/2025.findings-acl.122/
DOI:
10.18653/v1/2025.findings-acl.122
Bibkey:
Cite (ACL):
Bo Li, Gexiang Fang, Wei Ye, Zhenghua Xu, Jinglei Zhang, Hao Cheng, and Shikun Zhang. 2025. MPL: Multiple Programming Languages with Large Language Models for Information Extraction. In Findings of the Association for Computational Linguistics: ACL 2025, pages 2403–2414, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
MPL: Multiple Programming Languages with Large Language Models for Information Extraction (Li et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/transition-to-people-yaml/2025.findings-acl.122.pdf