Prompt Optimization for Relation Extraction using Reinforcement Learning

Ying Liu, Dong Shuai, Cui Zibo, TengQi Ye, Gang Wu


Abstract
Relation extraction is a fundamental task in information extraction. Still, existing supervised approaches rely heavily on large-scale annotated data, limiting their applicability in domain-specific and low-resource scenarios. Prompt-based methods with large language models provide a parameter-efficient alternative; however, their performance is susceptible to prompt design, which often requires extensive domain expertise and heuristic trial-and-error. We propose REPO, a reinforcement learning-based automated prompt optimization framework for domain relation extraction. REPO formulates prompt construction as a structured, sequential decision-making problem, optimizing prompt quality through interaction with a black-box LLM. To enable efficient and stable optimization, we introduce a two-stage framework comprising an initial prompt-construction stage that generates semantically grounded candidates and a DRL-based refinement stage that iteratively improves prompts within a constrained, domain-aware action space. We further design a composite evaluation metric that integrates extraction accuracy and semantic consistency to serve as a dense reward signal. Extensive experiments on multiple relation extraction datasets across medical, financial, legal, and news domains demonstrate that REPO consistently outperforms existing prompt-based methods and supervised baselines. Ablation studies further confirm the effectiveness and robustness of the proposed DRL-based prompt optimization strategy. Our code is available at https://github.com/dddong2-star/REPO.
Anthology ID:
2026.findings-acl.2100
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
42321–42334
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2100/
DOI:
Bibkey:
Cite (ACL):
Ying Liu, Dong Shuai, Cui Zibo, TengQi Ye, and Gang Wu. 2026. Prompt Optimization for Relation Extraction using Reinforcement Learning. In Findings of the Association for Computational Linguistics: ACL 2026, pages 42321–42334, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Prompt Optimization for Relation Extraction using Reinforcement Learning (Liu et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2100.pdf
Checklist:
 2026.findings-acl.2100.checklist.pdf