Akang Shi
2025
Towards Robust Universal Information Extraction: Dataset, Evaluation, and Solution
Jizhao Zhu
|
Akang Shi
|
Zixuan Li
|
Long Bai
|
Xiaolong Jin
|
Jiafeng Guo
|
Xueqi Cheng
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In this paper, we aim to enhance the robustness of Universal Information Extraction (UIE) by introducing a new benchmark dataset, a comprehensive evaluation, and a feasible solution. Existing robust benchmark datasets have two key limitations: 1) They generate only a limited range of perturbations for a single Information Extraction (IE) task, which fails to evaluate the robustness of UIE models effectively; 2) They rely on small models or handcrafted rules to generate perturbations, often resulting in unnatural adversarial examples. Considering the powerful generation capabilities of Large Language Models (LLMs), we introduce a new benchmark dataset for Robust UIE, called RUIE-Bench, which utilizes LLMs to generate more diverse and realistic perturbations across different IE tasks. Based on this dataset, we comprehensively evaluate existing UIE models and reveal that both LLM-based models and other models suffer from significant performance drops. To improve robustness and reduce training costs, we propose a data-augmentation solution that dynamically selects hard samples for iterative training based on the model’s inference loss. Experimental results show that training with only 15% of the data leads to an average 8.1% relative performance improvement across three IE tasks. Our code and dataset are available at: https://github.com/ICT-GoKnow/RobustUIE.
Search
Fix author
Co-authors
- Long Bai 1
- Xueqi Cheng 1
- Jiafeng Guo (嘉丰 郭) 1
- Xiaolong Jin 1
- Zixuan Li 1
- show all...
Venues
- acl1