Shengyu Feng
2026
AdvancedIF: Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following
Yun He | Wenzhe Li | Hejia Zhang | Songlin Li | Karishma Mandyam | Sopan Khosla | Yuanhao Xiong | Nanshu Wang | Xiaoliang Peng | Beibin Li | Shengjie Bi | Shishir G Patil | Qi Qi | Shengyu Feng | Julian Katz-Samuels | Richard Yuanzhe Pang | Sujan Kumar Gonugondla | Hunter Lang | Yue Yu | Yundi Qian | Maryam Fazel-Zarandi | Licheng Yu | Amine Benhalloum | Hany Hassan Awadalla | Manaal Faruqui
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yun He | Wenzhe Li | Hejia Zhang | Songlin Li | Karishma Mandyam | Sopan Khosla | Yuanhao Xiong | Nanshu Wang | Xiaoliang Peng | Beibin Li | Shengjie Bi | Shishir G Patil | Qi Qi | Shengyu Feng | Julian Katz-Samuels | Richard Yuanzhe Pang | Sujan Kumar Gonugondla | Hunter Lang | Yue Yu | Yundi Qian | Maryam Fazel-Zarandi | Licheng Yu | Amine Benhalloum | Hany Hassan Awadalla | Manaal Faruqui
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent progress in large language models (LLMs) has led to impressive performance on a range of tasks, yet advanced instruction following (IF)—especially for complex, multi-turn, and system-prompted instructions—remains a significant challenge. Rigorous evaluation and effective training for such capabilities are hindered by the lack of high-quality, human-annotated benchmarks and reliable, interpretable reward signals. In this work, we introduce AdvancedIF, a comprehensive benchmark featuring over 1,600 prompts and expert-curated rubrics that assess LLMs’ ability to follow complex, multi-turn, and system-level instructions. We also open-source the evaluation script of AdvancedIF. We further propose RIFL (Rubric-based Instruction-Following Learning), a novel post-training pipeline that leverages rubric generation, a finetuned rubric verifier, and reward shaping to enable effective reinforcement learning for instruction following. Extensive experiments demonstrate that RIFL substantially improves the instruction-following abilities of LLMs, achieving a 6.7% absolute gain on AdvancedIF and strong results on public benchmarks. Our ablation studies confirm the effectiveness of each component in RIFL. This work establishes rubrics as a powerful tool for both training and evaluating advanced IF in LLMs, paving the way for more capable and reliable AI systems.
2021
Coreference by Appearance: Visually Grounded Event Coreference Resolution
Liming Wang | Shengyu Feng | Xudong Lin | Manling Li | Heng Ji | Shih-Fu Chang
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference
Liming Wang | Shengyu Feng | Xudong Lin | Manling Li | Heng Ji | Shih-Fu Chang
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference
Event coreference resolution is critical to understand events in the growing number of online news with multiple modalities including text, video, speech, etc. However, the events and entities depicting in different modalities may not be perfectly aligned and can be difficult to annotate, which makes the task especially challenging with little supervision available. To address the above issues, we propose a supervised model based on attention mechanism and an unsupervised model based on statistical machine translation, capable of learning the relative importance of modalities for event coreference resolution. Experiments on a video multimedia event dataset show that our multimodal models outperform text-only systems in event coreference resolution tasks. A careful analysis reveals that the performance gain of the multimodal model especially under unsupervised settings comes from better learning of visually salient events.
Search
Fix author
Co-authors
- Amine Benhalloum 1
- Shengjie Bi 1
- Shih-Fu Chang 1
- Manaal Faruqui 1
- Maryam Fazel-Zarandi 1
- Sujan Kumar Gonugondla 1
- Hany Hassan Awadalla 1
- Yun He 1
- Heng Ji 1
- Julian Katz-Samuels 1
- Sopan Khosla 1
- Hunter Lang 1
- Beibin Li 1
- Manling Li 1
- Songlin Li 1
- Wenzhe Li 1
- Xudong Lin 1
- Karishma Mandyam 1
- Richard Yuanzhe Pang 1
- Shishir G Patil 1
- Xiaoliang Peng 1
- Qi Qi 1
- Yundi Qian 1
- Liming Wang 1
- Nanshu Wang 1
- Yuanhao Xiong 1
- Licheng Yu 1
- Yue Yu 1
- Hejia Zhang 1