Wenbin Duan

2026

uircis at SemEval-2026 Task 8: A Unified Lightweight Pipeline for Multi-Turn RAG Evaluation
Jiaqi Zhang | Wenbin Duan | Yingqi Zhang | Yan Li | Binyang Li
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

We submit a system description paper for SemEval-2026 Task 8 (MTRAGEval), covering both Subtask A (retrieval) and Subtask B (generation). Our approach is a lightweight, fully reproducible multi-turn RAG pipeline using open-weight models: Qwen2.5-7B-Instruct for query rewriting and grounded answer generation, BGE-M3 for dense retrieval, and BGE-Reranker-v2-M3 for cross-encoder reranking. We report official test performance, conduct ablation experiments to quantify the impact of rewriting and reranking across domains, and provide error analysis using the organizers’ analytics and answerability classes, highlighting key failure modes in multi-turn retrieval specificity and grounded generation.

Wenbin Duan

2026

2025

Co-authors

Venues