uircis at SemEval-2026 Task 8: A Unified Lightweight Pipeline for Multi-Turn RAG Evaluation

Jiaqi Zhang, Wenbin Duan, Yingqi Zhang, Yan Li, Binyang Li


Abstract
We submit a system description paper for SemEval-2026 Task 8 (MTRAGEval), covering both Subtask A (retrieval) and Subtask B (generation). Our approach is a lightweight, fully reproducible multi-turn RAG pipeline using open-weight models: Qwen2.5-7B-Instruct for query rewriting and grounded answer generation, BGE-M3 for dense retrieval, and BGE-Reranker-v2-M3 for cross-encoder reranking. We report official test performance, conduct ablation experiments to quantify the impact of rewriting and reranking across domains, and provide error analysis using the organizers’ analytics and answerability classes, highlighting key failure modes in multi-turn retrieval specificity and grounded generation.
Anthology ID:
2026.semeval-1.394
Volume:
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3143–3148
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.394/
DOI:
Bibkey:
Cite (ACL):
Jiaqi Zhang, Wenbin Duan, Yingqi Zhang, Yan Li, and Binyang Li. 2026. uircis at SemEval-2026 Task 8: A Unified Lightweight Pipeline for Multi-Turn RAG Evaluation. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3143–3148, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
uircis at SemEval-2026 Task 8: A Unified Lightweight Pipeline for Multi-Turn RAG Evaluation (Zhang et al., SemEval 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.394.pdf
Supplementarymaterial:
 2026.semeval-1.394.SupplementaryMaterial.zip