Yi-Shan Lin


2026

This paper describes our system for SemEval-2026 Task 8: Evaluating Multi-Turn RAG Conversations (MTRAGEval), which evaluates retrieval-augmented generation (RAG) in multi-turn, context-dependent settings. We improve retrieval with history-aware query rewriting and enhance generation faithfulness with a LoRA-adapted model, integrating both into an end-to- end pipeline.Our approach achieves competitive performance across all subtasks, with nDCG@5 of 0.4855 in Subtask A, a harmonic mean score of 0.6554 in Subtask B, and 0.5159 in Subtask C, outperforming strong baselines in Subtasks A and B while remaining competitive in Subtask C.Our analysis shows that increasing dialogue length introduces cumulative errors in history selection and query formulation, leading to incomplete or drifting retrieval results and increasing the risk of hallucination.