Tal Farhan
2026
JCT at SemEval-2026 Task 8: Resource-Efficient Multi-Turn RAG via Nano-LLM Rewriting and Hybrid Reranking
Tal Farhan | Chaya Liebeskind
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Tal Farhan | Chaya Liebeskind
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper describes our system submission for SemEval-2026 Task A (MTRAGEval), focusing on multi-turn Retrieval-Augmented Generation (RAG). Conversational queries often suffer from contextual ambiguity, rendering standard retrieval methods ineffective. We propose a highly resource-efficient pipeline that decouples query understanding from retrieval using a 1.5B parameter Nano-LLM (Qwen) for query rewriting, followed by parallel hybrid retrieval (Qdrant) and Cross-Encoder reranking. During internal development, our optimized system achieved an nDCG@5 score of 0.1991 on answerable queries, outperforming the official BM25 baseline. On the official blind test set, the system achieved a score of 0.1744. While our absolute performance trails behind baselines utilizing massive 20B parameter models, our work establishes a crucial baseline for extreme resource efficiency in conversational RAG. We provide a comprehensive error analysis detailing the impact of domain shifts, retrieval funnels, and we conduct a qualitative analysis on the organizers’ surprise “Underspecified” class to highlight the vulnerabilities of generative query rewriting.