Yaxin Bi


2025

pdf bib
Optimizing RAG: Classifying Queries for Dynamic Processing
Kabir Olawore | Michael McTear | Yaxin Bi | David Griol
Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology

In Retrieval-Augmented Generation (RAG) systems efficient information retrieval is crucial for enhancing user experience and satisfaction, as response times and computational demands significantly impact performance. RAG can be unnecessarily resource-intensive for frequently asked questions (FAQs) and simple questions. In this paper we introduce an approach in which we categorize user questions into simple queries that do not require RAG processing. Evaluation results show that our proposal reduces latency and improves response efficiency compared to systems relying solely on RAG.

pdf bib
From Complex Word Identification to Substitution: Instruction-Tuned Language Models for Lexical Simplification
Tonghui Han | Xinru Zhang | Yaxin Bi | Maurice D. Mulvenna | Dongqiang Yang
Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025)

Lexical-level sentence simplification is essential for improving text accessibility, yet traditional methods often struggle to dynamically identify complex terms and generate contextually appropriate substitutions, resulting in limited generalization. While prompt-based approaches with large language models (LLMs) have shown strong performance and adaptability, they often lack interpretability and are prone to hallucinating. This study proposes a fine-tuning approach for mid-sized LLMs to emulate the lexical simplification pipeline. We transform complex word identification datasets into an instruction–response format to support instruction tuning. Experimental results show that our method substantially enhances complex word identification accuracy with reduced hallucinations while achieving competitive performance on lexical simplification benchmarks. Furthermore, we find that integrating fine-tuning with prompt engineering reduces dependency on manual prompt optimization, leading to a more efficient simplification framework.