Ziji Zhang
2025
REIC: RAG-Enhanced Intent Classification at Scale
Ziji Zhang
|
Michael Yang
|
Zhiyu Chen
|
Yingying Zhuang
|
Shu-Ting Pi
|
Qun Liu
|
Rajashekar Maragoud
|
Vy Nguyen
|
Anurag Beniwal
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Accurate intent classification is critical for efficient routing in customer service, ensuring customers are connected with the most suitable agents while reducing handling times and operational costs. However, as companies expand their product lines, intent classification faces scalability challenges due to the increasing number of intents and variations in taxonomy across different verticals. In this paper, we introduce REIC, a Retrieval-augmented generation Enhanced Intent Classification approach, which addresses these challenges effectively. REIC leverages retrieval-augmented generation (RAG) to dynamically incorporate relevant knowledge, enabling precise classification without the need for frequent retraining. Through extensive experiments on real-world datasets, we demonstrate that REIC outperforms traditional fine-tuning, zero-shot, and few-shot methods in large-scale customer service settings. Our results highlight its effectiveness in both in-domain and out-of-domain scenarios, demonstrating its potential for real-world deployment in adaptive and large-scale intent classification systems.
2024
From Bottom to Top: Extending the Potential of Parameter Efficient Fine-Tuning
Jihao Gu
|
Zelin Wang
|
Yibo Zhang
|
Ziji Zhang
|
Ping Gong
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
With the proliferation of large language models, Parameter Efficient Fine-Tuning (PEFT) method, which freeze pre-trained parameters and only fine-tune a few task-specific parameters, are playing an increasingly important role. However, previous work primarily applied uniform operations across all layers of the model, overlooking the fact that different layers in a transformer store different information. In the process of exploration, We find that there is a significant differences in fine-tuning strategies between different layers, and fine-tuning only a subset of layers can even achieve comparable performance. Based on this, we propose the Hybrid LoRA-Prefix Tuning(HLPT) method, which uses enhanced LoRA and Prefix-tuning methods with learnable adaptive mechanism separately for the bottom and top layers, and the Half Hybrid LoRA-Prefix Tuning(H2LPT) method, which goes a step further, reducing the parameter count to nearly half by omitting fine-tuning in the middle layers. Extensive experiments with large language models on various downstream tasks provide strong evidence for the potential of PEFT focusing on different layers’ interactions and the effectiveness of our methods. Furthermore, we validate the robustness of these methods and their advantages in speeding up training convergence, reducing inference time requirements.
Search
Fix author
Co-authors
- Anurag Beniwal 1
- Zhiyu Chen 1
- Ping Gong 1
- Jihao Gu 1
- Qun Liu 1
- show all...