2025
pdf
bib
abs
MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation
Chia-Yuan Chang
|
Zhimeng Jiang
|
Vineeth Rakesh
|
Menghai Pan
|
Chin-Chia Michael Yeh
|
Guanchu Wang
|
Mingzhi Hu
|
Zhichao Xu
|
Yan Zheng
|
Mahashweta Das
|
Na Zou
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) are becoming essential tools for various natural language processing tasks but often suffer from generating outdated or incorrect information. Retrieval-Augmented Generation (RAG) addresses this issue by incorporating external, real-time information retrieval to ground LLM responses. However, the existing RAG systems frequently struggle with the quality of retrieval documents, as irrelevant or noisy documents degrade performance, increase computational overhead, and undermine response reliability. To tackle this problem, we propose Multi-Agent Filtering Retrieval-Augmented Generation (MAIN-RAG), a training-free RAG framework that leverages multiple LLM agents to collaboratively filter and score retrieved documents. Specifically, MAIN-RAG introduces an adaptive filtering mechanism that dynamically adjusts the relevance filtering threshold based on score distributions, effectively minimizing noise while maintaining high recall of relevant documents. The proposed approach leverages inter-agent consensus to ensure robust document selection without requiring additional training data or fine-tuning. Experimental results across four QA benchmarks demonstrate that MAIN-RAG consistently outperforms traditional RAG approaches, achieving a 2–11% improvement in answer accuracy while reducing the number of irrelevant retrieved documents. Quantitative analysis further reveals that our approach achieves superior response consistency and answer accuracy over baseline methods, offering a competitive and practical alternative to training-based solutions.
pdf
bib
abs
Enhancing Foundation Models in Transaction Understanding with LLM-based Sentence Embeddings
Xiran Fan
|
Zhimeng Jiang
|
Chin-Chia Michael Yeh
|
Yuzhong Chen
|
Yingtong Dou
|
Menghai Pan
|
Yan Zheng
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
The ubiquity of payment networks generates vast transactional data encoding rich consumer and merchant behavioral patterns. Recent foundation models for transaction analysis process tabular data sequentially but rely on index-based representations for categorical merchant fields, causing substantial semantic information loss by converting rich textual data into discrete tokens. While Large Language Models (LLMs) can address this limitation through superior semantic understanding, their computational overhead challenges real-time financial deployment. We introduce a hybrid framework that uses LLM-generated embeddings as semantic initializations for lightweight transaction models, balancing interpretability with operational efficiency. Our approach employs multi-source data fusion to enrich merchant categorical fields and a one-word constraint principle for consistent embedding generation across LLM architectures. We systematically address data quality through noise filtering and context-aware enrichment. Experiments on large-scale transaction datasets demonstrate significant performance improvements across multiple transaction understanding tasks.
pdf
bib
abs
LoRATK: LoRA Once, Backdoor Everywhere in the Share-and-Play Ecosystem
Hongyi Liu
|
Shaochen Zhong
|
Xintong Sun
|
Minghao Tian
|
Mohsen Hariri
|
Zirui Liu
|
Ruixiang Tang
|
Zhimeng Jiang
|
Jiayi Yuan
|
Yu-Neng Chuang
|
Li Li
|
Soo-Hyun Choi
|
Rui Chen
|
Vipin Chaudhary
|
Xia Hu
Findings of the Association for Computational Linguistics: EMNLP 2025
Backdoor attacks are powerful and effective, but distributing LLMs without a proven track record like ‘meta-llama‘ or ‘qwen‘ rarely gains community traction. We identify LoRA sharing as a unique scenario where users are more willing to try unendorsed assets, since such shared LoRAs allow them to enjoy personalized LLMs with negligible investment. However, this convenient share-and-play ecosystem also introduces a new attack surface, where attackers can distribute malicious LoRAs to an undefended community. Despite the high-risk potential, no prior art has comprehensively explored LoRA’s attack surface under the downstream-enhancing share-and-play context. In this paper, we investigate how backdoors can be injected into task-enhancing LoRAs and examine the mechanisms of such infections. We find that with a simple, efficient, yet specific recipe, **a backdoor LoRA can be trained once and then seamlessly merged (in a training-free fashion) with multiple task-enhancing LoRAs, retaining both its malicious backdoor and benign downstream capabilities.** This allows attackers to scale the distribution of compromised LoRAs with minimal effort by leveraging the rich pool of existing shared LoRA assets. We note that such merged LoRAs are particularly *infectious* — because their malicious intent is cleverly concealed behind improved downstream capabilities, creating a strong incentive for voluntary download — and *dangerous* — because under local deployment, no safety measures exist to intervene when things go wrong. Our work is among the first to study this new threat model of training-free distribution of downstream-capable-yet-backdoor-injected LoRAs, highlighting the urgent need for heightened security awareness in the LoRA ecosystem. **Warning: This paper contains offensive content and involves a real-life tragedy.**