Jae-Gil Lee

2026

Retrieval-augmented generation(RAG) systems depend on retrieval modules to supply grounding evidence for large language models. While hybrid approaches combining sparse and dense retrievers improve performance, most rely on fixed weights that ignore query-specific and corpus-specific variation. Similarly, query expansion has long been used to enrich recall, but its integration with original queries is usually static and can introduce noise. We present QuDAR, a dual-perspective adaptive retrieval framework that adapts along two perspectives: retriever type (sparse vs. dense) and query format (original vs.expanded). Leveraging margin-derived confidence (e.g., top-1–top-2 score gaps) and blind LLM-based relevance scoring, QuDAR dynamically assigns query-specific weights, fusing lexical specificity with semantic breadth while mitigating noise. QuDAR is lightweight, retriever-agnostic, and broadly applicable. Experiments show consistent gains over static baselines, improving overall retrieval quality and yielding more stable performance across queries.

2025

pdf bib abs

MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained Devices
Patara Trirat | Jae-Gil Lee
Findings of the Association for Computational Linguistics: EMNLP 2025

The growing use of smartphones and IoT devices necessitates efficient time-series analysis on resource-constrained hardware, which is critical for sensing applications such as human activity recognition and air quality prediction. Recent efforts in hardware-aware neural architecture search (NAS) automate architecture discovery for specific platforms; however, none focus on general time-series analysis with edge deployment. Leveraging the problem-solving and reasoning capabilities of large language models (LLM), we propose ***MONAQ***, a novel framework that reformulates NAS into ***M***ulti-***O***bjective ***N***eural ***A***rchitecture ***Q***uerying tasks. *MONAQ* is equipped with *multimodal query generation* for processing multimodal time-series inputs and hardware constraints, alongside an *LLM agent-based multi-objective search* to achieve deployment-ready models via code generation. By integrating numerical data, time-series images, and textual descriptions, *MONAQ* improves an LLM’s understanding of time-series data. Experiments on fifteen datasets demonstrate that *MONAQ*-discovered models outperform both handcrafted models and NAS baselines while being more efficient.

Co-authors

Patara Trirat 1

Seunghyouk Yoon 1

Venues

ACL1
Findings1

Fix author