Jinjin Tian


2025

pdf bib
Learning to Instruct: Fine-Tuning a Task-Aware Instruction Optimizer for Black-Box LLMs
Yunzhe Qi | Jinjin Tian | Tianci Liu | Ruirui Li | Tianxin Wei | Hui Liu | Xianfeng Tang | Monica Xiao Cheng | Jingrui He
Findings of the Association for Computational Linguistics: EMNLP 2025

The performance of Large Language Models (LLMs) critically depends on designing effective instructions, which is particularly challenging for black-box LLMs with inaccessible internal states. To this end, we introduce Learning to Instruct, a novel paradigm that formulates instruction optimization as an LLM fine-tuning objective for a white-box “instruction engineer” LLM, leveraging its rich learning capacity and vast pre-trained knowledge to enable efficient and effective instruction optimization. Within this paradigm, we propose Automatic Instruction Optimizer (AIO), a novel framework that fine-tunes a white-box LLM into a capable instruction engineer. AIO learns to optimize task-aware, human-comprehensible instructions by incorporating task nuances and feedback from the task-solving black-box LLM. To overcome the challenges of inaccessible black-box gradients and high API costs, AIO introduces a novel zeroth-order (ZO) gradient approximation mechanism guided by Thompson Sampling (TS), which reuses informative black-box LLM feedback for improved query efficiency. Extensive experiments show that AIO generally outperforms strong baselines in both effectiveness and efficiency, establishing Learning to Instruct as a promising new direction for black-box LLM instruction optimization.

2024

pdf bib
BlendFilter: Advancing Retrieval-Augmented Large Language Models via Query Generation Blending and Knowledge Filtering
Haoyu Wang | Ruirui Li | Haoming Jiang | Jinjin Tian | Zhengyang Wang | Chen Luo | Xianfeng Tang | Monica Xiao Cheng | Tuo Zhao | Jing Gao
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Retrieval-augmented Large Language Models (LLMs) offer substantial benefits in enhancing performance across knowledge-intensive scenarios. However, these methods often struggle with complex inputs and encounter difficulties due to noisy knowledge retrieval, notably hindering model effectiveness. To address this issue, we introduce BlendFilter, a novel approach that elevates retrieval-augmented LLMs by integrating query generation blending with knowledge filtering. BlendFilter proposes the blending process through its query generation method, which integrates both external and internal knowledge augmentation with the original query, ensuring comprehensive information gathering. Additionally, our distinctive knowledge filtering module capitalizes on the intrinsic capabilities of the LLM, effectively eliminating extraneous data. We conduct extensive experiments on three open-domain question answering benchmarks, and the findings clearly indicate that our innovative BlendFilter surpasses state-of-the-art baselines significantly.

2023

pdf bib
UseClean: learning from complex noisy labels in named entity recognition
Jinjin Tian | Kun Zhou | Meiguo Wang | Yu Zhang | Benjamin Yao | Xiaohu Liu | Chenlei Guo
Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD)

We investigate and refine denoising methods for NER task on data that potentially contains extremely noisy labels from multi-sources. In this paper, we first summarized all possible noise types and noise generation schemes, based on which we built a thorough evaluation system. We then pinpoint the bottleneck of current state-of-art denoising methods using our evaluation system. Correspondingly, we propose several refinements, including using a two-stage framework to avoid error accumulation; a novel confidence score utilizing minimal clean supervision to increase predictive power; an automatic cutoff fitting to save extensive hyper-parameter tuning; a warm started weighted partial CRF to better learn on the noisy tokens. Additionally, we propose to use adaptive sampling to further boost the performance in long-tailed entity settings. Our method improves F1 score by on average at least 5 10% over current state-of-art across extensive experiments.