TT-SI: Self-Improving LLM Agents with Test-Time Training
Emre Can Acikgoz, Cheng Qian, Heng Ji, Dilek Hakkani-T\"ur, Gokhan Tur
Abstract
One paradigm of language model (LM) fine-tuning relies on creating large training datasets, under the assumption that high quantity and diversity will enable models to generalize to novel tasks after post-training. In practice, gathering large sets of data is inefficient, and training on them is prohibitively expensive; worse, there is no guarantee that the resulting model will handle complex scenarios or generalize better. Moreover, existing techniques rarely assess whether a training sample provides novel information, resulting in unnecessary costs. In this work, we explore a new Test-Time Self-Improvement (TT-SI) algorithm to create more effective and generalizable agentic LMs on-the-fly. TT-SI can be summarized in three steps: (i) first it identifies the samples that model struggles with (self-awareness), (ii) then generates similar examples from detected uncertain samples (self-data augmentation), and (iii) uses these newly generated samples at test-time training (self-improvement). We further explore Test-Time Distillation (TT-D), which leverages a stronger supervisor for targeted data generation. Empirical evaluations across different agent benchmarks demonstrate that TT-SI improves the performance with +5.48% absolute accuracy gain on average across all benchmarks and surpasses other standard learning methods more efficiently. Our findings highlight the promise of TT-SI, demonstrating the potential of self-improvement algorithms at test-time as a new paradigm for building more capable agents toward self-evolution.- Anthology ID:
- 2026.findings-acl.462
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9483–9508
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.462/
- DOI:
- Cite (ACL):
- Emre Can Acikgoz, Cheng Qian, Heng Ji, Dilek Hakkani-T\"ur, and Gokhan Tur. 2026. TT-SI: Self-Improving LLM Agents with Test-Time Training. In Findings of the Association for Computational Linguistics: ACL 2026, pages 9483–9508, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- TT-SI: Self-Improving LLM Agents with Test-Time Training (Acikgoz et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.462.pdf