TT-SI: Self-Improving LLM Agents with Test-Time Training

Emre Can Acikgoz, Cheng Qian, Heng Ji, Dilek Hakkani-T\"ur, Gokhan Tur


Abstract
One paradigm of language model (LM) fine-tuning relies on creating large training datasets, under the assumption that high quantity and diversity will enable models to generalize to novel tasks after post-training. In practice, gathering large sets of data is inefficient, and training on them is prohibitively expensive; worse, there is no guarantee that the resulting model will handle complex scenarios or generalize better. Moreover, existing techniques rarely assess whether a training sample provides novel information, resulting in unnecessary costs. In this work, we explore a new Test-Time Self-Improvement (TT-SI) algorithm to create more effective and generalizable agentic LMs on-the-fly. TT-SI can be summarized in three steps: (i) first it identifies the samples that model struggles with (self-awareness), (ii) then generates similar examples from detected uncertain samples (self-data augmentation), and (iii) uses these newly generated samples at test-time training (self-improvement). We further explore Test-Time Distillation (TT-D), which leverages a stronger supervisor for targeted data generation. Empirical evaluations across different agent benchmarks demonstrate that TT-SI improves the performance with +5.48% absolute accuracy gain on average across all benchmarks and surpasses other standard learning methods more efficiently. Our findings highlight the promise of TT-SI, demonstrating the potential of self-improvement algorithms at test-time as a new paradigm for building more capable agents toward self-evolution.
Anthology ID:
2026.findings-acl.462
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9483–9508
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.462/
DOI:
Bibkey:
Cite (ACL):
Emre Can Acikgoz, Cheng Qian, Heng Ji, Dilek Hakkani-T\"ur, and Gokhan Tur. 2026. TT-SI: Self-Improving LLM Agents with Test-Time Training. In Findings of the Association for Computational Linguistics: ACL 2026, pages 9483–9508, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
TT-SI: Self-Improving LLM Agents with Test-Time Training (Acikgoz et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.462.pdf
Checklist:
 2026.findings-acl.462.checklist.pdf