TT-SI: Self-Improving LLM Agents with Test-Time Training

Emre Can Acikgoz; Cheng Qian; Heng Ji; Dilek Hakkani-T\"ur; Gokhan Tur

TT-SI: Self-Improving LLM Agents with Test-Time Training

Emre Can Acikgoz, Cheng Qian, Heng Ji, Dilek Hakkani-T\"ur, Gokhan Tur

Abstract

One paradigm of language model (LM) fine-tuning relies on creating large training datasets, under the assumption that high quantity and diversity will enable models to generalize to novel tasks after post-training. In practice, gathering large sets of data is inefficient, and training on them is prohibitively expensive; worse, there is no guarantee that the resulting model will handle complex scenarios or generalize better. Moreover, existing techniques rarely assess whether a training sample provides novel information, resulting in unnecessary costs. In this work, we explore a new Test-Time Self-Improvement (TT-SI) algorithm to create more effective and generalizable agentic LMs on-the-fly. TT-SI can be summarized in three steps: (i) first it identifies the samples that model struggles with (self-awareness), (ii) then generates similar examples from detected uncertain samples (self-data augmentation), and (iii) uses these newly generated samples at test-time training (self-improvement). We further explore Test-Time Distillation (TT-D), which leverages a stronger supervisor for targeted data generation. Empirical evaluations across different agent benchmarks demonstrate that TT-SI improves the performance with +5.48% absolute accuracy gain on average across all benchmarks and surpasses other standard learning methods more efficiently. Our findings highlight the promise of TT-SI, demonstrating the potential of self-improvement algorithms at test-time as a new paradigm for building more capable agents toward self-evolution.

Anthology ID:: 2026.findings-acl.462
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9483–9508
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.462/
DOI:
Bibkey:
Cite (ACL):: Emre Can Acikgoz, Cheng Qian, Heng Ji, Dilek Hakkani-T\"ur, and Gokhan Tur. 2026. TT-SI: Self-Improving LLM Agents with Test-Time Training. In Findings of the Association for Computational Linguistics: ACL 2026, pages 9483–9508, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: TT-SI: Self-Improving LLM Agents with Test-Time Training (Acikgoz et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.462.pdf
Checklist:: 2026.findings-acl.462.checklist.pdf

PDF Cite Search Checklist Fix data