NAT: Enhancing Agent Tuning with Negative Samples

Renxi Wang; Xudong Han; Yixuan Zhang; Timothy Baldwin; Haonan Li

NAT: Enhancing Agent Tuning with Negative Samples

Renxi Wang, Xudong Han, Yixuan Zhang, Timothy Baldwin, Haonan Li

Abstract

Interaction trajectories between agents and environments have proven effective in tuning LLMs into task-specific agents. However, constructing these trajectories, especially successful trajectories, is often computationally and time intensive due to the relatively low success rates of even the most advanced LLMs, such as GPT-4 and Claude. Additionally, common training paradigms like supervised fine-tuning (SFT) and reinforcement learning (RL) not only require large volumes of data but also have specific demands regarding the trajectories used. For instance, existing SFT approaches typically utilize only positive examples, limiting their efficiency in low-resource scenarios. To address this, we introduce Negative-Aware Training (NAT), a straightforward yet effective method that leverages both successful and failed trajectories for fine-tuning, maximizing the utility of limited resources. Experimental results demonstrate that NAT consistently surpasses existing methods, including SFT, DPO, and PPO, across various tasks.

Anthology ID:: 2025.naacl-long.378
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7385–7398
Language:
URL:: https://preview.aclanthology.org/landing_page/2025.naacl-long.378/
DOI:
Bibkey:
Cite (ACL):: Renxi Wang, Xudong Han, Yixuan Zhang, Timothy Baldwin, and Haonan Li. 2025. NAT: Enhancing Agent Tuning with Negative Samples. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 7385–7398, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: NAT: Enhancing Agent Tuning with Negative Samples (Wang et al., NAACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2025.naacl-long.378.pdf

PDF Cite Search Fix data