2025
pdf
bib
abs
From Selection to Generation: A Survey of LLM-based Active Learning
Yu Xia
|
Subhojyoti Mukherjee
|
Zhouhang Xie
|
Junda Wu
|
Xintong Li
|
Ryan Aponte
|
Hanjia Lyu
|
Joe Barrow
|
Hongjie Chen
|
Franck Dernoncourt
|
Branislav Kveton
|
Tong Yu
|
Ruiyi Zhang
|
Jiuxiang Gu
|
Nesreen K. Ahmed
|
Yu Wang
|
Xiang Chen
|
Hanieh Deilamsalehy
|
Sungchul Kim
|
Zhengmian Hu
|
Yue Zhao
|
Nedim Lipka
|
Seunghyun Yoon
|
Ting-Hao Kenneth Huang
|
Zichao Wang
|
Puneet Mathur
|
Soumyabrata Pal
|
Koyel Mukherjee
|
Zhehao Zhang
|
Namyong Park
|
Thien Huu Nguyen
|
Jiebo Luo
|
Ryan A. Rossi
|
Julian McAuley
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Active Learning (AL) has been a powerful paradigm for improving model efficiency and performance by selecting the most informative data points for labeling and training. In recent active learning frameworks, Large Language Models (LLMs) have been employed not only for selection but also for generating entirely new data instances and providing more cost-effective annotations. Motivated by the increasing importance of high-quality data and efficient model training in the era of LLMs, we present a comprehensive survey on LLM-based Active Learning. We introduce an intuitive taxonomy that categorizes these techniques and discuss the transformative roles LLMs can play in the active learning loop. We further examine the impact of AL on LLM learning paradigms and its applications across various domains. Finally, we identify open challenges and propose future research directions. This survey aims to serve as an up-to-date resource for researchers and practitioners seeking to gain an intuitive understanding of LLM-based AL techniques and deploy them to new applications.
pdf
bib
abs
GUI Agents: A Survey
Dang Nguyen
|
Jian Chen
|
Yu Wang
|
Gang Wu
|
Namyong Park
|
Zhengmian Hu
|
Hanjia Lyu
|
Junda Wu
|
Ryan Aponte
|
Yu Xia
|
Xintong Li
|
Jing Shi
|
Hongjie Chen
|
Viet Dac Lai
|
Zhouhang Xie
|
Sungchul Kim
|
Ruiyi Zhang
|
Tong Yu
|
Mehrab Tanjim
|
Nesreen K. Ahmed
|
Puneet Mathur
|
Seunghyun Yoon
|
Lina Yao
|
Branislav Kveton
|
Jihyung Kil
|
Thien Huu Nguyen
|
Trung Bui
|
Tianyi Zhou
|
Ryan A. Rossi
|
Franck Dernoncourt
Findings of the Association for Computational Linguistics: ACL 2025
Graphical User Interface (GUI) agents, powered by Large Foundation Models, have emerged as a transformative approach to automating human-computer interaction. These agents autonomously interact with digital systems via GUIs, emulating human actions such as clicking, typing, and navigating visual elements across diverse platforms. Motivated by the growing interest and fundamental importance of GUI agents, we provide a comprehensive survey that categorizes their benchmarks, evaluation metrics, architectures, and training methods. We propose a unified framework that delineates their perception, reasoning, planning, and acting capabilities. Furthermore, we identify important open challenges and discuss key future directions. Finally, this work serves as a basis for practitioners and researchers to gain an intuitive understanding of current progress, techniques, benchmarks, and critical open problems that remain to be addressed.
pdf
bib
abs
Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision
Zhouhang Xie
|
Tushar Khot
|
Bhavana Dalvi Mishra
|
Harshit Surana
|
Julian McAuley
|
Peter Clark
|
Bodhisattwa Prasad Majumder
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Instruction-following LLMs have recently allowed systems to discover hidden concepts from a collection of unstructured documents based on a natural language description of the purpose of the discovery (i.e., goal). Still, the quality of the discovered concepts remains mixed, as it depends heavily on LLM’s reasoning ability and drops when the data is noisy or beyond LLM’s knowledge. We present Instruct-LF, a goal-oriented latent factor discovery system that integrates LLM’s instruction-following ability with statistical models to handle large, noisy datasets where LLM reasoning alone falls short. Instruct-LF uses LLMs to propose fine-grained, goal-related properties from documents, estimates their presence across the dataset, and applies gradient-based optimization to uncover hidden factors, where each factor is represented by a cluster of co-occurring properties. We evaluate latent factors produced by Instruct-LF on movie recommendation, text-world navigation, and legal document categorization tasks. These interpretable representations improve downstream task performance by 5-52% than the best baselines and were preferred 1.8 times as often as the best alternative, on average, in human evaluation.
2024
pdf
bib
abs
Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning
Zhouhang Xie
|
Bodhisattwa Prasad Majumder
|
Mengjie Zhao
|
Yoshinori Maeda
|
Keiichi Yamada
|
Hiromi Wakaki
|
Julian McAuley
Findings of the Association for Computational Linguistics: ACL 2024
We consider the task of building a dialogue system that can motivate users to adopt positive lifestyle changes, Motivational Interviewing (MI). Addressing such a task requires a system that could infer how to motivate the user effectively. We propose DIIR, a framework that is capable of learning and applying conversation strategies in the form of natural language inductive rules from expert demonstrations. Automatic and human evaluation on instruction-following large language models show natural language strategies descriptions discovered by DIIR can improve active listening skills, reduce unsolicited advice, and promote more collaborative and less authoritative conversations, outperforming in-context demonstrations that are over 50 times longer.
pdf
bib
abs
Mitigating Hallucination in Fictional Character Role-Play
Nafis Sadeq
|
Zhouhang Xie
|
Byungkyu Kang
|
Prarit Lamba
|
Xiang Gao
|
Julian McAuley
Findings of the Association for Computational Linguistics: EMNLP 2024
Role-playing has wide-ranging applications in customer support, embodied agents, and computational social science. The influence of parametric world knowledge of large language models (LLMs) often causes role-playing characters to act out of character and to hallucinate about things outside the scope of their knowledge. In this work, we focus on the evaluation and mitigation of hallucination in fictional character role-play. We introduce a dataset with over 2,000 characters and 72,000 interviews, including 18,000 adversarial questions. We propose RoleFact, a role-playing method that mitigates hallucination by modulating the influence of parametric knowledge using a pre-calibrated confidence threshold. Experiments show that the proposed method improves the factual precision of generated responses by 18% for adversarial questions with a 44% reduction in temporal hallucination for time-sensitive interviews. The code and the dataset are available at
https://github.com/NafisSadeq/rolefact.git.
pdf
bib
abs
FUTGA: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation
Junda Wu
|
Zachary Novack
|
Amit Namburi
|
Jiaheng Dai
|
Hao-Wen Dong
|
Zhouhang Xie
|
Carol Chen
|
Julian McAuley
Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA)
We propose FUTGA, a model equipped with fined-grained music understanding capabilities through learning from generative augmentation with temporal compositions. We leverage existing music caption datasets and large language models (LLMs) to synthesize fine-grained music captions with structural descriptions and time boundaries for full-length songs. Augmented by the proposed synthetic dataset, FUTGA is enabled to identify the music’s temporal changes at key transition points and their musical functions, as well as generate detailed descriptions for each music segment. We further introduce a full-length music caption dataset generated by FUTGA, as the augmentation of the MusicCaps and the Song Describer datasets. The experiments demonstrate the better quality of the generated captions, which capture the time boundaries of long-form music.
2021
pdf
bib
abs
What Models Know About Their Attackers: Deriving Attacker Information From Latent Representations
Zhouhang Xie
|
Jonathan Brophy
|
Adam Noack
|
Wencong You
|
Kalyani Asthana
|
Carter Perkins
|
Sabrina Reis
|
Zayd Hammoudeh
|
Daniel Lowd
|
Sameer Singh
Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP
Adversarial attacks curated against NLP models are increasingly becoming practical threats. Although various methods have been developed to detect adversarial attacks, securing learning-based NLP systems in practice would require more than identifying and evading perturbed instances. To address these issues, we propose a new set of adversary identification tasks, Attacker Attribute Classification via Textual Analysis (AACTA), that attempts to obtain more detailed information about the attackers from adversarial texts. Specifically, given a piece of adversarial text, we hope to accomplish tasks such as localizing perturbed tokens, identifying the attacker’s access level to the target model, determining the evasion mechanism imposed, and specifying the perturbation type employed by the attacking algorithm. Our contributions are as follows: we formalize the task of classifying attacker attributes, and create a benchmark on various target models from sentiment classification and abuse detection domains. We show that signals from BERT models and target models can be used to train classifiers that reveal the properties of the attacking algorithms. We demonstrate that adversarial attacks leave interpretable traces in both feature spaces of pre-trained language models and target models, making AACTA a promising direction towards more trustworthy NLP systems.