Gang Wu
2026
Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs
Paiheng Xu | Gang Wu | Xiang Chen | Tong Yu | Chang Xiao | Franck Dernoncourt | Tianyi Zhou | Wei Ai | Viswanathan Swaminathan
Findings of the Association for Computational Linguistics: EACL 2026
Paiheng Xu | Gang Wu | Xiang Chen | Tong Yu | Chang Xiao | Franck Dernoncourt | Tianyi Zhou | Wei Ai | Viswanathan Swaminathan
Findings of the Association for Computational Linguistics: EACL 2026
Scripting interfaces enable users to automate tasks and customize software workflows, but creating scripts traditionally requires programming expertise and familiarity with specific APIs, posing barriers for many users. While Large Language Models (LLMs) can generate code from natural language queries, runtime code generation is severely limited due to unverified code, security risks, longer response times, and higher computational costs. To bridge the gap, we propose an offline simulation framework to curate a software-specific skillset—a collection of verified scripts—by exploiting LLMs and publicly available scripting guides. Our framework comprises two components: (1) task creation, using top-down functionality guidance and bottom-up API synergy exploration to generate helpful tasks; and (2) skill generation with trials, refining and validating scripts based on execution feedback. To efficiently navigate the extensive API landscape, we introduce a Graph Neural Network (GNN)-based link prediction model to capture API synergy, enabling the generation of skills involving underutilized APIs and expanding the skillset’s diversity. Experiments with Adobe Illustrator demonstrate that our framework significantly improves automation success rates, reduces response time, and saves runtime token costs compared to traditional runtime code generation. This is the first attempt to use software scripting interfaces as a testbed for LLM-based systems, highlighting the advantages of leveraging execution feedback in a controlled environment and offering valuable insights into aligning AI capabilities with user needs in specialized software domains.
2025
GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration
Yue Fan | Handong Zhao | Ruiyi Zhang | Yu Shen | Xin Eric Wang | Gang Wu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Yue Fan | Handong Zhao | Ruiyi Zhang | Yu Shen | Xin Eric Wang | Gang Wu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Graphical User Interface (GUI) action grounding, mapping language instructions to actionable elements on GUI screens, is important for assisting users in interactive tutorials, task automation, accessibility support, etc. Most recent works of GUI action grounding use large GUI datasets to fine-tune Multimodal Large Language Models (MLLMs). However, the fine-tuning data is inherently limited to specific GUI environments, leading to significant performance degradation in novel environments due to the generalization challenges in the GUI domain. Therefore, we argue that GUI action grounding models should be further aligned with novel environments before deployment to optimize their performance. To address this, we first propose GUI-Bee, an MLLM-based autonomous agent, to collect high-quality, environment-specific data through exploration and then continuously fine-tune GUI grounding models with the collected data. To ensure the GUI action grounding models generalize to various screens within the target novel environment after the continuous fine-tuning, we equip GUI-Bee with a novel Q-value-Incentive In-Context Reinforcement Learning (Q-ICRL) algorithm that optimizes exploration efficiency and exploration data quality. In the experiment, we introduce NovelScreenSpot to test how well the data can help align GUI action grounding models to novel environments. Furthermore, we conduct an ablation study to validate the Q-ICRL method in enhancing the efficiency of GUI-Bee.
GUI Agents: A Survey
Dang Nguyen | Jian Chen | Yu Wang | Gang Wu | Namyong Park | Zhengmian Hu | Hanjia Lyu | Junda Wu | Ryan Aponte | Yu Xia | Xintong Li | Jing Shi | Hongjie Chen | Viet Dac Lai | Zhouhang Xie | Sungchul Kim | Ruiyi Zhang | Tong Yu | Mehrab Tanjim | Nesreen K. Ahmed | Puneet Mathur | Seunghyun Yoon | Lina Yao | Branislav Kveton | Jihyung Kil | Thien Huu Nguyen | Trung Bui | Tianyi Zhou | Ryan A. Rossi | Franck Dernoncourt
Findings of the Association for Computational Linguistics: ACL 2025
Dang Nguyen | Jian Chen | Yu Wang | Gang Wu | Namyong Park | Zhengmian Hu | Hanjia Lyu | Junda Wu | Ryan Aponte | Yu Xia | Xintong Li | Jing Shi | Hongjie Chen | Viet Dac Lai | Zhouhang Xie | Sungchul Kim | Ruiyi Zhang | Tong Yu | Mehrab Tanjim | Nesreen K. Ahmed | Puneet Mathur | Seunghyun Yoon | Lina Yao | Branislav Kveton | Jihyung Kil | Thien Huu Nguyen | Trung Bui | Tianyi Zhou | Ryan A. Rossi | Franck Dernoncourt
Findings of the Association for Computational Linguistics: ACL 2025
Graphical User Interface (GUI) agents, powered by Large Foundation Models, have emerged as a transformative approach to automating human-computer interaction. These agents autonomously interact with digital systems via GUIs, emulating human actions such as clicking, typing, and navigating visual elements across diverse platforms. Motivated by the growing interest and fundamental importance of GUI agents, we provide a comprehensive survey that categorizes their benchmarks, evaluation metrics, architectures, and training methods. We propose a unified framework that delineates their perception, reasoning, planning, and acting capabilities. Furthermore, we identify important open challenges and discuss key future directions. Finally, this work serves as a basis for practitioners and researchers to gain an intuitive understanding of current progress, techniques, benchmarks, and critical open problems that remain to be addressed.
A Survey on Small Language Models
Chien Van Nguyen | Xuan Shen | Ryan Aponte | Yu Xia | Samyadeep Basu | Zhengmian Hu | Jian Chen | Mihir Parmar | Sasidhar Kunapuli | Joe Barrow | Junda Wu | Ashish Singh | Yu Wang | Jiuxiang Gu | Nesreen K. Ahmed | Nedim Lipka | Ruiyi Zhang | Xiang Chen | Tong Yu | Sungchul Kim | Hanieh Deilamsalehy | Namyong Park | Michael Rimer | Zhehao Zhang | Huanrui Yang | Puneet Mathur | Gang Wu | Franck Dernoncourt | Ryan A. Rossi | Thien Huu Nguyen
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Chien Van Nguyen | Xuan Shen | Ryan Aponte | Yu Xia | Samyadeep Basu | Zhengmian Hu | Jian Chen | Mihir Parmar | Sasidhar Kunapuli | Joe Barrow | Junda Wu | Ashish Singh | Yu Wang | Jiuxiang Gu | Nesreen K. Ahmed | Nedim Lipka | Ruiyi Zhang | Xiang Chen | Tong Yu | Sungchul Kim | Hanieh Deilamsalehy | Namyong Park | Michael Rimer | Zhehao Zhang | Huanrui Yang | Puneet Mathur | Gang Wu | Franck Dernoncourt | Ryan A. Rossi | Thien Huu Nguyen
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device, mobile, edge devices, among many others. In this article, we present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques. We propose a novel taxonomy for categorizing the methods used to optimize SLMs, including model compression, pruning, and quantization techniques. We summarize the benchmark datasets that are useful for benchmarking SLMs along with the evaluation metrics commonly used. Additionally, we highlight key open challenges that remain to be addressed. Our survey aims to serve as a valuable resource for researchers and practitioners interested in developing and deploying small yet efficient language models.
2022
BotSIM: An End-to-End Bot Simulation Framework for Commercial Task-Oriented Dialog Systems
Guangsen Wang | Samson Tan | Shafiq Joty | Gang Wu | Jimmy Au | Steven C.h. Hoi
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Guangsen Wang | Samson Tan | Shafiq Joty | Gang Wu | Jimmy Au | Steven C.h. Hoi
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
We present BotSIM, a data-efficient end-to-end Bot SIMulation framework for commercial task-oriented dialog (TOD) systems. BotSIM consists of three major components: 1) a Generator that can infer semantic-level dialog acts and entities from bot definitions and generate user queries via model-based paraphrasing; 2) an agenda-based dialog user Simulator (ABUS) to simulate conversations with the dialog agents; 3) a Remediator to analyze the simulated conversations, visualize the bot health reports and provide actionable remediation suggestions for bot troubleshooting and improvement. We demonstrate BotSIM’s effectiveness in end-to-end evaluation, remediation and multi-intent dialog generation via case studies on two commercial bot platforms. BotSIM’s “generation-simulation-remediation” paradigm accelerates the end-to-end bot evaluation and iteration process by: 1) reducing manual test cases creation efforts; 2) enabling a holistic gauge of the bot in terms of NLU and end-to-end performance via extensive dialog simulation; 3) improving the bot troubleshooting process with actionable suggestions. A demo of our system can be found at https://tinyurl.com/mryu74cd and a demo video at https://youtu.be/qLPJm6_UOKY.
Search
Fix author
Co-authors
- Franck Dernoncourt 3
- Tong Yu 3
- Ruiyi Zhang 3
- Nesreen K. Ahmed 2
- Ryan Aponte 2
- Xiang Chen 2
- Zhengmian Hu 2
- Sungchul Kim 2
- Puneet Mathur 2
- Thien Huu Nguyen 2
- Namyong Park 2
- Ryan A. Rossi 2
- Junda Wu 2
- Yu Xia 2
- Tianyi Zhou 2
- Wei Ai 1
- Jimmy Au 1
- Joe Barrow 1
- Samyadeep Basu 1
- Trung Bui 1
- Jian Chen 1
- Hongjie Chen 1
- Jian Chen 1
- Hanieh Deilamsalehy 1
- Yue Fan 1
- Jiuxiang Gu 1
- Steven C.H. Hoi 1
- Shafiq Joty 1
- Jihyung Kil 1
- Sasidhar Kunapuli 1
- Branislav Kveton 1
- Viet Dac Lai 1
- Xintong Li 1
- Nedim Lipka 1
- Hanjia Lyu 1
- Dang Nguyen 1
- Chien Van Nguyen 1
- Mihir Parmar 1
- Michael Rimer 1
- Yu Shen 1
- Xuan Shen 1
- Jing Shi 1
- Ashish Singh 1
- Viswanathan Swaminathan 1
- Samson Tan 1
- Mehrab Tanjim 1
- Guangsen Wang 1
- Xin Eric Wang 1
- Yu Wang 1
- Yu Wang 1
- Chang Xiao 1
- Zhouhang Xie 1
- Paiheng Xu 1
- Huanrui Yang 1
- Lina Yao 1
- Seunghyun Yoon 1
- Zhehao Zhang 1
- Handong Zhao 1