Xin Peng
2026
Taming System Complexity: Demystifying Software Engineering Agents in Diagnosing Linux Kernel Faults
Zhenhao Zhou | Zhuochen Huang | Yike He | Chong Wang | Jiajun Wang | Yijian Wu | Xin Peng | Yiling Lou
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhenhao Zhou | Zhuochen Huang | Yike He | Chong Wang | Jiajun Wang | Yijian Wu | Xin Peng | Yiling Lou
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The Linux kernel is a critical system, serving as the foundation for numerous systems. Bugs in the Linux kernel can cause serious consequences, affecting billions of users. Fault localization (FL), which aims at identifying the buggy code elements in software, plays an essential role in software quality assurance. While recent LLM agents have achieved promising accuracy in FL on recent benchmarks like SWE-bench, it remains unclear how well these methods perform in the Linux kernel, where FL is much more challenging due to the large-scale code base, limited observability, and diverse impact factors. In this paper, we introduce LinuxFLBench, a FL benchmark constructed from real-world Linux kernel bugs. We conduct an empirical study to assess the performance of state-of-the-art LLM agents on the Linux kernel. Our initial results reveal that existing agents struggle with this task, achieving a best top-1 accuracy of only 41.6% at file level. To address this challenge, we propose LinuxFL+, an enhancement framework designed to improve FL effectiveness of LLM agents for the Linux kernel. LinuxFL+ substantially improves the FL accuracy of all studied agents (e.g., 7.2% - 11.2% accuracy increase) with minimal costs.
ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling
Jianghao Lin | Yuanyuan Shi | Xin Peng | Renjie Ding | Hairui Wang | Yuxuan Peng | Bizhe Bai | Weixi Song | Fengshuo Bai | Huacan Chai | Weinan Zhang | Fei Huang | Ying Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jianghao Lin | Yuanyuan Shi | Xin Peng | Renjie Ding | Hairui Wang | Yuxuan Peng | Bizhe Bai | Weixi Song | Fengshuo Bai | Huacan Chai | Weinan Zhang | Fei Huang | Ying Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) excel at function calling, but inference scaling has been explored mainly for unstructured generation. We propose an inference-scaling framework for structured outputs that combines fine-grained beam search with ToolPRM, a process reward model scoring each intra-call decision (function name and argument filling). We build the first fine-grained intra-call supervision dataset via function masking, rollout collection, and step-level annotation. ToolPRM outperforms outcome and coarse-grained reward models in predictive accuracy and yields consistent test-time gains on multiple function-calling benchmarks. We further show that structured generation follows “explore more but retain less”, since early JSON errors are unrecoverable.
2024
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation
Tong Su | Xin Peng | Sarubi Thillainathan | David Guzmán | Surangika Ranathunga | En-Shiun Lee
Findings of the Association for Computational Linguistics: NAACL 2024
Tong Su | Xin Peng | Sarubi Thillainathan | David Guzmán | Surangika Ranathunga | En-Shiun Lee
Findings of the Association for Computational Linguistics: NAACL 2024
Parameter-efficient fine-tuning (PEFT) methods are increasingly vital in adapting large-scale pre-trained language models for diverse tasks, offering a balance between adaptability and computational efficiency. They are important in Low-Resource Language (LRL) Neural Machine Translation (NMT) to enhance translation accuracy with minimal resources. However, their practical effectiveness varies significantly across different languages. We conducted comprehensive empirical experiments with varying LRL domains and sizes to evaluate the performance of 8 PEFT methods with in total of 15 architectures using the SacreBLEU score. We showed that 6 PEFT architectures outperform the baseline for both in-domain and out-domain tests and the Houlsby+Inversion adapter has the best performance overall, proving the effectiveness of PEFT methods.
ZSEE: A Dataset based on Zeolite Synthesis Event Extraction for Automated Synthesis Platform
Song He | Xin Peng | Yihan Cai | Xin Li | Zhiqing Yuan | WenLi Du | Weimin Yang
Findings of the Association for Computational Linguistics: NAACL 2024
Song He | Xin Peng | Yihan Cai | Xin Li | Zhiqing Yuan | WenLi Du | Weimin Yang
Findings of the Association for Computational Linguistics: NAACL 2024
Automated synthesis of zeolite, one of the most important catalysts in chemical industries, holds great significance for attaining economic and environmental benefits. Structural synthesis data extracted through NLP technologies from zeolite experimental procedures can significantly expedite automated synthesis owing to its machine readability. However, the utilization of NLP technologies in information extraction of zeolite synthesis remains restricted due to the lack of annotated datasets. In this paper, we formulate an event extraction task to mine structural synthesis actions from experimental narratives for modular automated synthesis. Furthermore, we introduce ZSEE, a novel dataset containing fine-grained event annotations of zeolite synthesis actions. Our dataset features 16 event types and 13 argument roles which cover all the experimental operational steps of zeolite synthesis. We explore current state-of-the-art event extraction methods on ZSEE, perform error analysis based on the experimental results, and summarize the challenges and corresponding research directions to further facilitate the automated synthesis of zeolites. The code is publicly available at https://github.com/Hi-0317/ZSEE.
Search
Fix author
Co-authors
- Bizhe Bai 1
- Fengshuo Bai 1
- Yihan Cai 1
- Huacan Chai 1
- Renjie Ding 1
- WenLi Du 1
- David Guzmán 1
- Song He 1
- Yike He 1
- Fei Huang 1
- Zhuochen Huang 1
- En-Shiun Lee 1
- Xin Li 1
- Jianghao Lin 1
- Yiling Lou 1
- Yuxuan Peng 1
- Surangika Ranathunga 1
- Yuanyuan Shi 1
- Weixi Song 1
- Tong Su 1
- Sarubi Thillainathan 1
- Chong Wang 1
- Hairui Wang 1
- Jiajun Wang 1
- Ying Wen 1
- Yijian Wu 1
- Weimin Yang 1
- Zhiqing Yuan 1
- Weinan Zhang 1
- Zhenhao Zhou 1