Sijie Wang
2026
LiveCANNBench: Benchmark SWE AI Coding for Ascend CANN
Sijie Wang | Kai Zhao | Wee Peng Tay | Shuo Zhang | Chengwen Liu | Quanjiang Guo | Ren Junhao | Xin Li | Heng Lian | Jingdi Lei | Rui She | Huacan Wang | Ronghao Chen
Findings of the Association for Computational Linguistics: ACL 2026
Sijie Wang | Kai Zhao | Wee Peng Tay | Shuo Zhang | Chengwen Liu | Quanjiang Guo | Ren Junhao | Xin Li | Heng Lian | Jingdi Lei | Rui She | Huacan Wang | Ronghao Chen
Findings of the Association for Computational Linguistics: ACL 2026
AI coding has emerged as a core application of large language models (LLMs), evolving from single-file coding tasks towards complex software engineering (SWE) scenarios. Recent advances in agents have enabled multi-file, multi-language, and dependency-aware AI coding, significantly expanding the scope of AI-assisted software development. While a variety of benchmarks have been proposed to evaluate coding capabilities in general-purpose or GPU coding ecosystems such as CUDA and ROCm, systematic evaluation for Huawei Ascend CANN remains largely underexplored. In this work, we propose LiveCANNBench, an SWE-level benchmark designed for AI coding in the CANN software stack. LiveCANNBench is constructed from real-world CANN repositories and consists of over 400 task instances spanning multi-file, multi-language, and execution-aware coding challenges. Unlike existing static benchmarks that primarily focus on kernel-level code generation, LiveCANNBench adopts a live benchmarking paradigm, effectively mitigating data leakage and enabling more reliable evaluation of modern coding agents.
2025
BANER: Boundary-Aware LLMs for Few-Shot Named Entity Recognition
Quanjiang Guo | Yihong Dong | Ling Tian | Zhao Kang | Yu Zhang | Sijie Wang
Proceedings of the 31st International Conference on Computational Linguistics
Quanjiang Guo | Yihong Dong | Ling Tian | Zhao Kang | Yu Zhang | Sijie Wang
Proceedings of the 31st International Conference on Computational Linguistics
Despite the recent success of two-stage prototypical networks in few-shot named entity recognition (NER), challenges such as over/under-detected false spans in the span detection stage and unaligned entity prototypes in the type classification stage persist. Additionally, LLMs have not proven to be effective few-shot information extractors in general. In this paper, we propose an approach called Boundary-Aware LLMs for Few-Shot Named Entity Recognition to address these issues. We introduce a boundary-aware contrastive learning strategy to enhance the LLM’s ability to perceive entity boundaries for generalized entity spans. Additionally, we utilize LoRAHub to align information from the target domain to the source domain, thereby enhancing adaptive cross-domain classification capabilities. Extensive experiments across various benchmarks demonstrate that our framework outperforms prior methods, validating its effectiveness. In particular, the proposed strategies demonstrate effectiveness across a range of LLM architectures. The code and data are released on https://github.com/UESTC-GQJ/BANER.