Zhengyang Zhao
2026
CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges
Zihan Wang | Lam Nguyen | Zhengyang Zhao | Mengyue Yang | Chengwei Qin | Yujiu Yang | Linyi Yang
Findings of the Association for Computational Linguistics: ACL 2026
Zihan Wang | Lam Nguyen | Zhengyang Zhao | Mengyue Yang | Chengwei Qin | Yujiu Yang | Linyi Yang
Findings of the Association for Computational Linguistics: ACL 2026
The saturation of high-quality pre-training data has shifted research focus toward evolutionary systems capable of continuously generating novel artifacts, leading to the success of AlphaEvolve. However, the progress of such systems is hindered by the lack of rigorous, quantitative evaluation. To tackle this challenge, we introduce CreativeBench, a benchmark for evaluating machine creativity in code generation, grounded in a classical cognitive framework. Comprising two subsets – CreativeBench-Combo and CreativeBench-Explore – the benchmark targets combinatorial and exploratory creativity through an automated pipeline utilizing reverse engineering and self-play. By leveraging executable code, CreativeBench objectively distinguishes creativity from hallucination via a unified metric defined as the product of quality and novelty. Our analysis of state-of-the-art models reveals distinct behaviors: (1) scaling significantly improves combinatorial creativity but yields diminishing returns for exploration; (2) larger models exhibit “convergence-by-scaling,” becoming more correct but less divergent; and (3) reasoning capabilities primarily benefit constrained exploration rather than combination. Finally, we propose EvoRePE, a plug-and-play inference-time steering strategy that internalizes evolutionary search patterns to consistently enhance machine creativity.
2023
CL-QR: Cross-Lingual Enhanced Query Reformulation for Multi-lingual Conversational AI Agents
Zhongkai Sun | Zhengyang Zhao | Sixing Lu | Chengyuan Ma | Xiaohu Liu | Xing Fan | Wei Shen | Chenlei Guo
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Zhongkai Sun | Zhengyang Zhao | Sixing Lu | Chengyuan Ma | Xiaohu Liu | Xing Fan | Wei Shen | Chenlei Guo
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
The growing popularity of conversational AI agents such as Alexa, Google Assistant, and Siri rely on accurate spoken language comprehension. The query reformulation (QR) method, which reformulates defective user queries, has been broadly adopted to mitigate the challenges posed by understanding user’s intent from imperfect spoken recognition result. However, due to the scarcity of non-English QR labels, providing high-quality QR for non-English users still remains a challenge. This work proposes a novel cross-lingual QR framework, CL-QR, to leverage the abundant reformulation resources in English to improve non-English QR performance. The proposed work also proposes a Module-wise Mutually-supervised Feedback learning (MMF) algorithm to enable the continually self-improving of the CL-QR, which alleviates the lack of cross-lingual QR training data and enhances the delivery of high-quality reformulations learned in English for multilingual queries. Both offline evaluation and online A/B testing demonstrates the effectiveness of the proposed method.
2022
Fine-grained Multi-lingual Disentangled Autoencoder for Language-agnostic Representation Learning
Zetian Wu | Zhongkai Sun | Zhengyang Zhao | Sixing Lu | Chengyuan Ma | Chenlei Guo
Proceedings of the Massively Multilingual Natural Language Understanding Workshop (MMNLU-22)
Zetian Wu | Zhongkai Sun | Zhengyang Zhao | Sixing Lu | Chengyuan Ma | Chenlei Guo
Proceedings of the Massively Multilingual Natural Language Understanding Workshop (MMNLU-22)
Encoding both language-specific and language-agnostic information into a single high-dimensional space is a common practice of pre-trained Multi-lingual Language Models (pMLM). Such encoding has been shown to perform effectively on natural language tasks requiring semantics of the whole sentence (e.g., translation). However, its effectiveness appears to be limited on tasks requiring partial information of the utterance (e.g., multi-lingual entity retrieval, template retrieval, and semantic alignment). In this work, a novel Fine-grained Multilingual Disentangled Autoencoder (FMDA) is proposed to disentangle fine-grained semantic information from language-specific information in a multi-lingual setting. FMDA is capable of successfully extracting the disentangled template semantic and residual semantic representations. Experiments conducted on the MASSIVE dataset demonstrate that the disentangled encoding can boost each other during the training, thus consistently outperforming the original pMLM and the strong language disentanglement baseline on monolingual template retrieval and cross-lingual semantic retrieval tasks across multiple languages.