Chenyang Yang
2026
Detecting AI-Generated Content on Social Media with Multi-modal Language Models
Chenyang Yang | Shen Yan | Yibo Yang | Litao Hu | Yuchen Liu | Yuan Zeng | Hanchao Yu | Yinan Zhu | Sumedha Singla | Brian Vanover | Huijun Qian | Zihao Wang | Fujun Liu | Aashu Singh | Jianyu Wang | Xuewen Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Chenyang Yang | Shen Yan | Yibo Yang | Litao Hu | Yuchen Liu | Yuan Zeng | Hanchao Yu | Yinan Zhu | Sumedha Singla | Brian Vanover | Huijun Qian | Zihao Wang | Fujun Liu | Aashu Singh | Jianyu Wang | Xuewen Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Generative AI has enabled the creation of photorealistic images and videos that are increasingly disseminated on social media, often used for spam, misinformation, manipulation, and fraud. Existing AI-generated content (AIGC) detection methods face challenges including poor generalization to new generation models, reliance on single modalities, and lack of interpretable explanations. We present our pipeline that mitigates these issues by continuously curating diverse multi-modal social media data and training a compact vision-language model for detection and explanation. Our model achieves state-of-the-art detection performance on public benchmarks and demonstrates robust detection and explanation capabilities on internal social media datasets across multiple platforms. We deployed our model for post recommendation on social media platforms and observed positive downstream impacts on user engagement, demonstrating that it is feasible to perform effective AIGC detection in dynamic, real-world social media environments.
2025
cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree
Yilin Zhang | Xinran Zhao | Zora Zhiruo Wang | Chenyang Yang | Jiayi Wei | Tongshuang Wu
Findings of the Association for Computational Linguistics: EMNLP 2025
Yilin Zhang | Xinran Zhao | Zora Zhiruo Wang | Chenyang Yang | Jiayi Wei | Tongshuang Wu
Findings of the Association for Computational Linguistics: EMNLP 2025
Retrieval-Augmented Generation (RAG) has become essential for large-scale code generation, grounding predictions in external code corpora to improve factuality. However, a critical yet underexplored aspect of RAG pipelines is chunking—the process of dividing documents into retrievable units. Existing line-based chunking heuristics often break semantic structures, splitting functions or merging unrelated code, which can degrade generation quality. We propose chunking via Abstract Syntax Trees (cAST), a structure-aware method that recursively breaks large AST nodes into smaller chunks and merges sibling nodes while respecting size limits. This approach generates self-contained, semantically coherent units across programming languages and tasks, improving performance on diverse code generation tasks, e.g., boosting Recall@5 by 4.3 points on RepoEval retrieval and Pass@1 by 2.67 points on SWE-bench generation. Our work highlights the importance of structure-aware chunking for scaling retrieval-enhanced code intelligence.
SPHERE: An Evaluation Card for Human-AI Systems
Dora Zhao | Qianou Ma | Xinran Zhao | Chenglei Si | Chenyang Yang | Ryan Louie | Ehud Reiter | Diyi Yang | Tongshuang Wu
Findings of the Association for Computational Linguistics: ACL 2025
Dora Zhao | Qianou Ma | Xinran Zhao | Chenglei Si | Chenyang Yang | Ryan Louie | Ehud Reiter | Diyi Yang | Tongshuang Wu
Findings of the Association for Computational Linguistics: ACL 2025
In the era of Large Language Models (LLMs), establishing effective evaluation methods and standards for diverse human-AI interaction systems is increasingly challenging. To encourage more transparent documentation and facilitate discussion on human-AI system evaluation design options, we present an evaluation card SPHERE, which encompasses five key dimensions: 1) What is being evaluated?; 2) How is the evaluation conducted?; 3) Who is participating in the evaluation?; 4) When is evaluation conducted?; 5) How is evaluation validated? We conduct a review of 39 human-AI systems using SPHERE, outlining current evaluation practices and areas for improvement. We provide three recommendations for improving the validity and rigor of evaluation practices.
2023
Beyond Testers’ Biases: Guiding Model Testing with Knowledge Bases using LLMs
Chenyang Yang | Rishabh Rustogi | Rachel Brower-Sinning | Grace Lewis | Christian Kaestner | Tongshuang Wu
Findings of the Association for Computational Linguistics: EMNLP 2023
Chenyang Yang | Rishabh Rustogi | Rachel Brower-Sinning | Grace Lewis | Christian Kaestner | Tongshuang Wu
Findings of the Association for Computational Linguistics: EMNLP 2023
Current model testing work has mostly focused on creating test cases. Identifying what to test is a step that is largely ignored and poorly supported. We propose Weaver, an interactive tool that supports requirements elicitation for guiding model testing. Weaver uses large language models to generate knowledge bases and recommends concepts from them interactively, allowing testers to elicit requirements for further testing. Weaver provides rich external knowledge to testers and encourages testers to systematically explore diverse concepts beyond their own biases. In a user study, we show that both NLP experts and non-experts identified more, as well as more diverse concepts worth testing when using Weaver. Collectively, they found more than 200 failing test cases for stance detection with zero-shot ChatGPT. Our case studies further show that Weaver can help practitioners test models in real-world settings, where developers define more nuanced application scenarios (e.g., code understanding and transcript summarization) using LLMs.
Search
Fix author
Co-authors
- Tongshuang Wu 3
- Xinran Zhao 2
- Rachel Brower-Sinning 1
- Litao Hu 1
- Christian Kaestner 1
- Grace Lewis 1
- Fujun Liu 1
- Yuchen Liu (刘雨辰) 1
- Ryan Louie 1
- Qianou Ma 1
- Huijun Qian 1
- Ehud Reiter 1
- Rishabh Rustogi 1
- Chenglei Si 1
- Aashu Singh 1
- Sumedha Singla 1
- Brian Vanover 1
- Jianyu Wang 1
- Zihao Wang 1
- Zora Zhiruo Wang 1
- Jiayi Wei 1
- Shen Yan 1
- Diyi Yang 1
- Yibo Yang 1
- Hanchao Yu 1
- Yuan Zeng 1
- Xuewen Zhang 1
- Yilin Zhang 1
- Dora Zhao 1
- Yinan Zhu 1