Hanchao Yu
2026
Detecting AI-Generated Content on Social Media with Multi-modal Language Models
Chenyang Yang | Shen Yan | Yibo Yang | Litao Hu | Yuchen Liu | Yuan Zeng | Hanchao Yu | Yinan Zhu | Sumedha Singla | Brian Vanover | Huijun Qian | Zihao Wang | Fujun Liu | Aashu Singh | Jianyu Wang | Xuewen Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Chenyang Yang | Shen Yan | Yibo Yang | Litao Hu | Yuchen Liu | Yuan Zeng | Hanchao Yu | Yinan Zhu | Sumedha Singla | Brian Vanover | Huijun Qian | Zihao Wang | Fujun Liu | Aashu Singh | Jianyu Wang | Xuewen Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Generative AI has enabled the creation of photorealistic images and videos that are increasingly disseminated on social media, often used for spam, misinformation, manipulation, and fraud. Existing AI-generated content (AIGC) detection methods face challenges including poor generalization to new generation models, reliance on single modalities, and lack of interpretable explanations. We present our pipeline that mitigates these issues by continuously curating diverse multi-modal social media data and training a compact vision-language model for detection and explanation. Our model achieves state-of-the-art detection performance on public benchmarks and demonstrates robust detection and explanation capabilities on internal social media datasets across multiple platforms. We deployed our model for post recommendation on social media platforms and observed positive downstream impacts on user engagement, demonstrating that it is feasible to perform effective AIGC detection in dynamic, real-world social media environments.
2025
Inference Compute-Optimal Video Vision Language Models
Peiqi Wang | ShengYun Peng | Xuewen Zhang | Hanchao Yu | Yibo Yang | Lifu Huang | Fujun Liu | Qifan Wang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Peiqi Wang | ShengYun Peng | Xuewen Zhang | Hanchao Yu | Yibo Yang | Lifu Huang | Fujun Liu | Qifan Wang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
This work investigates the optimal allocation of inference compute across three key scaling factors in video vision language models: language model size, frame count, and the number of visual tokens per frame. While prior works typically focuses on optimizing model efficiency or improving performance without considering resource constraints, we instead identify optimal model configuration under fixed inference compute budgets. We conduct large-scale training sweeps and careful parametric modeling of task performance to identify the inference compute-optimal frontier. Our experiments reveal how task performance depends on scaling factors and finetuning data size, as well as how changes in data size shift the compute-optimal frontier. These findings translate to practical tips for selecting these scaling factors.
2023
RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training
Jaehyung Kim | Yuning Mao | Rui Hou | Hanchao Yu | Davis Liang | Pascale Fung | Qifan Wang | Fuli Feng | Lifu Huang | Madian Khabsa
Findings of the Association for Computational Linguistics: EMNLP 2023
Jaehyung Kim | Yuning Mao | Rui Hou | Hanchao Yu | Davis Liang | Pascale Fung | Qifan Wang | Fuli Feng | Lifu Huang | Madian Khabsa
Findings of the Association for Computational Linguistics: EMNLP 2023
Fine-tuning pre-trained language models (LMs) has become the de facto standard in many NLP tasks. Nevertheless, fine-tuned LMs are still prone to robustness issues, such as adversarial robustness and model calibration. Several perspectives of robustness for LMs have been studied independently, but lacking a unified consideration in multiple perspectives. In this paper, we propose Robustifying LMs via Adversarial perturbation with Selective Training (RoAST), a simple yet effective fine-tuning technique to enhance the multi-perspective robustness of LMs in a unified way. RoAST effectively incorporates two important sources for the model robustness, robustness on the perturbed inputs and generalizable knowledge in pre-trained LMs. To be specific, RoAST introduces adversarial perturbation during fine-tuning while the model parameters are selectively updated upon their relative importance to minimize unnecessary deviation. Under a unified evaluation of fine-tuned LMs by incorporating four representative perspectives of model robustness, we demonstrate the effectiveness of RoAST compared to state-of-the-art fine-tuning methods on six different types of LMs, which indicates its usefulness in practice.
APrompt: Attention Prompt Tuning for Efficient Adaptation of Pre-trained Language Models
Qifan Wang | Yuning Mao | Jingang Wang | Hanchao Yu | Shaoliang Nie | Sinong Wang | Fuli Feng | Lifu Huang | Xiaojun Quan | Zenglin Xu | Dongfang Liu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Qifan Wang | Yuning Mao | Jingang Wang | Hanchao Yu | Shaoliang Nie | Sinong Wang | Fuli Feng | Lifu Huang | Xiaojun Quan | Zenglin Xu | Dongfang Liu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
With the continuous growth of large language models, the process of fine-tuning these models for new tasks has become increasingly parameter-intensive. Prompt tuning, a method that involves tuning a small set of soft prompts, has emerged as an effective and efficient approach for adapting large pre-trained language models. However, most existing prompt tuning approaches only introduce prompts at the input layer, limiting their performance and leaving large rooms for improvement. In this work, we propose a novel Attention Prompt tuning method, namely APrompt, for efficient adaptation of pre-trained language models. We first demonstrate that existing prompt tuning can be considered as a special case of attention prompt tuning. We then formally introduce APrompt, which incorporates query, key, and value prompts into the attention layer to guide the attention computation during fine-tuning. Experimental results on the SuperGLUE benchmark consistently demonstrate that our proposed approach outperforms state-of-the-art baselines and full fine-tuning method with pre-trained models at different scales. In addition, a comprehensive set of ablation studies validate the effectiveness of the prompt design, as well as the efficiency of our approach.
Generating Hashtags for Short-form Videos with Guided Signals
Tiezheng Yu | Hanchao Yu | Davis Liang | Yuning Mao | Shaoliang Nie | Po-Yao Huang | Madian Khabsa | Pascale Fung | Yi-Chia Wang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Tiezheng Yu | Hanchao Yu | Davis Liang | Yuning Mao | Shaoliang Nie | Po-Yao Huang | Madian Khabsa | Pascale Fung | Yi-Chia Wang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Short-form video hashtag recommendation (SVHR) aims to recommend hashtags to content creators from videos and corresponding descriptions. Most prior studies regard SVHR as a classification or ranking problem and select hashtags from a set of limited candidates. However, in reality, users can create new hashtags, and trending hashtags change rapidly over time on social media. Both of these properties cannot be easily modeled with classification approaches. To bridge this gap, we formulate SVHR as a generation task that better represents how hashtags are created naturally. Additionally, we propose the Guided Generative Model (GGM) where we augment the input features by retrieving relevant hashtags from a large-scale hashtag pool as extra guidance signals. Experimental results on two short-form video datasets show that our generative models outperform strong classification baselines, and the guidance signals further boost the performance by 8.11 and 2.17 absolute ROUGE-1 scores on average, respectively. We also perform extensive analyses including human evaluation, demonstrating that our generative model can create meaningful and relevant novel hashtags while achieving state-of-the-art performance on known hashtags
Search
Fix author
Co-authors
- Lifu Huang 3
- Yuning Mao 3
- Qifan Wang 3
- Fuli Feng 2
- Pascale Fung 2
- Madian Khabsa 2
- Davis Liang 2
- Fujun Liu 2
- Shaoliang Nie 2
- Yibo Yang 2
- Xuewen Zhang 2
- Rui Hou 1
- Litao Hu 1
- Po-Yao Huang 1
- Jaehyung Kim 1
- Dongfang Liu 1
- Yuchen Liu (刘雨辰) 1
- ShengYun Peng 1
- Huijun Qian 1
- Xiaojun Quan 1
- Aashu Singh 1
- Sumedha Singla 1
- Brian Vanover 1
- Jianyu Wang 1
- Jingang Wang 1
- Peiqi Wang 1
- Sinong Wang 1
- Yi-Chia Wang 1
- Zihao Wang 1
- Zenglin Xu 1
- Shen Yan 1
- Chenyang Yang 1
- Tiezheng Yu 1
- Yuan Zeng 1
- Yinan Zhu 1