Li Chen
Papers on this page may belong to the following people: Li Chen, Li Chen
2026
Jailbreak-Zero: A Path to Pareto Optimal Red Teaming for Large Language Models
Kai Hu | Abhinav Aggarwal | Mehran Khodabandeh | David Zhang | Eric Hsin | Li Chen | Ankit Jain | Matt Fredrikson | Akash Bharadwaj
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Kai Hu | Abhinav Aggarwal | Mehran Khodabandeh | David Zhang | Eric Hsin | Li Chen | Ankit Jain | Matt Fredrikson | Akash Bharadwaj
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
This paper presents a novel Automated Red Teaming (ART) framework that shifts from example-based to policy-based evaluation, addressing critical limitations in scalability and validity. We define harmful content through abstract safety policies rather than specific static examples. We also introduce multiple evaluation objectives: risk coverage, semantic diversity, and fidelity, and discover Pareto trade-offs between them. We propose Jailbreak-Zero, a black-box method capable of both zero-shot generation and fine-tuned exploitation of a victim’s vulnerabilities to achieve Pareto optimality. Unlike prior approaches, it does not require expert-designed strategies/prompts, but still achieves superior, human-readable attacks against open-source and proprietary models (attack success rates of 99.5% against GPT-4o and 96.0% against Claude 3.5), even for unseen safety policies. It retains efficacy even after victim models undergo safety alignment, and exposes controls to navigate Pareto trade-offs without retraining. Lastly, we show that Jailbreak-Zero is the best-performing ART method at a given compute budget. Code is available at: https://github.com/hukkai/jailbreak-zero/ .
Red-Teaming NSFW Image Classifiers as Text-to-Image Safeguards
Tinghao Xie | Yueqi Xie | Alireza Zareian | Shuming Hu | Felix Juefei-Xu | Xiaowen Lin | Ankit Jain | Prateek Mittal | Li Chen
Findings of the Association for Computational Linguistics: ACL 2026
Tinghao Xie | Yueqi Xie | Alireza Zareian | Shuming Hu | Felix Juefei-Xu | Xiaowen Lin | Ankit Jain | Prateek Mittal | Li Chen
Findings of the Association for Computational Linguistics: ACL 2026
Not Safe for Work (NSFW) image classifiers play a critical role in safeguarding text-to-image (T2I) systems. However, a concerning phenomenon has emerged in T2I systems – changes in text prompts that manipulate benign image elements can result in failed detection by NSFW classifiers – dubbed "*context shifts*." For instance, while a NSFW image of "*a nude person in an empty scene*" can be easily blocked by most NSFW classifiers, a stealthier one that depicts "*a nude person blending in a group of dressed people*" may evade detection. We ask: how to systematically reveal NSFW image classifiers’ failure against such context shifts?Towards this end, we present an automated red-teaming framework that leverages a set of generative AI tools. We propose an **exploration-exploitation** approach: **First**, in the *exploration* stage, we synthesize a diverse and massive 36K NSFW image dataset that facilitates our study of context shifts. We find that varying fractions (e.g., 4.1% to 36% nude and sexual content) of the dataset are misclassified by NSFW image classifiers like GPT-4o and Gemini. **Second**, in the *exploitation* stage, we leverage these failure cases to train a specialized LLM that rewrites unseen seed prompts into more evasive versions, increasing the likelihood of detection evasion by up to 6 times. Alarmingly, we show **these failures translate to real-world T2I and even T2V systems** like DALL-E 3, Sora, Nano Banana, and Veo 3 – beyond the open-weight image generators in our main study. For example, querying DALL-E 3 with prompts rewritten by our approach increases the chance of obtaining NSFW images from 0 to over 50%.
2024
Large Language Models for Generative Recommendation: A Survey and Visionary Discussions
Lei Li | Yongfeng Zhang | Dugang Liu | Li Chen
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Lei Li | Yongfeng Zhang | Dugang Liu | Li Chen
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Large language models (LLM) not only have revolutionized the field of natural language processing (NLP) but also have the potential to reshape many other fields, e.g., recommender systems (RS). However, most of the related work treats an LLM as a component of the conventional recommendation pipeline (e.g., as a feature extractor), which may not be able to fully leverage the generative power of LLM. Instead of separating the recommendation process into multiple stages, such as score computation and re-ranking, this process can be simplified to one stage with LLM: directly generating recommendations from the complete pool of items. This survey reviews the progress, methods, and future directions of LLM-based generative recommendation by examining three questions: 1) What generative recommendation is, 2) Why RS should advance to generative recommendation, and 3) How to implement LLM-based generative recommendation for various RS tasks. We hope that this survey can provide the context and guidance needed to explore this interesting and emerging topic.
2021
Alpha at SemEval-2021 Task 6: Transformer Based Propaganda Classification
Zhida Feng | Jiji Tang | Jiaxiang Liu | Weichong Yin | Shikun Feng | Yu Sun | Li Chen
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Zhida Feng | Jiji Tang | Jiaxiang Liu | Weichong Yin | Shikun Feng | Yu Sun | Li Chen
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
This paper describes our system participated in Task 6 of SemEval-2021: the task focuses on multimodal propaganda technique classification and it aims to classify given image and text into 22 classes. In this paper, we propose to use transformer based architecture to fuse the clues from both image and text. We explore two branches of techniques including fine-tuning the text pretrained transformer with extended visual features, and fine-tuning the multimodal pretrained transformers. For the visual features, we have tested both grid features based on ResNet and salient region features from pretrained object detector. Among the pretrained multimodal transformers, we choose ERNIE-ViL, a two-steam cross-attended transformers pretrained on large scale image-caption aligned data. Fine-tuing ERNIE-ViL for our task produce a better performance due to general joint multimodal representation for text and image learned by ERNIE-ViL. Besides, as the distribution of the classification labels is very unbalanced, we also make a further attempt on the loss function and the experiment result shows that focal loss would perform better than cross entropy loss. Last we have won first for subtask C in the final competition.
Personalized Transformer for Explainable Recommendation
Lei Li | Yongfeng Zhang | Li Chen
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Lei Li | Yongfeng Zhang | Li Chen
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Personalization of natural language generation plays a vital role in a large spectrum of tasks, such as explainable recommendation, review summarization and dialog systems. In these tasks, user and item IDs are important identifiers for personalization. Transformer, which is demonstrated with strong language modeling capability, however, is not personalized and fails to make use of the user and item IDs since the ID tokens are not even in the same semantic space as the words. To address this problem, we present a PErsonalized Transformer for Explainable Recommendation (PETER), on which we design a simple and effective learning objective that utilizes the IDs to predict the words in the target explanation, so as to endow the IDs with linguistic meanings and to achieve personalized Transformer. Besides generating explanations, PETER can also make recommendations, which makes it a unified model for the whole recommendation-explanation pipeline. Extensive experiments show that our small unpretrained model outperforms fine-tuned BERT on the generation task, in terms of both effectiveness and efficiency, which highlights the importance and the nice utility of our design.
2020
Xiaomingbot: A Multilingual Robot News Reporter
Runxin Xu | Jun Cao | Mingxuan Wang | Jiaze Chen | Hao Zhou | Ying Zeng | Yuping Wang | Li Chen | Xiang Yin | Xijin Zhang | Songcheng Jiang | Yuxuan Wang | Lei Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Runxin Xu | Jun Cao | Mingxuan Wang | Jiaze Chen | Hao Zhou | Ying Zeng | Yuping Wang | Li Chen | Xiang Yin | Xijin Zhang | Songcheng Jiang | Yuxuan Wang | Lei Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
This paper proposes the building of Xiaomingbot, an intelligent, multilingual and multimodal software robot equipped with four inte- gral capabilities: news generation, news translation, news reading and avatar animation. Its system summarizes Chinese news that it automatically generates from data tables. Next, it translates the summary or the full article into multiple languages, and reads the multi- lingual rendition through synthesized speech. Notably, Xiaomingbot utilizes a voice cloning technology to synthesize the speech trained from a real person’s voice data in one input language. The proposed system enjoys several merits: it has an animated avatar, and is able to generate and read multilingual news. Since it was put into practice, Xiaomingbot has written over 600,000 articles, and gained over 150,000 followers on social media platforms.
2019
Ranking-Based Autoencoder for Extreme Multi-label Classification
Bingyu Wang | Li Chen | Wei Sun | Kechen Qin | Kefeng Li | Hui Zhou
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Bingyu Wang | Li Chen | Wei Sun | Kechen Qin | Kefeng Li | Hui Zhou
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Extreme Multi-label classification (XML) is an important yet challenging machine learning task, that assigns to each instance its most relevant candidate labels from an extremely large label collection, where the numbers of labels, features and instances could be thousands or millions. XML is more and more on demand in the Internet industries, accompanied with the increasing business scale / scope and data accumulation. The extremely large label collections yield challenges such as computational complexity, inter-label dependency and noisy labeling. Many methods have been proposed to tackle these challenges, based on different mathematical formulations. In this paper, we propose a deep learning XML method, with a word-vector-based self-attention, followed by a ranking-based AutoEncoder architecture. The proposed method has three major advantages: 1) the autoencoder simultaneously considers the inter-label dependencies and the feature-label dependencies, by projecting labels and features onto a common embedding space; 2) the ranking loss not only improves the training efficiency and accuracy but also can be extended to handle noisy labeled data; 3) the efficient attention mechanism improves feature representation by highlighting feature importance. Experimental results on benchmark datasets show the proposed method is competitive to state-of-the-art methods.
2014
Tri-Training for Authorship Attribution with Limited Training Data
Tieyun Qian | Bing Liu | Li Chen | Zhiyong Peng
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Tieyun Qian | Bing Liu | Li Chen | Zhiyong Peng
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
2012
A Preliminary Work on Symptom Name Recognition from Free-Text Clinical Records of Traditional Chinese Medicine using Conditional Random Fields and Reasonable Features
Yaqiang Wang | Yiguang Liu | Zhonghua Yu | Li Chen | Yongguang Jiang
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Yaqiang Wang | Yiguang Liu | Zhonghua Yu | Li Chen | Yongguang Jiang
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Search
Fix author
Co-authors
- Lei Li 3
- Ankit Jain 2
- Yongfeng Zhang 2
- Abhinav Aggarwal 1
- Akash Bharadwaj 1
- Jun Cao 1
- Jiaze Chen 1
- Zhida Feng 1
- Shikun Feng 1
- Matt Fredrikson 1
- Eric Hsin 1
- Kai Hu 1
- Shuming Hu 1
- Songcheng Jiang 1
- Yongguang Jiang 1
- Felix Juefei-Xu 1
- Mehran Khodabandeh 1
- Kefeng Li 1
- Xiaowen Lin 1
- Yiguang Liu 1
- Bing Liu 1
- Dugang Liu 1
- Jiaxiang Liu 1
- Prateek Mittal 1
- Zhiyong Peng 1
- Tieyun Qian 1
- Kechen Qin 1
- Wei Sun 1
- Yu Sun 1
- Jiji Tang 1
- Mingxuan Wang 1
- Yuping Wang 1
- Yuxuan Wang 1
- Yaqiang Wang 1
- Bingyu Wang 1
- Tinghao Xie 1
- Yueqi Xie 1
- Runxin Xu 1
- Xiang Yin 1
- Weichong Yin 1
- Zhonghua Yu 1
- Alireza Zareian 1
- Ying Zeng 1
- Xijin Zhang 1
- David Zhang 1
- Hao Zhou 1
- Hui Zhou 1