Junlin Wu
2026
From Personal to Collective: On the Role of Local and Global Knowledge in LLM Personalization
Zehong Wang | Junlin Wu | Zhaoxuan Tan | Bolian Li | Xianrui Zhong | Zheli Liu | Qingkai Zeng
Findings of the Association for Computational Linguistics: ACL 2026
Zehong Wang | Junlin Wu | Zhaoxuan Tan | Bolian Li | Xianrui Zhong | Zheli Liu | Qingkai Zeng
Findings of the Association for Computational Linguistics: ACL 2026
Large language model (LLM) personalization typically relies on modeling each user in isolation, conditioning on their historical interactions to adapt model behavior. However, this user-centric formulation overlooks the collective knowledge shared across users, limiting generalization for users with sparse histories and amplifying overfitting for those with highly skewed behaviors. We argue that effective personalization requires leveraging both individual preferences and population-level patterns. To this end, we propose LoGo, a Local–Global knowledge framework that augments user-specific signals with a global knowledge encoding collective behavioral trends. LoGo models global knowledge through a temporally evolving process that captures how population-wide preferences change over time, and a community-aware structure that organizes users into coherent groups with shared interests. To balance potentially conflicting local and global signals, LoGo employs a mediator module that adaptively fuses the two knowledge sources. Experiments on five personalization benchmarks show that LoGo consistently enhances personalization quality, outperforming existing methods by improving generalization in users with limited histories and mitigating bias in users with abundant histories. These results demonstrate the central role of collective knowledge in advancing LLM personalization. Our code is publicly available at https://github.com/Zehong-Wang/LoGo.
2024
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models
Jiongxiao Wang | Junlin Wu | Muhao Chen | Yevgeniy Vorobeychik | Chaowei Xiao
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiongxiao Wang | Junlin Wu | Muhao Chen | Yevgeniy Vorobeychik | Chaowei Xiao
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Reinforcement Learning with Human Feedback (RLHF) is a methodology designed to align Large Language Models (LLMs) with human preferences, playing an important role in LLMs alignment. Despite its advantages, RLHF relies on human annotators to rank the text, which can introduce potential security vulnerabilities if any adversarial annotator (i.e., attackers) manipulates the ranking score by up-ranking any malicious text to steer the LLM adversarially. To assess the red-teaming of RLHF against human preference data poisoning, we propose RankPoison, a poisoning attack method on candidates’ selection of preference rank flipping to reach certain malicious behaviors (e.g., generating longer sequences, which can increase the computational cost). With poisoned dataset generated by RankPoison, we can perform poisoning attacks on LLMs to generate longer tokens without hurting the original safety alignment performance. Moreover, applying RankPoison, we also successfully implement a backdoor attack where LLMs can generate longer answers under questions with the trigger word. Our findings highlight critical security challenges in RLHF, underscoring the necessity for more robust alignment methods for LLMs.
2018
CARER: Contextualized Affect Representations for Emotion Recognition
Elvis Saravia | Hsien-Chi Toby Liu | Yen-Hao Huang | Junlin Wu | Yi-Shin Chen
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Elvis Saravia | Hsien-Chi Toby Liu | Yen-Hao Huang | Junlin Wu | Yi-Shin Chen
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Emotions are expressed in nuanced ways, which varies by collective or individual experiences, knowledge, and beliefs. Therefore, to understand emotion, as conveyed through text, a robust mechanism capable of capturing and modeling different linguistic nuances and phenomena is needed. We propose a semi-supervised, graph-based algorithm to produce rich structural descriptors which serve as the building blocks for constructing contextualized affect representations from text. The pattern-based representations are further enriched with word embeddings and evaluated through several emotion recognition tasks. Our experimental results demonstrate that the proposed method outperforms state-of-the-art techniques on emotion recognition tasks.