Zihang Fu
2026
Beyond the Crowd: LLM-Augmented Community Notes for Governing Health Misinformation
Jiaying Wu | Zihang Fu | Haonan Wang | Fanxiao Li | Jiafeng Guo | Preslav Nakov | Min-Yen Kan
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiaying Wu | Zihang Fu | Haonan Wang | Fanxiao Li | Jiafeng Guo | Preslav Nakov | Min-Yen Kan
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Community Notes, the crowd-sourced misinformation governance system on X (formerly Twitter), allows users to flag misleading posts, attach contextual notes, and rate the notes’ helpfulness. However, our empirical analysis of 30.8K health-related notes reveals substantial latency, with a median delay of 17.6 hours before notes receive a helpfulness status. To improve responsiveness during real-world misinformation surges, we propose CrowdNotes+, a unified LLM-based framework that augments Community Notes for faster and more reliable health misinformation governance. CrowdNotes+ integrates two modes: (1) evidence-grounded note augmentation and (2) utility-guided note automation, supported by a hierarchical three-stage evaluation of relevance, correctness, and helpfulness. We instantiate the framework with HealthNotes, a benchmark of 1.2K health notes annotated for helpfulness, and a fine-tuned helpfulness judge. Our analysis first uncovers a key loophole in current crowd-sourced governance: voters frequently conflate stylistic fluency with factual accuracy. Addressing this via our hierarchical evaluation, experiments across 15 representative LLMs demonstrate that CrowdNotes+ significantly outperforms human contributors in note correctness, helpfulness, and evidence utility.
2021
Don’t Change Me! User-Controllable Selective Paraphrase Generation
Mohan Zhang | Luchen Tan | Zihang Fu | Kun Xiong | Jimmy Lin | Ming Li | Zhengkai Tu
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Mohan Zhang | Luchen Tan | Zihang Fu | Kun Xiong | Jimmy Lin | Ming Li | Zhengkai Tu
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
In the paraphrase generation task, source sentences often contain phrases that should not be altered. Which phrases, however, can be context dependent and can vary by application. Our solution to this challenge is to provide the user with explicit tags that can be placed around any arbitrary segment of text to mean “don’t change me!” when generating a paraphrase; the model learns to explicitly copy these phrases to the output. The contribution of this work is a novel data generation technique using distant supervision that allows us to start with a pretrained sequence-to-sequence model and fine-tune a paraphrase generator that exhibits this behavior, allowing user-controllable paraphrase generation. Additionally, we modify the loss during fine-tuning to explicitly encourage diversity in model output. Our technique is language agnostic, and we report experiments in English and Chinese.