Lingjie Chen

2026

Reasoning Traces Shape Outputs but Models Won’t Say So
Yijie Hao | Lingjie Chen | Ali Emami | Joyce C. Ho
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Can we trust the reasoning traces that large reasoning models (LRMs) produce? We investigate whether these traces faithfully reflect what drives model outputs, and whether models will honestly report their influence. We introduce Thought Injection, a method that injects synthetic reasoning snippets into a model’s reasoning trace, then measures whether the model follows the injected reasoning and acknowledges doing so. Across 45,000 samples from three LRMs, we find that injected hints reliably alter outputs, confirming that reasoning traces causally shape model behavior. However, when asked to explain their changed answers, models overwhelmingly refuse to disclose the influence: non-disclosure exceeds 90% for extreme hints across 30,000 follow-up samples. Instead of acknowledging the injected reasoning, models fabricate aligned-appearing but unrelated explanations. Activation analysis reveals that sycophancy- and deception-related directions are strongly activated during these fabrications, suggesting systematic patterns rather than incidental failures. Our findings reveal a gap between the reasoning LRMs follow and the reasoning they report, raising concern that aligned-appearing explanations may not be equivalent to genuine alignment.

pdf bib abs

dLLM: Simple Diffusion Language Modeling
Zhanhui Zhou | Lingjie Chen | Hanghang Tong | Dawn Song
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

Although diffusion language models (DLMs) are evolving quickly, many recent models converge on a set of shared components. These components, however, are distributed across ad-hoc research codebases or lack transparent implementations, making them difficult to reproduce or extend. As the field accelerates, there is a clear need for a unified framework that standardizes these common components while remaining flexible enough to support new methods and architectures.To address this gap, we introduce dLLM, an open-source framework that unifies the core components of diffusion language modeling—training, inference, and evaluation—and makes them easy to customize for new designs. With dLLM, users can reproduce, finetune, deploy, and evaluate open-source large DLMs such as LLaDA and Dream through a standardized pipeline.The framework also provides minimal, reproducible recipes for building small DLMs from scratch with accessible compute—including converting any BERT-style encoder or autoregressive LM into a DLM. We also release the checkpoints of these small DLMs to make DLMs more accessible and accelerate future research.

2024

pdf bib abs

“A good pun is its own reword”: Can Large Language Models Understand Puns?
Zhijun Xu | Siyu Yuan | Lingjie Chen | Deqing Yang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Puns play a vital role in academic research due to their distinct structure and clear definition, which aid in the comprehensive analysis of linguistic humor. However, the understanding of puns in large language models (LLMs) has not been thoroughly examined, limiting their use in creative writing and humor creation. In this paper, we leverage three popular tasks, i.e., pun recognition, explanation and generation to systematically evaluate the capabilities of LLMs in pun understanding. In addition to adopting the automated evaluation metrics from prior research, we introduce new evaluation methods and metrics that are better suited to the in-context learning paradigm of LLMs. These new metrics offer a more rigorous assessment of an LLM’s ability to understand puns and align more closely with human cognition than previous metrics. Our findings reveal the “lazy pun generation” pattern and identify the primary challenges LLMs encounter in understanding puns.

Co-authors

Venues

ACL2
EMNLP1

Fix author