Zhenyang Li
Other people with similar names: Zhenyang Li
Unverified author pages with similar names: Zhenyang Li
2026
DiffER: Diffusion Entity-Relation Modeling for Reversal Curse in Diffusion Large Language Models
Shaokai He | Kaiwen Wei | Xinyi Zeng | Xiang Chen | Xue Yang | Zhenyang Li | Jiang Zhong | Yu Tian
Findings of the Association for Computational Linguistics: ACL 2026
Shaokai He | Kaiwen Wei | Xinyi Zeng | Xiang Chen | Xue Yang | Zhenyang Li | Jiang Zhong | Yu Tian
Findings of the Association for Computational Linguistics: ACL 2026
The "reversal curse" refers to the phenomenon where large language models (LLMs) exhibit predominantly unidirectional behavior when processing logically bidirectional relationships. Prior work attributed this to autoregressive training—predicting the next token inherently favors left-to-right information flow over genuine bidirectional knowledge associations. However, we observe that Diffusion LLMs (DLLMs), despite being trained bidirectionally, also suffer from the reversal curse. To investigate the root causes, we conduct systematic experiments on DLLMs and identify three key reasons: 1) entity fragmentation during training, 2) data asymmetry, and 3) missing entity relations. Motivated by the analysis of these reasons, we propose Diffusion Entity-Relation Modeling (DiffER), which addresses the reversal curse through entity-aware training and balanced data construction. Specifically, DiffER introduces whole-entity masking, which mitigates entity fragmentation by predicting complete entities in a single step. DiffER further employs distribution-symmetric and relation-enhanced data construction strategies to alleviate data asymmetry and missing relations. Extensive experiments demonstrate that DiffER effectively alleviates the reversal curse in Diffusion LLMs, offering new perspectives for future research. The code is available at https://github.com/CQU-MM-Intelligent-Lab/DiffER.
From Past To Path: Masked History Learning for Next-Item Prediction in Generative Recommendation
Kaiwen Wei | Kejun he | Xiaomian Kang | Jie Zhang | Ymyang | Li Jin | Zhenyang Li | Jiang Zhong | Richard He Bai | Junnan Zhu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Kaiwen Wei | Kejun he | Xiaomian Kang | Jie Zhang | Ymyang | Li Jin | Zhenyang Li | Jiang Zhong | Richard He Bai | Junnan Zhu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Generative recommendation, which directly generates item identifiers, has emerged as a promising paradigm for recommendation systems. However, this left-to-right paradigm inherently biases the model towards local contexts, failing to capture deeper historical dependencies necessary for understanding complex user intents.To address this limitation, we propose Masked History Learning (MHL), a novel training framework that shifts the objective from simple next-step prediction to deep comprehension of history. MHL augments the standard autoregressive objective with an auxiliary task of reconstructing masked historical items, compelling the model to understand "why” an item path is formed from the user’s past behaviors, rather than just "what” item comes next.We introduce two key contributions to enhance this framework: (1) an entropy-guided masking policy that intelligently targets the most informative historical items for reconstruction, and (2) a curriculum learning scheduler that progressively transitions from history reconstruction to future prediction.Experiments on three public datasets show that our method significantly outperforms state-of-the-art generative models, highlighting that a comprehensive understanding of the past is crucial for accurately predicting a user’s future path. The code is available at https://github.com/CQU-MM-Intelligent-Lab/MHL.