Changyuan Tian
2025
PIPER: Benchmarking and Prompting Event Reasoning Boundary of LLMs via Debiasing-Distillation Enhanced Tuning
Zhicong Lu
|
Changyuan Tian
|
PeiguangLi PeiguangLi
|
Li Jin
|
Sirui Wang
|
Wei Jia
|
Ying Shen
|
Guangluan Xu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While Large Language Models (LLMs) excel in diverse domains, their validity in event reasoning remains underexplored. Most existing works merely stagnate at assessing LLMs’ event reasoning with a single event relational type or reasoning format, failing to conduct a complete evaluation and provide a practical solution for capability enhancement. In this paper, we propose PIPER, the first comprehensive benchmark for Probing Into the Performance boundary of LLMs in Event Reasoning. Motivated by our evaluation observations and error patterns analysis, we meticulously craft 10K diverse instruction-tuning demonstrations to alleviate event reasoning-oriented data scarcity. Additionally, a novel Debiasing and Distillation-Enhanced Supervised Fine-Tuning (D2E-SFT) strategy is presented, which facilitates adhering to context and fixating significant contextual event information to elevate the event reasoning capability. Specifically, D2E-SFT removes the given sample’s context to construct an imagined sample, subtracting its logits to mitigate the bias of neglecting context and improve contextual faithfulness. To guide the model in emphasizing significant contextual event information, D2E-SFT employs a context-refined sample to achieve self-distillation with the alignment of logits. Extensive experimental results demonstrate the effectiveness of our data and strategy in expanding the performance boundary of event reasoning.
2024
Rethinking the Reversal Curse of LLMs: a Prescription from Human Knowledge Reversal
Zhicong Lu
|
Li Jin
|
Peiguang Li
|
Yu Tian
|
Linhao Zhang
|
Sirui Wang
|
Guangluan Xu
|
Changyuan Tian
|
Xunliang Cai
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Large Language Models (LLMs) have exhibited exceptional performance across diverse domains. However, recent studies reveal that LLMs are plagued by the “reversal curse”. Most existing methods rely on aggressive sample permutation and pay little attention to delving into the underlying reasons for this issue, resulting in only partial mitigation. In this paper, inspired by human knowledge reversal, we investigate and quantify the individual influence of three potential reasons on the reversal curse: 1) knowledge clarity, 2) entity correlation modeling, and 3) pairwise relationship reasoning capability. Motivated by the analysis of these reasons, we propose a novel **P**airwise entity **O**rder- and **R**elationship-**E**nhanced (**PORE**) data strategy, which facilitates bidirectional entity correlation modeling and pairwise relationship reasoning to overcome the reversal curse. Specifically, PORE augments the samples with entity order-reversal and semantically preserved question-answer pairs, enhancing the encoding of entity correlations in both directions. PORE also employs entity-interleaved pairwise relationship data, which elevates the model’s capability for relationship reasoning. Additionally, to improve the recall of reverse relationships, we leverage knowledge clarity to construct high-clarity data for PORE. Extensive experimental results on available and two newly assembled datasets demonstrate the effectiveness and generalization of our method in both data-sufficient and -constrained situations.