Zehua Duo

2026

As large language models (LLMs) are increasingly deployed in dialogue systems and interactive agents, their social adaptation during natural interaction has drawn growing attention. While prior work shows strong social regulation under explicit role or style instructions, it remains unclear whether LLMs can spontaneously perceive and respond to implicit social differences without explicit prompts. Focusing on high-context Chinese interactions, we identify a robust phenomenon termed Social Agnosia, where LLMs fail to adequately perceive and accommodate implicit social power, affective arousal, and epistemic status during natural interaction. To diagnose this behavior, we propose C-ISA, a framework grounded in Communication Accommodation Theory that decomposes social adaptation into three approximately orthogonal dimensions, and conduct controlled comparisons across multiple Chinese LLMs under implicit and explicit conditions. Results show that while models substantially adjust linguistic strategies under explicit conditioning, they exhibit socially insensitive and homogenized responses in natural interaction, revealing a structural gap between spontaneous behavior and conditioned capability. The C-ISA dataset is publicly available at https://github.com/ty373/C-ISA.

pdf bib abs

Temporal knowledge graph embedding (TKGE) aims to model the temporal evolution of relational facts. However, existing approaches predominantly rely on discrete timestamp lookup tables and high-dimensional embedding spaces, which lack explicit structural constraints for continuous-time dynamics. As a result, temporal patterns are often captured through capacity scaling rather than principled dynamic modeling, leading to limited parameter efficiency and scalability.To address these limitations, we propose , a physics-inspired framework that embeds temporal dynamics into a symplectic phase space. Our model introduces a structure-preserving Hamiltonian evolution mechanism based on a pairwise-decoupled Hamiltonian generator and its Cayley transform, ensuring that temporal transformations adhere to the symplectic group Sp(2d) and preserve phase-space volume with linear computational complexity. In addition, we design a Time-Aware Parameter Modulation mechanism that integrates continuous Rotary Time Embeddings via Feature-wise Linear Modulation, enabling smooth temporal evolution while capturing event-driven variations. Theoretical analysis establishes the geometric validity of the proposed framework. Extensive experiments on standard TKGE benchmarks demonstrate that achieves competitive performance with substantially lower embedding dimensions. Furthermore, empirical results show that the proposed continuous Hamiltonian evolution facilitates generalization to unseen timestamps by learning transferable temporal dynamics from the underlying geometric structure.

pdf bib abs

While Large Language Models (LLMs) excel at capturing communicative intent, this capability introduces a side effect: Pragmatic Hallucination, where models over-interpret literal contexts to generate non-factual inferences. To quantify this, we introduce the PaCE (Pragmatics-as-Context Evaluation) benchmark, comprising over 3,000 manually verified "context-flip" samples. Evaluations across nine mainstream models reveal a significant Context Sensitivity Gap (CSG), with literal accuracy consistently lagging behind pragmatic reasoning. Attribution analysis indicates that Reinforcement Learning from Human Feedback (RLHF) exacerbates this bias, and neither parameter scaling nor Chain-of-Thought (CoT) fully mitigates it. Crucially, "Strict Prompting" effectively reverses the CSG, demonstrating that the phenomenon stems from behavioral lock-in during training rather than inherent capability deficiencies. Furthermore, error patterns exhibit high systematic correlation across diverse architectures. This study highlights that current alignment paradigms lack precise control over pragmatic boundaries, underscoring the necessity for a "Literal Grounding" mechanism in future safety frameworks.

pdf bib abs

CausalityCheck: A Framework for Evaluating Causal Reasoning in Large Language Models
Jiang Li | Zehua Duo | Guanglai Gao | Xiangdong Su
Findings of the Association for Computational Linguistics: ACL 2026

Causal reasoning is a crucial component of understanding complex phenomena and building intelligent systems. Recent advancements in large language models (LLMs) have demonstrated their strong capabilities in reasoning tasks; however, their true understanding of causal relationships remains limited, particularly in cases where causal chains are misidentified or reliance on empirical inference occurs. To mitigate the risk that models misclassify data as false positives due to these issues, we introduce CausalityCheck, an automated tool designed to efficiently generate causal reasoning checklists. This checklist enables the creation of multi-task causal reasoning datasets with task generalization and reasoning robustness from a single causal reasoning dataset. Using CausalityCheck, we developed CausalityCheck-CP to assess the causal reasoning abilities of 18 LLMs. This framework also measures the extent to which causal chains are misidentified or rely on empirical inferences. Our results indicate that the current large language models still face two critical issues when handling complex causal reasoning tasks: incorrect identification of causal chains and reliance on empirical inference. The code and data are available at https://github.com/dzh597/CausalityCheck.

pdf bib abs

Theory of Mind (ToM) is widely regarded as central to effective persuasion, yet existing evaluations often fail to capture the infer–apply loop that arises in real-world dialogue. We introduce Theory-of-Mind-Guided Elaboration-Likelihood Persuasion (ToMELP), a benchmark that jointly conditions on the audience persona p and the Elaboration Likelihood Model (ELM) route r ∈ {central, peripheral} within persuasive conversations. The benchmark tests whether large language models can perform ToM inference over multi-turn interactions and leverage these inferences for controllable persuasive generation. ToMELP provides a structured interface with evidence annotations, enabling automated evaluation of persuasive effectiveness, route alignment/deviation, evidence quality under the central route, and robustness to perturbations.

2025

pdf bib abs

Knowledge graph embedding techniques have emerged as a critical approach for addressing the issue of missing relations in knowledge graphs. However, existing methods often suffer from limitations, including high intra-group similarity, loss of semantic information, and insufficient inference capability, particularly in complex relation patterns such as 1-N and N-1 relations. To address these challenges, we introduce a novel KGE framework that leverages mutual information maximization to improve the semantic representation of entities and relations. By maximizing the mutual information between different components of triples, such as (h, r) and t, or (r, t) and h, the proposed method improves the model’s ability to preserve semantic dependencies while maintaining the relational structure of the knowledge graph. Extensive experiments on benchmark datasets demonstrate the effectiveness of our approach, with consistent performance improvements across various baseline models. Additionally, visualization analyses and case studies demonstrate the improved ability of the MI framework to capture complex relation patterns.

Co-authors

Yu Tian 2

Xu Liu 1

Venues

Findings4
ACL2

Fix author