Feng Huang

2026

Self-Reflection Improves Safety of Large Reasoning Models
Qiang Huang | Wei Zhai | Feng Huang | Dejing Dou
Findings of the Association for Computational Linguistics: ACL 2026

Large Reasoning Models(LRMs) have achieved significant breakthroughs over prior large language models (LLMs), but they also entail greater potential safety risks. Existing alignment methods often remain at a shallow level of protection, making them insufficient to address deeper risks and strategic attacks in complex reasoning processes. To bridge this gap, we move beyond the conventional paradigm that treats safety alignment merely as a preventive measure to reduce harmful outputs. Drawing inspiration from human-like introspection and self-correction, we propose Self-Reflection, a technique that introduces a special Self-Reflection token, enabling LRMs to perform Self-Reflection during generation and recover from harmful outputs. Our approach integrates seamlessly into standard post-training paradigms , further enhancing both helpfulness and safety. The experimental results demonstrate that models trained with Self-Reflection not only consistently outperform the baseline in terms of safety (reducing the HCR from 13.8% to 4.1%, nearly a threefold improvement over mainstream approaches), but also achieve substantial advantages in both helpfulness and the safety–helpfulness balance. More importantly, under evaluations involving various adversarial attacks, including a specially designed adaptive attack, the Self-Reflection mechanism significantly enhances model safety without targeted adversarial training.Notice: This paper contains harmful content.

pdf bib abs

Large language models (LLMs) often hallucinate in question answering (QA) tasks due to a lack of factual knowledge. While integrating knowledge graphs (KGs) with LLMs has alleviated this issue, existing methods suffer from poor generalization or low reasoning efficiency, and critically, they overlook the learning and reuse of reasoning paths from past experiences. To address these challenges, we introduce Thought-Action Graph (TAG), a structured repository of reasoning experiences. TAG decomposes successful LLM-KG interaction trajectories into fine-grained semantic operators, which are stored in TAG constructed by the thought layer and action layer. Building upon TAG, we propose a novel KGQA paradigm — TAG-Reasoning (TAGR). TAGR first retrieves and assembles reasoning blueprints from TAG, and then guides LLM to efficiently execute on KG according to them. Through this approach, TAGR transforms the computationally expensive online exploration process of LLMs into an offline process of TAG retrieval and assembly. Experimental results on multiple KGQA benchmarks demonstrate that TAGR significantly outperforms state-of-the-art methods across key metrics, while drastically reducing the number of LLM calls and generated tokens. This work opens new avenues for building continual learning, efficient, and faithful KGQA systems.

2025

pdf bib abs

A Novel Chinese-Idiom Automatic Error Correction Method Based on the Hidden Markov Model
Rongbin Zhang | Anlu Gui | Peng Cao | Lingfeng Wu | Feng Huang | Jiahui Li
Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)

Spelling errors in Chinese idioms frequently occur due to various types of misspellings and optical character recognition (OCR) errors in daily learning and usage. Achieving automatic error correction for Chinese idioms is one of the important natural language processing tasks, as it helps improve the quality of Chinese texts as well as language learning. Existing methods, such as edit distance and custom dictionary approaches, suffer from limited error correction capability, low computational efficiency, and weak flexibility. To address these limitations, this paper proposes a novel automatic error correction method for Chinese idioms based on the hidden Markov model (HMM). Specifically, the generation process of idiom spelling errors is modeled using an HMM, transforming the idiom correction problem into a matching task between erroneous idioms and legitimate idioms. By constructing a legitimate idiom table and a Chinese character confusion set, a prototype system for idiom correction was developed, and performance testing was completed. Experiment results demonstrate that the proposed model is simpler with fewer parameters and has lower computational complexity while exhibiting stronger error correction capability and parameter robustness as compared to existing methods. It can more flexibly correct diverse types of idiom errors, showing high potential application value.

pdf bib abs

PKAG-DDI: Pairwise Knowledge-Augmented Language Model for Drug-Drug Interaction Event Text Generation
Ziyan Wang | Zhankun Xiong | Feng Huang | Wen Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Drug-drug interactions (DDIs) arise when multiple drugs are administered concurrently. Accurately predicting the specific mechanisms underlying DDIs (named DDI events or DDIEs) is critical for the safe clinical use of drugs. DDIEs are typically represented as textual descriptions. However, most computational methods focus more on predicting the DDIE class label over generating human-readable natural language increasing clinicians’ interpretation costs. Furthermore, current methods overlook the fact that each drug assumes distinct biological functions in a DDI, which, when used as input context, can enhance the understanding of the DDIE process and benefit DDIE generation by the language model (LM). In this work, we propose a novel pairwise knowledge-augmented generative method (termed PKAG-DDI) for DDIE text generation. It consists of a pairwise knowledge selector efficiently injecting structural information between drugs bidirectionally and simultaneously to select pairwise biological functions from the knowledge set, and a pairwise knowledge integration strategy that matches and integrates the selected biological functions into the LM. Experiments on two professional datasets show that PKAG-DDI outperforms existing methods in DDIE text generation, especially in challenging inductive scenarios, indicating its practicality and generalization.