Sen Song
2025
Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models
Jialiang Wu
|
Yi Shen
|
Sijia Liu
|
Yi Tang
|
Sen Song
|
Xiaoyi Wang
|
Longjun Cai
Findings of the Association for Computational Linguistics: NAACL 2025
Despite their impressive capacities, Large language models (LLMs) often struggle with the hallucination issue of generating inaccurate or fabricated content even when they possess correct knowledge. In this paper, we extend the exploration of the correlation between hidden-state prediction changes and output factuality into a deeper, token-wise level. Based on the insights , we propose cross-layer Entropy eNhanced Decoding (END), a decoding method that mitigates hallucinations without requiring extra training. END leverages inner probability changes across layers to individually quantify the factual knowledge required for each candidate token, and adjusts the final predicting distribution to prioritize tokens with higher factuality. Experiments on both hallucination and QA benchmarks demonstrate that END significantly enhances the truthfulness and informativeness of generation while maintaining robust QA accuracy. Moreover, our work provides a deeper perspective of understanding the correlations between inherent knowledge and output factuality.
2020
Unsupervised Paraphrasing by Simulated Annealing
Xianggen Liu
|
Lili Mou
|
Fandong Meng
|
Hao Zhou
|
Jie Zhou
|
Sen Song
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
We propose UPSA, a novel approach that accomplishes Unsupervised Paraphrasing by Simulated Annealing. We model paraphrase generation as an optimization problem and propose a sophisticated objective function, involving semantic similarity, expression diversity, and language fluency of paraphrases. UPSA searches the sentence space towards this objective by performing a sequence of local editing. We evaluate our approach on various datasets, namely, Quora, Wikianswers, MSCOCO, and Twitter. Extensive results show that UPSA achieves the state-of-the-art performance compared with previous unsupervised methods in terms of both automatic and human evaluations. Further, our approach outperforms most existing domain-adapted supervised models, showing the generalizability of UPSA.