Xinyu Pang

2024

pdf abs
Subjective Topic meets LLMs: Unleashing Comprehensive, Reflective and Creative Thinking through the Negation of Negation
Fangrui Lv | Kaixiong Gong | Jian Liang | Xinyu Pang | Changshui Zhang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) exhibit powerful reasoning capacity, as evidenced by prior studies focusing on objective topics that with unique standard answers such as arithmetic and commonsense reasoning. However, the reasoning to definite answers emphasizes more on logical thinking, and falls short in effectively reflecting the comprehensive, reflective, and creative thinking that is also critical for the overall reasoning prowess of LLMs. In light of this, we build a dataset SJTP comprising diverse SubJective ToPics with free responses, as well as three evaluation indicators to fully explore LLM’s reasoning ability. We observe that a sole emphasis on logical thinking falls short in effectively tackling subjective challenges. Therefore, we introduce a framework grounded in the principle of the Negation of Negation (NeoN) to unleash the potential comprehensive, reflective, and creative thinking abilities of LLMs. Comprehensive experiments on SJTP demonstrate the efficacy of NeoN, and the enhanced performance on various objective reasoning tasks unequivocally underscores the benefits of stimulating LLM’s subjective thinking in augmenting overall reasoning capabilities.

pdf abs
A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning
Ruixin Hong | Hongming Zhang | Xinyu Pang | Dong Yu | Changshui Zhang
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Logical reasoning has been an ongoing pursuit in the field of AI. Despite significant advancements made by large language models (LLMs), they still struggle with complex logical reasoning problems. To enhance reasoning performance, one promising direction is scalable oversight, which requires LLMs to identify their own errors and then improve by themselves. Various self-verification methods have been proposed in pursuit of this goal. Nevertheless, whether existing models understand their own errors well is still under investigation. In this paper, we take a closer look at the self-verification abilities of LLMs in the context of logical reasoning, focusing on their ability to identify logical fallacies accurately. We introduce a dataset, FALLACIES, containing 232 types of reasoning fallacies categorized in a hierarchical taxonomy. By conducting exhaustive experiments on FALLACIES, we obtain comprehensive and detailed analyses of a series of models on their verification abilities. Our main findings suggest that existing LLMs could struggle to identify fallacious reasoning steps accurately and may fall short of guaranteeing the validity of self-verification methods. Drawing from these observations, we offer suggestions for future research and practical applications of self-verification methods.

Co-authors

Hongming Zhang 1

Dong Yu 1

Venues

emnlp1
naacl1