Xiaolin Qin
2026
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Yanxiao Zhao | Yaqian Li | Zi-Hao Bo | Rinyoichi Takezoe | Haojia Hui | Mo Guang | Renlei | Xiaolin Qin | Kaiwen Long
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yanxiao Zhao | Yaqian Li | Zi-Hao Bo | Rinyoichi Takezoe | Haojia Hui | Mo Guang | Renlei | Xiaolin Qin | Kaiwen Long
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) exhibit strong general reasoning, yet the community lacks controllable, scalable, and verifiable tools to analyze and improve these abilities. We present SATQuest, a verifier that generates diverse SAT-based reasoning tasks directly from Conjunctive Normal Form (CNF) instances and checks answers objectively with PySAT. SATQuest factorizes evaluation along three orthogonal dimensions—instance, problem type, and question format—enabling fine-grained, multi-dimensional analysis and reinforcement fine-tuning. Randomized CNF generation mitigates memorization and supports reproducible experiments. Using SATQuest, we benchmark a range of open- and closed-weight LLMs and uncover persistent gaps in logical reasoning, particularly on higher-complexity tasks and in transfer beyond familiar mathematical notation to machine or narrative formats. We further show that reinforcement fine-tuning with SATQuest rewards substantially boosts targeted performance and generalizes to larger instances, while cross-format robustness remains challenging. Collectively, SATQuest provides verifier-backed infrastructure for controlled, scalable, and reproducible empirical research on LLM logical reasoning and its training.
2025
Disentangling Biased Representations: A Causal Intervention Framework for Fairer NLP Models
Yangge Qian | Yilong Hu | Siqi Zhang | Xu Gu | Xiaolin Qin
Proceedings of the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Yangge Qian | Yilong Hu | Siqi Zhang | Xu Gu | Xiaolin Qin
Proceedings of the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Natural language processing (NLP) systems often inadvertently encode and amplify social biases through entangled representations of demographic attributes and task-related attributes. To mitigate this, we propose a novel framework that combines causal analysis with practical intervention strategies. The method leverages attribute-specific prompting to isolate sensitive attributes while applying information-theoretic constraints to minimize spurious correlations. Experiments across six language models and two classification tasks demonstrate its effectiveness. We hope this work will provide the NLP community with a causal disentanglement perspective for achieving fairness in NLP systems.