Qi Qi
Other people with similar names: Qi Qi, Qi Qi
Unverified author pages with similar names: Qi Qi
2026
Modeling and Solving Stable Matching under Probabilistic Preferences with Large Language Models
Yuqi Kong | Shiyu Liu | Jiaxu Li | Hongtao Liu | Qi Qi | Weiran Shen
Findings of the Association for Computational Linguistics: ACL 2026
Yuqi Kong | Shiyu Liu | Jiaxu Li | Hongtao Liu | Qi Qi | Weiran Shen
Findings of the Association for Computational Linguistics: ACL 2026
Large language models (LLMs) have recently demonstrated strong capability in understanding and simulating humans’ decisions, suggesting a new way to use LLMs as tools to study social systems. We study two-sided-matching markets, such as dating and job matching. Classical matching models assume deterministic, strict preferences, which violate real-world setting. We focus on stable matching under stochastic decision behavior and use LLMs to simulate human-like preferences and probabilistic choice patterns. Based on this, we introduce Expected Blocking Pairs (EBP), a continuous measure to quantify stability that generalizes the classic blocking pair notion. We further propose a Hybrid GS–LLM matching method that integrates the celebrated Gale–Shapley (GS) algorithm with probabilistic acceptance decisions. Experiments show that the proposed hybrid method outperforms classical baselines in terms of stability, suggesting that LLMs provide a principled tool for modeling human decisions and for improving robustness of matching under uncertainty.
Towards Trustworthy Smart Contract Synthesis: A Multi-Agent Framework with Lean-Based Verification
Bowei Zhang | Hanbing Liu | Qixin Tian | Siyu Chen | Ziyuan Wang | Qi Qi
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Bowei Zhang | Hanbing Liu | Qixin Tian | Siyu Chen | Ziyuan Wang | Qi Qi
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Smart Contracts are the foundation of Decentralized Finance (DeFi), executing financial logic without trusted intermediaries. Recent advances in large language models (LLMs) have substantially lowered the barrier to smart contract development by enabling code generation from natural language. However, because smart contracts are immutable and directly manage financial assets, this accessibility introduces a critical trust gap: generated contracts are easy to produce but hard to trust. To bridge this gap, we present LeVer, the first trustworthy smart contract synthesis framework that integrates LLM-based generation with Lean-based auto-formalization and Verification. LeVer employs a closed-loop multi-agent architecture to iteratively generate, verify, attack, and repair contracts, providing both formal guarantees and empirical robustness. To facilitate the adoption of automated formal verification in smart contract generation and audition, we open-source our framework and datasets at: https://github.com/gl-bowei/LeVer
2025
Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL
Hanbing Liu | Haoyang Li | Xiaokang Zhang | Ruotong Chen | Haiyong Xu | Tian Tian | Qi Qi | Jing Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hanbing Liu | Haoyang Li | Xiaokang Zhang | Ruotong Chen | Haiyong Xu | Tian Tian | Qi Qi | Jing Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Direct Preference Optimization (DPO) has proven effective in complex reasoning tasks like math word problems and code generation. However, when applied to Text-to-SQL datasets, it often fails to improve performance and can even degrade it. Our investigation reveals the root cause: unlike math and code tasks, which naturally integrate Chain-of-Thought (CoT) reasoning with DPO, Text-to-SQL datasets typically include only final answers (gold SQL queries) without detailed CoT solutions. By augmenting Text-to-SQL datasets with synthetic CoT solutions, we achieve, for the first time, consistent and significant performance improvements using DPO.Our analysis shows that CoT reasoning is crucial for unlocking DPO’s potential, as it mitigates reward hacking, strengthens discriminative capabilities, and improves scalability. These findings offer valuable insights for building more robust Text-to-SQL models. To support further research, we publicly release the code and CoT-enhanced datasets: https://github.com/RUCKBReasoning/DPO_Text2SQL.