Ruitong Liu
2026
Rethinking Text-to-SQL: Dynamic Multi-turn SQL Interaction for Real-world Database Exploration
Linzhuang Sun | Tianyu Guo | Hao Liang | Ruitong Liu | Yuying Li | Qifeng Cai | Jingxuan Wei | Yuchen Wu | Bihui Yu | Xiangxiang Zhang | Wentao Zhang | Bin Cui
Findings of the Association for Computational Linguistics: ACL 2026
Linzhuang Sun | Tianyu Guo | Hao Liang | Ruitong Liu | Yuying Li | Qifeng Cai | Jingxuan Wei | Yuchen Wu | Bihui Yu | Xiangxiang Zhang | Wentao Zhang | Bin Cui
Findings of the Association for Computational Linguistics: ACL 2026
Recent advancements in Large Language Models (LLMs) have revolutionized Text-to-SQL parsing, achieving remarkable success in static, single-turn query generation. However, a significant disparity remains between these academic benchmarks and real-world utility. In practical applications, such as financial auditing or business analytics, user intents are rarely static; they evolve dynamically through iterative refinement, necessitating not just information retrieval (SELECT) but continuous state manipulation (INSERT, UPDATE, DELETE). To bridge this gap, we introduce DySQL-Bench, a novel benchmark designed to rigorously evaluate LLMs within a dynamic interaction framework. Unlike varying manual curation efforts, DySQL-Bench employs a two-stage automated synthesis pipeline: transforming raw relational schemas into hierarchical logic trees to generate user-database interactions, followed by a rigorous verify-and-refine protocol that ensures 100% distinct correctness via human expert validation. We further propose an interactive evaluation environment simulating a triadic workflow involving an LLM-simulated user, the agent under test, and an executable database system. Spanning 13 diverse domains with 1,072 complex tasks, our experiments reveal that current powerful models struggle in this realistic setting. Notably, GPT-4o achieves only 58.34% overall accuracy and a meager 23.81% on the strict Pass^5 metric, highlighting the substantial challenges DySQL-Bench poses for the future of database agents.
Evolving Beyond Snapshots: Harmonizing Structure and Sequence via Entity State Tuning for Temporal Knowledge Graph Forecasting
Siyuan Li | Yunjia Wu | Yiyong Xiao | Pingyang Huang | Peize Li | Ruitong Liu | Yan Wen | Te Sun
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Siyuan Li | Yunjia Wu | Yiyong Xiao | Pingyang Huang | Peize Li | Ruitong Liu | Yan Wen | Te Sun
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Temporal knowledge graph (TKG) forecasting requires predicting future facts by jointly modeling structural dependencies within each snapshot and temporal evolution across snapshots. However, most existing methods are stateless: they recompute entity representations at each timestamp from a limited query window, leading to episodic amnesia and rapid decay of long-term dependencies. To address this limitation, we propose Entity State Tuning (EST), an encoder-agnostic framework that endows TKG forecasters with persistent and continuously evolving entity states. EST maintains a global state buffer and progressively aligns structural evidence with sequential signals via a closed-loop design. Specifically, a topology-aware state perceiver first injects entity-state priors into structural encoding. Then, a unified temporal context module aggregates the state-enhanced events with a pluggable sequence backbone. Subsequently, a dual-track evolution mechanism writes the updated context back to the global entity state memory, balancing plasticity against stability. Experiments on multiple benchmarks show that EST consistently improves diverse backbones and achieves state-of-the-art performance, highlighting the importance of state persistence for long-horizon TKG forecasting.
2025
TueCL at SemEval-2025 Task 1: Image-Augmented Prompting and Multimodal Reasoning for Enhanced Idiom Understanding
Yue Yu | Jiarong Tang | Ruitong Liu
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Yue Yu | Jiarong Tang | Ruitong Liu
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
This paper presents our approach for SemEval-2025 Task 1, Advancing Multimodal Idiomaticity Representation (AdMIRe), which focuses on idiom image ranking via semantic similarity. We explore multiple strategies, including neural networks on extracted embeddings and Siamese networks with triplet loss. A key component of our methodology is the application of advanced prompt engineeringtechniques within multimodal in-context learning (ManyICL), leveraging GPT-4o, CLIP.Our experiments demonstrate that structured and optimized prompts significantly enhancethe model’s ability to interpret idiomatic expressions in a multimodal setting.
CLEME2.0: Towards Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction
Jingheng Ye | Zishan Xu | Yinghui Li | Linlin Song | Qingyu Zhou | Hai-Tao Zheng | Ying Shen | Wenhao Jiang | Hong-Gee Kim | Ruitong Liu | Xin Su | Zifei Shan
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jingheng Ye | Zishan Xu | Yinghui Li | Linlin Song | Qingyu Zhou | Hai-Tao Zheng | Ying Shen | Wenhao Jiang | Hong-Gee Kim | Ruitong Liu | Xin Su | Zifei Shan
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The paper focuses on the interpretability of Grammatical Error Correction (GEC) evaluation metrics, which received little attention in previous studies. To bridge the gap, we introduce **CLEME2.0**, a reference-based metric describing four fundamental aspects of GEC systems: hit-correction, wrong-correction, under-correction, and over-correction. They collectively contribute to exposing critical qualities and locating drawbacks of GEC systems. Evaluating systems by combining these aspects also leads to superior human consistency over other reference-based and reference-less metrics. Extensive experiments on two human judgment datasets and six reference datasets demonstrate the effectiveness and robustness of our method, achieving a new state-of-the-art result. Our codes are released at https://github.com/THUKElab/CLEME.
Position: LLMs Can be Good Tutors in English Education
Jingheng Ye | Shen Wang | Deqing Zou | Yibo Yan | Kun Wang | Hai-Tao Zheng | Ruitong Liu | Zenglin Xu | Irwin King | Philip S. Yu | Qingsong Wen
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Jingheng Ye | Shen Wang | Deqing Zou | Yibo Yan | Kun Wang | Hai-Tao Zheng | Ruitong Liu | Zenglin Xu | Irwin King | Philip S. Yu | Qingsong Wen
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
While recent efforts have begun integrating large language models (LLMs) into English education, they often rely on traditional approaches to learning tasks without fully embracing educational methodologies, thus lacking adaptability to language learning. To address this gap, we argue that **LLMs have the potential to serve as effective tutors in English Education**. Specifically, LLMs can play three critical roles: (1) as data enhancers, improving the creation of learning materials or serving as student simulations; (2) as task predictors, serving as learner assessment or optimizing learning pathway; and (3) as agents, enabling personalized and inclusive education. We encourage interdisciplinary research to explore these roles, fostering innovation while addressing challenges and risks, ultimately advancing English Education through the thoughtful integration of LLMs.
Search
Fix author
Co-authors
- Jingheng Ye 2
- Hai-Tao Zheng 2
- Qifeng Cai 1
- Bin Cui 1
- Tianyu Guo 1
- Pingyang Huang 1
- Wenhao Jiang 1
- Hong-Gee Kim 1
- Irwin King 1
- Yuying Li 1
- Siyuan Li 1
- Peize Li 1
- Yinghui Li 1
- Hao Liang 1
- Zifei Shan 1
- Ying Shen 1
- Linlin Song 1
- Xin Su 1
- Linzhuang Sun 1
- Te Sun 1
- Jiarong Tang 1
- Shen Wang 1
- Kun Wang 1
- Jingxuan Wei 1
- Yan Wen 1
- Qingsong Wen 1
- Yuchen Wu 1
- Yunjia Wu 1
- Yiyong Xiao 1
- Zishan Xu 1
- Zenglin Xu 1
- Yibo Yan 1
- Bihui Yu 1
- Yue Yu 1
- Philip S. Yu 1
- Xiangxiang Zhang 1
- Wentao Zhang 1
- Qingyu Zhou 1
- Deqing Zou 1