Ruitong Liu


2025

pdf bib
CLEME2.0: Towards Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction
Jingheng Ye | Zishan Xu | Yinghui Li | Linlin Song | Qingyu Zhou | Hai-Tao Zheng | Ying Shen | Wenhao Jiang | Hong-Gee Kim | Ruitong Liu | Xin Su | Zifei Shan
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The paper focuses on the interpretability of Grammatical Error Correction (GEC) evaluation metrics, which received little attention in previous studies. To bridge the gap, we introduce **CLEME2.0**, a reference-based metric describing four fundamental aspects of GEC systems: hit-correction, wrong-correction, under-correction, and over-correction. They collectively contribute to exposing critical qualities and locating drawbacks of GEC systems. Evaluating systems by combining these aspects also leads to superior human consistency over other reference-based and reference-less metrics. Extensive experiments on two human judgment datasets and six reference datasets demonstrate the effectiveness and robustness of our method, achieving a new state-of-the-art result. Our codes are released at https://github.com/THUKElab/CLEME.

pdf bib
Position: LLMs Can be Good Tutors in English Education
Jingheng Ye | Shen Wang | Deqing Zou | Yibo Yan | Kun Wang | Hai-Tao Zheng | Ruitong Liu | Zenglin Xu | Irwin King | Philip S. Yu | Qingsong Wen
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

While recent efforts have begun integrating large language models (LLMs) into English education, they often rely on traditional approaches to learning tasks without fully embracing educational methodologies, thus lacking adaptability to language learning. To address this gap, we argue that **LLMs have the potential to serve as effective tutors in English Education**. Specifically, LLMs can play three critical roles: (1) as data enhancers, improving the creation of learning materials or serving as student simulations; (2) as task predictors, serving as learner assessment or optimizing learning pathway; and (3) as agents, enabling personalized and inclusive education. We encourage interdisciplinary research to explore these roles, fostering innovation while addressing challenges and risks, ultimately advancing English Education through the thoughtful integration of LLMs.

pdf bib
TueCL at SemEval-2025 Task 1: Image-Augmented Prompting and Multimodal Reasoning for Enhanced Idiom Understanding
Yue Yu | Jiarong Tang | Ruitong Liu
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper presents our approach for SemEval-2025 Task 1, Advancing Multimodal Idiomaticity Representation (AdMIRe), which focuses on idiom image ranking via semantic similarity. We explore multiple strategies, including neural networks on extracted embeddings and Siamese networks with triplet loss. A key component of our methodology is the application of advanced prompt engineeringtechniques within multimodal in-context learning (ManyICL), leveraging GPT-4o, CLIP.Our experiments demonstrate that structured and optimized prompts significantly enhancethe model’s ability to interpret idiomatic expressions in a multimodal setting.