Jiatong Li

Hong Kong Polytechnic

Other people with similar names: Jiatong Li (Rutgers)


2025

pdf bib
ReAL: How Can LLMs Simulate the Real Teacher? Retrieval-enhanced Agent for Adaptive Learning
Rui Lv | Qi Liu | Weibo Gao | Jiatong Li | Kai Zhang | Shiwei Tong
Findings of the Association for Computational Linguistics: EMNLP 2025

Adaptive learning focuses on recommending personalized materials (e.g., exercises, courses) to the unique needs of learners. Despite significant research, these methods still lag behind real teachers including two main limitations: (1) Prior methods model learner-item interactions based only on ID sequences, leading to insufficient use of both learner and item information, particularly the inability to leverage semantic content from item text; (2) The data-driven reinforcement learning frameworks struggle with stable performance in scenarios with sparse learning logs. To address these challenges, we introduce the Retrieval-enhanced Agent for Adaptive Learning (ReAL) powered by large language models (LLMs), to simulate teacher decision-making with extensive prior knowledge and teaching experience. Specifically, we approach the simulation from both internal and external perspectives. From the internal perspective, we utilize the superior natural language standing ability of LLMs to analyze item texts and learner profiles. This mechanism contributes to the generation of personalized and appropriate item candidates. From the external perspective, we simulate the teacher experience by retrieving similar learners, further ensuring the model’s performance on sparse interaction data. Furthermore, we design a reflector based on learners’ feedback to refine the recommendation process. Evaluation on three real-world datasets demonstrates the superiority of ReAL in both data utilization, recommendation accuracy and stability compared to various representative baselines.

pdf bib
Summarize-Exemplify-Reflect: Data-driven Insight Distillation Empowers LLMs for Few-shot Tabular Classification
Yifei Yuan | Jiatong Li | Weijia Zhang | Mohammad Aliannejadi | Evangelos Kanoulas | Renjun Hu
Findings of the Association for Computational Linguistics: EMNLP 2025

Recent studies show the promise of large language models (LLMs) for few-shot tabular classification but highlight challenges due to the variability in structured data. To address this, we propose distilling data into actionable insights to enable robust and effective classification by LLMs. Drawing inspiration from human learning processes, we introduce InsightTab, an insight distillation framework guided by principles of divide-and-conquer, easy-first, and reflective learning. Our approach integrates rule summarization, strategic exemplification, and insight reflection through deep collaboration between LLMs and data modeling techniques. The obtained insights enable LLMs to better align their general knowledge and capabilities with the particular requirements of specific tabular tasks. We extensively evaluate InsightTab on nine datasets. The results demonstrate consistent improvement over state-of-the-art methods. Ablation studies further validate the principle-guided distillation process, while analyses emphasize InsightTab’s effectiveness in leveraging labeled data and managing bias.

pdf bib
LLaMA-Berry: Pairwise Optimization for Olympiad-level Mathematical Reasoning via O1-like Monte Carlo Tree Search
Di Zhang | Jianbo Wu | Jingdi Lei | Tong Che | Jiatong Li | Tong Xie | Xiaoshui Huang | Shufei Zhang | Marco Pavone | Yuqiang Li | Wanli Ouyang | Dongzhan Zhou
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

This paper presents LLaMA-Berry, an advanced mathematical reasoning framework to enhance the problem-solving ability of large language models (LLMs). The framework combines Monte Carlo Tree Search with Self-Refine (SR-MCTS) to optimize the reasoning paths and utilizes a pairwise reward model to evaluate different paths globally. By leveraging the self-critique and rewriting capabilities of LLMs, our SR-MCTS overcomes the inefficiencies and limitations of conventional step-wise and greedy search algorithms, enabling a more efficient exploration of solution spaces. To guide the search process, we propose the Pairwise Preference Reward Model (PPRM), which predicts pairwise preferences between solutions through instruction-following capabilities trained by Reinforcement Learning from Human Feedback (RLHF). Finally, the Enhanced Borda Count (EBC) method is adopted to synthesize pairwise preferences into global quantile scores for evaluations. This approach mitigates the challenges of scoring variability and non-independent distributions in mathematical reasoning tasks. The framework has been tested on general and advanced benchmarks, showing superior search efficiency and performance compared to existing open-source and closed-source methods, particularly in complex Olympiad-level benchmarks, including AIME24 and AMC23.

2024

pdf bib
MFE-NER: Multi-feature Fusion Embedding for Chinese Named Entity Recognition
Jiatong Li | Kui Meng
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“In Chinese Named Entity Recognition, character substitution is a complicated linguistic phe-nomenon. Some Chinese characters are quite similar as they share the same components or havesimilar pronunciations. People replace characters in a named entity with similar characters togenerate a new collocation but refer to the same object. As a result, it always leads to unrecog-nizable or mislabeling errors in the NER task. In this paper, we propose a lightweight method,MFE-NER, which fuses glyph and phonetic features to help pre-trained language models handlethe character substitution problem in the NER task with limited extra cost. Basically, in the glyphdomain, we disassemble Chinese characters into Five-Stroke components to represent structurefeatures. In the phonetic domain, an improved phonetic system is proposed in our work, makingit reasonable to describe phonetic similarity among Chinese characters. Experiments demon-strate that our method performs especially well in detecting character substitutions while slightlyimproving the overall performance of Chinese NER.”

2023

pdf bib
Conflicts, Villains, Resolutions: Towards models of Narrative Media Framing
Lea Frermann | Jiatong Li | Shima Khanehzar | Gosia Mikolajczak
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Despite increasing interest in the automatic detection of media frames in NLP, the problem is typically simplified as single-label classification and adopts a topic-like view on frames, evading modelling the broader document-level narrative. In this work, we revisit a widely used conceptualization of framing from the communication sciences which explicitly captures elements of narratives, including conflict and its resolution, and integrate it with the narrative framing of key entities in the story as heroes, victims or villains. We adapt an effective annotation paradigm that breaks a complex annotation task into a series of simpler binary questions, and present an annotated data set of English news articles, and a case study on the framing of climate change in articles from news outlets across the political spectrum. Finally, we explore automatic multi-label prediction of our frames with supervised and semi-supervised approaches, and present a novel retrieval-based method which is both effective and transparent in its predictions. We conclude with a discussion of opportunities and challenges for future work on document-level models of narrative framing.