Yifan Xiang


2025

pdf bib
ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Yiming Du | Yifan Xiang | Bin Liang | Dahua Lin | Kam-Fai Wong | Fei Tan
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Fine-tuning multi-turn dialogue systems requires high-quality supervision but often suffers from degraded performance when exposed to low-quality data. Supervision errors in early turns can propagate across subsequent turns, undermining coherence and response quality. Existing methods typically address data quality via static prefiltering, which decouples quality control from training and fails to mitigate turn-level error propagation. In this context, we propose **ReSURE** (REgularizing Supervision UnREliability), an adaptive learning method that dynamically down-weights unreliable supervision without explicit filtering. ReSURE estimates per-turn loss distributions using Welford’s online statistics and reweights sample losses on the fly accordingly. Experiments on both single-source and mixed-quality datasets show improved stability and response quality. Notably, ReSURE enjoys positive Spearman correlations (0.21 ~ 1.0 across multiple benchmarks) between response scores and number of samples regardless of data quality, which potentially paves the way for utilizing large-scale data effectively.

pdf bib
Flexibly Utilize Memory for Long-Term Conversation via a Fragment-then-Compose Framework
Cai Ke | Yiming Du | Bin Liang | Yifan Xiang | Lin Gui | Zhongyang Li | Baojun Wang | Yue Yu | Hui Wang | Kam-Fai Wong | Ruifeng Xu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) have made significant breakthroughs in extracting useful information from conversation history to enhance the response in long-term conversations. Summarizing useful information from historical conversations has achieved remarkable performance, which, however, may introduce irrelevant or redundant information, making it difficult to flexibly choose and integrate key information from different sessions during memory retrieval. To address this issue, we propose a Fragment-then-Compose framework, a novel memory utilization approach for long-term open-domain conversation, called *FraCom*. To be specific, inspired by the concept of proposition representation from Cognitive Psychology, we first represent the conversation history as a series of predicates plus arguments for propositional representation to preserve key information useful for memory ("**Fragment**”). Then, we compose propositional graphs for the conversation history based on the connection between shared arguments ("**Compose**”). During retrieval, we retrieve relevant propositions from the graph based on arguments from the current query. This essentially allows for flexible and effective utilization of related information in long-term memory for better response generation towards a query. Experimental results on four long-term open-domain conversation datasets demonstrate the effectiveness of our *FraCom* in memory utilization and its ability to enhance response generation for LLMs.