You Li
2025
Multi-Stage LLM Fine-Tuning with a Continual Learning Setting
Changhao Guan
|
Chao Huang
|
Hongliang Li
|
You Li
|
Ning Cheng
|
Zihe Liu
|
Yufeng Chen
|
Jinan Xu
|
Jian Liu
Findings of the Association for Computational Linguistics: NAACL 2025
In recent years, large language models (LLMs) have made significant progress in knowledge-intensive applications. However, when adapting them to specific domains, we may encounter a multi-stage continuous learning scenario, especially in cases where domain knowledge evolves rapidly.This issue severely limits traditional fine-tuning approaches for LLMs.To overcome this limitation, we propose a new learning paradigm designed specifically for multi-stage continuous learning. This paradigm includes a preference-based learning bias to identify potential knowledge conflicts, as well as a self-distillation-based data augmentation strategy to expand and enrich the training corpus, thereby improving the integration of knowledge-compatible information.In the experiments, we show that our proposed method achieves a significant improvement in accuracy after 7 stages of fine-tuning compared to previous methods, while also demonstrating excellent performance in preserving general knowledge.We have released our code and dataset at Multi-Stage-Learning.
2024
Knowledge-Guided Dynamic Modality Attention Fusion Framework for Multimodal Sentiment Analysis
Xinyu Feng
|
Yuming Lin
|
Lihua He
|
You Li
|
Liang Chang
|
Ya Zhou
Findings of the Association for Computational Linguistics: EMNLP 2024
Multimodal Sentiment Analysis (MSA) utilizes multimodal data to infer the users’ sentiment. Previous methods focus on equally treating the contribution of each modality or statically using text as the dominant modality to conduct interaction, which neglects the situation where each modality may become dominant. In this paper, we propose a Knowledge-Guided Dynamic Modality Attention Fusion Framework (KuDA) for multimodal sentiment analysis. KuDA uses sentiment knowledge to guide the model dynamically selecting the dominant modality and adjusting the contributions of each modality. In addition, with the obtained multimodal representation, the model can further highlight the contribution of dominant modality through the correlation evaluation loss. Extensive experiments on four MSA benchmark datasets indicate that KuDA achieves state-of-the-art performance and is able to adapt to different scenarios of dominant modality.
2023
Rethinking the Construction of Effective Metrics for Understanding the Mechanisms of Pretrained Language Models
You Li
|
Jinhui Yin
|
Yuming Lin
Findings of the Association for Computational Linguistics: EMNLP 2023
Pretrained language models are expected to effectively map input text to a set of vectors while preserving the inherent relationships within the text. Consequently, designing a white-box model to compute metrics that reflect the presence of specific internal relations in these vectors has become a common approach for post-hoc interpretability analysis of pretrained language models. However, achieving interpretability in white-box models and ensuring the rigor of metric computation becomes challenging when the source model lacks inherent interpretability. Therefore, in this paper, we discuss striking a balance in this trade-off and propose a novel line to constructing metrics for understanding the mechanisms of pretrained language models. We have specifically designed a family of metrics along this line of investigation, and the model used to compute these metrics is referred to as the tree topological probe. We conducted measurements on BERT-large by using these metrics. Based on the experimental results, we propose a speculation regarding the working mechanism of BERT-like pretrained language models, as well as a strategy for enhancing fine-tuning performance by leveraging the topological probe to improve specific submodules.
Search
Fix data
Co-authors
- Yuming Lin 2
- Liang Chang 1
- Yufeng Chen (陈钰枫) 1
- Ning Cheng 1
- Xinyu Feng 1
- show all...