Fanyou Wu
2025
Disentangling Biased Knowledge from Reasoning in Large Language Models via Machine Unlearning
Zheyuan Liu
|
Suraj Maharjan
|
Fanyou Wu
|
Rahil Parikh
|
Belhassen Bayar
|
Srinivasan H. Sengamedu
|
Meng Jiang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The rapid development of Large Language Models (LLMs) has led to their widespread adoption across various domains, leveraging vast pre-training knowledge and impressive generalization capabilities. However, these models often inherit biased knowledge, resulting in unfair decisions in sensitive applications. It is challenging to remove this biased knowledge without compromising reasoning abilities due to the entangled nature of the learned knowledge within LLMs. To solve this problem, existing approaches have attempted to mitigate the bias using techniques such as fine-tuning with unbiased datasets, model merging, and gradient ascent. While these methods have experimentally proven effective, they can still be sub-optimum in fully disentangling biases from reasoning. To address this gap, we propose Selective Disentanglement Unlearning (SDU), a novel unlearning framework that selectively removes biased knowledge while preserving reasoning capabilities. SDU operates in three stages: identifying biased parameters using a shadow LLM, fine-tuning with unbiased data, and performing selective parameter updates based on weight saliency. Experimental results across multiple LLMs show that SDU improves fairness accuracy by 14.7% and enhances reasoning performance by 62.6% compared to existing baselines.
2024
Synthesizing Conversations from Unlabeled Documents using Automatic Response Segmentation
Fanyou Wu
|
Weijie Xu
|
Chandan Reddy
|
Srinivasan Sengamedu
Findings of the Association for Computational Linguistics: ACL 2024
In this study, we tackle the challenge of inadequate and costly training data that has hindered the development of conversational question answering (ConvQA) systems. Enterprises have a large corpus of diverse internal documents. Instead of relying on a searching engine, a more compelling approach for people to comprehend these documents is to create a dialogue system. In this paper, we propose a robust dialog synthesising method. We learn the segmentation of data for the dialog task instead of using segmenting at sentence boundaries. The synthetic dataset generated by our proposed method achieves superior quality when compared to WikiDialog, as assessed through machine and human evaluations. By employing our inpainted data for ConvQA retrieval system pre-training, we observed a notable improvement in performance across OR-QuAC benchmarks.
2023
DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based LLM
Weijie Xu
|
Wenxiang Hu
|
Fanyou Wu
|
Srinivasan Sengamedu
Findings of the Association for Computational Linguistics: EMNLP 2023
In the burgeoning field of natural language processing, Neural Topic Models (NTMs) and Large Language Models (LLMs) have emerged as areas of significant research interest. Despite this, NTMs primarily utilize contextual embeddings from LLMs, which are not optimal for clustering or capable for topic generation. Our study addresses this gap by introducing a novel framework named Diffusion-Enhanced Topic Modeling using Encoder-Decoder-based LLMs (DeTiME). DeTiME leverages Encoder-Decoder-based LLMs to produce highly clusterable embeddings that could generate topics that exhibit both superior clusterability and enhanced semantic coherence compared to existing methods. Additionally, by exploiting the power of diffusion, our framework also provides the capability to generate content relevant to the identified topics. This dual functionality allows users to efficiently produce highly clustered topics and related content simultaneously. DeTiME’s potential extends to generating clustered embeddings as well. Notably, our proposed framework proves to be efficient to train and exhibits high adaptability, demonstrating its potential for a wide array of applications.
Search
Fix author
Co-authors
- Srinivasan Sengamedu 2
- Weijie Xu 2
- Belhassen Bayar 1
- Wenxiang Hu 1
- Meng Jiang 1
- show all...