Yu Han

2025

pdf bib abs
RTE-GMoE: A Model-agnostic Approach for Relation Triplet Extraction via Graph-based Mixture-of-Expert Mutual Learning
Aziguli Wulamu | Kaiyuan Gong | Lyu Zhengyu | Yu Han | Zhihong Zhu | Bowen Xing
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Relation Triplet Extraction (RTE) is a fundamental while challenge task in knowledge acquisition, which identifies and extracts all triplets from unstructured text. Despite the recent advancements, the deep integration of the entity-, relation- and triplet-specific information remains a challenge. In this paper, we propose a Graph-based Mixture-of-Experts mutual learning framework for RTE, namely RTE-GMoE, to address this limitation. As a model-agnostic framework, RTE-GMoE distinguishes itself by including and modeling the mutual interactions among three vital task-specific experts: entity expert, RTE expert, and relation expert. RTE expert corresponds to the main RTE task and can be implemented by any model and the other two correspond to the two auxiliary tasks: entity recognition and relation extraction. We construct an expert graph and achieve comprehensive and adaptive graph-based MoE interactions with a novel mutual learning mechanism. In our framework, these experts perform knowledge extractions collaboratively via dynamic information exchange and knowledge sharing. We conduct extensive experiments on four state-of-the-art backbones and evaluate them on several widely-used benchmarks. The results demonstrate that our framework brings consistent and promising improvements on all backbones and benchmarks. Component study and model analysis further verify the effectiveness and advantages of our method.

Knowledge base question answering (KBQA) aims to answer natural language questions by reasoning over structured knowledge bases. Existing approaches often struggle with the complexity of mapping questions to precise logical forms, particularly when dealing with diverse entities and relations. In this paper, we propose Hierarchical Topology Multi-task Learning (HTML), a novel framework that leverages a hierarchical multi-task learning paradigm to enhance the performance of logical form generation. Our framework consists of a main task: generating logical forms from questions, and three auxiliary tasks: entity prediction from the input question, relation prediction for the given entities, and logical form generation based on the given entities and relations. Through joint instruction-tuning, HTML allows mutual guidance and knowledge transfer among the hierarchical tasks, capturing the subtle dependencies between entities, relations, and logical forms. Extensive experiments on public benchmarks show that HTML markedly outperforms both supervised fine-tuning methods and training-free ones based on powerful large language models (e.g., GPT-4), demonstrating its superiority in question understanding and structural knowledge reasoning.

2024

pdf bib abs
Mixture-of-LoRAs: An Efficient Multitask Tuning Method for Large Language Models
Wenfeng Feng | Chuzhan Hao | Yuewei Zhang | Yu Han | Hao Wang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Instruction Tuning has the potential to stimulate or enhance specific capabilities of large language models (LLMs). However, achieving the right balance of data is crucial to prevent catastrophic forgetting and interference between tasks. To address these limitations and enhance training flexibility, we propose the Mixture-of-LoRAs (MoA) architecture which is a novel and parameter-efficient tuning method designed for multi-task learning with LLMs. In this paper, we start by individually training multiple domain-specific LoRA modules using corresponding supervised corpus data. These LoRA modules can be aligned with the expert design principles observed in Mixture-of-Experts (MoE). Subsequently, we combine the multiple LoRAs using an explicit routing strategy and introduce domain labels to facilitate multi-task learning, which help prevent interference between tasks and ultimately enhances the performance of each individual task. Furthermore, each LoRA model can be iteratively adapted to a new domain, allowing for quick domain-specific adaptation. Experiments on diverse tasks demonstrate superior and robust performance, which can further promote the wide application of domain-specific LLMs.

2023

pdf bib abs
Prompt-Based Editing for Text Style Transfer
Guoqing Luo | Yu Han | Lili Mou | Mauajama Firdaus
Findings of the Association for Computational Linguistics: EMNLP 2023

Prompting approaches have been recently explored in text style transfer, where a textual prompt is used to query a pretrained language model (PLM) to generate style-transferred texts word by word in an autoregressive manner. However, such a generation process is less controllable and early prediction errors may affect future word predictions. In this paper, we propose a prompt-based editing approach to text style transfer. Specifically, we prompt a PLM for style classification and use the classification probability to compute a style score. Then, we perform discrete search with word-level editing to maximize a comprehensive scoring function for the style-transfer task. In this way, we transform a prompt-based generation problem into a classification one, which does not suffer from the error accumulation problem and is more controllable than the autoregressive generation of sentences. In our experiments, we performed both automatic and human evaluation on three style-transfer benchmark datasets, and show that our approach largely outperforms the existing systems that have 20 times more parameters. Additional empirical analyses further demonstrate the effectiveness of our approach.

Co-authors

Hao Wang (汪浩, 王昊, 王浩) 1

Zewen Wang 1

Yuewei Zhang 1

Venues

Fix author