Yushun Dong

2025

Advances in large language models (LLMs) significantly enhance reasoning capabilities but their deployment is restricted in resource-constrained scenarios. Knowledge distillation addresses this by transferring knowledge from powerful teacher models to compact and transparent students.However, effectively capturing the teacher’s comprehensive reasoning is challenging due to conventional token-level supervision’s limited scope. Using multiple reasoning paths per query alleviates this problem, but treating each path identically is suboptimal as paths vary widely in quality and suitability across tasks and models.We propose Quality-filtered Routing with Cooperative Distillation(QR-Distill), combining path quality filtering, conditional routing, and cooperative peer teaching. First, quality filtering retains only correct reasoning paths scored by an LLM-based evaluation. Second, conditional routing dynamically assigns paths tailored to each student’s current learning state. Finally, cooperative peer teaching enables students to mutually distill diverse insights, addressing knowledge gaps and biases toward specific reasoning styles. Experiments demonstrate QR-Distill’s superiority over traditional single- and multi-path distillation methods. Ablation studies further highlight the importance of each component—quality filtering, conditional routing, and peer teaching—in effective knowledge transfer. Our code is available at https://github.com/LzyFischer/Distill.

Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains, including their emerging role in mitigating threats to human life, infrastructure, and the environment during natural disasters. Despite increasing research on disaster-focused LLMs, there remains a lack of systematic reviews and in-depth analyses of their applications in natural disaster management. To address this gap, this paper presents a comprehensive survey of LLMs in disaster response, introducing a taxonomy that categorizes existing works based on disaster phases and application scenarios. By compiling public datasets and identifying key challenges and opportunities, this study aims to provide valuable insights for the research community and practitioners in developing advanced LLM-driven solutions to enhance resilience against natural disasters.

2024

pdf bib abs
Knowledge Graph-Enhanced Large Language Models via Path Selection
Haochen Liu | Song Wang | Yaochen Zhu | Yushun Dong | Jundong Li
Findings of the Association for Computational Linguistics: ACL 2024

Large Language Models (LLMs) have shown unprecedented performance in various real-world applications. However, they are known to generate factually inaccurate outputs, a.k.a. the hallucination problem. In recent years, incorporating external knowledge extracted from Knowledge Graphs (KGs) has become a promising strategy to improve the factual accuracy of LLM-generated outputs. Nevertheless, most existing explorations rely on LLMs themselves to perform KG knowledge extraction, which is highly inflexible as LLMs can only provide binary judgment on whether a certain knowledge (e.g., a knowledge path in KG) should be used. In addition, LLMs tend to pick only knowledge with direct semantic relationship with the input text, while potentially useful knowledge with indirect semantics can be ignored. In this work, we propose a principled framework KELP with three stages to handle the above problems. Specifically, KELP is able to achieve finer granularity of flexible knowledge extraction by generating scores for knowledge paths with input texts via latent semantic matching. Meanwhile, knowledge paths with indirect semantic relationships with the input text can also be considered via trained encoding between the selected paths in KG and the input text. Experiments on real-world datasets validate the effectiveness of KELP.

pdf bib abs
Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective on Molecule Graphs
Yinhan He | Zaiyi Zheng | Patrick Soga | Yaochen Zhu | Yushun Dong | Jundong Li
Findings of the Association for Computational Linguistics: EMNLP 2024

In recent years, Graph Neural Networks (GNNs) have become successful in molecular property prediction tasks such as toxicity analysis. However, due to the black-box nature of GNNs, their outputs can be concerning in high-stakes decision-making scenarios, e.g., drug discovery. Facing such an issue, Graph Counterfactual Explanation (GCE) has emerged as a promising approach to improve GNN transparency. However, current GCE methods usually fail to take domain-specific knowledge into consideration, which can result in outputs that are not easily comprehensible by humans. To address this challenge, we propose a novel GCE method, LLM-GCE, to unleash the power of large language models (LLMs) in explaining GNNs for molecular property prediction. Specifically, we utilize an autoencoder to generate the counterfactual graph topology from a set of counterfactual text pairs (CTPs) based on an input graph. Meanwhile, we also incorporate a CTP dynamic feedback module to mitigate LLM hallucination, which provides intermediate feedback derived from the generated counterfactuals as an attempt to give more faithful guidance. Extensive experiments demonstrate the superior performance of LLM-GCE. Our code is released on https://github.com/YinhanHe123/new_LLM4GNNExplanation.

Co-authors

Venues

findings3
emnlp1

Fix author