2024
pdf
abs
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Deyuan Liu
|
Zhanyue Qin
|
Hairu Wang
|
Zhao Yang
|
Zecheng Wang
|
Fangying Rong
|
Qingbin Liu
|
Yanchao Hao
|
Bo Li
|
Xi Chen
|
Cunhang Fan
|
Zhao Lv
|
Dianhui Chu
|
Zhiying Tu
|
Dianbo Sui
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
While large language models (LLMs) excel in many domains, their complexity and scale challenge deployment in resource-limited environments. Current compression techniques, such as parameter pruning, often fail to effectively utilize the knowledge from pruned parameters. To address these challenges, we propose Manifold-Based Knowledge Alignment and Layer Merging Compression (MKA), a novel approach that uses manifold learning and the Information Bottleneck (IB) measure to merge similar layers, reducing model size while preserving essential performance. We evaluate MKA on multiple benchmark datasets and various LLMs. Our findings show that MKA not only preserves model performance but also achieves substantial compression ratios, outperforming traditional pruning methods. Moreover, when coupled with quantization, MKA delivers even greater compression. Specifically, on the MMLU dataset using the Llama3-8B model, MKA achieves a compression ratio of 43.75% with a minimal performance decrease of only 2.82%. The proposed MKA method offers a resource-efficient and performance-preserving model compression technique for LLMs. We make our code available at https://github.com/SempraETY/Pruning-via-Merging
pdf
abs
Analyzing Chain-of-thought Prompting in Black-Box Large Language Models via Estimated V-information
Zecheng Wang
|
Chunshan Li
|
Zhao Yang
|
Qingbin Liu
|
Yanchao Hao
|
Xi Chen
|
Dianhui Chu
|
Dianbo Sui
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Chain-of-Thought (CoT) prompting combined with large language models (LLM) has shown great potential in improving performance on challenging reasoning tasks. While understanding why CoT prompting is effective is crucial for the application and improvement of CoT prompting, few studies have addressed this issue. Besides, almost no prior work has conducted theoretical analysis on CoT prompting in the context of black-box models. In this paper, we approach the analysis of CoT prompting in black-box LLMs from an information-theoretic perspective. Specifically, we propose a new metric, EPVI (Estimated Pointwise V-Information), which extends the concept of pointwise V-information to black-box models, quantifying the label-relevant new information introduced by CoT prompting beyond the pre-existing information in the input. Based on this, we conduct a series of experiments at both the task and instance levels to analyze CoT prompting, demonstrating that the effectiveness of CoT prompting can be attributed to its capacity to influence the difficulty of model inference by augmenting or reducing the model-usable information. Furthermore, we show that selecting high-quality demonstrations of CoT reasoning based on EPVI can improve the downstream performance of reasoning tasks.
2023
pdf
abs
Class Lifelong Learning for Intent Detection via Structure Consolidation Networks
Qingbin Liu
|
Yanchao Hao
|
Xiaolong Liu
|
Bo Li
|
Dianbo Sui
|
Shizhu He
|
Kang Liu
|
Jun Zhao
|
Xi Chen
|
Ningyu Zhang
|
Jiaoyan Chen
Findings of the Association for Computational Linguistics: ACL 2023
Intent detection, which estimates diverse intents behind user utterances, is an essential component of task-oriented dialogue systems. Previous intent detection models are usually trained offline, which can only handle predefined intent classes. In the real world, new intents may keep challenging deployed models. For example, with the prevalence of the COVID-19 pandemic, users may pose various issues related to the pandemic to conversational systems, which brings many new intents. A general intent detection model should be intelligent enough to continually learn new data and recognize new arriving intent classes. Therefore, this work explores Class Lifelong Learning for Intent Detection (CLL-ID), where the model continually learns new intent classes from new data while avoiding catastrophic performance degradation on old data. To this end, we propose a novel lifelong learning method, called Structure Consolidation Networks (SCN), which consists of structure-based retrospection and contrastive knowledge distillation to handle the problems of expression diversity and class imbalance in the CLL-ID task. In addition to formulating the new task, we construct 3 benchmarks based on 8 intent detection datasets. Experimental results demonstrate the effectiveness of SCN, which significantly outperforms previous lifelong learning methods on the three benchmarks.
pdf
abs
Novel Relation Detection: Discovering Unknown Relation Types via Multi-Strategy Self-Supervised Learning
Qingbin Liu
|
Yin Kung
|
Yanchao Hao
|
Dianbo Sui
|
Siyuan Cheng
|
Xi Chen
|
Ningyu Zhang
|
Jiaoyan Chen
Findings of the Association for Computational Linguistics: EMNLP 2023
Conventional approaches to relation extraction can only recognize predefined relation types. In the real world, new or out-of-scope relation types may keep challenging the deployed models. In this paper, we formalize such a challenging problem as Novel Relation Detection (NRD), which aims to discover potential new relation types based on training samples of known relations. To this end, we construct two NRD datasets and exhaustively investigate a variety of out-of-scope detection methods. We further propose an effective NRD method that utilizes multi-strategy self-supervised learning to handle the problem of shallow semantic similarity in the NRD task. Experimental results demonstrate the effectiveness of our method, which significantly outperforms previous state-of-the-art methods on both datasets.
2018
pdf
abs
Pattern-revising Enhanced Simple Question Answering over Knowledge Bases
Yanchao Hao
|
Hao Liu
|
Shizhu He
|
Kang Liu
|
Jun Zhao
Proceedings of the 27th International Conference on Computational Linguistics
Question Answering over Knowledge Bases (KB-QA), which automatically answer natural language questions based on the facts contained by a knowledge base, is one of the most important natural language processing (NLP) tasks. Simple questions constitute a large part of questions queried on the web, still being a challenge to QA systems. In this work, we propose to conduct pattern extraction and entity linking first, and put forward pattern revising procedure to mitigate the error propagation problem. In order to learn to rank candidate subject-predicate pairs to enable the relevant facts retrieval given a question, we propose to do joint fact selection enhanced by relation detection. Multi-level encodings and multi-dimension information are leveraged to strengthen the whole procedure. The experimental results demonstrate that our approach sets a new record in this task, outperforming the current state-of-the-art by an absolute large margin.
2017
pdf
abs
An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge
Yanchao Hao
|
Yuanzhe Zhang
|
Kang Liu
|
Shizhu He
|
Zhanyi Liu
|
Hua Wu
|
Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
With the rapid growth of knowledge bases (KBs) on the web, how to take full advantage of them becomes increasingly important. Question answering over knowledge base (KB-QA) is one of the promising approaches to access the substantial knowledge. Meanwhile, as the neural network-based (NN-based) methods develop, NN-based KB-QA has already achieved impressive results. However, previous work did not put more emphasis on question representation, and the question is converted into a fixed vector regardless of its candidate answers. This simple representation strategy is not easy to express the proper information in the question. Hence, we present an end-to-end neural network model to represent the questions and their corresponding scores dynamically according to the various candidate answer aspects via cross-attention mechanism. In addition, we leverage the global knowledge inside the underlying KB, aiming at integrating the rich KB information into the representation of the answers. As a result, it could alleviates the out-of-vocabulary (OOV) problem, which helps the cross-attention model to represent the question more precisely. The experimental results on WebQuestions demonstrate the effectiveness of the proposed approach.