Dongliang Xu


2024

pdf
Improving Factual Consistency in Abstractive Summarization with Sentence Structure Pruning
Dingxin Hu | Xuanyu Zhang | Xingyue Zhang | Yiyang Li | Dongsheng Chen | Marina Litvak | Natalia Vanetik | Qing Yang | Dongliang Xu | Yanquan Zhou | Lei Li | Yuze Li | Yingqi Zhu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

State-of-the-art abstractive summarization models still suffer from the content contradiction between the summaries and the input text, which is referred to as the factual inconsistency problem. Recently, a large number of works have also been proposed to evaluate factual consistency or improve it by post-editing methods. However, these post-editing methods typically focus on replacing suspicious entities, failing to identify and modify incorrect content hidden in sentence structures. In this paper, we first verify that the correctable errors can be enriched by leveraging sentence structure pruning operation, and then we propose a post-editing method based on that. In the correction process, the pruning operation on possible errors is performed on the syntactic dependency tree with the guidance of multiple factual evaluation metrics. Experimenting on the FRANK dataset shows a great improvement in factual consistency compared with strong baselines and, when combined with them, can achieve even better performance. All the codes and data will be released on paper acceptance.

pdf
SmartTrim: Adaptive Tokens and Attention Pruning for Efficient Vision-Language Models
Zekun Wang | Jingchang Chen | Wangchunshu Zhou | Haichao Zhu | Jiafeng Liang | Liping Shan | Ming Liu | Dongliang Xu | Qing Yang | Bing Qin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications. Moreover, the degree of redundancy in token representations and model parameters, such as attention heads, varies significantly for different inputs. In light of the challenges, we propose SmartTrim, an adaptive acceleration framework for VLMs, which adjusts the computational overhead per instance. Specifically, we integrate lightweight modules into the original backbone to identify and prune redundant token representations and attention heads within each layer. Furthermore, we devise a self-distillation strategy to enhance the consistency between the predictions of the pruned model and its fully-capacity counterpart. Experimental results across various vision-language tasks consistently demonstrate that SmartTrim accelerates the original model by 2-3 times with minimal performance degradation, highlighting the effectiveness and efficiency compared to previous approaches. Code will be available at https://github.com/kugwzk/SmartTrim.

pdf
SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models
Weixiang Zhao | Shilong Wang | Yulin Hu | Yanyan Zhao | Bing Qin | Xuanyu Zhang | Qing Yang | Dongliang Xu | Wanxiang Che
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world. Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input, aiming at handling the challenges of catastrophic forgetting and knowledge transfer in CL. However, these methods tend to address only one of the challenges, ignoring the potential of aligning the two modules to effectively address catastrophic forgetting and knowledge transfer simultaneously. To this end, we propose a novel Shared Attention Framework (SAPT), to align the PET learning and selection via the Shared Attentive Learning & Selection module. Extensive Experiments on two CL benchmarks demonstrate the superiority of SAPT. Moreover, SAPT consistently demonstrates its superiority when we scale it to different model sizes (from 770M to 13B), different model architectures (T5 and LLaMA-2) and unseen tasks.

2022

pdf
TranS: Transition-based Knowledge Graph Embedding with Synthetic Relation Representation
Xuanyu Zhang | Qing Yang | Dongliang Xu
Findings of the Association for Computational Linguistics: EMNLP 2022

Knowledge graph embedding (KGE) aims to learn continuous vector representations of relations and entities in knowledge graph (KG). Recently, transition-based KGE methods have become popular and achieved promising performance. However, scoring patterns like TransE are not suitable for complex scenarios where the same entity pair has different relations. Although some models attempt to employ entity-relation interaction or projection to improve entity representation for one-to-many/many-to-one/many-to-many complex relations, they still continue the traditional scoring pattern, where only a single relation vector in the relation part is used to translate the head entity to the tail entity or their variants. And recent research shows that entity representation only needs to consider entities and their interactions to achieve better performance. Thus, in this paper, we propose a novel transition-based method, TranS, for KGE. The single relation vector of the relation part in the traditional scoring pattern is replaced by the synthetic relation representation with entity-relation interactions to solve these issues. And the entity part still retains its independence through entity-entity interactions. Experiments on a large KG dataset, ogbl-wikikg2, show that our model achieves state-of-the-art results.

pdf
Instance-Guided Prompt Learning for Few-Shot Text Matching
Jia Du | Xuanyu Zhang | Siyi Wang | Kai Wang | Yanquan Zhou | Lei Li | Qing Yang | Dongliang Xu
Findings of the Association for Computational Linguistics: EMNLP 2022

Few-shot text matching is a more practical technique in natural language processing (NLP) to determine whether two texts are semantically identical. They primarily design patterns to reformulate text matching into a pre-trained task with uniform prompts across all instances. But they fail to take into account the connection between prompts and instances. This paper argues that dynamically strengthening the correlation between particular instances and the prompts is necessary because fixed prompts cannot adequately fit all diverse instances in inference. We suggest IGATE: Instance-Guided prompt leArning for few-shoT tExt matching, a novel pluggable prompt learning method. The gate mechanism used by IGATE, which is between the embedding and the PLM encoders, makes use of the semantics of instances to regulate the effects of the gate on the prompt tokens. The experimental findings show that IGATE achieves SOTA performance on MRPC and QQP, outperforming strong baselines. GitHub will host the release of codes.