Zheng Lian


2024

pdf
NLoPT: N-gram Enhanced Low-Rank Task Adaptive Pre-training for Efficient Language Model Adaption
Hao Gu | Jiangyan Yi | Zheng Lian | Jianhua Tao | Xinrui Yan
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Pre-trained Language Models (PLMs) like BERT have achieved superior performance on different downstream tasks, even when such a model is trained on a general domain. Moreover, recent studies have shown that continued pre-training on task-specific data, known as task adaptive pre-training (TAPT), can further improve downstream task performance. However, conventional TAPT adjusts all the parameters of the PLMs, which distorts the learned generic knowledge embedded in the original PLMs weights, and it is expensive to store a whole model copy for each downstream task. In this paper, we propose NLoPT, a two-step n-gram enhanced low-rank task adaptive pre-training method, to effectively and efficiently customize a PLM to the downstream task. Specifically, we first apply low-rank adaption (LoRA), a prevalent parameter-efficient technique, for efficient TAPT. We further explicitly incorporate the task-specific multi-granularity n-gram information via the cross-attention mechanism. Experimental results on six datasets from four domains illustrate the effectiveness of NLoPT, demonstrating the superiority of LoRA based TAPT and the necessity of incorporating task-specific n-gram information.

2022

pdf
Supporting Medical Relation Extraction via Causality-Pruned Semantic Dependency Forest
Yifan Jin | Jiangmeng Li | Zheng Lian | Chengbo Jiao | Xiaohui Hu
Proceedings of the 29th International Conference on Computational Linguistics

Medical Relation Extraction (MRE) task aims to extract relations between entities in medical texts. Traditional relation extraction methods achieve impressive success by exploring the syntactic information, e.g., dependency tree. However, the quality of the 1-best dependency tree for medical texts produced by an out-of-domain parser is relatively limited so that the performance of medical relation extraction method may degenerate. To this end, we propose a method to jointly model semantic and syntactic information from medical texts based on causal explanation theory. We generate dependency forests consisting of the semantic-embedded 1-best dependency tree. Then, a task-specific causal explainer is adopted to prune the dependency forests, which are further fed into a designed graph convolutional network to learn the corresponding representation for downstream task. Empirically, the various comparisons on benchmark medical datasets demonstrate the effectiveness of our model.

pdf
AMOA: Global Acoustic Feature Enhanced Modal-Order-Aware Network for Multimodal Sentiment Analysis
Ziming Li | Yan Zhou | Weibo Zhang | Yaxin Liu | Chuanpeng Yang | Zheng Lian | Songlin Hu
Proceedings of the 29th International Conference on Computational Linguistics

In recent years, multimodal sentiment analysis (MSA) has attracted more and more interest, which aims to predict the sentiment polarity expressed in a video. Existing methods typically 1) treat three modal features (textual, acoustic, visual) equally, without distinguishing the importance of different modalities; and 2) split the video into frames, leading to missing the global acoustic information. In this paper, we propose a global Acoustic feature enhanced Modal-Order-Aware network (AMOA) to address these problems. Firstly, a modal-order-aware network is designed to obtain the multimodal fusion feature. This network integrates the three modalities in a certain order, which makes the modality at the core position matter more. Then, we introduce the global acoustic feature of the whole video into our model. Since the global acoustic feature and multimodal fusion feature originally reside in their own spaces, contrastive learning is further employed to align them before concatenation. Experiments on two public datasets show that our model outperforms the state-of-the-art models. In addition, we also generalize our model to the sentiment with more complex semantics, such as sarcasm detection. Our model also achieves state-of-the-art performance on a widely used sarcasm dataset.