Xu Guo (郭旭) - ACL Anthology

Xu Guo

Also published as: 旭郭

2024

pdf abs
PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning
Xiaoqi Qiu | Yongjie Wang | Xu Guo | Zhiwei Zeng | Yu Yue | Yuhong Feng | Chunyan Miao
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes. Training with CAD enhances model robustness against spurious features that happen to correlate with labels by spreading the casual relationships across different classes. Yet, recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information, inadvertently introducing biases that may impair performance on out-of-distribution (OOD) datasets. To mitigate this issue, we employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues. We theoretically prove that contrastive loss can encourage models to leverage a broader range of features beyond those modified ones. Comprehensive experiments on two human-edited CAD datasets demonstrate that our proposed method outperforms the state-of-the-art on OOD datasets.

pdf abs
RevMUX: Data Multiplexing with Reversible Adapters for Efficient LLM Batch Inference
Yige Xu | Xu Guo | Zhiwei Zeng | Chunyan Miao
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) have brought a great breakthrough to the natural language processing (NLP) community, while leading the challenge of handling concurrent customer queries due to their high throughput demands. Data multiplexing addresses this by merging multiple inputs into a single composite input, allowing more efficient inference through a shared forward pass. However, as distinguishing individuals from a composite input is challenging, conventional methods typically require training the entire backbone, yet still suffer from performance degradation. In this paper, we introduce RevMUX, a parameter-efficient data multiplexing framework that incorporates a reversible design in the multiplexer, which can be reused by the demultiplexer to perform reverse operations and restore individual samples for classification. Extensive experiments on four datasets and three types of LLM backbones demonstrate the effectiveness of RevMUX for enhancing LLM inference efficiency while retaining a satisfactory classification performance.

Natural language counterfactual generation aims to minimally modify a given text such that the modified text will be classified into a different class. The generated counterfactuals provide insight into the reasoning behind a model’s predictions by highlighting which words significantly influence the outcomes. Additionally, they can be used to detect model fairness issues and augment the training data to enhance the model’s robustness. A substantial amount of research has been conducted to generate counterfactuals for various NLP tasks, employing different models and methodologies. With the rapid growth of studies in this field, a systematic review is crucial to guide future researchers and developers. To bridge this gap, this survey provides a comprehensive overview of textual counterfactual generation methods, particularly those based on Large Language Models. We propose a new taxonomy that systematically categorizes the generation methods into four groups and summarizes the metrics for evaluating the generation quality. Finally, we discuss ongoing research challenges and outline promising directions for future work.

2023

Multilingual language models (MLLMs) have achieved remarkable success in various cross-lingual transfer tasks. However, they suffer poor performance in zero-shot low-resource languages, particularly when dealing with longer contexts. Existing research mainly relies on full-model fine-tuning on large parallel datasets to enhance the cross-lingual alignment of MLLMs, which is computationally expensive. In this paper, we propose InteMATs, a novel approach that integrates multilingual adapters trained on texts of different levels of granularity. To achieve this, we curate a multilingual parallel dataset comprising 42 languages to pre-train sentence-level and document-level adapters under the contrastive learning framework. Extensive experiments demonstrate the effectiveness of InteMATs in improving the cross-lingual transfer performance of MLLMs, especially on low-resource languages. Finally, our comprehensive analyses and ablation studies provide a deep understanding of the high-quality representations derived by InteMATs.

2022

pdf abs
基于多源知识融合的领域情感词典表示学习研究(Domain Sentiment Lexicon Representation Learning Based on Multi-source Knowledge Fusion)
Ruihua Qi (祁瑞华) | Jia Wei (魏佳) | Zhen Shao (邵震) | Xu Guo (郭旭) | Heng Chen (陈恒)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“本文旨在解决领域情感词典构建任务中标注数据资源相对匮乏以及情感语义表示不充分问题,通过多源数据领域差异计算联合权重,融合先验情感知识和Fasttext词向量表示学习,将情感语义知识映射到新的词向量空间,从无标注数据中自动构建适应大数据多领域和多语言环境的领域情感词典。在中英文多领域公开数据集上的对比实验表明,与情感词典方法和预训练词向量方法相比,本文提出的多源知识融合的领域情感词典表示学习方法在实验数据集上的分类正确率均有明显提升,并在多种算法、多语言、多领域和多数据集上具有较好的鲁棒性。本文还通过消融实验验证了所提出模型的各个模块在提升情感分类效果中的作用。”

pdf abs
Improving the Sample Efficiency of Prompt Tuning with Domain Adaptation
Xu Guo | Boyang Li | Han Yu
Findings of the Association for Computational Linguistics: EMNLP 2022

Prompt tuning, or the conditioning of a frozen pretrained language model (PLM) with soft prompts learned from data, has demonstrated impressive performance on a wide range of NLP tasks. However, prompt tuning requires a large training dataset to be effective and is outperformed by finetuning the entire PLM in data-scarce regimes. Previous work (Gu et al., 2022, Vu et al., 2022) proposed to transfer soft prompts pretrained on the source domain to the target domain. In this paper, we explore domain adaptation for prompt tuning, a problem setting where unlabeled data from the target domain are available during pretraining. We propose bOosting Prompt TunIng with doMain Adaptation (OPTIMA), which regularizes the decision boundary to be smooth around regions where source and target data distributions are similar. Extensive experiments demonstrate that OPTIMA significantly enhances the transferability and sample-efficiency of prompt tuning compared to strong baselines. Moreover, in few-shot settings, OPTIMA exceeds full-model tuning by a large margin.

2021

pdf abs
Latent-Optimized Adversarial Neural Transfer for Sarcasm Detection
Xu Guo | Boyang Li | Han Yu | Chunyan Miao
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

The existence of multiple datasets for sarcasm detection prompts us to apply transfer learning to exploit their commonality. The adversarial neural transfer (ANT) framework utilizes multiple loss terms that encourage the source-domain and the target-domain feature distributions to be similar while optimizing for domain-specific performance. However, these objectives may be in conflict, which can lead to optimization difficulties and sometimes diminished transfer. We propose a generalized latent optimization strategy that allows different losses to accommodate each other and improves training dynamics. The proposed method outperforms transfer learning and meta-learning baselines. In particular, we achieve 10.02% absolute performance gain over the previous state of the art on the iSarcasm dataset.