Yuanhang Yang
2023
Once is Enough: A Light-Weight Cross-Attention for Fast Sentence Pair Modeling
Yuanhang Yang
|
Shiyi Qi
|
Chuanyi Liu
|
Qifan Wang
|
Cuiyun Gao
|
Zenglin Xu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Transformer-based models have achieved great success on sentence pair modeling tasks, such as answer selection and natural language inference (NLI). These models generally perform cross-attention over input pairs, leading to prohibitive computational cost. Recent studies propose dual-encoder and late interaction architectures for faster computation. However, the balance between the expressive of cross-attention and computation speedup still needs better coordinated. To this end, this paper introduces a novel paradigm TopicAns for efficient sentence pair modeling. TopicAns involves a lightweight cross-attention mechanism. It conducts query encoding only once while modeling the query-candidate interaction in parallel. Extensive experiments conducted on four tasks demonstrate that our TopicAnscan speed up sentence pairing by over 113x while achieving comparable performance as the more expensive cross-attention models.
FedPETuning: When Federated Learning Meets the Parameter-Efficient Tuning Methods of Pre-trained Language Models
Zhuo Zhang
|
Yuanhang Yang
|
Yong Dai
|
Qifan Wang
|
Yue Yu
|
Lizhen Qu
|
Zenglin Xu
Findings of the Association for Computational Linguistics: ACL 2023
With increasing concerns about data privacy, there is an increasing necessity of fine-tuning pre-trained language models (PLMs) for adapting to downstream tasks located in end-user devices or local clients without transmitting data to the central server. This urgent necessity therefore calls the research of investigating federated learning (FL) for PLMs. However, large PLMs bring the curse of prohibitive communication overhead and local model adaptation costs for the FL system. To this end, we investigate the parameter-efficient tuning (PETuning) of PLMs and develop a corresponding federated benchmark for four representative PETuning methods, dubbed FedPETuning. Specifically, FedPETuning provides the first holistic empirical study of representative PLMs tuning methods in FL, covering privacy attacks, performance comparisons, and resource-constrained analysis. Intensive experimental results have indicated that FedPETuning can efficiently defend against privacy attacks and maintains acceptable performance with reducing heavy resource consumption. The open-source code and data are available at https://github.com/SMILELab-FL/FedPETuning.
Search
Co-authors
- Qifan Wang 2
- Zenglin Xu 2
- Shiyi Qi 1
- Chuanyi Liu 1
- Cuiyun Gao 1
- show all...