Abstract
Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have shown that incorporating span-level information over consecutive words in pre-training could further improve the performance of PrLMs. However, given that span-level clues are introduced and fixed in pre-training, previous methods are time-consuming and lack of flexibility. To alleviate the inconvenience, this paper presents a novel span fine-tuning method for PrLMs, which facilitates the span setting to be adaptively determined by specific downstream tasks during the fine-tuning phase. In detail, any sentences processed by the PrLM will be segmented into multiple spans according to a pre-sampled dictionary. Then the segmentation information will be sent through a hierarchical CNN module together with the representation outputs of the PrLM and ultimately generate a span-enhanced representation. Experiments on GLUE benchmark show that the proposed span fine-tuning method significantly enhances the PrLM, and at the same time, offer more flexibility in an efficient way.- Anthology ID:
- 2021.findings-emnlp.169
- Original:
- 2021.findings-emnlp.169v1
- Version 2:
- 2021.findings-emnlp.169v2
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1970–1979
- Language:
- URL:
- https://aclanthology.org/2021.findings-emnlp.169
- DOI:
- 10.18653/v1/2021.findings-emnlp.169
- Cite (ACL):
- Rongzhou Bao, Zhuosheng Zhang, and Hai Zhao. 2021. Span Fine-tuning for Pre-trained Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1970–1979, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Span Fine-tuning for Pre-trained Language Models (Bao et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2021.findings-emnlp.169.pdf
- Data
- CoNLL 2003, GLUE, QNLI, SNLI, SST, SST-2