Shuffled-token Detection for Refining Pre-trained RoBERTa
Subhadarshi Panda, Anjali Agrawal, Jeewon Ha, Benjamin Bloch
Abstract
State-of-the-art transformer models have achieved robust performance on a variety of NLP tasks. Many of these approaches have employed domain agnostic pre-training tasks to train models that yield highly generalized sentence representations that can be fine-tuned for specific downstream tasks. We propose refining a pre-trained NLP model using the objective of detecting shuffled tokens. We use a sequential approach by starting with the pre-trained RoBERTa model and training it using our approach. Applying random shuffling strategy on the word-level, we found that our approach enables the RoBERTa model achieve better performance on 4 out of 7 GLUE tasks. Our results indicate that learning to detect shuffled tokens is a promising approach to learn more coherent sentence representations.- Anthology ID:
- 2021.naacl-srw.12
- Volume:
- Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 88–93
- Language:
- URL:
- https://aclanthology.org/2021.naacl-srw.12
- DOI:
- 10.18653/v1/2021.naacl-srw.12
- Cite (ACL):
- Subhadarshi Panda, Anjali Agrawal, Jeewon Ha, and Benjamin Bloch. 2021. Shuffled-token Detection for Refining Pre-trained RoBERTa. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 88–93, Online. Association for Computational Linguistics.
- Cite (Informal):
- Shuffled-token Detection for Refining Pre-trained RoBERTa (Panda et al., NAACL 2021)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2021.naacl-srw.12.pdf