JUST-BLUE at SemEval-2021 Task 1: Predicting Lexical Complexity using BERT and RoBERTa Pre-trained Language Models
Tuqa Bani Yaseen, Qusai Ismail, Sarah Al-Omari, Eslam Al-Sobh, Malak Abdullah
Abstract
Predicting the complexity level of a word or a phrase is considered a challenging task. It is even recognized as a crucial step in numerous NLP applications, such as text rearrangements and text simplification. Early research treated the task as a binary classification task, where the systems anticipated the existence of a word’s complexity (complex versus uncomplicated). Other studies had been designed to assess the level of word complexity using regression models or multi-labeling classification models. Deep learning models show a significant improvement over machine learning models with the rise of transfer learning and pre-trained language models. This paper presents our approach that won the first rank in the SemEval-task1 (sub stask1). We have calculated the degree of word complexity from 0-1 within a text. We have been ranked first place in the competition using the pre-trained language models Bert and RoBERTa, with a Pearson correlation score of 0.788.- Anthology ID:
- 2021.semeval-1.85
- Volume:
- Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Venue:
- SemEval
- SIGs:
- SIGLEX | SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 661–666
- Language:
- URL:
- https://aclanthology.org/2021.semeval-1.85
- DOI:
- 10.18653/v1/2021.semeval-1.85
- Cite (ACL):
- Tuqa Bani Yaseen, Qusai Ismail, Sarah Al-Omari, Eslam Al-Sobh, and Malak Abdullah. 2021. JUST-BLUE at SemEval-2021 Task 1: Predicting Lexical Complexity using BERT and RoBERTa Pre-trained Language Models. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 661–666, Online. Association for Computational Linguistics.
- Cite (Informal):
- JUST-BLUE at SemEval-2021 Task 1: Predicting Lexical Complexity using BERT and RoBERTa Pre-trained Language Models (Bani Yaseen et al., SemEval 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.semeval-1.85.pdf