CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way
Greta Smolenska, Peter Kolb, Sinan Tang, Mironas Bitinis, Héctor Hernández, Elin Asklöv
Abstract
This paper presents the system we submitted to the first Lexical Complexity Prediction (LCP) Shared Task 2021. The Shared Task provides participants with a new English dataset that includes context of the target word. We participate in the single-word complexity prediction sub-task and focus on feature engineering. Our best system is trained on linguistic features and word embeddings (Pearson’s score of 0.7942). We demonstrate, however, that a simpler feature set achieves comparable results and submit a model trained on 36 linguistic features (Pearson’s score of 0.7925).- Anthology ID:
- 2021.semeval-1.81
- Volume:
- Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Alexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, Xiaodan Zhu
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 632–639
- Language:
- URL:
- https://aclanthology.org/2021.semeval-1.81
- DOI:
- 10.18653/v1/2021.semeval-1.81
- Cite (ACL):
- Greta Smolenska, Peter Kolb, Sinan Tang, Mironas Bitinis, Héctor Hernández, and Elin Asklöv. 2021. CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 632–639, Online. Association for Computational Linguistics.
- Cite (Informal):
- CLULEX at SemEval-2021 Task 1: A Simple System Goes a Long Way (Smolenska et al., SemEval 2021)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/2021.semeval-1.81.pdf