LangResearchLab NC at SemEval-2021 Task 1: Linguistic Feature Based Modelling for Lexical Complexity

Raksha Agarwal, Niladri Chatterjee


Abstract
The present work aims at assigning a complexity score between 0 and 1 to a target word or phrase in a given sentence. For each Single Word Target, a Random Forest Regressor is trained on a feature set consisting of lexical, semantic, and syntactic information about the target. For each Multiword Target, a set of individual word features is taken along with single word complexities in the feature space. The system yielded the Pearson correlation of 0.7402 and 0.8244 on the test set for the Single and Multiword Targets, respectively.
Anthology ID:
2021.semeval-1.10
Volume:
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:
August
Year:
2021
Address:
Online
Editors:
Alexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, Xiaodan Zhu
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
120–125
Language:
URL:
https://aclanthology.org/2021.semeval-1.10
DOI:
10.18653/v1/2021.semeval-1.10
Bibkey:
Cite (ACL):
Raksha Agarwal and Niladri Chatterjee. 2021. LangResearchLab NC at SemEval-2021 Task 1: Linguistic Feature Based Modelling for Lexical Complexity. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 120–125, Online. Association for Computational Linguistics.
Cite (Informal):
LangResearchLab NC at SemEval-2021 Task 1: Linguistic Feature Based Modelling for Lexical Complexity (Agarwal & Chatterjee, SemEval 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2021.semeval-1.10.pdf