CMNEROne at SemEval-2022 Task 11: Code-Mixed Named Entity Recognition by leveraging multilingual data

Suman Dowlagar, Radhika Mamidi


Abstract
Identifying named entities is, in general, a practical and challenging task in the field of Natural Language Processing. Named Entity Recognition on the code-mixed text is further challenging due to the linguistic complexity resulting from the nature of the mixing. This paper addresses the submission of team CMNEROne to the SEMEVAL 2022 shared task 11 MultiCoNER. The Code-mixed NER task aimed to identify named entities on the code-mixed dataset. Our work consists of Named Entity Recognition (NER) on the code-mixed dataset by leveraging the multilingual data. We achieved a weighted average F1 score of 0.7044, i.e., 6% greater than the NER baseline.
Anthology ID:
2022.semeval-1.214
Volume:
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
SemEval
SIGs:
SIGLEX | SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
1556–1561
Language:
URL:
https://aclanthology.org/2022.semeval-1.214
DOI:
10.18653/v1/2022.semeval-1.214
Bibkey:
Cite (ACL):
Suman Dowlagar and Radhika Mamidi. 2022. CMNEROne at SemEval-2022 Task 11: Code-Mixed Named Entity Recognition by leveraging multilingual data. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1556–1561, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
CMNEROne at SemEval-2022 Task 11: Code-Mixed Named Entity Recognition by leveraging multilingual data (Dowlagar & Mamidi, SemEval 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2022.semeval-1.214.pdf
Data
MultiCoNER