PAI at SemEval-2022 Task 11: Name Entity Recognition with Contextualized Entity Representations and Robust Loss Functions

Long Ma, Xiaorong Jian, Xuan Li


Abstract
This paper describes our system used in the SemEval-2022 Task 11 Multilingual Complex Named Entity Recognition, achieving 3rd for track 1 on the leaderboard. We propose Dictionary-fused BERT, a flexible approach for entity dictionaries integration. The main ideas of our systems are:1) integrating external knowledge (an entity dictionary) into pre-trained models to obtain contextualized word and entity representations 2) designing a robust loss function leveraging a logit matrix 3) adding an auxiliary task, which is an on-top binary classification to decide whether the token is a mention word or not, makes the main task easier to learn. It is worth noting that our system achieves an F1 of 0.914 in the post-evaluation stage by updating the entity dictionary to the one of (CITATION), which is higher than the score of 1st on the leaderboard of the evaluation stage.
Anthology ID:
2022.semeval-1.229
Volume:
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
SemEval
SIGs:
SIGLEX | SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
1665–1670
Language:
URL:
https://aclanthology.org/2022.semeval-1.229
DOI:
10.18653/v1/2022.semeval-1.229
Bibkey:
Cite (ACL):
Long Ma, Xiaorong Jian, and Xuan Li. 2022. PAI at SemEval-2022 Task 11: Name Entity Recognition with Contextualized Entity Representations and Robust Loss Functions. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1665–1670, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
PAI at SemEval-2022 Task 11: Name Entity Recognition with Contextualized Entity Representations and Robust Loss Functions (Ma et al., SemEval 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2022.semeval-1.229.pdf
Data
MultiCoNER