Abstract
This paper describes our system used in the SemEval-2022 Task 11 Multilingual Complex Named Entity Recognition, achieving 3rd for track 1 on the leaderboard. We propose Dictionary-fused BERT, a flexible approach for entity dictionaries integration. The main ideas of our systems are:1) integrating external knowledge (an entity dictionary) into pre-trained models to obtain contextualized word and entity representations 2) designing a robust loss function leveraging a logit matrix 3) adding an auxiliary task, which is an on-top binary classification to decide whether the token is a mention word or not, makes the main task easier to learn. It is worth noting that our system achieves an F1 of 0.914 in the post-evaluation stage by updating the entity dictionary to the one of (CITATION), which is higher than the score of 1st on the leaderboard of the evaluation stage.- Anthology ID:
- 2022.semeval-1.229
- Volume:
- Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Editors:
- Guy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1665–1670
- Language:
- URL:
- https://aclanthology.org/2022.semeval-1.229
- DOI:
- 10.18653/v1/2022.semeval-1.229
- Cite (ACL):
- Long Ma, Xiaorong Jian, and Xuan Li. 2022. PAI at SemEval-2022 Task 11: Name Entity Recognition with Contextualized Entity Representations and Robust Loss Functions. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1665–1670, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- PAI at SemEval-2022 Task 11: Name Entity Recognition with Contextualized Entity Representations and Robust Loss Functions (Ma et al., SemEval 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2022.semeval-1.229.pdf
- Code
- diqiuzhuanzhuan/semeval2022
- Data
- MultiCoNER