Abstract
This paper introduces the approach of Team LingJing’s experiments on SemEval-2022 Task 1 Comparing Dictionaries and Word Embeddings (CODWOE). This task aims at comparing two types of semantic descriptions and including two sub-tasks: the definition modeling and reverse dictionary track. Our team focuses on the reverse dictionary track and adopts the multi-task self-supervised pre-training for multilingual reverse dictionaries. Specifically, the randomly initialized mDeBERTa-base model is used to perform multi-task pre-training on the multilingual training datasets. The pre-training step is divided into two stages, namely the MLM pre-training stage and the contrastive pre-training stage. The experimental results show that the proposed method has achieved good performance in the reverse dictionary track, where we rank the 1-st in the Sgns targets of the EN and RU languages. All the experimental codes are open-sourced at https://github.com/WENGSYX/Semeval.- Anthology ID:
- 2022.semeval-1.4
- Volume:
- Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Venue:
- SemEval
- SIGs:
- SIGLEX | SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 29–35
- Language:
- URL:
- https://aclanthology.org/2022.semeval-1.4
- DOI:
- 10.18653/v1/2022.semeval-1.4
- Cite (ACL):
- Bin Li, Yixuan Weng, Fei Xia, Shizhu He, Bin Sun, and Shutao Li. 2022. LingJing at SemEval-2022 Task 1: Multi-task Self-supervised Pre-training for Multilingual Reverse Dictionary. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 29–35, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- LingJing at SemEval-2022 Task 1: Multi-task Self-supervised Pre-training for Multilingual Reverse Dictionary (Li et al., SemEval 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.semeval-1.4.pdf
- Code
- wengsyx/semeval