Abstract
We describe the Uppsala NLP submission to SemEval-2021 Task 2 on multilingual and cross-lingual word-in-context disambiguation. We explore the usefulness of three pre-trained multilingual language models, XLM-RoBERTa (XLMR), Multilingual BERT (mBERT) and multilingual distilled BERT (mDistilBERT). We compare these three models in two setups, fine-tuning and as feature extractors. In the second case we also experiment with using dependency-based information. We find that fine-tuning is better than feature extraction. XLMR performs better than mBERT in the cross-lingual setting both with fine-tuning and feature extraction, whereas these two models give a similar performance in the multilingual setting. mDistilBERT performs poorly with fine-tuning but gives similar results to the other models when used as a feature extractor. We submitted our two best systems, fine-tuned with XLMR and mBERT.- Anthology ID:
- 2021.semeval-1.15
- Volume:
- Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Venue:
- SemEval
- SIGs:
- SIGLEX | SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 150–156
- Language:
- URL:
- https://aclanthology.org/2021.semeval-1.15
- DOI:
- 10.18653/v1/2021.semeval-1.15
- Cite (ACL):
- Huiling You, Xingran Zhu, and Sara Stymne. 2021. Uppsala NLP at SemEval-2021 Task 2: Multilingual Language Models for Fine-tuning and Feature Extraction in Word-in-Context Disambiguation. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 150–156, Online. Association for Computational Linguistics.
- Cite (Informal):
- Uppsala NLP at SemEval-2021 Task 2: Multilingual Language Models for Fine-tuning and Feature Extraction in Word-in-Context Disambiguation (You et al., SemEval 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.semeval-1.15.pdf