Abstract
This paper describes an approach to detect idiomaticity only from the contextualized representation of a MWE over multilingual pretrained language models.Our experiments find that larger models are usually more effective in idiomaticity detection. However, using a higher layer of the model may not guarantee a better performance.In multilingual scenarios, the convergence of different languages are not consistent and rich-resource languages have big advantages over other languages.- Anthology ID:
- 2022.semeval-1.23
- Volume:
- Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 190–196
- Language:
- URL:
- https://aclanthology.org/2022.semeval-1.23
- DOI:
- 10.18653/v1/2022.semeval-1.23
- Cite (ACL):
- Minghuan Tan. 2022. HiJoNLP at SemEval-2022 Task 2: Detecting Idiomaticity of Multiword Expressions using Multilingual Pretrained Language Models. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 190–196, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- HiJoNLP at SemEval-2022 Task 2: Detecting Idiomaticity of Multiword Expressions using Multilingual Pretrained Language Models (Tan, SemEval 2022)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2022.semeval-1.23.pdf
- Code
- visualjoyce/ciyi
- Data
- AStitchInLanguageModels