NER4ID at SemEval-2022 Task 2: Named Entity Recognition for Idiomaticity Detection

Simone Tedeschi, Roberto Navigli


Abstract
Idioms are lexically-complex phrases whose meaning cannot be derived by compositionally interpreting their components. Although the automatic identification and understanding of idioms is essential for a wide range of Natural Language Understanding tasks, they are still largely under-investigated. This motivated the organization of the SemEval-2022 Task 2, which is divided into two multilingual subtasks: one about idiomaticity detection, and the other about sentence embeddings. In this work, we focus on the first subtask and propose a Transformer-based dual-encoder architecture to compute the semantic similarity between a potentially-idiomatic expression and its context and, based on this, predict idiomaticity. Then, we show how and to what extent Named Entity Recognition can be exploited to reduce the degree of confusion of idiom identification systems and, therefore, improve performance. Our model achieves 92.1 F1 in the one-shot setting and shows strong robustness towards unseen idioms achieving 77.4 F1 in the zero-shot setting. We release our code at https://github.com/Babelscape/ner4id.
Anthology ID:
2022.semeval-1.25
Volume:
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Guy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
204–210
Language:
URL:
https://aclanthology.org/2022.semeval-1.25
DOI:
10.18653/v1/2022.semeval-1.25
Bibkey:
Cite (ACL):
Simone Tedeschi and Roberto Navigli. 2022. NER4ID at SemEval-2022 Task 2: Named Entity Recognition for Idiomaticity Detection. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 204–210, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
NER4ID at SemEval-2022 Task 2: Named Entity Recognition for Idiomaticity Detection (Tedeschi & Navigli, SemEval 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2022.semeval-1.25.pdf
Video:
 https://preview.aclanthology.org/emnlp-22-attachments/2022.semeval-1.25.mp4
Code
 babelscape/ner4id