Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task
Gustavo Aguilar, Fahad AlGhamdi, Victor Soto, Mona Diab, Julia Hirschberg, Thamar Solorio
Abstract
In the third shared task of the Computational Approaches to Linguistic Code-Switching (CALCS) workshop, we focus on Named Entity Recognition (NER) on code-switched social-media data. We divide the shared task into two competitions based on the English-Spanish (ENG-SPA) and Modern Standard Arabic-Egyptian (MSA-EGY) language pairs. We use Twitter data and 9 entity types to establish a new dataset for code-switched NER benchmarks. In addition to the CS phenomenon, the diversity of the entities and the social media challenges make the task considerably hard to process. As a result, the best scores of the competitions are 63.76% and 71.61% for ENG-SPA and MSA-EGY, respectively. We present the scores of 9 participants and discuss the most common challenges among submissions.- Anthology ID:
- W18-3219
- Volume:
- Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne, Australia
- Editors:
- Gustavo Aguilar, Fahad AlGhamdi, Victor Soto, Thamar Solorio, Mona Diab, Julia Hirschberg
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 138–147
- Language:
- URL:
- https://aclanthology.org/W18-3219
- DOI:
- 10.18653/v1/W18-3219
- Cite (ACL):
- Gustavo Aguilar, Fahad AlGhamdi, Victor Soto, Mona Diab, Julia Hirschberg, and Thamar Solorio. 2018. Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task. In Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching, pages 138–147, Melbourne, Australia. Association for Computational Linguistics.
- Cite (Informal):
- Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task (Aguilar et al., ACL 2018)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/W18-3219.pdf
- Data
- IPM NEL