Abstract
This paper presents the system built by ASIREM team for the Discriminating between Similar Languages (DSL) Shared task 2016. It describes the system which uses character-based and word-based n-grams separately. ASIREM participated in both sub-tasks (sub-task 1 and sub-task 2) and in both open and closed tracks. For the sub-task 1 which deals with Discriminating between similar languages and national language varieties, the system achieved an accuracy of 87.79% on the closed track, ending up ninth (the best results being 89.38%). In sub-task 2, which deals with Arabic dialect identification, the system achieved its best performance using character-based n-grams (49.67% accuracy), ranking fourth in the closed track (the best result being 51.16%), and an accuracy of 53.18%, ranking first in the open track.- Anthology ID:
- W16-4821
- Volume:
- Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi
- Venue:
- VarDial
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 163–169
- Language:
- URL:
- https://aclanthology.org/W16-4821
- DOI:
- Cite (ACL):
- Wafia Adouane, Nasredine Semmar, and Richard Johansson. 2016. ASIREM Participation at the Discriminating Similar Languages Shared Task 2016. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 163–169, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- ASIREM Participation at the Discriminating Similar Languages Shared Task 2016 (Adouane et al., VarDial 2016)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/W16-4821.pdf