Abstract
This paper describes our submission to the 2016 Discriminating Similar Languages (DSL) Shared Task. We participated in the closed Sub-task 1 with two separate machine learning techniques. The first approach is a character based Convolution Neural Network with an LSTM layer (CLSTM), which achieved an accuracy of 78.45% with minimal tuning. The second approach is a character-based n-gram model of size 7. It achieved an accuracy of 88.45% which is close to the accuracy of 89.38% achieved by the best submission.- Anthology ID:
- W16-4831
- Volume:
- Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi
- Venue:
- VarDial
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 243–250
- Language:
- URL:
- https://aclanthology.org/W16-4831
- DOI:
- Cite (ACL):
- Andre Cianflone and Leila Kosseim. 2016. N-gram and Neural Language Models for Discriminating Similar Languages. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 243–250, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- N-gram and Neural Language Models for Discriminating Similar Languages (Cianflone & Kosseim, VarDial 2016)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/W16-4831.pdf