Abstract
In this paper, we describe a system (CGLI) for discriminating similar languages, varieties and dialects using convolutional neural networks (CNNs) and long short-term memory (LSTM) neural networks. We have participated in the Arabic dialect identification sub-task of DSL 2016 shared task for distinguishing different Arabic language texts under closed submission track. Our proposed approach is language independent and works for discriminating any given set of languages, varieties, and dialects. We have obtained 43.29% weighted-F1 accuracy in this sub-task using CNN approach using default network parameters.- Anthology ID:
- W16-4824
- Volume:
- Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi
- Venue:
- VarDial
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 185–194
- Language:
- URL:
- https://aclanthology.org/W16-4824
- DOI:
- Cite (ACL):
- Chinnappa Guggilla. 2016. Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 185–194, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks (Guggilla, VarDial 2016)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/W16-4824.pdf