Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks

Chinnappa Guggilla


Abstract
In this paper, we describe a system (CGLI) for discriminating similar languages, varieties and dialects using convolutional neural networks (CNNs) and long short-term memory (LSTM) neural networks. We have participated in the Arabic dialect identification sub-task of DSL 2016 shared task for distinguishing different Arabic language texts under closed submission track. Our proposed approach is language independent and works for discriminating any given set of languages, varieties, and dialects. We have obtained 43.29% weighted-F1 accuracy in this sub-task using CNN approach using default network parameters.
Anthology ID:
W16-4824
Volume:
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi
Venue:
VarDial
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
185–194
Language:
URL:
https://aclanthology.org/W16-4824
DOI:
Bibkey:
Cite (ACL):
Chinnappa Guggilla. 2016. Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 185–194, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks (Guggilla, VarDial 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/W16-4824.pdf