CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects

Simon Clematide, Peter Makarov


Abstract
Our submissions for the GDI 2017 Shared Task are the results from three different types of classifiers: Naïve Bayes, Conditional Random Fields (CRF), and Support Vector Machine (SVM). Our CRF-based run achieves a weighted F1 score of 65% (third rank) being beaten by the best system by 0.9%. Measured by classification accuracy, our ensemble run (Naïve Bayes, CRF, SVM) reaches 67% (second rank) being 1% lower than the best system. We also describe our experiments with Recurrent Neural Network (RNN) architectures. Since they performed worse than our non-neural approaches we did not include them in the submission.
Anthology ID:
W17-1221
Volume:
Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Preslav Nakov, Marcos Zampieri, Nikola Ljubešić, Jörg Tiedemann, Shevin Malmasi, Ahmed Ali
Venue:
VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
170–177
Language:
URL:
https://aclanthology.org/W17-1221
DOI:
10.18653/v1/W17-1221
Bibkey:
Cite (ACL):
Simon Clematide and Peter Makarov. 2017. CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pages 170–177, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects (Clematide & Makarov, VarDial 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/W17-1221.pdf