Abstract
Our submissions for the GDI 2017 Shared Task are the results from three different types of classifiers: Naïve Bayes, Conditional Random Fields (CRF), and Support Vector Machine (SVM). Our CRF-based run achieves a weighted F1 score of 65% (third rank) being beaten by the best system by 0.9%. Measured by classification accuracy, our ensemble run (Naïve Bayes, CRF, SVM) reaches 67% (second rank) being 1% lower than the best system. We also describe our experiments with Recurrent Neural Network (RNN) architectures. Since they performed worse than our non-neural approaches we did not include them in the submission.- Anthology ID:
- W17-1221
- Volume:
- Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Preslav Nakov, Marcos Zampieri, Nikola Ljubešić, Jörg Tiedemann, Shevin Malmasi, Ahmed Ali
- Venue:
- VarDial
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 170–177
- Language:
- URL:
- https://aclanthology.org/W17-1221
- DOI:
- 10.18653/v1/W17-1221
- Cite (ACL):
- Simon Clematide and Peter Makarov. 2017. CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pages 170–177, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects (Clematide & Makarov, VarDial 2017)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/W17-1221.pdf