Improving Multi-Label Classification of Similar Languages by Semantics-Aware Word Embeddings
The Ngo, Thi Anh Nguyen, My Ha, Thi Minh Nguyen, Phuong Le-Hong
Abstract
The VLP team participated in the DSL-ML shared task of the VarDial 2024 workshop which aims to distinguish texts in similar languages. This paper presents our approach to solving the problem and discusses our experimental and official results. We propose to integrate semantics-aware word embeddings which are learned from ConceptNet into a bidirectional long short-term memory network. This approach achieves good performance – our sys- tem is ranked in the top two or three of the best performing teams for the task.- Anthology ID:
- 2024.vardial-1.21
- Volume:
- Proceedings of the Eleventh Workshop on NLP for Similar Languages, Varieties, and Dialects (VarDial 2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Yves Scherrer, Tommi Jauhiainen, Nikola Ljubešić, Marcos Zampieri, Preslav Nakov, Jörg Tiedemann
- Venues:
- VarDial | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 235–240
- Language:
- URL:
- https://aclanthology.org/2024.vardial-1.21
- DOI:
- 10.18653/v1/2024.vardial-1.21
- Cite (ACL):
- The Ngo, Thi Anh Nguyen, My Ha, Thi Minh Nguyen, and Phuong Le-Hong. 2024. Improving Multi-Label Classification of Similar Languages by Semantics-Aware Word Embeddings. In Proceedings of the Eleventh Workshop on NLP for Similar Languages, Varieties, and Dialects (VarDial 2024), pages 235–240, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Improving Multi-Label Classification of Similar Languages by Semantics-Aware Word Embeddings (Ngo et al., VarDial-WS 2024)
- PDF:
- https://preview.aclanthology.org/landing_page/2024.vardial-1.21.pdf