Thi Anh Nguyen


2024

pdf
Improving Multi-Label Classification of Similar Languages by Semantics-Aware Word Embeddings
The Ngo | Thi Anh Nguyen | My Ha | Thi Minh Nguyen | Phuong Le-Hong
Proceedings of the Eleventh Workshop on NLP for Similar Languages, Varieties, and Dialects (VarDial 2024)

The VLP team participated in the DSL-ML shared task of the VarDial 2024 workshop which aims to distinguish texts in similar languages. This paper presents our approach to solving the problem and discusses our experimental and official results. We propose to integrate semantics-aware word embeddings which are learned from ConceptNet into a bidirectional long short-term memory network. This approach achieves good performance – our sys- tem is ranked in the top two or three of the best performing teams for the task.