Scaling Sustainable Development Goal Predictions across Languages: From English to Finnish
Melany Macias, Lev Kharlashkin,, Leo Huovinen, Mika Hämäläinen
Abstract
In this paper, we leverage an exclusive English dataset to train diverse multilingual classifiers, investigating their efficacy in adapting to Finnish data. We employ an exclusively English classification dataset of UN Sustainable Development Goals (SDG) in an education context, to train various multilingual classifiers and examine how well these models can adapt to recognizing the same classes within Finnish university course descriptions. It’s worth noting that Finnish, with a mere 5 million native speakers, presents a significantly less-resourced linguistic context compared to English. The best performing model in our experiments was mBART with an F1-score of 0.843.- Anthology ID:
- 2024.iwclul-1.17
- Volume:
- Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages
- Month:
- November
- Year:
- 2024
- Address:
- Helsinki, Finland
- Editors:
- Mika Hämäläinen, Flammie Pirinen, Melany Macias, Mario Crespo Avila
- Venue:
- IWCLUL
- SIG:
- SIGUR
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 132–137
- Language:
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.iwclul-1.17/
- DOI:
- Cite (ACL):
- Melany Macias, Lev Kharlashkin,, Leo Huovinen, and Mika Hämäläinen. 2024. Scaling Sustainable Development Goal Predictions across Languages: From English to Finnish. In Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages, pages 132–137, Helsinki, Finland. Association for Computational Linguistics.
- Cite (Informal):
- Scaling Sustainable Development Goal Predictions across Languages: From English to Finnish (Macias et al., IWCLUL 2024)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.iwclul-1.17.pdf