Revisiting Projection-based Data Transfer for Cross-Lingual Named Entity Recognition in Low-Resource Languages

Andrei Politov, Oleh Shkalikov, Rene Jäkel, Michael Färber


Abstract
Cross-lingual Named Entity Recognition (NER) leverages knowledge transfer between languages to identify and classify named entities, making it particularly useful for low-resource languages. We show that the data-based cross-lingual transfer method is an effective technique for cross-lingual NER and can outperform multi-lingual language models for low-resource languages. This paper introduces two key enhancements to the annotation projection step in cross-lingual NER for low-resource languages. First, we explore refining word alignments using back-translation to improve accuracy. Second, we present a novel formalized projection approach of matching source entities with extracted target candidates. Through extensive experiments on two datasets spanning 57 languages, we demonstrated that our approach surpasses existing projection-based methods in low-resource settings. These findings highlight the robustness of projection-based data transfer as an alternative to model-based methods for cross-lingual named entity recognition in low-resource languages.
Anthology ID:
2025.nodalida-1.54
Volume:
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
Month:
march
Year:
2025
Address:
Tallinn, Estonia
Editors:
Richard Johansson, Sara Stymne
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
499–507
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.nodalida-1.54/
DOI:
Bibkey:
Cite (ACL):
Andrei Politov, Oleh Shkalikov, Rene Jäkel, and Michael Färber. 2025. Revisiting Projection-based Data Transfer for Cross-Lingual Named Entity Recognition in Low-Resource Languages. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 499–507, Tallinn, Estonia. University of Tartu Library.
Cite (Informal):
Revisiting Projection-based Data Transfer for Cross-Lingual Named Entity Recognition in Low-Resource Languages (Politov et al., NoDaLiDa 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.nodalida-1.54.pdf