Large Margin Representation Learning for Robust Cross-lingual Named Entity Recognition

Guangcheng Zhu, Ruixuan Xiao, Haobo Wang, Zhen Zhu, Gengyu Lyu, Junbo Zhao


Abstract
Cross-lingual named entity recognition (NER) aims to build an NER model that generalizes to the low-resource target language with labeled data from the high-resource source language. Current state-of-the-art methods typically combine self-training mechanism with contrastive learning paradigm, in order to develop discriminative entity clusters for cross-lingual adaptation. Despite the promise, we identify that these methods neglect two key problems: distribution skewness and pseudo-label bias, leading to indistinguishable entity clusters with small margins. To this end, we propose a novel framework, MARAL, which optimizes an adaptively reweighted contrastive loss to handle the class skewness and theoretically guarantees the optimal feature arrangement with maximum margin. To further mitigate the adverse effects of unreliable pseudo-labels, MARAL integrates a progressive cross-lingual adaptation strategy, which first selects reliable samples as anchors and then refines the remaining unreliable ones. Extensive experiments demonstrate that MARAL significantly outperforms the current state-of-the-art methods on multiple benchmarks, e.g., +2.04% on the challenging MultiCoNER dataset.
Anthology ID:
2025.acl-long.215
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4270–4291
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.215/
DOI:
Bibkey:
Cite (ACL):
Guangcheng Zhu, Ruixuan Xiao, Haobo Wang, Zhen Zhu, Gengyu Lyu, and Junbo Zhao. 2025. Large Margin Representation Learning for Robust Cross-lingual Named Entity Recognition. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4270–4291, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Large Margin Representation Learning for Robust Cross-lingual Named Entity Recognition (Zhu et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.215.pdf