Uncertainty-Aware Cross-Lingual Transfer with Pseudo Partial Labels
Shuo Lei, Xuchao Zhang, Jianfeng He, Fanglan Chen, Chang-Tien Lu
Abstract
Large-scale multilingual pre-trained language models have achieved remarkable performance in zero-shot cross-lingual tasks. A recent study has demonstrated the effectiveness of self-learning-based approach on cross-lingual transfer, where only unlabeled data of target languages are required, without any efforts to annotate gold labels for target languages. However, it suffers from noisy training due to the incorrectly pseudo-labeled samples. In this work, we propose an uncertainty-aware Cross-Lingual Transfer framework with Pseudo-Partial-Label (CLTP)1 to maximize the utilization of unlabeled data by reducing the noise introduced in the training phase. To estimate pseudo-partial-label for each unlabeled data, we propose a novel estimation method, considering both prediction confidence and the limitation to the number of similar labels. Extensive experiments are conducted on two cross-lingual tasks, including Named Entity Recognition (NER) and Natural Language Inference (NLI) across 40 languages, which shows our method can outperform the baselines on both high-resource and low-resource languages, such as 6.9 on Kazakh (kk) and 5.2 Marathi (mr) for NER.- Anthology ID:
- 2022.findings-naacl.153
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2022
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1987–1997
- Language:
- URL:
- https://aclanthology.org/2022.findings-naacl.153
- DOI:
- 10.18653/v1/2022.findings-naacl.153
- Cite (ACL):
- Shuo Lei, Xuchao Zhang, Jianfeng He, Fanglan Chen, and Chang-Tien Lu. 2022. Uncertainty-Aware Cross-Lingual Transfer with Pseudo Partial Labels. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1987–1997, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- Uncertainty-Aware Cross-Lingual Transfer with Pseudo Partial Labels (Lei et al., Findings 2022)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2022.findings-naacl.153.pdf
- Code
- slei109/cltp
- Data
- XTREME