Abstract
This paper reports on experiments for cross-lingual transfer using the anchor-based approach of Schuster et al. (2019) for English and a low-resourced language, namely Hindi. For the sake of comparison, we also evaluate the approach on three very different higher-resourced languages, viz. Dutch, Russian and Chinese. Initially designed for ELMo embeddings, we analyze the approach for the more recent BERT family of transformers for a variety of tasks, both mono and cross-lingual. The results largely prove that like most other cross-lingual transfer approaches, the static anchor approach is underwhelming for the low-resource language, while performing adequately for the higher resourced ones. We attempt to provide insights into both the quality of the anchors, and the performance for low-shot cross-lingual transfer to better understand this performance gap. We make the extracted anchors and the modified train and test sets available for future research at https://github.com/pranaydeeps/Vyaapak- Anthology ID:
- 2022.sigul-1.23
- Volume:
- Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Venue:
- SIGUL
- SIG:
- SIGUL
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 176–184
- Language:
- URL:
- https://aclanthology.org/2022.sigul-1.23
- DOI:
- Cite (ACL):
- Pranaydeep Singh, Orphee De Clercq, and Els Lefever. 2022. Investigating the Quality of Static Anchor Embeddings from Transformers for Under-Resourced Languages. In Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages, pages 176–184, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Investigating the Quality of Static Anchor Embeddings from Transformers for Under-Resourced Languages (Singh et al., SIGUL 2022)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2022.sigul-1.23.pdf
- Code
- pranaydeeps/vyaapak
- Data
- XNLI