Abstract
Projecting linguistic annotations through word alignments is one of the most prevalent approaches to cross-lingual transfer learning. Conventional wisdom suggests that annotation projection “just works” regardless of the task at hand. We carefully consider multi-source projection for named entity recognition. Our experiment with 17 languages shows that to detect named entities in true low-resource languages, annotation projection may not be the right way to move forward. On a more positive note, we also uncover the conditions that do favor named entity projection from multiple sources. We argue these are infeasible under noisy low-resource constraints.- Anthology ID:
- W18-6125
- Volume:
- Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text
- Month:
- November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Venue:
- WNUT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 195–201
- Language:
- URL:
- https://aclanthology.org/W18-6125
- DOI:
- 10.18653/v1/W18-6125
- Cite (ACL):
- Jan Vium Enghoff, Søren Harrison, and Željko Agić. 2018. Low-resource named entity recognition via multi-source projection: Not quite there yet?. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text, pages 195–201, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Low-resource named entity recognition via multi-source projection: Not quite there yet? (Enghoff et al., WNUT 2018)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/W18-6125.pdf