Cross-lingual Transfer Learning for Japanese Named Entity Recognition

Andrew Johnson, Penny Karanasou, Judith Gaspers, Dietrich Klakow


Abstract
This work explores cross-lingual transfer learning (TL) for named entity recognition, focusing on bootstrapping Japanese from English. A deep neural network model is adopted and the best combination of weights to transfer is extensively investigated. Moreover, a novel approach is presented that overcomes linguistic differences between this language pair by romanizing a portion of the Japanese input. Experiments are conducted on external datasets, as well as internal large-scale real-world ones. Gains with TL are achieved for all evaluated cases. Finally, the influence on TL of the target dataset size and of the target tagset distribution is further investigated.
Anthology ID:
N19-2023
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Anastassia Loukina, Michelle Morales, Rohit Kumar
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
182–189
Language:
URL:
https://aclanthology.org/N19-2023
DOI:
10.18653/v1/N19-2023
Bibkey:
Cite (ACL):
Andrew Johnson, Penny Karanasou, Judith Gaspers, and Dietrich Klakow. 2019. Cross-lingual Transfer Learning for Japanese Named Entity Recognition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers), pages 182–189, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Cross-lingual Transfer Learning for Japanese Named Entity Recognition (Johnson et al., NAACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/N19-2023.pdf
Data
CoNLL 2003