Word Reordering for Zero-shot Cross-lingual Structured Prediction

Tao Ji, Yong Jiang, Tao Wang, Zhongqiang Huang, Fei Huang, Yuanbin Wu, Xiaoling Wang


Abstract
Adapting word order from one language to another is a key problem in cross-lingual structured prediction. Current sentence encoders (e.g., RNN, Transformer with position embeddings) are usually word order sensitive. Even with uniform word form representations (MUSE, mBERT), word order discrepancies may hurt the adaptation of models. In this paper, we build structured prediction models with bag-of-words inputs, and introduce a new reordering module to organizing words following the source language order, which learns task-specific reordering strategies from a general-purpose order predictor model. Experiments on zero-shot cross-lingual dependency parsing, POS tagging, and morphological tagging show that our model can significantly improve target language performances, especially for languages that are distant from the source language.
Anthology ID:
2021.emnlp-main.338
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4109–4120
Language:
URL:
https://aclanthology.org/2021.emnlp-main.338
DOI:
10.18653/v1/2021.emnlp-main.338
Bibkey:
Cite (ACL):
Tao Ji, Yong Jiang, Tao Wang, Zhongqiang Huang, Fei Huang, Yuanbin Wu, and Xiaoling Wang. 2021. Word Reordering for Zero-shot Cross-lingual Structured Prediction. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4109–4120, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Word Reordering for Zero-shot Cross-lingual Structured Prediction (Ji et al., EMNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.emnlp-main.338.pdf
Software:
 2021.emnlp-main.338.Software.zip