Abstract
We present Expected Statistic Regulariza tion (ESR), a novel regularization technique that utilizes low-order multi-task structural statistics to shape model distributions for semi- supervised learning on low-resource datasets. We study ESR in the context of cross-lingual transfer for syntactic analysis (POS tagging and labeled dependency parsing) and present several classes of low-order statistic functions that bear on model behavior. Experimentally, we evaluate the proposed statistics with ESR for unsupervised transfer on 5 diverse target languages and show that all statistics, when estimated accurately, yield improvements to both POS and LAS, with the best statistic improving POS by +7.0 and LAS by +8.5 on average. We also present semi-supervised transfer and learning curve experiments that show ESR provides significant gains over strong cross-lingual-transfer-plus-fine-tuning baselines for modest amounts of label data. These results indicate that ESR is a promising and complementary approach to model-transfer approaches for cross-lingual parsing.1- Anthology ID:
- 2023.tacl-1.8
- Volume:
- Transactions of the Association for Computational Linguistics, Volume 11
- Month:
- Year:
- 2023
- Address:
- Cambridge, MA
- Venue:
- TACL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 122–138
- Language:
- URL:
- https://aclanthology.org/2023.tacl-1.8
- DOI:
- 10.1162/tacl_a_00537
- Cite (ACL):
- Thomas Effland and Michael Collins. 2023. Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization. Transactions of the Association for Computational Linguistics, 11:122–138.
- Cite (Informal):
- Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization (Effland & Collins, TACL 2023)
- PDF:
- https://preview.aclanthology.org/emnlp-22-attachments/2023.tacl-1.8.pdf