Abstract
The nature of no word delimiter or inflection that can indicate segment boundaries or word semantics increases the difficulty of Chinese text understanding, and also intensifies the demand for word-level semantic knowledge to accomplish the tagging goal in Chinese segmenting and labeling tasks. However, for unsupervised Chinese cross-domain segmenting and labeling tasks, the model trained on the source domain frequently suffers from the deficient word-level semantic knowledge of the target domain. To address this issue, we propose a novel paradigm based on attention augmentation to introduce crucial cross-domain knowledge via a translation system. The proposed paradigm enables the model attention to draw cross-domain knowledge indicated by the implicit word-level cross-lingual alignment between the input and its corresponding translation. Aside from the model requiring cross-lingual input, we also establish an off-the-shelf model which eludes the dependency on cross-lingual translations. Experiments demonstrate that our proposal significantly advances the state-of-the-art results of cross-domain Chinese segmenting and labeling tasks.- Anthology ID:
- 2021.findings-emnlp.163
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1896–1906
- Language:
- URL:
- https://aclanthology.org/2021.findings-emnlp.163
- DOI:
- 10.18653/v1/2021.findings-emnlp.163
- Cite (ACL):
- Ruixuan Luo, Yi Zhang, Sishuo Chen, and Xu Sun. 2021. Translation as Cross-Domain Knowledge: Attention Augmentation for Unsupervised Cross-Domain Segmenting and Labeling Tasks. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1896–1906, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Translation as Cross-Domain Knowledge: Attention Augmentation for Unsupervised Cross-Domain Segmenting and Labeling Tasks (Luo et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2021.findings-emnlp.163.pdf
- Code
- lancopku/attention-augmentation