Adversarial Multi-Criteria Learning for Chinese Word Segmentation

Xinchi Chen; Zhan Shi; Xipeng Qiu (邱锡鹏); Xuan-Jing Huang (黄萱菁)

doi:10.18653/v1/P17-1110

Adversarial Multi-Criteria Learning for Chinese Word Segmentation

Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang

Abstract

Different linguistic perspectives causes many diverse segmentation criteria for Chinese word segmentation (CWS). Most existing methods focus on improve the performance for each single criterion. However, it is interesting to exploit these different criteria and mining their common underlying knowledge. In this paper, we propose adversarial multi-criteria learning for CWS by integrating shared knowledge from multiple heterogeneous segmentation criteria. Experiments on eight corpora with heterogeneous segmentation criteria show that the performance of each corpus obtains a significant improvement, compared to single-criterion learning. Source codes of this paper are available on Github.

Anthology ID:: P17-1110
Volume:: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2017
Address:: Vancouver, Canada
Editors:: Regina Barzilay, Min-Yen Kan
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1193–1203
Language:
URL:: https://preview.aclanthology.org/nschneid-patch-1/P17-1110/
DOI:: 10.18653/v1/P17-1110
Bibkey:
Cite (ACL):: Xinchi Chen, Zhan Shi, Xipeng Qiu, and Xuanjing Huang. 2017. Adversarial Multi-Criteria Learning for Chinese Word Segmentation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1193–1203, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):: Adversarial Multi-Criteria Learning for Chinese Word Segmentation (Chen et al., ACL 2017)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-1/P17-1110.pdf
Video:: https://preview.aclanthology.org/nschneid-patch-1/P17-1110.mp4

PDF Cite Search Video Fix data