Towards Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, Wei Chu
Abstract
The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities. Multi-criteria Chinese word segmentation aims to capture various annotation criteria among datasets and leverage their common underlying knowledge. In this paper, we propose a domain adaptive segmenter to exploit diverse criteria of various datasets. Our model is based on Bidirectional Encoder Representations from Transformers (BERT), which is responsible for introducing open-domain knowledge. Private and shared projection layers are proposed to capture domain-specific knowledge and common knowledge, respectively. We also optimize computational efficiency via distillation, quantization, and compiler optimization. Experiments show that our segmenter outperforms the previous state of the art (SOTA) models on 10 CWS datasets with superior efficiency.- Anthology ID:
- 2020.coling-main.186
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Donia Scott, Nuria Bel, Chengqing Zong
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 2062–2072
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.186
- DOI:
- 10.18653/v1/2020.coling-main.186
- Cite (ACL):
- Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, and Wei Chu. 2020. Towards Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2062–2072, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- Towards Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning (Huang et al., COLING 2020)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2020.coling-main.186.pdf