Yanfei Wang


2020

pdf
Improving Sentence Classification by Multilingual Data Augmentation and Consensus Learning
Yanfei Wang | Yangdong Chen | Yuejie Zhang
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Neural network based models have achieved impressive results on the sentence classification task. However, most of previous work focuses on designing more sophisticated network or effective learning paradigms on monolingual data, which often suffers from insufficient discriminative knowledge for classification. In this paper, we investigate to improve sentence classification by multilingual data augmentation and consensus learning. Comparing to previous methods, our model can make use of multilingual data generated by machine translation and mine their language-share and language-specific knowledge for better representation and classification. We evaluate our model using English (i.e., source language) and Chinese (i.e., target language) data on several sentence classification tasks. Very positive classification performance can be achieved by our proposed model.