Bing Xu


2021

pdf bib
Multi-Grained Knowledge Distillation for Named Entity Recognition
Xuan Zhou | Xiao Zhang | Chenyang Tao | Junya Chen | Bing Xu | Wei Wang | Jing Xiao
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Although pre-trained big models (e.g., BERT, ERNIE, XLNet, GPT3 etc.) have delivered top performance in Seq2seq modeling, their deployments in real-world applications are often hindered by the excessive computations and memory demand involved. For many applications, including named entity recognition (NER), matching the state-of-the-art result under budget has attracted considerable attention. Drawing power from the recent advance in knowledge distillation (KD), this work presents a novel distillation scheme to efficiently transfer the knowledge learned from big models to their more affordable counterpart. Our solution highlights the construction of surrogate labels through the k-best Viterbi algorithm to distill knowledge from the teacher model. To maximally assimilate knowledge into the student model, we propose a multi-grained distillation scheme, which integrates cross entropy involved in conditional random field (CRF) and fuzzy learning.To validate the effectiveness of our proposal, we conducted a comprehensive evaluation on five NER benchmarks, reporting cross-the-board performance gains relative to competing prior-arts. We further discuss ablation results to dissect our gains.

2020

pdf bib
CAN-GRU: a Hierarchical Model for Emotion Recognition in Dialogue
Ting Jiang | Bing Xu | Tiejun Zhao | Sheng Li
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Emotion recognition in dialogue systems has gained attention in the field of natural language processing recent years, because it can be applied in opinion mining from public conversational data on social media. In this paper, we propose a hierarchical model to recognize emotions in the dialogue. In the first layer, in order to extract textual features of utterances, we propose a convolutional self-attention network(CAN). Convolution is used to capture n-gram information and attention mechanism is used to obtain the relevant semantic information among words in the utterance. In the second layer, a GRU-based network helps to capture contextual information in the conversation. Furthermore, we discuss the effects of unidirectional and bidirectional networks. We conduct experiments on Friends dataset and EmotionPush dataset. The results show that our proposed model(CAN-GRU) and its variants achieve better performance than baselines.

pdf bib
CN-HIT-MI.T at SemEval-2020 Task 8: Memotion Analysis Based on BERT
Zhen Li | Yaojie Zhang | Bing Xu | Tiejun Zhao
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Internet memes emotion recognition is focused by many researchers. In this paper, we adopt BERT and ResNet for evaluation of detecting the emotions of Internet memes. We focus on solving the problem of data imbalance and data contains noise. We use RandAugment to enhance the data of the picture, and use Training Signal Annealing (TSA) to solve the impact of the imbalance of the label. At the same time, a new loss function is designed to ensure that the model is not affected by input noise which will improve the robustness of the model. We participated in sub-task a and our model based on BERT obtains 34.58% macro F1 score, ranking 10/32.

2019

pdf bib
CN-HIT-MI.T at SemEval-2019 Task 6: Offensive Language Identification Based on BiLSTM with Double Attention
Yaojie Zhang | Bing Xu | Tiejun Zhao
Proceedings of the 13th International Workshop on Semantic Evaluation

Offensive language has become pervasive in social media. In Offensive Language Identification tasks, it may be difficult to predict accurately only according to the surface words. So we try to dig deeper semantic information of text. This paper presents use an attention-based two layers bidirectional longshort memory neural network (BiLSTM) for semantic feature extraction. Additionally, a residual connection mechanism is used to synthesize two different deep features, and an emoji attention mechanism is used to extract semantic information of emojis in text. We participated in three sub-tasks of SemEval 2019 Task 6 as CN-HIT-MI.T team. Our macro-averaged F1-score in sub-task A is 0.768, ranking 28/103. We got 0.638 in sub-task B, ranking 30/75. In sub-task C, we got 0.549, ranking 22/65. We also tried some other methods of not submitting results.

2017

pdf bib
MI&T Lab at SemEval-2017 task 4: An Integrated Training Method of Word Vector for Sentiment Classification
Jingjing Zhao | Yan Yang | Bing Xu
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

A CNN method for sentiment classification task in Task 4A of SemEval 2017 is presented. To solve the problem of word2vec training word vector slowly, a method of training word vector by integrating word2vec and Convolutional Neural Network (CNN) is proposed. This training method not only improves the training speed of word2vec, but also makes the word vector more effective for the target task. Furthermore, the word2vec adopts a full connection between the input layer and the projection layer of the Continuous Bag-of-Words (CBOW) for acquiring the semantic information of the original sentence.