Hua Xu


2021

pdf bib
TEXTOIR: An Integrated and Visualized Platform for Text Open Intent Recognition
Hanlei Zhang | Xiaoteng Li | Hua Xu | Panpan Zhang | Kang Zhao | Kai Gao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

TEXTOIR is the first integrated and visualized platform for text open intent recognition. It is composed of two main modules: open intent detection and open intent discovery. Each module integrates most of the state-of-the-art algorithms and benchmark intent datasets. It also contains an overall framework connecting the two modules in a pipeline scheme. In addition, this platform has visualized tools for data and model management, training, evaluation and analysis of the performance from different aspects. TEXTOIR provides useful toolkits and convenient visualized interfaces for each sub-module, and designs a framework to implement a complete process to both identify known intents and discover open intents.

2020

pdf bib
CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality
Wenmeng Yu | Hua Xu | Fanyang Meng | Yilin Zhu | Yixiao Ma | Jiele Wu | Jiyun Zou | Kaicheng Yang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Previous studies in multimodal sentiment analysis have used limited datasets, which only contain unified multimodal annotations. However, the unified annotations do not always reflect the independent sentiment of single modalities and limit the model to capture the difference between modalities. In this paper, we introduce a Chinese single- and multi-modal sentiment analysis dataset, CH-SIMS, which contains 2,281 refined video segments in the wild with both multimodal and independent unimodal annotations. It allows researchers to study the interaction between modalities or use independent unimodal annotations for unimodal sentiment analysis.Furthermore, we propose a multi-task learning framework based on late fusion as the baseline. Extensive experiments on the CH-SIMS show that our methods achieve state-of-the-art performance and learn more distinctive unimodal representations. The full dataset and codes are available for use at https://github.com/thuiar/MMSA.

2019

pdf bib
Deep Unknown Intent Detection with Margin Loss
Ting-En Lin | Hua Xu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Identifying the unknown (novel) user intents that have never appeared in the training set is a challenging task in the dialogue system. In this paper, we present a two-stage method for detecting unknown intents. We use bidirectional long short-term memory (BiLSTM) network with the margin loss as the feature extractor. With margin loss, we can learn discriminative deep features by forcing the network to maximize inter-class variance and to minimize intra-class variance. Then, we feed the feature vectors to the density-based novelty detection algorithm, local outlier factor (LOF), to detect unknown intents. Experiments on two benchmark datasets show that our method can yield consistent improvements compared with the baseline methods.

pdf bib
The Strength of the Weakest Supervision: Topic Classification Using Class Labels
Jiatong Li | Kai Zheng | Hua Xu | Qiaozhu Mei | Yue Wang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

When developing topic classifiers for real-world applications, we begin by defining a set of meaningful topic labels. Ideally, an intelligent classifier can understand these labels right away and start classifying documents. Indeed, a human can confidently tell if an article is about science, politics, sports, or none of the above, after knowing just the class labels. We study the problem of training an initial topic classifier using only class labels. We investigate existing techniques for solving this problem and propose a simple but effective approach. Experiments on a variety of topic classification data sets show that learning from class labels can save significant initial labeling effort, essentially providing a ”free” warm start to the topic classifier.

2016

pdf bib
UTHealth at SemEval-2016 Task 12: an End-to-End System for Temporal Information Extraction from Clinical Notes
Hee-Jin Lee | Hua Xu | Jingqi Wang | Yaoyun Zhang | Sungrim Moon | Jun Xu | Yonghui Wu
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Clinical Abbreviation Disambiguation Using Neural Word Embeddings
Yonghui Wu | Jun Xu | Yaoyun Zhang | Hua Xu
Proceedings of BioNLP 15

pdf bib
UTH-CCB: The Participation of the SemEval 2015 Challenge – Task 14
Jun Xu | Yaoyun Zhang | Jingqi Wang | Yonghui Wu | Min Jiang | Ergin Soysal | Hua Xu
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
UTH_CCB: A report for SemEval 2014 – Task 7 Analysis of Clinical Text
Yaoyun Zhang | Jingqi Wang | Buzhou Tang | Yonghui Wu | Min Jiang | Yukun Chen | Hua Xu
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

2013

pdf bib
Implicit Feature Detection via a Constrained Topic Model and SVM
Wei Wang | Hua Xu | Xiaoqiu Huang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Grouping Product Features Using Semi-Supervised Learning with Soft-Constraints
Zhongwu Zhai | Bing Liu | Hua Xu | Peifa Jia
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Recognizing Medication related Entities in Hospital Discharge Summaries using Support Vector Machine
Son Doan | Hua Xu
Coling 2010: Posters

pdf bib
Soochow University: Description and Analysis of the Chinese Word Sense Induction System for CLP2010
Hua Xu | Bing Liu | Longhua Qian | Guodong Zhou
CIPS-SIGHAN Joint Conference on Chinese Language Processing

2007

pdf bib
Combining multiple evidence for gene symbol disambiguation
Hua Xu | Jung-Wei Fan | Carol Friedman
Biological, translational, and clinical language processing