Yi-Jie Huang


2017

pdf
MONPA: Multi-objective Named-entity and Part-of-speech Annotator for Chinese using Recurrent Neural Network
Yu-Lun Hsieh | Yung-Chun Chang | Yi-Jie Huang | Shu-Hao Yeh | Chun-Hung Chen | Wen-Lian Hsu
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Part-of-speech (POS) tagging and named entity recognition (NER) are crucial steps in natural language processing. In addition, the difficulty of word segmentation places additional burden on those who intend to deal with languages such as Chinese, and pipelined systems often suffer from error propagation. This work proposes an end-to-end model using character-based recurrent neural network (RNN) to jointly accomplish segmentation, POS tagging and NER of a Chinese sentence. Experiments on previous word segmentation and NER datasets show that a single model with the proposed architecture is comparable to those trained specifically for each task, and outperforms freely-available softwares. Moreover, we provide a web-based interface for the public to easily access this resource.

pdf
Incorporating Dependency Trees Improve Identification of Pregnant Women on Social Media Platforms
Yi-Jie Huang | Chu Hsien Su | Yi-Chun Chang | Tseng-Hsin Ting | Tzu-Yuan Fu | Rou-Min Wang | Hong-Jie Dai | Yung-Chun Chang | Jitendra Jonnagaddala | Wen-Lian Hsu
Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)

The increasing popularity of social media lead users to share enormous information on the internet. This information has various application like, it can be used to develop models to understand or predict user behavior on social media platforms. For example, few online retailers have studied the shopping patterns to predict shopper’s pregnancy stage. Another interesting application is to use the social media platforms to analyze users’ health-related information. In this study, we developed a tree kernel-based model to classify tweets conveying pregnancy related information using this corpus. The developed pregnancy classification model achieved an accuracy of 0.847 and an F-score of 0.565. A new corpus from popular social media platform Twitter was developed for the purpose of this study. In future, we would like to improve this corpus by reducing noise such as retweets.