Chao-Wei Huang


2021

pdf bib
Modeling Diagnostic Label Correlation for Automatic ICD Coding
Shang-Chi Tsai | Chao-Wei Huang | Yun-Nung Chen
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Given the clinical notes written in electronic health records (EHRs), it is challenging to predict the diagnostic codes which is formulated as a multi-label classification task. The large set of labels, the hierarchical dependency, and the imbalanced data make this prediction task extremely hard. Most existing work built a binary prediction for each label independently, ignoring the dependencies between labels. To address this problem, we propose a two-stage framework to improve automatic ICD coding by capturing the label correlation. Specifically, we train a label set distribution estimator to rescore the probability of each label set candidate generated by a base predictor. This paper is the first attempt at learning the label set distribution as a reranking module for ICD coding. In the experiments, our proposed framework is able to improve upon best-performing predictors for medical code prediction on the benchmark MIMIC datasets.

2020

pdf bib
Towards Unsupervised Language Understanding and Generation by Joint Dual Learning
Shang-Yu Su | Chao-Wei Huang | Yun-Nung Chen
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In modular dialogue systems, natural language understanding (NLU) and natural language generation (NLG) are two critical components, where NLU extracts the semantics from the given texts and NLG is to construct corresponding natural language sentences based on the input semantic representations. However, the dual property between understanding and generation has been rarely explored. The prior work is the first attempt that utilized the duality between NLU and NLG to improve the performance via a dual supervised learning framework. However, the prior work still learned both components in a supervised manner; instead, this paper introduces a general learning framework to effectively exploit such duality, providing flexibility of incorporating both supervised and unsupervised learning algorithms to train language understanding and generation models in a joint fashion. The benchmark experiments demonstrate that the proposed approach is capable of boosting the performance of both NLU and NLG. The source code is available at: https://github.com/MiuLab/DuaLUG.

pdf bib
Learning Spoken Language Representations with Neural Lattice Language Modeling
Chao-Wei Huang | Yun-Nung Chen
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Pre-trained language models have achieved huge improvement on many NLP tasks. However, these methods are usually designed for written text, so they do not consider the properties of spoken language. Therefore, this paper aims at generalizing the idea of language model pre-training to lattices generated by recognition systems. We propose a framework that trains neural lattice language models to provide contextualized representations for spoken language understanding tasks. The proposed two-stage pre-training approach reduces the demands of speech data and has better efficiency. Experiments on intent detection and dialogue act recognition datasets demonstrate that our proposed method consistently outperforms strong baselines when evaluated on spoken inputs. The code is available at https://github.com/MiuLab/Lattice-ELMo.

2019

pdf bib
Dual Supervised Learning for Natural Language Understanding and Generation
Shang-Yu Su | Chao-Wei Huang | Yun-Nung Chen
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Natural language understanding (NLU) and natural language generation (NLG) are both critical research topics in the NLP and dialogue fields. Natural language understanding is to extract the core semantic meaning from the given utterances, while natural language generation is opposite, of which the goal is to construct corresponding sentences based on the given semantics. However, such dual relationship has not been investigated in literature. This paper proposes a novel learning framework for natural language understanding and generation on top of dual supervised learning, providing a way to exploit the duality. The preliminary experiments show that the proposed approach boosts the performance for both tasks, demonstrating the effectiveness of the dual relationship.