Yu-Hsuan Wu

2023

pdf abs
IKM_Lab at BioLaySumm Task 1: Longformer-based Prompt Tuning for Biomedical Lay Summary Generation
Yu-Hsuan Wu | Ying-Jia Lin | Hung-Yu Kao
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

This paper describes the entry by the Intelligent Knowledge Management (IKM) Laboratory in the BioLaySumm 2023 task1. We aim to transform lengthy biomedical articles into concise, reader-friendly summaries that can be easily comprehended by the general public. We utilized a long-text abstractive summarization longformer model and experimented with several prompt methods for this task. Our entry placed 10th overall, but we were particularly proud to achieve a 3rd place score in the readability evaluation metric.

2022

pdf abs
A Dimensional Valence-Arousal-Irony Dataset for Chinese Sentence and Context
Sheng-Wei Huang | Wei-Yi Chung | Yu-Hsuan Wu | Chen-Chia Yu | Jheng-Long Wu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Chinese multi-dimensional sentiment detection is a challenging task with a considerable impact on semantic understanding. Past irony datasets are utilized to annotate sentiment type of whole sentences of irony. It does not provide the corresponding intensity of valence and arousal on the sentences and context. However, an ironic statement is defined as a statement whose apparent meaning is the opposite of its actual meaning. This means that in order to understand the actual meaning of a sentence, contextual information is needed. Therefore, the dimensional sentiment intensities of ironic sentences and context are important issues in the natural language processing field. This paper creates the extended NTU irony corpus, which includes valence, arousal and irony intensities on sentence-level; and valence and arousal intensities on context-level, called Chinese Dimensional Valence-Arousal-Irony (CDVAI) dataset. Therefore, this paper analyzes the annotation difference between the human annotators and uses a deep learning model such as BERT to evaluate the prediction performances on CDVAI corpus.

pdf abs
SCU-NLP at ROCLING 2022 Shared Task: Experiment and Error Analysis of Biomedical Entity Detection Model
Sung-Ting Chiou | Sheng-Wei Huang | Ying-Chun Lo | Yu-Hsuan Wu | Jheng-Long Wu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Named entity recognition generally refers to entities with specific meanings in unstructured text, including names of people, places, organizations, dates, times, quantities, proper nouns and other words. In the medical field, it may be drug names, Organ names, test items, nutritional supplements, etc. The purpose of named entity recognition in this study is to search for the above items from unstructured input text. In this study, taking healthcare as the research purpose, and predicting named entity boundaries and categories of sentences based on ten entity types, We explore multiple fundamental NER approaches to solve this task, Include: Hidden Markov Models, Conditional Random Fields, Random Forest Classifier and BERT. The prediction results are more significant in the F-score of the CRF model, and have achieved better results.

pdf
A Chinese Dimensional Valence-Arousal-Irony Detection on Sentence-level and Context-level Using Deep Learning Model
Jheng-Long Wu | Sheng-Wei Huang | Wei-Yi Chung | Yu-Hsuan Wu | Chen-Chia Yu
International Journal of Computational Linguistics & Chinese Language Processing, Volume 27, Number 2, December 2022

2017

pdf abs
Verb Replacer: An English Verb Error Correction System
Yu-Hsuan Wu | Jhih-Jie Chen | Jason Chang
Proceedings of the IJCNLP 2017, System Demonstrations

According to the analysis of Cambridge Learner Corpus, using a wrong verb is the most common type of grammatical errors. This paper describes Verb Replacer, a system for detecting and correcting potential verb errors in a given sentence. In our approach, alternative verbs are considered to replace the verb based on an error-annotated corpus and verb-object collocations. The method involves applying regression on channel models, parsing the sentence, identifying the verbs, retrieving a small set of alternative verbs, and evaluating each alternative. Our method combines and improves channel and language models, resulting in high recall of detecting and correcting verb misuse.