Sheng-Wei Huang


2022

pdf
A Dimensional Valence-Arousal-Irony Dataset for Chinese Sentence and Context
Sheng-Wei Huang | Wei-Yi Chung | Yu-Hsuan Wu | Chen-Chia Yu | Jheng-Long Wu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Chinese multi-dimensional sentiment detection is a challenging task with a considerable impact on semantic understanding. Past irony datasets are utilized to annotate sentiment type of whole sentences of irony. It does not provide the corresponding intensity of valence and arousal on the sentences and context. However, an ironic statement is defined as a statement whose apparent meaning is the opposite of its actual meaning. This means that in order to understand the actual meaning of a sentence, contextual information is needed. Therefore, the dimensional sentiment intensities of ironic sentences and context are important issues in the natural language processing field. This paper creates the extended NTU irony corpus, which includes valence, arousal and irony intensities on sentence-level; and valence and arousal intensities on context-level, called Chinese Dimensional Valence-Arousal-Irony (CDVAI) dataset. Therefore, this paper analyzes the annotation difference between the human annotators and uses a deep learning model such as BERT to evaluate the prediction performances on CDVAI corpus.

pdf
SCU-NLP at ROCLING 2022 Shared Task: Experiment and Error Analysis of Biomedical Entity Detection Model
Sung-Ting Chiou | Sheng-Wei Huang | Ying-Chun Lo | Yu-Hsuan Wu | Jheng-Long Wu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Named entity recognition generally refers to entities with specific meanings in unstructured text, including names of people, places, organizations, dates, times, quantities, proper nouns and other words. In the medical field, it may be drug names, Organ names, test items, nutritional supplements, etc. The purpose of named entity recognition in this study is to search for the above items from unstructured input text. In this study, taking healthcare as the research purpose, and predicting named entity boundaries and categories of sentences based on ten entity types, We explore multiple fundamental NER approaches to solve this task, Include: Hidden Markov Models, Conditional Random Fields, Random Forest Classifier and BERT. The prediction results are more significant in the F-score of the CRF model, and have achieved better results.