Youngwook Kim


2022

pdf
Generalizable Implicit Hate Speech Detection Using Contrastive Learning
Youngwook Kim | Shinwoo Park | Yo-Sub Han
Proceedings of the 29th International Conference on Computational Linguistics

Hate speech detection has gained increasing attention with the growing prevalence of hateful contents. When a text contains an obvious hate word or expression, it is fairly easy to detect it. However, it is challenging to identify implicit hate speech in nuance or context when there are insufficient lexical cues. Recently, there are several attempts to detect implicit hate speech leveraging pre-trained language models such as BERT and HateBERT. Fine-tuning on an implicit hate speech dataset shows satisfactory performance when evaluated on the test set of the dataset used for training. However, we empirically confirm that the performance drops at least 12.5%p in F1 score when tested on the dataset that is different from the one used for training. We tackle this cross-dataset underperforming problem using contrastive learning. Based on our observation of common underlying implications in various forms of hate posts, we propose a novel contrastive learning method, ImpCon, that pulls an implication and its corresponding posts close in representation space. We evaluate the effectiveness of ImpCon by running cross-dataset evaluation on three implicit hate speech benchmarks. The experimental results on cross-dataset show that ImpCon improves at most 9.10% on BERT, and 8.71% on HateBERT.

pdf
Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning
Yu Jin Kim | Beong-woo Kwak | Youngwook Kim | Reinald Kim Amplayo | Seung-won Hwang | Jinyoung Yeo
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Commonsense reasoning systems should be able to generalize to diverse reasoning cases. However, most state-of-the-art approaches depend on expensive data annotations and overfit to a specific benchmark without learning how to perform general semantic reasoning. To overcome these drawbacks, zero-shot QA systems have shown promise as a robust learning scheme by transforming a commonsense knowledge graph (KG) into synthetic QA-form samples for model training. Considering the increasing type of different commonsense KGs, this paper aims to extend the zero-shot transfer learning scenario into multiple-source settings, where different KGs can be utilized synergetically. Towards this goal, we propose to mitigate the loss of knowledge from the interference among the different knowledge sources, by developing a modular variant of the knowledge aggregation as a new zero-shot commonsense reasoning framework. Results on five commonsense reasoning benchmarks demonstrate the efficacy of our framework, improving the performance with multiple KGs.