Jaewon Lee


2023

pdf
AIWolfDial 2023: Summary of Natural Language Division of 5th International AIWolf Contest
Yoshinobu Kano | Neo Watanabe | Kaito Kagaminuma | Claus Aranha | Jaewon Lee | Benedek Hauer | Hisaichi Shibata | Soichiro Miki | Yuta Nakamura | Takuya Okubo | Soga Shigemura | Rei Ito | Kazuki Takashima | Tomoki Fukuda | Masahiro Wakutani | Tomoya Hatanaka | Mami Uchida | Mikio Abe | Akihiro Mikami | Takashi Otsuki | Zhiyang Qi | Kei Harada | Michimasa Inaba | Daisuke Katagami | Hirotaka Osawa | Fujio Toriumi
Proceedings of the 16th International Natural Language Generation Conference: Generation Challenges

We held our 5th annual AIWolf international contest to automatically play the Werewolf game “Mafia”, where players try finding liars via conversations, aiming at promoting developments in creating agents of more natural conversations in higher level, such as longer contexts, personal relationships, semantics, pragmatics, and logics, revealing the capabilities and limits of the generative AIs. In our Natural Language Division of the contest, we had six Japanese speaking agents from five teams, and three English speaking agents, to mutually run games. By using the game logs, We performed human subjective evaluations and detailed log analysis. We found that the entire system performance has largely improved over the previous year, due to the recent advantages of the LLMs. However, it is not perfect at all yet; the generated talks are sometimes inconsistent with the game actions, it is still doubtful that the agents could infer roles by logics rather than superficial utterance generations. It is not explicitly observed in this log but it would be still difficult to make an agent telling a lie, pretend as a villager but it has an opposite goal inside. Our future work includes to reveal the capability of the LLMs, whether they can make the duality of the “liar”, in other words, holding a “true” and a “false” circumstances of the agent at the same time, even holding what these circumstances look like from other agents.

pdf
“Why do I feel offended?” - Korean Dataset for Offensive Language Identification
San-Hee Park | Kang-Min Kim | O-Joun Lee | Youjin Kang | Jaewon Lee | Su-Min Lee | SangKeun Lee
Findings of the Association for Computational Linguistics: EACL 2023

Warning: This paper contains some offensive expressions. Offensive content is an unavoidable issue on social media. Most existing offensive language identification methods rely on the compilation of labeled datasets. However, existing methods rarely consider low-resource languages that have relatively less data available for training (e.g., Korean). To address these issues, we construct a novel KOrean Dataset for Offensive Language Identification (KODOLI). KODOLI comprises more fine-grained offensiveness categories (i.e., not offensive, likely offensive, and offensive) than existing ones. A likely offensive language refers to texts with implicit offensiveness or abusive language without offensive intentions. In addition, we propose two auxiliary tasks to help identify offensive languages: abusive language detection and sentiment analysis. We provide experimental results for baselines on KODOLI and observe that language models suffer from identifying “LIKELY” offensive statements. Quantitative results and qualitative analysis demonstrate that jointly learning offensive language, abusive language and sentiment information improves the performance of offensive language identification.