Wenzhi Wang


2025

pdf bib
Identification of Multiple Logical Interpretations in Counter-Arguments
Wenzhi Wang | Paul Reisert | Shoichi Naito | Naoya Inoue | Machi Shimmei | Surawat Pothong | Jungmin Choi | Kentaro Inui
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Counter-arguments (CAs) are a good means to improve the critical-thinking skills of learners, especially given that one has to thoroughly consider the logic of initial arguments (IA) when composing their CA. Although several tasks have been created for identifying the logical structure of CAs, no prior work has focused on capturing multiple interpretations of logical structures due to their complexity. In this work, we create CALSA+, a dataset consisting of 134 CAs annotated with 13 logical predicate questions. CALSA+ contains 1,742 instances annotated by 3 expert annotators (5,226 total annotations) with good agreement (Krippendorff 𝛼=0.46). Using CALSA+, we train a model with Reinforcement Learning with Verifiable Rewards (RLVR) to identify multiple logical interpretations and show that models trained with RLVR can perform on par with much bigger proprietary models. Our work is the first to attempt to annotate all the interpretations of logical structure on top of CAs. We publicly release our dataset to facilitate research in CA logical structure identification.

2024

pdf bib
Flee the Flaw: Annotating the Underlying Logic of Fallacious Arguments Through Templates and Slot-filling
Irfan Robbani | Paul Reisert | Surawat Pothong | Naoya Inoue | Camélia Guerraoui | Wenzhi Wang | Shoichi Naito | Jungmin Choi | Kentaro Inui
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Prior research in computational argumentation has mainly focused on scoring the quality of arguments, with less attention on explicating logical errors. In this work, we introduce four sets of explainable templates for common informal logical fallacies designed to explicate a fallacy’s implicit logic. Using our templates, we conduct an annotation study on top of 400 fallacious arguments taken from LOGIC dataset and achieve a high agreement score (Krippendorf’s 𝛼 of 0.54) and reasonable coverage 83%. Finally, we conduct an experiment for detecting the structure of fallacies and discover that state-of-the-art language models struggle with detecting fallacy templates (0.47 accuracy). To facilitate research on fallacies, we make our dataset and guidelines publicly available.

pdf bib
Designing Logic Pattern Templates for Counter-Argument Logical Structure Analysis
Shoichi Naito | Wenzhi Wang | Paul Reisert | Naoya Inoue | Camélia Guerraoui | Kenshi Yamaguchi | Jungmin Choi | Irfan Robbani | Surawat Pothong | Kentaro Inui
Findings of the Association for Computational Linguistics: EMNLP 2024

2023

pdf bib
Teach Me How to Argue: A Survey on NLP Feedback Systems in Argumentation
Camelia Guerraoui | Paul Reisert | Naoya Inoue | Farjana Sultana Mim | Keshav Singh | Jungmin Choi | Irfan Robbani | Shoichi Naito | Wenzhi Wang | Kentaro Inui
Proceedings of the 10th Workshop on Argument Mining

The use of argumentation in education has shown improvement in students’ critical thinking skills, and computational models for argumentation have been developed to further assist this process. Although these models are useful for evaluating the quality of an argument, they often cannot explain why a particular argument score was predicted, i.e., why the argument is good or bad, which makes it difficult to provide constructive feedback to users, e.g., students, so that they can strengthen their critical thinking skills. In this survey, we explore current NLP feedback systems by categorizing each into four important dimensions of feedback (Richness, Visualization, Interactivity and Personalization). We discuss limitations for each dimension and provide suggestions to enhance the power of feedback and explanations to ultimately improve user critical thinking skills.