On Selecting Training Corpora for Cross-Domain Claim Detection

Robin Schaefer, René Knaebel, Manfred Stede


Abstract
Identifying claims in text is a crucial first step in argument mining. In this paper, we investigate factors for the composition of training corpora to improve cross-domain claim detection. To this end, we use four recent argumentation corpora annotated with claims and submit them to several experimental scenarios. Our results indicate that the “ideal” composition of training corpora is characterized by a large corpus size, homogeneous claim proportions, and less formal text domains.
Anthology ID:
2022.argmining-1.17
Volume:
Proceedings of the 9th Workshop on Argument Mining
Month:
October
Year:
2022
Address:
Online and in Gyeongju, Republic of Korea
Editors:
Gabriella Lapesa, Jodi Schneider, Yohan Jo, Sougata Saha
Venue:
ArgMining
SIG:
Publisher:
International Conference on Computational Linguistics
Note:
Pages:
181–186
Language:
URL:
https://aclanthology.org/2022.argmining-1.17
DOI:
Bibkey:
Cite (ACL):
Robin Schaefer, René Knaebel, and Manfred Stede. 2022. On Selecting Training Corpora for Cross-Domain Claim Detection. In Proceedings of the 9th Workshop on Argument Mining, pages 181–186, Online and in Gyeongju, Republic of Korea. International Conference on Computational Linguistics.
Cite (Informal):
On Selecting Training Corpora for Cross-Domain Claim Detection (Schaefer et al., ArgMining 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2022.argmining-1.17.pdf