Abstract
Identifying claims in text is a crucial first step in argument mining. In this paper, we investigate factors for the composition of training corpora to improve cross-domain claim detection. To this end, we use four recent argumentation corpora annotated with claims and submit them to several experimental scenarios. Our results indicate that the “ideal” composition of training corpora is characterized by a large corpus size, homogeneous claim proportions, and less formal text domains.- Anthology ID:
- 2022.argmining-1.17
- Volume:
- Proceedings of the 9th Workshop on Argument Mining
- Month:
- October
- Year:
- 2022
- Address:
- Online and in Gyeongju, Republic of Korea
- Editors:
- Gabriella Lapesa, Jodi Schneider, Yohan Jo, Sougata Saha
- Venue:
- ArgMining
- SIG:
- Publisher:
- International Conference on Computational Linguistics
- Note:
- Pages:
- 181–186
- Language:
- URL:
- https://aclanthology.org/2022.argmining-1.17
- DOI:
- Cite (ACL):
- Robin Schaefer, René Knaebel, and Manfred Stede. 2022. On Selecting Training Corpora for Cross-Domain Claim Detection. In Proceedings of the 9th Workshop on Argument Mining, pages 181–186, Online and in Gyeongju, Republic of Korea. International Conference on Computational Linguistics.
- Cite (Informal):
- On Selecting Training Corpora for Cross-Domain Claim Detection (Schaefer et al., ArgMining 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.argmining-1.17.pdf