Self-trained Pretrained Language Models for Evidence Detection

Mohamed Elaraby, Diane Litman


Abstract
Argument role labeling is a fundamental task in Argument Mining research. However, such research often suffers from a lack of large-scale datasets labeled for argument roles such as evidence, which is crucial for neural model training. While large pretrained language models have somewhat alleviated the need for massive manually labeled datasets, how much these models can further benefit from self-training techniques hasn’t been widely explored in the literature in general and in Argument Mining specifically. In this work, we focus on self-trained language models (particularly BERT) for evidence detection. We provide a thorough investigation on how to utilize pseudo labels effectively in the self-training scheme. We also assess whether adding pseudo labels from an out-of-domain source can be beneficial. Experiments on sentence level evidence detection show that self-training can complement pretrained language models to provide performance improvements.
Anthology ID:
2021.argmining-1.14
Volume:
Proceedings of the 8th Workshop on Argument Mining
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Khalid Al-Khatib, Yufang Hou, Manfred Stede
Venue:
ArgMining
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
142–147
Language:
URL:
https://aclanthology.org/2021.argmining-1.14
DOI:
10.18653/v1/2021.argmining-1.14
Bibkey:
Cite (ACL):
Mohamed Elaraby and Diane Litman. 2021. Self-trained Pretrained Language Models for Evidence Detection. In Proceedings of the 8th Workshop on Argument Mining, pages 142–147, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Self-trained Pretrained Language Models for Evidence Detection (Elaraby & Litman, ArgMining 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2021.argmining-1.14.pdf