Abstract
In this paper we investigate the efficacy of using contextual embeddings from multilingual BERT and German BERT in identifying fact-claiming comments in German on social media. Additionally, we examine the impact of formulating the classification problem as a multi-task learning problem, where the model identifies toxicity and engagement of the comment in addition to identifying whether it is fact-claiming. We provide a thorough comparison of the two BERT based models compared with a logistic regression baseline and show that German BERT features trained using a multi-task objective achieves the best F1 score on the test set. This work was done as part of a submission to GermEval 2021 shared task on the identification of fact-claiming comments.- Anthology ID:
- 2021.germeval-1.15
- Volume:
- Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments
- Month:
- September
- Year:
- 2021
- Address:
- Duesseldorf, Germany
- Venue:
- GermEval
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 100–104
- Language:
- URL:
- https://aclanthology.org/2021.germeval-1.15
- DOI:
- Cite (ACL):
- Subhadarshi Panda and Sarah Ita Levitan. 2021. HunterSpeechLab at GermEval 2021: Does Your Comment Claim A Fact? Contextualized Embeddings for German Fact-Claiming Comment Classification. In Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments, pages 100–104, Duesseldorf, Germany. Association for Computational Linguistics.
- Cite (Informal):
- HunterSpeechLab at GermEval 2021: Does Your Comment Claim A Fact? Contextualized Embeddings for German Fact-Claiming Comment Classification (Panda & Levitan, GermEval 2021)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2021.germeval-1.15.pdf