ur-iw-hnt at GermEval 2021: An Ensembling Strategy with Multiple BERT Models

Hoai Nam Tran; Udo Kruschwitz

ur-iw-hnt at GermEval 2021: An Ensembling Strategy with Multiple BERT Models

Abstract

This paper describes our approach (ur-iw-hnt) for the Shared Task of GermEval2021 to identify toxic, engaging, and fact-claiming comments. We submitted three runs using an ensembling strategy by majority (hard) voting with multiple different BERT models of three different types: German-based, Twitter-based, and multilingual models. All ensemble models outperform single models, while BERTweet is the winner of all individual models in every subtask. Twitter-based models perform better than GermanBERT models, and multilingual models perform worse but by a small margin.

Anthology ID:: 2021.germeval-1.12
Volume:: Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments
Month:: September
Year:: 2021
Address:: Duesseldorf, Germany
Editors:: Julian Risch, Anke Stoll, Lena Wilms, Michael Wiegand
Venue:: GermEval
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 83–87
Language:
URL:: https://aclanthology.org/2021.germeval-1.12
DOI:
Bibkey:
Cite (ACL):: Hoai Nam Tran and Udo Kruschwitz. 2021. ur-iw-hnt at GermEval 2021: An Ensembling Strategy with Multiple BERT Models. In Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments, pages 83–87, Duesseldorf, Germany. Association for Computational Linguistics.
Cite (Informal):: ur-iw-hnt at GermEval 2021: An Ensembling Strategy with Multiple BERT Models (Tran & Kruschwitz, GermEval 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-2/2021.germeval-1.12.pdf
Code: hn-tran/germeval2021

PDF Search Code