FNLP412@EEUCA 2026: Understanding Toxic Behavioral Intent in Gaming Chat Logs using Transfer Learning and Synthetic Data Augmentation

Mihai Radu Radulescu


Abstract
Our paper explores several machine learning methods for detecting toxic language in gaming-related chat utterances. We start with the GameTox dataset, perform some data preprocessing and augment the minority classes with LLM-generated synthetic data. We then set a baseline using a classic Logistic Regression model and continue to explore severalapproaches to surpassing it, by leveraging the leading multilingual transformer models (XLM-RoBERTa and DeBERTa-V3) to classify our test data. We achieve a top result of 0.6725 Macro-F1 (2nd place on shared task leaderboard) using a MDeBERTa-V3 model which we pretrained on the Jigsaw dataset for 1 epoch and then fine-tuned on our GameTox data for 5 epochs.
Anthology ID:
2026.eeuca-1.10
Volume:
Proceedings of the 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ali Hürriyetoğlu, Surendrabikram Thapa, Hristo Tanev
Venues:
EEUCA | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
96–103
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.eeuca-1.10/
DOI:
Bibkey:
Cite (ACL):
Mihai Radu Radulescu. 2026. FNLP412@EEUCA 2026: Understanding Toxic Behavioral Intent in Gaming Chat Logs using Transfer Learning and Synthetic Data Augmentation. In Proceedings of the 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026), pages 96–103, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
FNLP412@EEUCA 2026: Understanding Toxic Behavioral Intent in Gaming Chat Logs using Transfer Learning and Synthetic Data Augmentation (Radulescu, EEUCA 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.eeuca-1.10.pdf