2025
pdf
bib
abs
CIC-NLP at GenAI Detection Task 1: Advancing Multilingual Machine-Generated Text Detection
Tolulope Olalekan Abiola
|
Tewodros Achamaleh Bizuneh
|
Fatima Uroosa
|
Nida Hafeez
|
Grigori Sidorov
|
Olga Kolesnikova
|
Olumide Ebenezer Ojo
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
Machine-written texts are gradually becoming indistinguishable from human-generated texts, leading to the need to use sophisticated methods to detect them. Team CIC-NLP presents work in the Gen-AI Content Detection Task 1 at COLING 2025 Workshop: the focus of our work is on Subtask B of Task 1, which is the classification of text written by machines and human authors, with particular attention paid to identifying multilingual binary classification problem. Usng mBERT, we addressed the binary classification task using the dataset provided by the GenAI Detection Task team. mBERT acchieved a macro-average F1-score of 0.72 as well as an accuracy score of 0.73.
pdf
bib
abs
CIC-NLP at GenAI Detection Task 1: Leveraging DistilBERT for Detecting Machine-Generated Text in English
Tolulope Olalekan Abiola
|
Tewodros Achamaleh Bizuneh
|
Oluwatobi Joseph Abiola
|
Temitope Olasunkanmi Oladepo
|
Olumide Ebenezer Ojo
|
Grigori Sidorov
|
Olga Kolesnikova
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
As machine-generated texts (MGT) become increasingly similar to human writing, these dis- tinctions are harder to identify. In this paper, we as the CIC-NLP team present our submission to the Gen-AI Content Detection Workshop at COLING 2025 for Task 1 Subtask A, which involves distinguishing between text generated by LLMs and text authored by humans, with an emphasis on detecting English-only MGT. We applied the DistilBERT model to this binary classification task using the dataset provided by the organizers. Fine-tuning the model effectively differentiated between the classes, resulting in a micro-average F1-score of 0.70 on the evaluation test set. We provide a detailed explanation of the fine-tuning parameters and steps involved in our analysis.
pdf
bib
abs
Tewodros at SemEval-2025 Task 11: Multilingual Emotion Intensity Detection using Small Language Models
Mikiyas Eyasu
|
Wendmnew Sitot Abebaw
|
Nida Hafeez
|
Fatima Uroosa
|
Tewodros Achamaleh Bizuneh
|
Grigori Sidorov
|
Alexander Gelbukh
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Emotions play a fundamental role in the decision-making process, shaping human actions across diverse disciplines. The extensive usage of emotion intensity detection approaches has generated substantial research interest during the last few years. Efficient multi-label emotion intensity detection remains unsatisfactory even for high-resource languages, with a substantial performance gap among well-resourced and under-resourced languages. Team {textbf{Tewodros}} participated in SemEval-2025 Task 11, Track B, focusing on detecting text-based emotion intensity. Our work involved multi-label emotion intensity detection across three languages: Amharic, English, and Spanish, using the (afro-xlmr-large-76L), (DeBERTa-v3-base), and (BERT-base-Spanish-wwm-uncased) models. The models achieved an average F1 score of 0.6503 for Amharic, 0.5943 for English, and an accuracy score of 0.6228 for Spanish. These results demonstrate the effectiveness of our models in capturing emotion intensity across multiple languages.