Girma Bade

Also published as: Girma Yohannis Bade


2025

pdf bib
Girma@DravidianLangTech 2025: Detecting AI Generated Product Reviews
Girma Yohannis Bade | Muhammad Tayyab Zamir | Olga Kolesnikova | José Luis Oropeza | Grigori Sidorov | Alexander Gelbukh
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

The increasing prevalence of AI-generated content, including fake product reviews, poses significant challenges in maintaining authenticity and trust in e-commerce systems. While much work has focused on detecting such reviews in high-resource languages, limited attention has been given to low-resource languages like Malayalam and Tamil. This study aims to address this gap by developing a robust framework to identify AI-generated product reviews in these languages. We explore a BERT-based approach for this task. Our methodology involves fine-tuning a BERT-based model specifically on Malayalam and Tamil datasets. The experiments are conducted using labeled datasets that contain a mix of human-written and AI-generated reviews. Performance is evaluated using the macro F1 score. The results show that the BERT-based model achieved a macro F1 score of 0.6394 for Tamil and 0.8849 for Malayalam. Preliminary results indicate that the BERT-based model performs significantly better for Malayalam than for Tamil in terms of the average Macro F1 score, leveraging its ability to capture the complex linguistic features of these languages. Finally, we open the source code of the implementation in the GitHub repository: AI-Generated-Product-Review-Code

pdf bib
Amado at SemEval-2025 Task 11: Multi-label Emotion Detection in Amharic and English Data
Girma Bade | Olga Kolesnikova | Jose Oropeza | Grigori Sidorov | Mesay Yigezu
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Amado at SemEval-2025 Task 11: Multi-label Emotion Detection inAmharic and English DataGirma Yohannis Bade, Olga Kolesnikova, José Luis OropezaGrigori Sidorov, Mesay Gemeda Yigezua(Centro de Investigaciones en Computación(CIC),Instituto Politécnico Nacional(IPN), Miguel Othon de Mendizabal,Ciudad de México, 07320, México.)

2024

pdf bib
Social Media Fake News Classification Using Machine Learning Algorithm
Girma Bade | Olga Kolesnikova | Grigori Sidorov | José Oropeza
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

The rise of social media has facilitated easier communication, information sharing, and current affairs updates. However, the prevalence of misleading and deceptive content, commonly referred to as fake news, poses a significant challenge. This paper focuses on the classification of fake news in Malayalam, a Dravidian language, utilizing natural language processing (NLP) techniques. To develop a model, we employed a random forest machine learning method on a dataset provided by a shared task(DravidianLangTech@EACL 2024)1. When evaluated by the separate test dataset, our developed model achieved a 0.71 macro F1 measure.

pdf bib
Social Media Hate and Offensive Speech Detection Using Machine Learning method
Girma Bade | Olga Kolesnikova | Grigori Sidorov | José Oropeza
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Even though the improper use of social media is increasing nowadays, there is also technology that brings solutions. Here, improperness is posting hate and offensive speech that might harm an individual or group. Hate speech refers to an insult toward an individual or group based on their identities. Spreading it on social media platforms is a serious problem for society. The solution, on the other hand, is the availability of natural language processing(NLP) technology that is capable to detect and handle such problems. This paper presents the detection of social media’s hate and offensive speech in the code-mixed Telugu language. For this, the task and golden standard dataset were provided for us by the shared task organizer (DravidianLangTech@ EACL 2024)1. To this end, we have employed the TF-IDF technique for numeric feature extraction and used a random forest algorithm for modeling hate speech detection. Finally, the developed model was evaluated on the test dataset and achieved 0.492 macro-F1.