Seyedeh Fatemeh Ebrahimi

2025

pdf bib abs
Constrained Non-negative Matrix Factorization for Guided Topic Modeling of Minority Topics
Seyedeh Fatemeh Ebrahimi | Jaakko Peltonen
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Topic models often fail to capture low-prevalence, domain-critical themes—so-called minority topics—such as mental health themes in online comments. While some existing methods can incorporate domain knowledge such as expected topical content, methods allowing guidance may require overly detailed expected topics, hindering the discovery of topic divisions and variation. We propose a topic modeling solution via a specially constrained NMF. We incorporate a seed word list characterizing minority content of interest, but we do not require experts to pre-specify their division across minority topics. Through prevalence constraints on minority topics and seed word content across topics, we learn distinct data-driven minority topics as well as majority topics. The constrained NMF is fitted via Karush-Kuhn-Tucker (KKT) conditions with multiplicative updates. We outperform several baselines on synthetic data in terms of topic purity, normalized mutual information, and also evaluate topic quality using Jensen-Shannon divergence (JSD). We conduct a case study on YouTube vlog comments, analyzing viewer discussion of mental health content; our model successfully identifies and reveals this domain relevant minority content.

2024

pdf bib abs
Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text
Seyedeh Fatemeh Ebrahimi | Karim Akhavan Azari | Amirmasoud Iravani | Arian Qazvini | Pouya Sadeghi | Zeinab Taghavi | Hossein Sameti
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

In this paper, we delve into the realm of detecting machine-generated text (MGT) within Natural Language Processing (NLP). Our approach involves fine-tuning a RoBERTa-base Transformer, a robust neural architecture, to tackle MGT detection as a binary classification task. Specifically focusing on Subtask A (Monolingual - English) within the SemEval-2024 competition framework, our system achieves a 78.9% accuracy on the test dataset, placing us 57th among participants. While our system demonstrates proficiency in identifying human-written texts, it faces challenges in accurately discerning MGTs.

pdf bib abs
Sharif-STR at SemEval-2024 Task 1: Transformer as a Regression Model for Fine-Grained Scoring of Textual Semantic Relations
Seyedeh Fatemeh Ebrahimi | Karim Akhavan Azari | Amirmasoud Iravani | Hadi Alizadeh | Zeinab Taghavi | Hossein Sameti
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper explores semantic textual relatedness (STR) using fine-tuning techniques on the RoBERTa transformer model, focusing on sentence-level STR within Track A (Supervised). The study evaluates the effectiveness of this approach across different languages, with promising results in English and Spanish but encountering challenges in Arabic.