Ahmed Fetouh
2026
REGLAT at SemEval-2026 Task 12: Multi-Strategy Ensemble Reasoning for Event Causality Identification
Mariam Francies | Nsrin Ashraf | Ahmed Fetouh | Asad Khalil | Hamada Nayel
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Mariam Francies | Nsrin Ashraf | Ahmed Fetouh | Asad Khalil | Hamada Nayel
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper describes the multi-strategy ensemble approach that has been used to develop the model submitted to the Abductive Event Reasoning shared task. The proposed model combines semantic similarity, causal pattern recognition, and Large Language Models (LLMs) to identify causal relationships between news events and their causes. Our system achieved competitive performance by integrating semantic embedding-based similarity, explicit causal pattern matching, keyword overlap analysis, temporal alignment scoring, and LLM-enhanced reasoning. Our system achieved accuracies of 65.4\% and 43.2\% on the development set using the LLM-enhanced configuration and the non-LLM ensemble, respectively. The final score using the test set on the leaderboard is 0.3.
REGLAT at SemEval-2026 Task 9: Enhancing Arabic Online Polarization Detection Using AraBERT and Synonym Replacement Augmentation
Ahmed Fetouh | Mariam Francies | Nsrin Ashraf | Hamada Nayel | Rahmath Mohammed
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Ahmed Fetouh | Mariam Francies | Nsrin Ashraf | Hamada Nayel | Rahmath Mohammed
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
In this paper, we present our system, which was submitted to SemEval-2026 Task 9 (Subtask 1: Polarization Detection) and focuses on binary classification of polarized content in Arabic social media text. To address Arabic linguistic variations, we propose a single-model approach that combines fine-tuned AraBERT with synonym-based data augmentation. On the Arabic bind set, our method achieves a competitive macro F1-score of 0.831 and an accuracy of 0.833. Among the 45 participating teams, our system ranked 11th overall, with a performance gap of 0.018 macro F1 from the top-ranked team (0.8488). The results show that a fine-tuned AraBERT with synonym replacement is a strong, simple, and reproducible baseline that outperforms more complex setups in dealing with Arabic attitude polarization nuances.