Muhammad Haroon
2025
Re-ranking Using Large Language Models for Mitigating Exposure to Harmful Content on Social Media Platforms
Rajvardhan Oak
|
Muhammad Haroon
|
Claire Wonjeong Jo
|
Magdalena Wojcieszak
|
Anshuman Chhabra
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Social media platforms utilize Machine Learning (ML) and Artificial Intelligence (AI) powered recommendation algorithms to maximize user engagement, which can result in inadvertent exposure to harmful content. Current moderation efforts, reliant on classifiers trained with extensive human-annotated data, struggle with scalability and adapting to new forms of harm. To address these challenges, we propose a novel re-ranking approach using Large Language Models (LLMs) in zero-shot and few-shot settings. Our method dynamically assesses and re-ranks content sequences, effectively mitigating harmful content exposure without requiring extensive labeled data. Alongside traditional ranking metrics, we also introduce two new metrics to evaluate the effectiveness of re-ranking in reducing exposure to harmful content. Through experiments on three datasets, three models and across three configurations, we demonstrate that our LLM-based approach significantly outperforms existing proprietary moderation approaches, offering a scalable and adaptable solution for harm mitigation.
Automated Authentication of Quranic Verses Using BERT (Bidirectional Encoder Representations from Transformers) based Language Models
Khubaib Amjad Alam
|
Maryam Khalid
|
Syed Ahmed Ali
|
Haroon Mahmood
|
Qaisar Shafi
|
Muhammad Haroon
|
Zulqarnain Haider
Proceedings of the New Horizons in Computational Linguistics for Religious Texts
The proliferation of Quranic content on digital platforms, including websites and social media, has brought about significant challenges in verifying the authenticity of Quranic verses. The inherent complexity of the Arabic language, with its rich morphology, syntax, and semantics, makes traditional text-processing techniques inadequate for robust authentication. This paper addresses this problem by leveraging state-of-the-art transformer-based Language models tailored for Arabic text processing. Our approach involves fine-tuning three transformer architectures BERT-Base-Arabic, AraBERT, and MarBERT on a curated dataset containing both authentic and non-authentic verses. Non-authentic examples were created using sentence-BERT, which applies cosine similarity to introduce subtle modifications. Comprehensive experiments were conducted to evaluate the performance of the models. Among the three candidate models, MarBERT, which is specifically designed for handling Arabic dialects demonstrated superior performance, achieving an F1-score of 93.80%. BERT-Base-Arabic also showed competitive F1 score of 92.90% reflecting its robust understanding of Arabic text. The findings underscore the potential of transformer-based models in addressing linguistic complexities inherent in Quranic text and pave the way for developing automated, reliable tools for Quranic verse authentication in the digital era.
Search
Fix author
Co-authors
- Khubaib Amjad Alam 1
- Syed Ahmed Ali 1
- Anshuman Chhabra 1
- Zulqarnain Haider 1
- Claire Wonjeong Jo 1
- show all...