Aditya Singh

2026

CultRAG at SemEval-2026 Task 7: Hybrid Sparse-Dense Retrieval with Entity-Centric Knowledge Bases for Cultural MCQ Answering
Aditya Singh | Rickarya Das
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

We developed CultRAG, a trust-weighted Retrieval-Augmented Generation system for BLEnD Track 2 (SemEval-2026 Task 7), targeting culturally grounded multiple-choice QA across 30 countries. Built on Llama-3.1-8B-Instruct, the six-phase pipeline integrates entity extraction via spaCy, hybrid BM25+FAISS retrieval with Reciprocal Rank Fusion, country-aware filtering, keyword-based intent detection, tiered prompt routing, anti-leak quality filtering to suppress answer-anchoring artifacts, and trust-weighted document reranking with source-credibility tiers. Ablation analysis across eight cumulative configurations and per-country decomposition identify which components contribute and where retrieval helps versus hurts, informing future directions for confidence-conditioned selective retrieval.

2023

pdf bib abs

Bhattacharya_Lab at SemEval-2023 Task 12: A Transformer-based Language Model for Sentiment Classification for Low Resource African Languages: Nigerian Pidgin and Yoruba
Nathaniel Hughes | Kevan Baker | Aditya Singh | Aryavardhan Singh | Tharalillah Dauda | Sutanu Bhattacharya
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

Sentiment Analysis is an aspect of natural languageprocessing (NLP) that has been a topicof research. While most studies focus on highresourcelanguages with an extensive amountof available data, the study on low-resource languageswith insufficient data needs attention. To address this issue, we propose a transformerbasedmethod for sentiment analysis for lowresourcesAfrican languages, Nigerian Pidginand Yoruba. To evaluate the effectiveness ofour multilingual language models for monolingualsentiment classification, we participated inthe AfriSenti SemEval shared task 2023 competition. On the official e valuation s et, ourgroup (named as Bhattacharya_Lab) ranked1 out of 33 participating groups in the MonolingualSentiment Classification task (i.e., TaskA) for Nigerian Pidgin (i.e., Track 4), and inthe Top 5 among 33 participating groups inthe Monolingual Sentiment Classification taskfor Yoruba (i.e., Track 2) respectively, demonstratingthe potential for our transformer-basedlanguage models to improve sentiment analysisin low-resource languages. Overall, ourstudy highlights the importance of exploringthe potential of NLP in low-resource languagesand the impact of transformer-based multilinguallanguage models in sentiment analysis forthe low-resource African languages, NigerianPidgin and Yoruba.

Co-authors

Aryavardhan Singh 1

Venues

SemEval2
WS1

Fix author