2025
pdf
bib
abs
Efficient Environmental Claim Detection with Hyperbolic Graph Neural Networks
Darpan Aswal
|
Manjira Sinha
The 14th International Joint Conference on Natural Language Processing and The 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Transformer based models, specially large language models (LLMs) dominate the field of NLP with their mass adoption in tasks such as text generation, summarization and fake news detection. These models offer ease of deployment and reliability for most applications, however, they require significant amounts of computational power for training as well as inference. This poses challenges in their adoption in resource-constrained applications, specially in the open-source community where compute availability is usually scarce. This work proposes a graph-based approach for Environmental Claim Detection, exploring Graph Neural Networks (GNNs) and Hyperbolic Graph Neural Networks (HGNNs) as lightweight yet effective alternatives to transformer-based models. Re-framing the task as a graph classification problem, we transform claim sentences into dependency parsing graphs, utilizing a combination of word2vec & learnable part-of-speech (POS) tag embeddings for the node features and encoding syntactic dependencies in the edge relations. Our results show that our graph-based models, particularly HGNNs in the poincaré space (P-HGNNs), achieve performance superior to the state-of-the-art on environmental claim detection while using up to **30x fewer parameters**. We also demonstrate that HGNNs benefit vastly from explicitly modeling data in hierarchical (tree-like) structures, enabling them to significantly improve over their euclidean counterparts.
pdf
bib
abs
TartanTritons at SemEval-2025 Task 10: Multilingual Hierarchical Entity Classification and Narrative Reasoning using Instruct-Tuned LLMs
Raghav R
|
Adarsh Prakash Vemali
|
Darpan Aswal
|
Rahul Ramesh
|
Parth Tusham
|
Pranaya Rishi
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
In today’s era of abundant online news, tackling the spread of deceptive content and manipulative narratives has become crucial. This paper details our system for SemEval-2025 Task 10, focusing on Subtasks 1 (Entity Framing) and 3 (Narrative Extraction). We instruct-tuned quantized Microsoft’s Phi-4 model, incorporating prompt engineering techniques to enhance performance. Our approach involved experimenting with various LLMs, including LLaMA, Phi-4, RoBERTa, and XLM-R, utilizing both quantized large models and non-quantized small models. To improve accuracy, we employed structured prompts, iterative refinement with retry mechanisms, and integrated label taxonomy information. For subtask 1, we also fine-tuned a RoBERTa classifier to predict main entity roles before classifying the fine-grained roles with Phi-4 for the English language. For subtask 3, we instruct-tuned Phi-4 to generate structured explanations, incorporating details about the article and its dominant narrative. Our system achieves competitive results in Hindi and Russian for Subtask 1.
pdf
bib
abs
ScottyPoseidon at SemEval-2025 Task 8: LLM-Driven Code Generation for Zero-Shot Question Answering on Tabular Data
Raghav R
|
Adarsh Prakash Vemali
|
Darpan Aswal
|
Rahul Ramesh
|
Ayush Bhupal
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Tabular Question Answering (QA) is crucial for enabling automated reasoning over structured data, facilitating efficient information retrieval and decision-making across domains like finance, healthcare, and scientific research. This paper describes our system for the SemEval 2025 Task 8 on Question Answering over Tabular Data, specifically focusing on the DataBench QA and DataBench Lite QA subtasks. Our approach involves generating Python code using Large Language Models (LLMs) to extract answers from tabular data in a zero-shot setting. We investigate both multi-step Chain-of-Thought (CoT) and unified LLM approaches, where the latter demonstrates superior performance by minimizing error propagation and enhancing system stability. Our system prioritizes computational efficiency and scalability by minimizing the input data provided to the LLM, optimizing its ability to contextualize information effectively. We achieve this by sampling a minimal set of rows from the dataset and utilizing external execution with Python and Pandas to maintain efficiency. Our system achieved the highest accuracy amongst all small open-source models, ranking 1st in both subtasks.