Vineet Saravanan
2025
cocoa at SemEval-2025 Task 10: Prompting vs. Fine-Tuning: A Multilevel Approach to Propaganda Classification
Vineet Saravanan
|
Steven Wilson
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
The increasing sophistication of natural language processing models has facilitated advancements in hierarchical text classification, particularly in the domain of propaganda detection. This paper presents our submission to SemEval 2025 Task 10, Subtask 1, which focuses on multilevel text classification for identifying and categorizing propaganda narratives in online news. We investigate two primary approaches: (1) prompt-based classification using large language models (LLMs) like GPT, which offers flexibility but struggles with hierarchical categorization, and (2) fine-tuning transformer-based models, where we employ a hierarchical structure—one model classifies the main propaganda category, followed by three separate models specializing in subcategory classification. Our results indicate that while LLMs demonstrate some generalization ability, fine-tuned models significantly outperform them in accuracy and reliability, reinforcing the importance of task-specific supervised learning for propaganda detection. Additionally, we discuss challenges related to data sparsity in subclassification and explore potential enhancements such as multi-task learning and hierarchical loss functions. Our findings contribute to the broader field of automated propaganda detection and emphasize the value of structured classification models in combating misinformation. All code and data used in our experiments will be made publicly available on our GitHub
2024
OUNLP at SemEval-2024 Task 9: Retrieval-Augmented Generation for Solving Brain Teasers with LLMs
Vineet Saravanan
|
Steven Wilson
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
The advancement of natural language processing has given rise to a variety of large language models (LLMs) with capabilities extending into the realm of complex problem-solving, including brainteasers that challenge not only linguistic fluency but also logical reasoning. This paper documents our submission to the SemEval 2024 Brainteaser task, in which we investigate the performance of state-of-the-art LLMs, such as GPT-3.5, GPT-4, and the Gemini model, on a diverse set of brainteasers using prompt engineering as a tool to enhance the models’ problem-solving abilities. We experimented with a series of structured prompts ranging from basic to those integrating task descriptions and explanations. Through a comparative analysis, we sought to determine which combinations of model and prompt yielded the highest accuracy in solving these puzzles. Our findings provide a snapshot of the current landscape of AI problem-solving and highlight the nuanced nature of LLM performance, influenced by both the complexity of the tasks and the sophistication of the prompts employed.
2023
Mr-wallace at SemEval-2023 Task 5: Novel Clickbait Spoiling Algorithm Using Natural Language Processing
Vineet Saravanan
|
Steven Wilson
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
This paper presents a model for clickbait spoiling,which aims at generating short texts that satisfy thecuriosity induced by a clickbait post. The modelis split into two tasks: identifying the clickbaittype and spoiling the clickbait. The first task isto classify the spoiler type that the clickbait postwarrants, and the second task is to generate thespoiler for the clickbait post. The model utilizesthe Distilbert-base-uncased model for the first taskand the Bert-base-uncased model for the secondtask. The trained model is optimized through trialand error on different model selections, and hyper-parameters and results are presented in a confusionmatrix. The main reason we utilized Distilbert-base-uncased is that it analyzes words in the con-text of what’s around it. The objective of this modelis to save readers time and spoil the clickbait of dif-ferent articles they may see on different platformslike Twitter and Reddit