Parthiban Mohankumar


2025

This paper presents an approach to Task 10 of SemEval 2025, which focuses on summarizing English news articles using a given dominant narrative. The dataset comprises news articles on the Russia-Ukraine war and climate change, introducing challenges related to bias, information compression, and contextual coherence. Transformer-based models, specifically BART variants, are utilized to generate concise and coherent summaries. Our team TechSSN, achieved 4th place on the official test leaderboard with a BERTScore of 0.74203, employing the DistilBART-CNN-12-6 model.

2024

The increase in the popularity of code mixed languages has resulted in the need to engineer language models for the same . Unlike pure languages, code-mixed languages lack clear grammatical structures, leading to ambiguous sentence constructions. This ambiguity presents significant challenges for natural language processing tasks, including syntactic parsing, word sense disambiguation, and language identification. This paper focuses on emotion recognition of conversations in Hinglish, a mix of Hindi and English, as part of Task 10 of SemEval 2024. The proposed approach explores the usage of standard machine learning models like SVM, MNB and RF, and also BERT-based models for Hindi-English code-mixed data- namely, HingBERT, Hing mBERT and HingRoBERTa for subtask A.