Jatin Bedi
2026
Team Paradise at #SMM4H-HeaRD 2026: Multi-Task Approaches for Social Media Health Mining
Dhruv Goyal | Ishita Gupta | Jatin Bedi
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Dhruv Goyal | Ishita Gupta | Jatin Bedi
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
We present Team Paradise’s systems for three tasks in the SMM4H-HeaRD 2026 shared task: multilingual adverse drug event detection (Task 1), influenza vaccine effectiveness estimation via two-subtask classification (Task 3), and opioid impact span extraction (Task 7). For Task 1, threshold-only ablation on XLMRoBERTa-large achieves a macro-F1 of 0.597, exceeding the field mean (0.547) by +0.050. For Task 3, a three-stage hybrid pipeline combining twitter-RoBERTa-base-2022 with rule-based post-processing achieves Micro-F1 0.8434 (Subtask 1: vaccination status) and 0.8936 (Subtask 2: test results). For Task 7, RoBERTa-large with CRF decoding and sliding-window inference obtains relaxed F1 0.60 despite severe train-test distributional shift Across tasks, we identify class imbalance, temporal ambiguity, and platform heterogeneity as central challenges.
Team TIET at #SMM4H-HeaRD 2026: Fine-tuned Biomedical Transformers with Language-Balanced Sampling for Patient Metadata and Multilingual ADE Detection
Divrose Kaur | Jatin Bedi | Jasmeet Singh
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Divrose Kaur | Jatin Bedi | Jasmeet Singh
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
We present Team TIET’s systems for two shared tasks at #SMM4H-HeaRD 2026: Task 5 (detection of patient metadata in SARS-CoV-2 sequencing papers) and Task 1 (multilingual adverse drug event detection across six languages plus an unseen Farsi subset). For Task 5 we explore iterative LLM prompting followed by fine-tuning BiomedBERT-base with weighted cross-entropy loss and probability threshold optimization, achieving F1 = 0.760 on the official test set (above the competition mean of 0.729). For Task 1 we fine-tune XLM-RoBERTa-base with a combined language- and class-balanced sampling strategy and per-language threshold tuning, achieving macro F1 = 0.497 overall (0.608 excluding the unseen Farsi subset). We report empirical findings on BERT+LLM ensemble failure with bimodal probability distributions, the superiority of base over large model variants under limited data, and the importance of language-balanced gradient contribution in multilingual classification.
blue at SMM4H-HeaRD 2026: Class-Weighted Transformer Ensembles with Structured Decoding and Chain-of-Thought Blending across Six Health NLP Shared Tasks
Krish Sharma | Rhea Singhal | Jatin Bedi
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Krish Sharma | Rhea Singhal | Jatin Bedi
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
We describe team blue’s participation across six SMM4H-HeaRD 2026 shared tasks spanning multilingual adverse drug event detection (Task 1), influenza vaccine effectiveness estimation (Task 3), patient metadata classification (Task 5), TNM cancer staging (Task 6), opioid impact span detection (Task 7), and multilingual clinical NER with cross-lingual annotation projection (Task 8). Despite the heterogeneity of these tasks, binary, multi-class, multi-label, and sequence-labelling, our systems share three recurring design principles: (i) inverse-frequency class weighting to handle severe imbalance, (ii) multi-seed and/or multi-backbone ensembling to reduce variance, and (iii) post-hoc calibration of decision boundaries. Key results include micro-F1 of 0.990 on TNM staging (Task 6), 0.872/0.918 on flu vaccination/test classification surpassing the 70B CoT baseline on vaccination (Task 3), F1 of 0.764 on patient metadata approaching the fine-tuning benchmark of 0.776 (Task 5), and competitive performance on ADE detection (Task 1, F1 = 0.580), opioid spans (Task 7, relaxed F1 = 0.59), and multilingual clinical NER (Task 8, strict F1 0.20–0.41 across 7 languages).
blue at SemEval-2026 Task 5: NarrBERT : Narrative-Aware BERT for Word Sense Disambiguation
Rhea Singhal | Krish Sharma | Lakksh Sharma | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Rhea Singhal | Krish Sharma | Lakksh Sharma | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper outlines the method submitted by team blue for the SemEval-2026 Task 5: Rating Plausibility of Word Senses in Ambiguous Sentences through Narrative (AmbiStory). The task requires predicting reasonable scores that match human thoughts and judgments instead of just picking a single correct sense as the output. This means that contextual reasoning with fine-grain contextual modeling is vital. In order to tackle this problem, we suggest a BERT-based cross-encoder regression model. This model encodes the entire narrative context, which includes the precontext, the ambiguous sentence, and the ending, along with candidate sense definitions and example usages. Unlike bi-encoder sentence-level methods, our model allows for token-level interaction between story cues and sense meanings. This interaction helps capture subtle narrative disambiguation signals. We conduct a systematic exploration of model architectures and training strategies, progressing from a sentence-transformer baseline to an optimised BERT cross-encoder. On the development set, our best configuration achieves a Spearman rank correlation of 0.66. On the official test set, the system achieves a Spearman correlation of 0.4866 and an Accuracy-within-Standard-Deviation of 0.6613, substantially outperforming sentence-transformer bi-encoder baselines.
blue at SemEval-2026 Task 4: Synergizing Long-Context Reranking with Semantic Similarity for Narrative Alignment
Krish Sharma | Lakksh Sharma | Rhea Singhal | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Krish Sharma | Lakksh Sharma | Rhea Singhal | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper describes the system submitted by team blue for SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning, with a primary focus on the Pairwise Similarity subtask (Track A). The core challenge of this task lies in identifying deep structural alignments between stories, which is fundamentally hindered by the restricted context windows of standard transformer architecturesthat truncate narratives before reaching critical plot resolutions. To overcome this context bottleneck, we propose a hybrid ensemble architecture designed to capture extended narrative arcs. Our approach synergizes a cross-encoder (Jina Reranker v2), which processes long inputs via a sliding-window strategy over 1,024-token chunks, to evaluate the global "course of action," with a semantic bi-encoder (RoBERTa-Large) to validate local tonal consistency. This dual-stream system achieved a Pearson correlation score of 0.63, demonstrating that processing narrative content beyond the 512-token truncation boundary is strictly necessary for accurate pairwise narrative comparison.
0704mis at SemEval-2026 Task 11: Single-Call Joint Abstraction for Robust Neuro-Symbolic Retrieval
Ishita Gupta | Dhruv Goyal | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Ishita Gupta | Dhruv Goyal | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Neuro-symbolic Basis for Robust Syllogistic Reasoning Under Distractors.We present our submission to SemEval-2026 Task 11 Subtasks 2 and 4, on syllogistic premise retrieval with distractors. Our system is based on a robustness-first neuro-symbolic pipeline. The key innovation is single-call joint abstraction: rather than parsing all statements independently, one LLM call jointly abstracts all premises and the conclusion into categorical logical forms (A/E/I/O) where symbolic (X/Y/Z) mappings are globally consistent. This allows reliable detection of the shared middle term needed for syllogistic validation. Parsed forms are passed through an exhaustive O(n²) premise-pair search with deterministic validation against the 24 valid Aristotelian syllogistic forms via constant time lookup. Ablation studies show that more theoretically sophisticated variants degrade performance when logical-form extraction is the primary bottleneck. Our approach achieves competitive rankings in both English and multilingual settings while remaining simple, deterministic, and content-invariant.
Dual-View Consistency Testing for Content-Invariant Multilingual Syllogistic Reasoning
Ishita Gupta | Dhruv Goyal | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Ishita Gupta | Dhruv Goyal | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Team 0704mis addressed the SemEval-2026 Task 11 Subtask 3 by building a neuro-symbolic system designed for multilingual syllogistic validity classification across 12 typologically diverse languages. The process involves a neural parser that extracts logical forms from text, which are then validated by a symbolic verifier implementing the full set of 24 valid Aristotelian forms via a hash lookup.Our standout contribution is the dual-view consistency test: the system compares a "native" parse of the original text with a "masked" version where content terms are replaced by abstract symbols (X, Y, Z), only proceeding with high confidence if both views agree. By comparing how the model interprets the same logic in two different formats, the system can detect if the model’s reasoning changes when the context shifts from real-world objects to abstract symbols. The primary goal is to combat belief bias, the human-like tendency of LLMs to accept invalid arguments if the conclusion sounds true, or reject valid arguments if the conclusion sounds false. By enforcing this dual-view check, we found that symbol abstraction (View B) acts as a structural regularizer, forcing the model to ignore semantic interference and focus on the relationship between terms.
Lakksh at SemEval-2026 Task 11(1 2): Neuro-Symbolic Decomposition to Mitigate Content Bias in Syllogistic Reasoning
Lakksh Sharma | Krish Sharma | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Lakksh Sharma | Krish Sharma | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Syllogistic reasoning is the ability to distinguish logical validity from semantic plausibility — a setting in which LLMs succumb to frequent content bias by conflating the two. The result is a characteristic failure to recognize logically valid arguments with highly implausible conclusions and logically invalid but semantically plausible arguments. This paper introduces a neuro-symbolic system that avoids this behavior by design: neural structure extraction is strictly separated from symbolic validity checking. A T5-Small parser is trained only on synthetic nonsense-symbol syllogisms, ensuring that the structural parse is learned in the absence of real-world semantics. Validity checking is performed by a deterministic symbolic kernel operating on extracted logical form alone, ensuring that semantic content cannot influence the final call. In binary validity classification, the system achieves 97.38% accuracy with a Total Content Effect of 3.10; in the retrieval setting, it achieves 82.11% accuracy with 99.47% F1 on premise identification. Ablation experiments show that formal theorem proving via NL-to-Z3 translation actually increases content bias due to leakage in intermediate representations. The results recommend architectural separation as a promising content-robustness strategy for syllogistic reasoning.
Paradise at SemEval-2026 Task 5: On the Limitations of Surface-Level Features for Graded Word Sense Plausibility Prediction
Dhruv Goyal | Ishita Gupta | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Dhruv Goyal | Ishita Gupta | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper introduces a simple approach for predicting how plausible a word sense is in short narratives where meaning is ambiguous. We use 13 hand-crafted features, including text statistics, word-level similarity computed using basic set-based comparisons, and measures of annotator disagreement. Five diverse and largely independent traditional machine learning models are combined using a weighted ensemble with minimal tuning. Despite theoretical grounding in classical disambiguation methods, our system achieves essentially random performance, with Spearman correlation (ρ) of −0.038 and accuracy within standard deviation of 0.542 on the official test set. This result demonstrates that surface-level lexical features, while interpretable, are insufficient for graded sense plausibility prediction without deep semantic representations. By selecting features inspired by classical word sense disambiguation techniques and incorporating signals derived from human disagreement, our model produces plausibility predictions that are largely interpretable. This negative result provides important baselines and insights for future work on graded word sense disambiguation.
Paradise at SemEval-2026 Task 12: Leveraging Instruction-Tuned Large Language Models with Chain-of-Thought Prompting for Abductive Event Reasoning
Dhruv Goyal | Ishita Gupta | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Dhruv Goyal | Ishita Gupta | Jatin Bedi
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
We present Paradise, our system for SemEval-2026 Task 12: Abductive Event Reasoning, which identifies plausible direct causes of real-world English-language events using retrieved contextual documents. Our approach employs Qwen2.5-7B-Instruct, a 7-billion-parameter instruction-tuned language model combined with carefully engineered chain-of-thought prompting, requiring no task-specific fine-tuning or training-data supervision (prompt components were selected using the development set). The system achieves a score of 0.79 on the official 612-instance test set by integrating explicit causal-inference rules, 4,000-character document context windows, and greedy decoding. Analysis reveals that conservative prediction patterns, 87.1% single-label and 36.9% Option D, effectively exploit the asymmetric scoring metric. Ablation studies confirm that document context contributes +6.4 points, chain-of-thought reasoning +5.3 points, and explicit causal rules +3.1 points to development performance. Our code is publicly available at https://github.com/DhruvGoyal404/semeval2026-task12.
Adapting AutoARGUE for Automatic Report Evaluation under Missing Citation Annotations
Divrose Kaur | Jatin Bedi | Jasmeet Singh
Proceedings of the 1st Workshop on Multilingual Report Generation via Retrieval Augmented Generation (RAG4Reports 2026)
Divrose Kaur | Jatin Bedi | Jasmeet Singh
Proceedings of the 1st Workshop on Multilingual Report Generation via Retrieval Augmented Generation (RAG4Reports 2026)
We adapt the AutoARGUE framework (Walden et al., 2026) for Task A.2 of RAG4Reports 2026, which requires ranking 57 report generation systems across 68 topics using automated evaluation. The RAGTIME-1 corpus poses a fundamental challenge: all nugget annotations use a no-reference-doc sentinel rather than ground-truth document citations, rendering the original citation-relevance gating inoperable. We address this with three adaptations: automatic sentinel detection with forced direct LLM-based nugget matching; a WEAK POSITIVE partial credit mechanism for sentences that correctly answer nuggets but lack attesting citations; and a report-level request alignment check. Our nugget_coverage_weighted metric achieves the highest topic-level Pearson correlation (r=0.599) of any non-coordinator submission, closely approaching the coordinator baseline (r=0.607).
EFSG: Evidence-First Structured Generation for Multilingual RAG Report Generation
Shaurya Gupta | Jatin Bedi
Proceedings of the 1st Workshop on Multilingual Report Generation via Retrieval Augmented Generation (RAG4Reports 2026)
Shaurya Gupta | Jatin Bedi
Proceedings of the 1st Workshop on Multilingual Report Generation via Retrieval Augmented Generation (RAG4Reports 2026)
We describe EFSG (Evidence-First Structured Generation), our submission to Task B of the RAG4Reports@ACL 2026 shared task. Standard retrieval-augmented generation pipelines allow generation models to write from parametric memory and attach citations retroactively: a behaviour we term post-rationalization. EFSG addresses this structurally through a phase boundary: all evidence is retrieved, extracted, and sealed into a fact pool before any generation begins; each sentence then sees only its single committed source passage. Our best run (t5100k doc corpus) achieved sentence_support of 0.612 and nugget_coverage of 0.126 (F1 = 0.182).
2025
Thapar Titan/s : Fine-Tuning Pretrained Language Models with Contextual Augmentation for Mistake Identification in Tutor–Student Dialogues
Harsh Dadwal | Sparsh Rastogi | Jatin Bedi
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
Harsh Dadwal | Sparsh Rastogi | Jatin Bedi
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
This paper presents Thapar Titan/s’ submission to the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors. The shared task consists of five subtasks; our team ranked 18th in Mistake Identification, 15th in Mistake Location, and 18th in Actionability. However, in this paper, we focus exclusively on presenting results for Task 1: Mistake Identification, which evaluates a system’s ability to detect student mistakes.Our approach employs contextual data augmentation using a RoBERTa based masked language model to mitigate class imbalance, supplemented by oversampling and weighted loss training. Subsequently, we fine-tune three separate classifiers: RoBERTa, BERT, and DeBERTa for three-way classification aligned with task-specific annotation schemas. This modular and scalable pipeline enables a comprehensive evaluation of tutor feedback quality in educational dialogues.
2024
SMM4H’24 Task6 : Extracting Self-Reported Age with LLM and BERTweet: Fine-Grained Approaches for Social Media Text
Jaskaran Singh | Jatin Bedi | Maninder Kaur
Proceedings of the 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
Jaskaran Singh | Jatin Bedi | Maninder Kaur
Proceedings of the 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
The paper presents two distinct approaches to Task 6 of the SMM4H’24 workshop: extracting self-reported exact age information from social media posts across platforms. This research task focuses on developing methods for automatically extracting self-reported ages from posts on two prominent social media platforms: Twitter (now X) and Reddit. The work leverages two ways, one Mistral-7B-Instruct-v0.2 Large Language Model (LLM) and another pre-trained language model BERTweet, to achieve robust and generalizable age classification, surpassing limitations of existing methods that rely on predefined age groups. The proposed models aim to advance the automatic extraction of self-reported exact ages from social media posts, enabling more nuanced analyses and insights into user demographics across different platforms.
Transformers at #SMM4H 2024: Identification of Tweets Reporting Children’s Medical Disorders And Effects of Outdoor Spaces on Social Anxiety Symptoms on Reddit Using RoBERTa
Kriti Singhal | Jatin Bedi
Proceedings of the 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
Kriti Singhal | Jatin Bedi
Proceedings of the 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
With the widespread increase in the use of social media platforms such as Twitter, Instagram, and Reddit, people are sharing their views on various topics. They have become more vocal on these platforms about their views and opinions on the medical challenges they are facing. This data is a valuable asset of medical insights in the study and research of healthcare. This paper describes our adoption of transformer-based approaches for tasks 3 and 5. For both tasks, we fine-tuned large RoBERTa, a BERT-based architecture, and achieved a highest F1 score of 0.413 and 0.900 in tasks 3 and 5, respectively.
Transformers at SemEval-2024 Task 5: Legal Argument Reasoning Task in Civil Procedure using RoBERTa
Kriti Singhal | Jatin Bedi
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Kriti Singhal | Jatin Bedi
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Legal argument reasoning task in civil procedure is a new NLP task utilizing a dataset from the domain of the U.S. civil procedure. The task aims at identifying whether the solution to a question in the legal domain is correct or not. This paper describes the team “Transformers” submission to the Legal Argument Reasoning Task in Civil Procedure shared task at SemEval-2024 Task 5. We use a BERT-based architecture for the shared task. The highest F1-score score and accuracy achieved was 0.6172 and 0.6531 respectively. We secured the 13th rank in the Legal Argument Reasoning Task in Civil Procedure shared task.
Transformers@LT-EDI-EACL2024: Caste and Migration Hate Speech Detection in Tamil Using Ensembling on Transformers
Kriti Singhal | Jatin Bedi
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
Kriti Singhal | Jatin Bedi
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
In recent years, there has been a persistent focus on developing systems that can automatically identify the hate speech content circulating on diverse social media platforms. This paper describes the team “Transformers” submission to the Caste and Migration Hate Speech Detection in Tamil shared task by LT-EDI 2024 workshop at EACL 2024. We used an ensemble approach in the shared task, combining various transformer-based pre-trained models using majority voting. The best macro average F1-score achieved was 0.82. We secured the 1st rank in the Caste and Migration Hate Speech in Tamil shared task.
Transformers@DravidianLangTech-EACL2024: Sentiment Analysis of Code-Mixed Tamil Using RoBERTa
Kriti Singhal | Jatin Bedi
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Kriti Singhal | Jatin Bedi
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
In recent years, there has been a persistent focus on developing systems that can automatically identify the hate speech content circulating on diverse social media platforms. This paper describes the team Transformers’ submission to the Caste/Immigration Hate Speech Detection in Tamil shared task by LT-EDI 2024 workshop at EACL 2024. We used an ensemble approach in the shared task, combining various transformer-based pre-trained models using majority voting. The best macro average F1-score achieved was 0.82. We secured the 1st rank in the Caste/Immigration Hate Speech in Tamil shared task.
Transformers at HSD-2Lang 2024: Hate Speech Detection in Arabic and Turkish Tweets Using BERT Based Architectures
Kriti Singhal | Jatin Bedi
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)
Kriti Singhal | Jatin Bedi
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)
Over the past years, researchers across the globe have made significant efforts to develop systems capable of identifying the presence of hate speech in different languages. This paper describes the team Transformers’ submission to the subtasks: Hate Speech Detection in Turkish across Various Contexts and Hate Speech Detection with Limited Data in Arabic, organized by HSD-2Lang in conjunction with CASE at EACL 2024. A BERT based architecture was employed in both the subtasks. We achieved an F1 score of 0.63258 using XLM RoBERTa and 0.48101 using mBERT, hence securing the 6th rank and the 5th rank in the first and the second subtask, respectively.
2023
MLModeler5 at SemEval-2023 Task 3: Detecting the Category and the Framing Techniques in Online News in a Multi-lingual Setup
Arjun Khanchandani | Nitansh Jain | Jatin Bedi
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Arjun Khanchandani | Nitansh Jain | Jatin Bedi
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
System Description Paper for Task 3 Subtask 1 and 2 of Semeval 2023. The paper describes our approach to handling the News Genre Categorisation and Framing Detection using RoBERTa and ALBERT models.
MLModeler5 @ Causal News Corpus 2023: Using RoBERTa for Casual Event Classification
Amrita Bhatia | Ananya Thomas | Nitansh Jain | Jatin Bedi
Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text
Amrita Bhatia | Ananya Thomas | Nitansh Jain | Jatin Bedi
Proceedings of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text
Identifying cause-effect relations plays an integral role in the understanding and interpretation of natural languages. Furthermore, automated mining of causal relations from news and text about socio-political events is a stepping stone in gaining critical insights, including analyzing the scale, frequency and trends across timelines of events, as well as anticipating future ones. The Shared Task 3, part of the 6th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE @ RANLP 2023), involved the task of Event Causality Identification with Causal News Corpus. We describe our approach to Subtask 1, dealing with causal event classification, a supervised binary classification problem to annotate given event sentences with whether they contained any cause-effect relations. To help achieve this task, a BERT based architecture - RoBERTa was implemented. The results of this model are validated on the dataset provided by the organizers of this task.
2022
ARGUABLY@SMM4H’22: Classification of Health Related Tweets using Ensemble, Zero-Shot and Fine-Tuned Language Model
Prabsimran Kaur | Guneet Kohli | Jatin Bedi
Proceedings of the Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task
Prabsimran Kaur | Guneet Kohli | Jatin Bedi
Proceedings of the Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task
With the increase in the use of social media, people have become more outspoken and are using platforms like Reddit, Facebook, and Twitter to express their views and share the medical challenges they are facing. This data is a valuable source of medical insight and is often used for healthcare research. This paper describes our participation in Task 1a, 2a, 2b, 3, 5, 6, 7, and 9 organized by SMM4H 2022. We have proposed two transformer-based approaches to handle the classification tasks. The first approach is fine-tuning single language models. The second approach is ensembling the results of BERT, RoBERTa, and ERNIE 2.0.
Raccoons at SemEval-2022 Task 11: Leveraging Concatenated Word Embeddings for Named Entity Recognition
Atharvan Dogra | Prabsimran Kaur | Guneet Kohli | Jatin Bedi
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Atharvan Dogra | Prabsimran Kaur | Guneet Kohli | Jatin Bedi
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Named Entity Recognition (NER), an essential subtask in NLP that identifies text belonging to predefined semantics such as a person, location, organization, drug, time, clinical procedure, biological protein, etc. NER plays a vital role in various fields such as informationextraction, question answering, and machine translation. This paper describes our participating system run to the Named entity recognitionand classification shared task SemEval-2022. The task is motivated towards detecting semantically ambiguous and complex entities in shortand low-context settings. Our team focused on improving entity recognition by improving the word embeddings. We concatenated the word representations from State-of-the-art language models and passed them to find the best representation through a reinforcement trainer. Our results highlight the improvements achieved by various embedding concatenations.
Adversarial Perturbations Augmented Language Models for Euphemism Identification
Guneet Kohli | Prabsimran Kaur | Jatin Bedi
Proceedings of the 3rd Workshop on Figurative Language Processing (FLP)
Guneet Kohli | Prabsimran Kaur | Jatin Bedi
Proceedings of the 3rd Workshop on Figurative Language Processing (FLP)
Euphemisms are mild words or expressions used instead of harsh or direct words while talking to someone to avoid discussing something unpleasant, embarrassing, or offensive. However, they are often ambiguous, thus making it a challenging task. The Third Workshop on Figurative Language Processing, colocated with EMNLP 2022 organized a shared task on Euphemism Detection to better understand euphemisms. We have used the adversarial augmentation technique to construct new data. This augmented data was then trained using two language models: BERT and longformer. To further enhance the overall performance, various combinations of the results obtained using longformer and BERT were passed through a voting ensembler. We achieved an F1 score of 71.5 using the combination of two adversarial longformers, two adversarial BERT, and one non-adversarial BERT.
ARGUABLY @ Causal News Corpus 2022: Contextually Augmented Language Models for Event Causality Identification
Guneet Kohli | Prabsimran Kaur | Jatin Bedi
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)
Guneet Kohli | Prabsimran Kaur | Jatin Bedi
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)
Causal (a cause-effect relationship between two arguments) has become integral to various NLP domains such as question answering, summarization, and event prediction. To understand causality in detail, Event Causality Identification with Causal News Corpus (CASE-2022) has organized shared tasks. This paper defines our participation in Subtask 1, which focuses on classifying event causality. We used sentence-level augmentation based on contextualized word embeddings of distillBERT to construct new data. This data was then trained using two approaches. The first technique used the DeBERTa language model, and the second used the RoBERTa language model in combination with cross-attention. We obtained the second-best F1 score (0.8610) in the competition with the Contextually Augmented DeBERTa model.
2021
ARGUABLY at ComMA@ICON: Detection of Multilingual Aggressive, Gender Biased, and Communally Charged Tweets Using Ensemble and Fine-Tuned IndicBERT
Guneet Kohli | Prabsimran Kaur | Jatin Bedi
Proceedings of the 18th International Conference on Natural Language Processing: Shared Task on Multilingual Gender Biased and Communal Language Identification
Guneet Kohli | Prabsimran Kaur | Jatin Bedi
Proceedings of the 18th International Conference on Natural Language Processing: Shared Task on Multilingual Gender Biased and Communal Language Identification
The proliferation in Social Networking has increased offensive language, aggression, and hate-speech detection, which has drawn the focus of the NLP community. However, people’s difference in perception makes it difficult to distinguish between acceptable content and aggressive/hateful content, thus making it harder to create an automated system. In this paper, we propose multi-class classification techniques to identify aggressive and offensive language used online. Two main approaches have been developed for the classification of data into aggressive, gender-biased, and communally charged. The first approach is an ensemble-based model comprising of XG-Boost, LightGBM, and Naive Bayes applied on vectorized English data. The data used was obtained using an Indic Transliteration on the original data comprising of Meitei, Bangla, Hindi, and English language. The second approach is a BERT-based architecture used to detect misogyny and aggression. The proposed model employs IndicBERT Embeddings to define contextual understanding. The results of the models are validated on the ComMA v 0.2 dataset.