Md. Shad Akhtar

Also published as: Md Shad Akhtar


2022

pdf
When did you become so smart, oh wise one?! Sarcasm Explanation in Multi-modal Multi-party Dialogues
Shivani Kumar | Atharva Kulkarni | Md Shad Akhtar | Tanmoy Chakraborty
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Indirect speech such as sarcasm achieves a constellation of discourse goals in human communication. While the indirectness of figurative language warrants speakers to achieve certain pragmatic goals, it is challenging for AI agents to comprehend such idiosyncrasies of human communication. Though sarcasm identification has been a well-explored topic in dialogue analysis, for conversational systems to truly grasp a conversation’s innate meaning and generate appropriate responses, simply detecting sarcasm is not enough; it is vital to explain its underlying sarcastic connotation to capture its true essence. In this work, we study the discourse structure of sarcastic conversations and propose a novel task – Sarcasm Explanation in Dialogue (SED). Set in a multimodal and code-mixed setting, the task aims to generate natural language explanations of satirical conversations. To this end, we curate WITS, a new dataset to support our task. We propose MAF (Modality Aware Fusion), a multimodal context-aware attention and global information fusion module to capture multimodality and use it to benchmark WITS. The proposed attention module surpasses the traditional multimodal fusion baselines and reports the best performance on almost all metrics. Lastly, we carry out detailed analysis both quantitatively and qualitatively.

pdf
Domain-aware Self-supervised Pre-training for Label-Efficient Meme Analysis
Shivam Sharma | Mohd Khizir Siddiqui | Md. Shad Akhtar | Tanmoy Chakraborty
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Existing self-supervised learning strategies are constrained to either a limited set of objectives or generic downstream tasks that predominantly target uni-modal applications. This has isolated progress for imperative multi-modal applications that are diverse in terms of complexity and domain-affinity, such as meme analysis. Here, we introduce two self-supervised pre-training methods, namely Ext-PIE-Net and MM-SimCLR that (i) employ off-the-shelf multi-modal hate-speech data during pre-training and (ii) perform self-supervised learning by incorporating multiple specialized pretext tasks, effectively catering to the required complex multi-modal representation learning for meme analysis. We experiment with different self-supervision strategies, including potential variants that could help learn rich cross-modality representations and evaluate using popular linear probing on the Hateful Memes task. The proposed solutions strongly compete with the fully supervised baseline via label-efficient training while distinctly outperforming them on all three tasks of the Memotion challenge with 0.18%, 23.64%, and 0.93% performance gain, respectively. Further, we demonstrate the generalizability of the proposed solutions by reporting competitive performance on the HarMeme task. Finally, we empirically establish the quality of the learned representations by analyzing task-specific learning, using fewer labeled training samples, and arguing that the complexity of the self-supervision strategy and downstream task at hand are correlated. Our efforts highlight the requirement of better multi-modal self-supervision methods involving specialized pretext tasks for efficient fine-tuning and generalizable performance.

pdf
DISARM: Detecting the Victims Targeted by Harmful Memes
Shivam Sharma | Md Shad Akhtar | Preslav Nakov | Tanmoy Chakraborty
Findings of the Association for Computational Linguistics: NAACL 2022

Internet memes have emerged as an increasingly popular means of communication on the web. Although memes are typically intended to elicit humour, they have been increasingly used to spread hatred, trolling, and cyberbullying, as well as to target specific individuals, communities, or society on political, socio-cultural, and psychological grounds. While previous work has focused on detecting harmful, hateful, and offensive memes in general, identifying whom these memes attack (i.e., the ‘victims’) remains a challenging and underexplored area. We attempt to address this problem in this paper. To this end, we create a dataset in which we annotate each meme with its victim(s) such as the name of the targeted person(s), organization(s), and community(ies). We then propose DISARM (Detecting vIctimS targeted by hARmful Memes), a framework that uses named-entity recognition and person identification to detect all entities a meme is referring to, and then, incorporates a novel contextualized multimodal deep neural network to classify whether the meme intends to harm these entities. We perform several systematic experiments on three different test sets, corresponding to entities that are (i) all seen while training, (ii) not seen as a harmful target while training, and (iii) not seen at all while training. The evaluation shows that DISARM significantly outperforms 10 unimodal and multimodal systems. Finally, we demonstrate that DISARM is interpretable and comparatively more generalizable and that it can reduce the relative error rate of harmful target identification by up to 9 % absolute over multimodal baseline systems.

pdf
Proceedings of the Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situations
Tanmoy Chakraborty | Md. Shad Akhtar | Kai Shu | H. Russell Bernard | Maria Liakata | Preslav Nakov | Aseem Srivastava
Proceedings of the Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situations

pdf
Findings of the CONSTRAINT 2022 Shared Task on Detecting the Hero, the Villain, and the Victim in Memes
Shivam Sharma | Tharun Suresh | Atharva Kulkarni | Himanshi Mathur | Preslav Nakov | Md. Shad Akhtar | Tanmoy Chakraborty
Proceedings of the Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situations

We present the findings of the shared task at the CONSTRAINT 2022 Workshop: Hero, Villain, and Victim: Dissecting harmful memes for Semantic role labeling of entities. The task aims to delve deeper into the domain of meme comprehension by deciphering the connotations behind the entities present in a meme. In more nuanced terms, the shared task focuses on determining the victimizing, glorifying, and vilifying intentions embedded in meme entities to explicate their connotations. To this end, we curate HVVMemes, a novel meme dataset of about 7000 memes spanning the domains of COVID-19 and US Politics, each containing entities and their associated roles: hero, villain, victim, or none. The shared task attracted 105 participants, but eventually only 6 submissions were made. Most of the successful submissions relied on fine-tuning pre-trained language and multimodal models along with ensembles. The best submission achieved an F1-score of 58.67.

pdf
Document Retrieval and Claim Verification to Mitigate COVID-19 Misinformation
Megha Sundriyal | Ganeshan Malhotra | Md Shad Akhtar | Shubhashis Sengupta | Andrew Fano | Tanmoy Chakraborty
Proceedings of the Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situations

During the COVID-19 pandemic, the spread of misinformation on online social media has grown exponentially. Unverified bogus claims on these platforms regularly mislead people, leading them to believe in half-baked truths. The current vogue is to employ manual fact-checkers to verify claims to combat this avalanche of misinformation. However, establishing such claims’ veracity is becoming increasingly challenging, partly due to the plethora of information available, which is difficult to process manually. Thus, it becomes imperative to verify claims automatically without human interventions. To cope up with this issue, we propose an automated claim verification solution encompassing two steps – document retrieval and veracity prediction. For the retrieval module, we employ a hybrid search-based system with BM25 as a base retriever and experiment with recent state-of-the-art transformer-based models for re-ranking. Furthermore, we use a BART-based textual entailment architecture to authenticate the retrieved documents in the later step. We report experimental findings, demonstrating that our retrieval module outperforms the best baseline system by 10.32 NDCG@100 points. We escort a demonstration to assess the efficacy and impact of our suggested solution. As a byproduct of this study, we present an open-source, easily deployable, and user-friendly Python API that the community can adopt.

2021

pdf
Detecting Harmful Memes and Their Targets
Shraman Pramanick | Dimitar Dimitrov | Rituparna Mukherjee | Shivam Sharma | Md. Shad Akhtar | Preslav Nakov | Tanmoy Chakraborty
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf
HIT - A Hierarchically Fused Deep Attention Network for Robust Code-mixed Language Representation
Ayan Sengupta | Sourabh Kumar Bhattacharjee | Tanmoy Chakraborty | Md. Shad Akhtar
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf
MOMENTA: A Multimodal Framework for Detecting Harmful Memes and Their Targets
Shraman Pramanick | Shivam Sharma | Dimitar Dimitrov | Md. Shad Akhtar | Preslav Nakov | Tanmoy Chakraborty
Findings of the Association for Computational Linguistics: EMNLP 2021

Internet memes have become powerful means to transmit political, psychological, and socio-cultural ideas. Although memes are typically humorous, recent days have witnessed an escalation of harmful memes used for trolling, cyberbullying, and abuse. Detecting such memes is challenging as they can be highly satirical and cryptic. Moreover, while previous work has focused on specific aspects of memes such as hate speech and propaganda, there has been little work on harm in general. Here, we aim to bridge this gap. In particular, we focus on two tasks: (i)detecting harmful memes, and (ii) identifying the social entities they target. We further extend the recently released HarMeme dataset, which covered COVID-19, with additional memes and a new topic: US politics. To solve these tasks, we propose MOMENTA (MultimOdal framework for detecting harmful MemEs aNd Their tArgets), a novel multimodal deep neural network that uses global and local perspectives to detect harmful memes. MOMENTA systematically analyzes the local and the global perspective of the input meme (in both modalities) and relates it to the background context. MOMENTA is interpretable and generalizable, and our experiments show that it outperforms several strong rivaling approaches.

pdf
LESA: Linguistic Encapsulation and Semantic Amalgamation Based Generalised Claim Detection from Online Content
Shreya Gupta | Parantak Singh | Megha Sundriyal | Md. Shad Akhtar | Tanmoy Chakraborty
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

The conceptualization of a claim lies at the core of argument mining. The segregation of claims is complex, owing to the divergence in textual syntax and context across different distributions. Another pressing issue is the unavailability of labeled unstructured text for experimentation. In this paper, we propose LESA, a framework which aims at advancing headfirst into expunging the former issue by assembling a source-independent generalized model that captures syntactic features through part-of-speech and dependency embeddings, as well as contextual features through a fine-tuned language model. We resolve the latter issue by annotating a Twitter dataset which aims at providing a testing ground on a large unstructured dataset. Experimental results show that LESA improves upon the state-of-the-art performance across six benchmark claim datasets by an average of 3 claim-F1 points for in-domain experiments and by 2 claim-F1 points for general-domain experiments. On our dataset too, LESA outperforms existing baselines by 1 claim-F1 point on the in-domain experiments and 2 claim-F1 points on the general-domain experiments. We also release comprehensive data annotation guidelines compiled during the annotation phase (which was missing in the current literature).

2020

pdf
STHAL: Location-mention Identification in Tweets of Indian-context
Kartik Verma | Shobhit Sinha | Md. Shad Akhtar | Vikram Goyal
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

We investigate the problem of extracting Indian-locations from a given crowd-sourced textual dataset. The problem of extracting fine-grained Indian-locations has many challenges. One challenge in the task is to collect relevant dataset from the crowd-sourced platforms that contain locations. The second challenge lies in extracting the location entities from the collected data. We provide an in-depth review of the information collection process and our annotation guidelines such that a reliable dataset annotation is guaranteed. We evaluate many recent algorithms and models, including Conditional Random fields (CRF), Bi-LSTM-CNN and BERT (Bidirectional Encoder Representations from Transformers), on our developed dataset named . The study shows the best F1-score of 72.49% for BERT, followed by Bi-LSTM-CNN and CRF. As a result of our work, we prepare a publicly-available annotated dataset of Indian geolocations that can be used by the research community. Code and dataset are available at https://github.com/vkartik2k/STHAL.

2019

pdf
Context-aware Interactive Attention for Multi-modal Sentiment and Emotion Analysis
Dushyant Singh Chauhan | Md Shad Akhtar | Asif Ekbal | Pushpak Bhattacharyya
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

In recent times, multi-modal analysis has been an emerging and highly sought-after field at the intersection of natural language processing, computer vision, and speech processing. The prime objective of such studies is to leverage the diversified information, (e.g., textual, acoustic and visual), for learning a model. The effective interaction among these modalities often leads to a better system in terms of performance. In this paper, we introduce a recurrent neural network based approach for the multi-modal sentiment and emotion analysis. The proposed model learns the inter-modal interaction among the participating modalities through an auto-encoder mechanism. We employ a context-aware attention module to exploit the correspondence among the neighboring utterances. We evaluate our proposed approach for five standard multi-modal affect analysis datasets. Experimental results suggest the efficacy of the proposed model for both sentiment and emotion analysis over various existing state-of-the-art systems.

pdf
Multi-task Learning for Multi-modal Emotion Recognition and Sentiment Analysis
Md Shad Akhtar | Dushyant Chauhan | Deepanway Ghosal | Soujanya Poria | Asif Ekbal | Pushpak Bhattacharyya
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Related tasks often have inter-dependence on each other and perform better when solved in a joint framework. In this paper, we present a deep multi-task learning framework that jointly performs sentiment and emotion analysis both. The multi-modal inputs (i.e. text, acoustic and visual frames) of a video convey diverse and distinctive information, and usually do not have equal contribution in the decision making. We propose a context-level inter-modal attention framework for simultaneously predicting the sentiment and expressed emotions of an utterance. We evaluate our proposed approach on CMU-MOSEI dataset for multi-modal sentiment and emotion analysis. Evaluation results suggest that multi-task learning framework offers improvement over the single-task framework. The proposed approach reports new state-of-the-art performance for both sentiment analysis and emotion analysis.

pdf
Language-Agnostic Model for Aspect-Based Sentiment Analysis
Md Shad Akhtar | Abhishek Kumar | Asif Ekbal | Chris Biemann | Pushpak Bhattacharyya
Proceedings of the 13th International Conference on Computational Semantics - Long Papers

In this paper, we propose a language-agnostic deep neural network architecture for aspect-based sentiment analysis. The proposed approach is based on Bidirectional Long Short-Term Memory (Bi-LSTM) network, which is further assisted with extra hand-crafted features. We define three different architectures for the successful combination of word embeddings and hand-crafted features. We evaluate the proposed approach for six languages (i.e. English, Spanish, French, Dutch, German and Hindi) and two problems (i.e. aspect term extraction and aspect sentiment classification). Experiments show that the proposed model attains state-of-the-art performance in most of the settings.

2018

pdf
Solving Data Sparsity for Aspect Based Sentiment Analysis Using Cross-Linguality and Multi-Linguality
Md Shad Akhtar | Palaash Sawant | Sukanta Sen | Asif Ekbal | Pushpak Bhattacharyya
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Efficient word representations play an important role in solving various problems related to Natural Language Processing (NLP), data mining, text mining etc. The issue of data sparsity poses a great challenge in creating efficient word representation model for solving the underlying problem. The problem is more intensified in resource-poor scenario due to the absence of sufficient amount of corpus. In this work we propose to minimize the effect of data sparsity by leveraging bilingual word embeddings learned through a parallel corpus. We train and evaluate Long Short Term Memory (LSTM) based architecture for aspect level sentiment classification. The neural network architecture is further assisted by the hand-crafted features for the prediction. We show the efficacy of the proposed model against state-of-the-art methods in two experimental setups i.e. multi-lingual and cross-lingual.

pdf
IARM: Inter-Aspect Relation Modeling with Memory Networks in Aspect-Based Sentiment Analysis
Navonil Majumder | Soujanya Poria | Alexander Gelbukh | Md. Shad Akhtar | Erik Cambria | Asif Ekbal
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Sentiment analysis has immense implications in e-commerce through user feedback mining. Aspect-based sentiment analysis takes this one step further by enabling businesses to extract aspect specific sentimental information. In this paper, we present a novel approach of incorporating the neighboring aspects related information into the sentiment classification of the target aspect using memory networks. We show that our method outperforms the state of the art by 1.6% on average in two distinct domains: restaurant and laptop.

pdf
Contextual Inter-modal Attention for Multi-modal Sentiment Analysis
Deepanway Ghosal | Md Shad Akhtar | Dushyant Chauhan | Soujanya Poria | Asif Ekbal | Pushpak Bhattacharyya
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Multi-modal sentiment analysis offers various challenges, one being the effective combination of different input modalities, namely text, visual and acoustic. In this paper, we propose a recurrent neural network based multi-modal attention framework that leverages the contextual information for utterance-level sentiment prediction. The proposed approach applies attention on multi-modal multi-utterance representations and tries to learn the contributing features amongst them. We evaluate our proposed approach on two multi-modal sentiment analysis benchmark datasets, viz. CMU Multi-modal Opinion-level Sentiment Intensity (CMU-MOSI) corpus and the recently released CMU Multi-modal Opinion Sentiment and Emotion Intensity (CMU-MOSEI) corpus. Evaluation results show the effectiveness of our proposed approach with the accuracies of 82.31% and 79.80% for the MOSI and MOSEI datasets, respectively. These are approximately 2 and 1 points performance improvement over the state-of-the-art models for the datasets.

2017

pdf
IITP at SemEval-2017 Task 8 : A Supervised Approach for Rumour Evaluation
Vikram Singh | Sunny Narayan | Md Shad Akhtar | Asif Ekbal | Pushpak Bhattacharyya
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper describes our system participation in the SemEval-2017 Task 8 ‘RumourEval: Determining rumour veracity and support for rumours’. The objective of this task was to predict the stance and veracity of the underlying rumour. We propose a supervised classification approach employing several lexical, content and twitter specific features for learning. Evaluation shows promising results for both the problems.

pdf
IITPB at SemEval-2017 Task 5: Sentiment Prediction in Financial Text
Abhishek Kumar | Abhishek Sethi | Md Shad Akhtar | Asif Ekbal | Chris Biemann | Pushpak Bhattacharyya
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper reports team IITPB’s participation in the SemEval 2017 Task 5 on ‘Fine-grained sentiment analysis on financial microblogs and news’. We developed 2 systems for the two tracks. One system was based on an ensemble of Support Vector Classifier and Logistic Regression. This system relied on Distributional Thesaurus (DT), word embeddings and lexicon features to predict a floating sentiment value between -1 and +1. The other system was based on Support Vector Regression using word embeddings, lexicon features, and PMI scores as features. The system was ranked 5th in track 1 and 8th in track 2.

pdf
IITP at SemEval-2017 Task 5: An Ensemble of Deep Learning and Feature Based Models for Financial Sentiment Analysis
Deepanway Ghosal | Shobhit Bhatnagar | Md Shad Akhtar | Asif Ekbal | Pushpak Bhattacharyya
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this paper we propose an ensemble based model which combines state of the art deep learning sentiment analysis algorithms like Convolution Neural Network (CNN) and Long Short Term Memory (LSTM) along with feature based models to identify optimistic or pessimistic sentiments associated with companies and stocks in financial texts. We build our system to participate in a competition organized by Semantic Evaluation 2017 International Workshop. We combined predictions from various models using an artificial neural network to determine the opinion towards an entity in (a) Microblog Messages and (b) News Headlines data. Our models achieved a cosine similarity score of 0.751 and 0.697 for the above two tracks giving us the rank of 2nd and 7th best team respectively.

pdf
A Multilayer Perceptron based Ensemble Technique for Fine-grained Financial Sentiment Analysis
Md Shad Akhtar | Abhishek Kumar | Deepanway Ghosal | Asif Ekbal | Pushpak Bhattacharyya
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

In this paper, we propose a novel method for combining deep learning and classical feature based models using a Multi-Layer Perceptron (MLP) network for financial sentiment analysis. We develop various deep learning models based on Convolutional Neural Network (CNN), Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU). These are trained on top of pre-trained, autoencoder-based, financial word embeddings and lexicon features. An ensemble is constructed by combining these deep learning models and a classical supervised model based on Support Vector Regression (SVR). We evaluate our proposed technique on a benchmark dataset of SemEval-2017 shared task on financial sentiment analysis. The propose model shows impressive results on two datasets, i.e. microblogs and news headlines datasets. Comparisons show that our proposed model performs better than the existing state-of-the-art systems for the above two datasets by 2.0 and 4.1 cosine points, respectively.

pdf
IITP at EmoInt-2017: Measuring Intensity of Emotions using Sentence Embeddings and Optimized Features
Md Shad Akhtar | Palaash Sawant | Asif Ekbal | Jyoti Pawar | Pushpak Bhattacharyya
Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

This paper describes the system that we submitted as part of our participation in the shared task on Emotion Intensity (EmoInt-2017). We propose a Long short term memory (LSTM) based architecture cascaded with Support Vector Regressor (SVR) for intensity prediction. We also employ Particle Swarm Optimization (PSO) based feature selection algorithm for obtaining an optimized feature set for training and evaluation. System evaluation shows interesting results on the four emotion datasets i.e. anger, fear, joy and sadness. In comparison to the other participating teams our system was ranked 5th in the competition.

2016

pdf
Aspect based Sentiment Analysis in Hindi: Resource Creation and Evaluation
Md Shad Akhtar | Asif Ekbal | Pushpak Bhattacharyya
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Due to the phenomenal growth of online product reviews, sentiment analysis (SA) has gained huge attention, for example, by online service providers. A number of benchmark datasets for a wide range of domains have been made available for sentiment analysis, especially in resource-rich languages. In this paper we assess the challenges of SA in Hindi by providing a benchmark setup, where we create an annotated dataset of high quality, build machine learning models for sentiment analysis in order to show the effective usage of the dataset, and finally make the resource available to the community for further advancement of research. The dataset comprises of Hindi product reviews crawled from various online sources. Each sentence of the review is annotated with aspect term and its associated sentiment. As classification algorithms we use Conditional Random Filed (CRF) and Support Vector Machine (SVM) for aspect term extraction and sentiment analysis, respectively. Evaluation results show the average F-measure of 41.07% for aspect term extraction and accuracy of 54.05% for sentiment classification.

pdf
A Hybrid Deep Learning Architecture for Sentiment Analysis
Md Shad Akhtar | Ayush Kumar | Asif Ekbal | Pushpak Bhattacharyya
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In this paper, we propose a novel hybrid deep learning archtecture which is highly efficient for sentiment analysis in resource-poor languages. We learn sentiment embedded vectors from the Convolutional Neural Network (CNN). These are augmented to a set of optimized features selected through a multi-objective optimization (MOO) framework. The sentiment augmented optimized vector obtained at the end is used for the training of SVM for sentiment classification. We evaluate our proposed approach for coarse-grained (i.e. sentence level) as well as fine-grained (i.e. aspect level) sentiment analysis on four Hindi datasets covering varying domains. In order to show that our proposed method is generic in nature we also evaluate it on two benchmark English datasets. Evaluation shows that the results of the proposed method are consistent across all the datasets and often outperforms the state-of-art systems. To the best of our knowledge, this is the very first attempt where such a deep learning model is used for less-resourced languages such as Hindi.

2015

pdf
IITP: Multiobjective Differential Evolution based Twitter Named Entity Recognition
Md Shad Akhtar | Utpal Kumar Sikdar | Asif Ekbal
Proceedings of the Workshop on Noisy User-generated Text

pdf
IITP: Hybrid Approach for Text Normalization in Twitter
Md Shad Akhtar | Utpal Kumar Sikdar | Asif Ekbal
Proceedings of the Workshop on Noisy User-generated Text