Deepak P

Also published as: Deepak Padmanabhan


2023

pdf
Multi-task Ensemble Learning for Fake Reviews Detection and Helpfulness Prediction: A Novel Approach
Alimuddin Melleng | Anna Jurek-Loughrey | Deepak P
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Research on fake reviews detection and review helpfulness prediction is prevalent, yet most studies tend to focus solely on either fake reviews detection or review helpfulness prediction, considering them separate research tasks. In contrast to this prevailing pattern, we address both challenges concurrently by employing a multi-task learning approach. We posit that undertaking these tasks simultaneously can enhance the performance of each task through shared information among features. We utilize pre-trained RoBERTa embeddings with a document-level data representation. This is coupled with an array of deep learning and neural network models, including Bi-LSTM, LSTM, GRU, and CNN. Additionally, we em- ploy ensemble learning techniques to integrate these models, with the objective of enhancing overall prediction accuracy and mitigating the risk of overfitting. The findings of this study offer valuable insights to the fields of natural language processing and machine learning and present a novel perspective on leveraging multi-task learning for the twin challenges of fake reviews detection and review helpfulness prediction

pdf
Data Fusion for Better Fake Reviews Detection
Alimuddin Melleng | Anna Jurek-Loughrey | Deepak P
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Online reviews have become critical in informing purchasing decisions, making the detection of fake reviews a crucial challenge to tackle. Many different Machine Learning based solutions have been proposed, using various data representations such as n-grams or document embeddings. In this paper, we first explore the effectiveness of different data representations, including emotion, document embedding, n-grams, and noun phrases in embedding for mat, for fake reviews detection. We evaluate these representations with various state-of-the-art deep learning models, such as BILSTM, LSTM, GRU, CNN, and MLP. Following this, we propose to incorporate different data repre- sentations and classification models using early and late data fusion techniques in order to im- prove the prediction performance. The experiments are conducted on four datasets: Hotel, Restaurant, Amazon, and Yelp. The results demonstrate that combination of different data representations significantly outperform any of the single data representations

pdf
Multiple Evidence Combination for Fact-Checking of Health-Related Information
Pritam Deka | Anna Jurek-Loughrey | Deepak P
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

Fact-checking of health-related claims has become necessary in this digital age, where any information posted online is easily available to everyone. The most effective way to verify such claims is by using evidences obtained from reliable sources of medical knowledge, such as PubMed. Recent advances in the field of NLP have helped automate such fact-checking tasks. In this work, we propose a domain-specific BERT-based model using a transfer learning approach for the task of predicting the veracity of claim-evidence pairs for the verification of health-related facts. We also improvise on a method to combine multiple evidences retrieved for a single claim, taking into consideration conflicting evidences as well. We also show how our model can be exploited when labelled data is available and how back-translation can be used to augment data when there is data scarcity.

2021

pdf
Ranking Online Reviews Based on Their Helpfulness: An Unsupervised Approach
Alimuddin Melleng | Anna Jurek-Loughrey | Deepak P
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Online reviews are an essential aspect of online shopping for both customers and retailers. However, many reviews found on the Internet lack in quality, informativeness or helpfulness. In many cases, they lead the customers towards positive or negative opinions without providing any concrete details (e.g., very poor product, I would not recommend it). In this work, we propose a novel unsupervised method for quantifying helpfulness leveraging the availability of a corpus of reviews. In particular, our method exploits three characteristics of the reviews, viz., relevance, emotional intensity and specificity, towards quantifying helpfulness. We perform three rankings (one for each feature above), which are then combined to obtain a final helpfulness ranking. For the purpose of empirically evaluating our method, we use review of four product categories from Amazon review. The experimental evaluation demonstrates the effectiveness of our method in comparison to a recent and state-of-the-art baseline.

2019

pdf
Sentiment and Emotion Based Representations for Fake Reviews Detection
Alimuddin Melleng | Anna Jurek-Loughrey | Deepak P
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Fake reviews are increasingly prevalent across the Internet. They can be unethical as well as harmful. They can affect businesses and mislead individual customers. As the opinions on the Web are increasingly used the detection of fake reviews has become more and more critical. In this study, we explore the effectiveness of sentiment and emotions based representations for the task of building machine learning models for fake review detection. We perform empirical studies over three real world datasets and demonstrate that improved data representation can be achieved by combining sentiment and emotion extraction methods, as well as by performing sentiment and emotion analysis on a part-by-part basis by segmenting the reviews.

2018

pdf
Topic-Specific Sentiment Analysis Can Help Identify Political Ideology
Sumit Bhatia | Deepak P
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

Ideological leanings of an individual can often be gauged by the sentiment one expresses about different issues. We propose a simple framework that represents a political ideology as a distribution of sentiment polarities towards a set of topics. This representation can then be used to detect ideological leanings of documents (speeches, news articles, etc.) based on the sentiments expressed towards different topics. Experiments performed using a widely used dataset show the promise of our proposed approach that achieves comparable performance to other methods despite being much simpler and more interpretable.

2017

pdf
Latent Space Embedding for Retrieval in Question-Answer Archives
Deepak P | Dinesh Garg | Shirish Shevade
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Community-driven Question Answering (CQA) systems such as Yahoo! Answers have become valuable sources of reusable information. CQA retrieval enables usage of historical CQA archives to solve new questions posed by users. This task has received much recent attention, with methods building upon literature from translation models, topic models, and deep learning. In this paper, we devise a CQA retrieval technique, LASER-QA, that embeds question-answer pairs within a unified latent space preserving the local neighborhood structure of question and answer spaces. The idea is that such a space mirrors semantic similarity among questions as well as answers, thereby enabling high quality retrieval. Through an empirical analysis on various real-world QA datasets, we illustrate the improved effectiveness of LASER-QA over state-of-the-art methods.

pdf
Multi-entity sentiment analysis using entity-level feature extraction and word embeddings approach
Colm Sweeney | Deepak Padmanabhan
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

The sentiment analysis task has been traditionally divided into lexicon or machine learning approaches, but recently the use of word embeddings methods have emerged, that provide powerful algorithms to allow semantic understanding without the task of creating large amounts of annotated test data. One problem with this type of binary classification, is that the sentiment output will be in the form of ‘1’ (positive) or ‘0’ (negative) for the string of text in the tweet, regardless if there are one or more entities referred to in the text. This paper plans to enhance the word embeddings approach with the deployment of a sentiment lexicon-based technique to appoint a total score that indicates the polarity of opinion in relation to a particular entity or entities. This type of sentiment classification is a way of associating a given entity with the adjectives, adverbs, and verbs describing it, and extracting the associated sentiment to try and infer if the text is positive or negative in relation to the entity or entities.

pdf
Unsupervised Separation of Transliterable and Native Words for Malayalam
Deepak P
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

2016

pdf
MixKMeans: Clustering Question-Answer Archives
Deepak P
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2014

pdf
Unsupervised Solution Post Identification from Discussion Forums
Deepak P | Karthik Visweswariah
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Generating a Word-Emotion Lexicon from #Emotional Tweets
Anil Bandhakavi | Nirmalie Wiratunga | Deepak P | Stewart Massie
Proceedings of the Third Joint Conference on Lexical and Computational Semantics (*SEM 2014)