Deepak P

Also published as: Deepak Padmanabhan


2023

Fact-checking of health-related claims has become necessary in this digital age, where any information posted online is easily available to everyone. The most effective way to verify such claims is by using evidences obtained from reliable sources of medical knowledge, such as PubMed. Recent advances in the field of NLP have helped automate such fact-checking tasks. In this work, we propose a domain-specific BERT-based model using a transfer learning approach for the task of predicting the veracity of claim-evidence pairs for the verification of health-related facts. We also improvise on a method to combine multiple evidences retrieved for a single claim, taking into consideration conflicting evidences as well. We also show how our model can be exploited when labelled data is available and how back-translation can be used to augment data when there is data scarcity.
Research on fake reviews detection and review helpfulness prediction is prevalent, yet most studies tend to focus solely on either fake reviews detection or review helpfulness prediction, considering them separate research tasks. In contrast to this prevailing pattern, we address both challenges concurrently by employing a multi-task learning approach. We posit that undertaking these tasks simultaneously can enhance the performance of each task through shared information among features. We utilize pre-trained RoBERTa embeddings with a document-level data representation. This is coupled with an array of deep learning and neural network models, including Bi-LSTM, LSTM, GRU, and CNN. Additionally, we em- ploy ensemble learning techniques to integrate these models, with the objective of enhancing overall prediction accuracy and mitigating the risk of overfitting. The findings of this study offer valuable insights to the fields of natural language processing and machine learning and present a novel perspective on leveraging multi-task learning for the twin challenges of fake reviews detection and review helpfulness prediction
Online reviews have become critical in informing purchasing decisions, making the detection of fake reviews a crucial challenge to tackle. Many different Machine Learning based solutions have been proposed, using various data representations such as n-grams or document embeddings. In this paper, we first explore the effectiveness of different data representations, including emotion, document embedding, n-grams, and noun phrases in embedding for mat, for fake reviews detection. We evaluate these representations with various state-of-the-art deep learning models, such as BILSTM, LSTM, GRU, CNN, and MLP. Following this, we propose to incorporate different data repre- sentations and classification models using early and late data fusion techniques in order to im- prove the prediction performance. The experiments are conducted on four datasets: Hotel, Restaurant, Amazon, and Yelp. The results demonstrate that combination of different data representations significantly outperform any of the single data representations

2021

Online reviews are an essential aspect of online shopping for both customers and retailers. However, many reviews found on the Internet lack in quality, informativeness or helpfulness. In many cases, they lead the customers towards positive or negative opinions without providing any concrete details (e.g., very poor product, I would not recommend it). In this work, we propose a novel unsupervised method for quantifying helpfulness leveraging the availability of a corpus of reviews. In particular, our method exploits three characteristics of the reviews, viz., relevance, emotional intensity and specificity, towards quantifying helpfulness. We perform three rankings (one for each feature above), which are then combined to obtain a final helpfulness ranking. For the purpose of empirically evaluating our method, we use review of four product categories from Amazon review. The experimental evaluation demonstrates the effectiveness of our method in comparison to a recent and state-of-the-art baseline.