Abhishek Singh

2024

pdf abs
Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts
Sai Ashish Somayajula | Youwei Liang | Li Zhang | Abhishek Singh | Pengtao Xie
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Pretrained Language Models (PLMs) have advanced Natural Language Processing (NLP) tasks significantly, but finetuning PLMs on low-resource datasets poses significant challenges such as instability and overfitting. Previous methods tackle these issues by finetuning a strategically chosen subnetwork on a downstream task, while keeping the remaining weights fixed to the pretrained weights. However, they rely on a suboptimal criteria for sub-network selection, leading to suboptimal solutions. To address these limitations, we propose a regularization method based on attention-guided weight mixup for finetuning PLMs. Our approach represents each network weight as a mixup of task-specific weight and pretrained weight, controlled by a learnable attention parameter, providing finer control over sub-network selection. Furthermore, we employ a bi-level optimization (BLO) based framework on two separate splits of the training dataset, improving generalization and combating overfitting. We validate the efficacy of our proposed method through extensive experiments, demonstrating its superiority over previous methods, particularly in the context of finetuning PLMs on low-resource datasets. Our code is available at https://github.com/Sai-Ashish/Attention_guided_weight_mixup_BLO.

2022

pdf abs
Team LEGO at SemEval-2022 Task 4: Machine Learning Methods for PCL Detection
Abhishek Singh
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

In this paper, we present our submission to the SemEval 2022 - Task 4 on Patronizing and Condescending Language (PCL) detection. Weapproach this problem as a traditional text classification problem with machine learning (ML)methods. We experiment and investigate theuse of various ML algorithms for detecting PCL in news articles. Our best methodology achieves an F1- Score of 0.39 for subtask1 witha rank of 63 out of 80, and F1-score of 0.082for subtask2 with a rank of 41 out of 48 on the blind dataset provided in the shared task.

2021

pdf
Joint abstractive and extractive method for long financial document summarization
Nadhem Zmandar | Abhishek Singh | Mahmoud El-Haj | Paul Rayson
Proceedings of the 3rd Financial Narrative Processing Workshop

2020

Clinical notes contain rich information, which is relatively unexploited in predictive modeling compared to structured data. In this work, we developed a new clinical text representation Clinical XLNet that leverages the temporal information of the sequence of the notes. We evaluated our models on prolonged mechanical ventilation prediction problem and our experiments demonstrated that Clinical XLNet outperforms the best baselines consistently. The models and scripts are made publicly available.

pdf abs
PoinT-5: Pointer Network and T-5 based Financial Narrative Summarisation
Abhishek Singh
Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation

Companies provide annual reports to their shareholders at the end of the financial year that de-scribes their operations and financial conditions. The average length of these reports is 80, andit may extend up to 250 pages long. In this paper, we propose our methodology PoinT-5 (thecombination of Pointer Network and T-5 (Test-to-text transfer Transformer) algorithms) that weused in the Financial Narrative Summarisation (FNS) 2020 task. The proposed method usesPointer networks to extract important narrative sentences from the report, and then T-5 is used toparaphrase extracted sentences into a concise yet informative sentence. We evaluate our methodusing Rouge-N (1,2), L, and SU4. The proposed method achieves the highest precision scores inall the metrics and highest F1 scores in three out of four evaluation metrics that are Rouge 1, 2,and LCS and only solution to cross MUSE solution baseline in Rouge-LCS metrics.

pdf abs
Voice@SRIB at SemEval-2020 Tasks 9 and 12: Stacked Ensemblingmethod for Sentiment and Offensiveness detection in Social Media
Abhishek Singh | Surya Pratap Singh Parmar
Proceedings of the Fourteenth Workshop on Semantic Evaluation

In social-media platforms such as Twitter, Facebook, and Reddit, people prefer to use code-mixed language such as Spanish-English, Hindi-English to express their opinions. In this paper, we describe different models we used, using the external dataset to train embeddings, ensembling methods for Sentimix, and OffensEval tasks. The use of pre-trained embeddings usually helps in multiple tasks such as sentence classification, and machine translation. In this experiment, we have used our trained code-mixed embeddings and twitter pre-trained embeddings to SemEval tasks. We evaluate our models on macro F1-score, precision, accuracy, and recall on the datasets. We intend to show that hyper-parameter tuning and data pre-processing steps help a lot in improving the scores. In our experiments, we are able to achieve 0.886 F1-Macro on OffenEval Greek language subtask post-evaluation, whereas the highest is 0.852 during the Evaluation Period. We stood third in Spanglish competition with our best F1-score of 0.756. Codalab username is asking28.

2019

pdf abs
Incorporating Emoji Descriptions Improves Tweet Classification
Abhishek Singh | Eduardo Blanco | Wei Jin
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Tweets are short messages that often include specialized language such as hashtags and emojis. In this paper, we present a simple strategy to process emojis: replace them with their natural language description and use pretrained word embeddings as normally done with standard words. We show that this strategy is more effective than using pretrained emoji embeddings for tweet classification. Specifically, we obtain new state-of-the-art results in irony detection and sentiment analysis despite our neural network is simpler than previous proposals.