Abir Chakraborty


2024

pdf
RGAT at SemEval-2024 Task 2: Biomedical Natural Language Inference using Graph Attention Network
Abir Chakraborty
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

In this work, we (team RGAT) describe our approaches for the SemEval 2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials (NLI4CT). The objective of this task is multi-evidence natural language inference based on different sections of clinical trial reports. We have explored various approaches, (a) dependency tree of the input query as additional features in a Graph Attention Network (GAT) along with the token and parts-of-speech features, (b) sequence-to-sequence approach using various models and synthetic data and finally, (c) in-context learning using large language models (LLMs) like GPT-4. Amongs these three approaches the best result is obtained from the LLM with 0.76 F1-score (the highest being 0.78), 0.86 in faithfulness and 0.74 in consistence.

2023

pdf
Aspect and Opinion Term Extraction Using Graph Attention Network
Abir Chakraborty
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

In this work we investigate the capability of Graph Attention Network for extracting aspect and opinion terms. Aspect and opinion term extraction is posed as a token-level classification task akin to named entity recognition. We use the dependency tree of the input query as additional feature in a Graph Attention Network along with the token and part-of-speech features. We show that the dependency structure is a powerful feature that in the presence of a CRF layer substantially improves the performance and generates the best result on the commonly used datasets from SemEval 2014, 2015 and 2016. We experiment with additional layers like BiLSTM and Transformer in addition to the CRF layer. We also show that our approach works well in the presence of multiple aspects or sentiments in the same query and it is not necessary to modify the dependency tree based on a single aspect as was the original application for sentiment classification.

pdf
RGAT at SemEval-2023 Task 2: Named Entity Recognition Using Graph Attention Network
Abir Chakraborty
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

In this paper, we (team RGAT) describe our approach for the SemEval 2023 Task 2: Multilingual Complex Named Entity Recognition (MultiCoNER II). The goal of this task is to locate and classify named entities in unstructured short complex texts in 12 different languages and one multilingual setup. We use the dependency tree of the input query as additional feature in a Graph Attention Network along with the token and part-of-speech features. We also experiment with additional layers like BiLSTM and Transformer in addition to the CRF layer. However, we have not included any external Knowledge base like Wikipedia to enrich our inputs. We evaluated our proposed approach on the English NER dataset that resulted in a clean-subset F1 of 61.29\% and overall F1 of 56.91\%. However, other approaches that used external knowledge base performed significantly better.

2021

pdf
Aspect Based Sentiment Analysis Using Spectral Temporal Graph Neural Network
Abir Chakraborty
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

The objective of Aspect Based Sentiment Analysis is to capture the sentiment of reviewers associated with different aspects. However, complexity of the review sentences, presence of double negation and specific usage of words found in different domains make it difficult to predict the sentiment accurately and overall a challenging natural language understanding task. While recurrent neural network, attention mechanism and more recently, graph attention based models are prevalent, in this paper we propose graph Fourier transform based network with features created in the spectral domain. While this approach has found considerable success in the forecasting domain, it has not been explored earlier for any natural language processing task. The method relies on creating and learning an underlying graph from the raw data and thereby using the adjacency matrix to shift to the graph Fourier domain. Subsequently, Fourier transform is used to switch to the frequency (spectral) domain where new features are created. These series of transformation proved to be extremely efficient in learning the right representation as we have found that our model achieves the best result on both the SemEval-2014 datasets, i.e., “Laptop” and “Restaurants” domain. Our proposed model also found competitive results on the two other recently proposed datasets from the e-commerce domain.

pdf
Deep Embedding of Conversation Segments
Abir Chakraborty | Anirban Majumder
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

We introduce a novel conversation embedding by extending Bidirectional Encoder Representations from Transformers (BERT) framework. Specifically, information related to “turn” and “role” that are unique to conversations are augmented to the word tokens and the next sentence prediction task predicts a segment of a conversation possibly spanning across multiple roles and turns. It is observed that the addition of role and turn substantially increases the next sentence prediction accuracy. Conversation embeddings obtained in this fashion are applied to (a) conversation clustering, (b) conversation classification and (c) as a context for automated conversation generation on new datasets (unseen by the pre-training model). We found that clustering accuracy is greatly improved if embeddings are used as features as opposed to conventional tf-idf based features that do not take role or turn information into account. On classification task, a fine-tuned model on conversation embedding achieves accuracy comparable to an optimized linear SVM model on tf-idf based features. Finally, we present a way of capturing variable length context in sequence-to-sequence models by utilizing this conversation embedding and show that BLEU score improves over a vanilla sequence to sequence model without context.