Tapas Nayak

Also published as: Tapas Nayek


2021

pdf bib
A Hierarchical Entity Graph Convolutional Network for Relation Extraction across Documents
Tapas Nayak | Hwee Tou Ng
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple appear in two different documents that are connected via a chain of common entities. Following this idea, we create a dataset for two-hop relation extraction, where each chain contains exactly two documents. Our proposed dataset covers a higher number of relations than the publicly available sentence-level datasets. We also propose a hierarchical entity graph convolutional network (HEGCN) model for this task that improves performance by 1.1% F1 score on our two-hop relation extraction dataset, compared to some strong neural baselines.

pdf bib
Improving Distantly Supervised Relation Extraction with Self-Ensemble Noise Filtering
Tapas Nayak | Navonil Majumder | Soujanya Poria
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Distantly supervised models are very popular for relation extraction since we can obtain a large amount of training data using the distant supervision method without human annotation. In distant supervision, a sentence is considered as a source of a tuple if the sentence contains both entities of the tuple. However, this condition is too permissive and does not guarantee the presence of relevant relation-specific information in the sentence. As such, distantly supervised training data contains much noise which adversely affects the performance of the models. In this paper, we propose a self-ensemble filtering mechanism to filter out the noisy samples during the training process. We evaluate our proposed framework on the New York Times dataset which is obtained via distant supervision. Our experiments with multiple state-of-the-art neural relation extraction models show that our proposed filtering mechanism improves the robustness of the models and increases their F1 scores.

pdf bib
PASTE: A Tagging-Free Decoding Framework Using Pointer Networks for Aspect Sentiment Triplet Extraction
Rajdeep Mukherjee | Tapas Nayak | Yash Butala | Sourangshu Bhattacharya | Pawan Goyal
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Aspect Sentiment Triplet Extraction (ASTE) deals with extracting opinion triplets, consisting of an opinion target or aspect, its associated sentiment, and the corresponding opinion term/span explaining the rationale behind the sentiment. Existing research efforts are majorly tagging-based. Among the methods taking a sequence tagging approach, some fail to capture the strong interdependence between the three opinion factors, whereas others fall short of identifying triplets with overlapping aspect/opinion spans. A recent grid tagging approach on the other hand fails to capture the span-level semantics while predicting the sentiment between an aspect-opinion pair. Different from these, we present a tagging-free solution for the task, while addressing the limitations of the existing works. We adapt an encoder-decoder architecture with a Pointer Network-based decoding framework that generates an entire opinion triplet at each time step thereby making our solution end-to-end. Interactions between the aspects and opinions are effectively captured by the decoder by considering their entire detected spans while predicting their connecting sentiment. Extensive experiments on several benchmark datasets establish the better efficacy of our proposed approach, especially in recall, and in predicting multiple and aspect/opinion-overlapped triplets from the same review sentence. We report our results both with and without BERT and also demonstrate the utility of domain-specific BERT post-training for the task.

2019

pdf bib
Effective Attention Modeling for Neural Relation Extraction
Tapas Nayak | Hwee Tou Ng
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Relation extraction is the task of determining the relation between two entities in a sentence. Distantly-supervised models are popular for this task. However, sentences can be long and two entities can be located far from each other in a sentence. The pieces of evidence supporting the presence of a relation between two entities may not be very direct, since the entities may be connected via some indirect links such as a third entity or via co-reference. Relation extraction in such scenarios becomes more challenging as we need to capture the long-distance interactions among the entities and other words in the sentence. Also, the words in a sentence do not contribute equally in identifying the relation between the two entities. To address this issue, we propose a novel and effective attention model which incorporates syntactic information of the sentence and a multi-factor attention mechanism. Experiments on the New York Times corpus show that our proposed model outperforms prior state-of-the-art models.

2016

pdf bib
CATaLog Online: A Web-based CAT Tool for Distributed Translation with Data Capture for APE and Translation Process Research
Santanu Pal | Sudip Kumar Naskar | Marcos Zampieri | Tapas Nayak | Josef van Genabith
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

We present a free web-based CAT tool called CATaLog Online which provides a novel and user-friendly online CAT environment for post-editors/translators. The goal is to support distributed translation, reduce post-editing time and effort, improve the post-editing experience and capture data for incremental MT/APE (automatic post-editing) and translation process research. The tool supports individual as well as batch mode file translation and provides translations from three engines – translation memory (TM), MT and APE. TM suggestions are color coded to accelerate the post-editing task. The users can integrate their personal TM/MT outputs. The tool remotely monitors and records post-editing activities generating an extensive range of post-editing logs.

pdf bib
CATaLog Online: Porting a Post-editing Tool to the Web
Santanu Pal | Marcos Zampieri | Sudip Kumar Naskar | Tapas Nayak | Mihaela Vela | Josef van Genabith
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper presents CATaLog online, a new web-based MT and TM post-editing tool. CATaLog online is a freeware software that can be used through a web browser and it requires only a simple registration. The tool features a number of editing and log functions similar to the desktop version of CATaLog enhanced with several new features that we describe in detail in this paper. CATaLog online is designed to allow users to post-edit both translation memory segments as well as machine translation output. The tool provides a complete set of log information currently not available in most commercial CAT tools. Log information can be used both for project management purposes as well as for the study of the translation process and translator’s productivity.

2015

pdf bib
CATaLog: New Approaches to TM and Post Editing Interfaces
Tapas Nayek | Sudip Kumar Naskar | Santanu Pal | Marcos Zampieri | Mihaela Vela | Josef van Genabith
Proceedings of the Workshop Natural Language Processing for Translation Memories