Harshit Kumar
Other people with similar names: Harshit Kumar
2021
VeeAlign: Multifaceted Context Representation Using Dual Attention for Ontology Alignment
Vivek Iyer | Arvind Agarwal | Harshit Kumar
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Vivek Iyer | Arvind Agarwal | Harshit Kumar
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Ontology Alignment is an important research problem applied to various fields such as data integration, data transfer, data preparation, etc. State-of-the-art (SOTA) Ontology Alignment systems typically use naive domain-dependent approaches with handcrafted rules or domain-specific architectures, making them unscalable and inefficient. In this work, we propose VeeAlign, a Deep Learning based model that uses a novel dual-attention mechanism to compute the contextualized representation of a concept which, in turn, is used to discover alignments. By doing this, not only is our approach able to exploit both syntactic and semantic information encoded in ontologies, it is also, by design, flexible and scalable to different domains with minimal effort. We evaluate our model on four different datasets from different domains and languages, and establish its superiority through these results as well as detailed ablation studies. The code and datasets used are available at https://github.com/Remorax/VeeAlign.
2020
Neural Conversational QA: Learning to Reason vs Exploiting Patterns
Nikhil Verma | Abhishek Sharma | Dhiraj Madan | Danish Contractor | Harshit Kumar | Sachindra Joshi
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Nikhil Verma | Abhishek Sharma | Dhiraj Madan | Danish Contractor | Harshit Kumar | Sachindra Joshi
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Neural Conversational QA tasks such as ShARC require systems to answer questions based on the contents of a given passage. On studying recent state-of-the-art models on the ShARC QA task, we found indications that the model(s) learn spurious clues/patterns in the data-set. Further, a heuristic-based program, built to exploit these patterns, had comparative performance to that of the neural models. In this paper we share our findings about the four types of patterns in the ShARC corpus and how the neural models exploit them. Motivated by the above findings, we create and share a modified data-set that has fewer spurious patterns than the original data-set, consequently allowing models to learn better.