L. Venkata Subramaniam

Also published as: L Venkata Subramaniam, L. V. Subramaniam, L V Subramaniam


2022

pdf bib
Zero-shot Entity Linking with Less Data
G P Shrivatsa Bhargav | Dinesh Khandelwal | Saswati Dana | Dinesh Garg | Pavan Kapanipathi | Salim Roukos | Alexander Gray | L Venkata Subramaniam
Findings of the Association for Computational Linguistics: NAACL 2022

Entity Linking (EL) maps an entity mention in a natural language sentence to an entity in a knowledge base (KB). The Zero-shot Entity Linking (ZEL) extends the scope of EL to unseen entities at the test time without requiring new labeled data. BLINK (BERT-based) is one of the SOTA models for ZEL. Interestingly, we discovered that BLINK exhibits diminishing returns, i.e., it reaches 98% of its performance with just 1% of the training data and the remaining 99% of the data yields only a marginal increase of 2% in the performance. While this extra 2% gain makes a huge difference for downstream tasks, training BLINK on large amounts of data is very resource-intensive and impractical. In this paper, we propose a neuro-symbolic, multi-task learning approach to bridge this gap. Our approach boosts the BLINK’s performance with much less data by exploiting an auxiliary information about entity types. Specifically, we train our model on two tasks simultaneously - entity linking (primary task) and hierarchical entity type prediction (auxiliary task). The auxiliary task exploits the hierarchical structure of entity types. Our approach achieves superior performance on ZEL task with significantly less training data. On four different benchmark datasets, we show that our approach achieves significantly higher performance than SOTA models when they are trained with just 0.01%, 0.1%, or 1% of the original training data. Our code is available at https://github.com/IBM/NeSLET.

pdf bib
SYGMA: A System for Generalizable and Modular Question Answering Over Knowledge Bases
Sumit Neelam | Udit Sharma | Hima Karanam | Shajith Ikbal | Pavan Kapanipathi | Ibrahim Abdelaziz | Nandana Mihindukulasooriya | Young-Suk Lee | Santosh Srivastava | Cezar Pendus | Saswati Dana | Dinesh Garg | Achille Fokoue | G P Shrivatsa Bhargav | Dinesh Khandelwal | Srinivas Ravishankar | Sairam Gurajada | Maria Chang | Rosario Uceda-Sosa | Salim Roukos | Alexander Gray | Guilherme Lima | Ryan Riegel | Francois Luus | L V Subramaniam
Findings of the Association for Computational Linguistics: EMNLP 2022

Knowledge Base Question Answering (KBQA) involving complex reasoning is emerging as an important research direction. However, most KBQA systems struggle with generalizability, particularly on two dimensions: (a) across multiple knowledge bases, where existing KBQA approaches are typically tuned to a single knowledge base, and (b) across multiple reasoning types, where majority of datasets and systems have primarily focused on multi-hop reasoning. In this paper, we present SYGMA, a modular KBQA approach developed with goal of generalization across multiple knowledge bases and multiple reasoning types. To facilitate this, SYGMA is designed as two high level modules: 1) KB-agnostic question understanding module that remain common across KBs, and generates logic representation of the question with high level reasoning constructs that are extensible, and 2) KB-specific question mapping and answering module to address the KB-specific aspects of the answer extraction. We evaluated SYGMA on multiple datasets belonging to distinct knowledge bases (DBpedia and Wikidata) and distinct reasoning types (multi-hop and temporal). State-of-the-art or competitive performances achieved on those datasets demonstrate its generalization capability.

2017

pdf bib
SemTagger: A Novel Approach for Semantic Similarity Based Hashtag Recommendation on Twitter
Kuntal Dey | Ritvik Shrivastava | Saroj Kaushik | L Venkata Subramaniam
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

2013

pdf bib
An Empirical Assessment of Contemporary Online Media in Ad-Hoc Corpus Creation for Social Events
Kanika Narang | Seema Nagar | Sameep Mehta | L V Subramaniam | Kuntal Dey
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
NLP for uncertain data at scale
Sameep Mehta | L. V. Subramaniam
NAACL HLT 2013 Tutorial Abstracts

2011

pdf bib
Using Text Reviews for Product Entity Completion
Mrinmaya Sachan | Tanveer Faruquie | L. V. Subramaniam | Mukesh Mohania
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
Unsupervised cleansing of noisy text
Danish Contractor | Tanveer A. Faruquie | L. Venkata Subramaniam
Coling 2010: Posters

pdf bib
Handling Noisy Queries in Cross Language FAQ Retrieval
Danish Contractor | Govind Kothari | Tanveer Faruquie | L. V. Subramaniam | Sumit Negi
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
Noisy Text Analytics
L. Venkata Subramaniam
NAACL HLT 2010 Tutorial Abstracts

pdf bib
Automatically Generating Term Frequency Induced Taxonomies
Karin Murthy | Tanveer A Faruquie | L Venkata Subramaniam | Hima Prasad K | Mukesh Mohania
Proceedings of the ACL 2010 Conference Short Papers

2009

pdf bib
SMS based Interface for FAQ Retrieval
Govind Kothari | Sumit Negi | Tanveer A. Faruquie | Venkatesan T. Chakaravarthy | L. Venkata Subramaniam
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2007

pdf bib
Automatic Identification of Important Segments and Expressions for Mining of Business-Oriented Conversations at Contact Centers
Hironori Takeuchi | L Venkata Subramaniam | Tetsuya Nasukawa | Shourya Roy
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Automatic Generation of Domain Models for Call-Centers from Noisy Transcriptions
Shourya Roy | L Venkata Subramaniam
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics