Jugal Kalita

Also published as: J.K. Kalita, Jugal K. Kalita


2023

pdf
Training-free Neural Architecture Search for RNNs and Transformers
Aaron Serianni | Jugal Kalita
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Neural architecture search (NAS) has allowed for the automatic creation of new and effective neural network architectures, offering an alternative to the laborious process of manually designing complex architectures. However, traditional NAS algorithms are slow and require immense amounts of computing power. Recent research has investigated training-free NAS metrics for image classification architectures, drastically speeding up search algorithms. In this paper, we investigate training-free NAS metrics for recurrent neural network (RNN) and BERT-based transformer architectures, targeted towards language modeling tasks. First, we develop a new training-free metric, named hidden covariance, that predicts the trained performance of an RNN architecture and significantly outperforms existing training-free metrics. We experimentally evaluate the effectiveness of the hidden covariance metric on the NAS-Bench-NLP benchmark. Second, we find that the current search space paradigm for transformer architectures is not optimized for training-free neural architecture search. Instead, a simple qualitative analysis can effectively shrink the search space to the best performing architectures. This conclusion is based on our investigation of existing training-free metrics and new metrics developed from recent transformer pruning literature, evaluated on our own benchmark of trained BERT architectures. Ultimately, our analysis shows that the architecture search space and the training-free metric must be developed together in order to achieve effective results. Our source code is available at https://github.com/aaronserianni/training-free-nas.

pdf
Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models
Atnafu Lambebo Tonja | Hellina Hailu Nigatu | Olga Kolesnikova | Grigori Sidorov | Alexander Gelbukh | Jugal Kalita
Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP)

This paper describes CIC NLP’s submission to the AmericasNLP 2023 Shared Task on machine translation systems for indigenous languages of the Americas. We present the system descriptions for three methods. We used two multilingual models, namely M2M-100 and mBART50, and one bilingual (one-to-one) — Helsinki NLP Spanish-English translation model, and experimented with different transfer learning setups. We experimented with 11 languages from America and report the setups we used as well as the results we achieved. Overall, the mBART setup was able to improve upon the baseline for three out of the eleven languages.

pdf
Abstractive Text Summarization Using the BRIO Training Paradigm
Khang Lam | Thieu Doan | Khang Pham | Jugal Kalita
Findings of the Association for Computational Linguistics: ACL 2023

Summary sentences produced by abstractive summarization models may be coherent and comprehensive, but they lack control and rely heavily on reference summaries. The BRIO training paradigm assumes a non-deterministic distribution to reduce the model’s dependence on reference summaries, and improve model performance during inference. This paper presents a straightforward but effective technique to improve abstractive summaries by fine-tuning pre-trained language models, and training them with the BRIO paradigm. We build a text summarization dataset for Vietnamese, called VieSum. We perform experiments with abstractive summarization models trained with the BRIO paradigm on the CNNDM and the VieSum datasets. The results show that the models, trained on basic hardware, outperform all existing abstractive summarization models, especially for Vietnamese.

2021

pdf
Towards Multimodal Vision-Language Models Generating Non-Generic Text
Wes Robbins | Zanyar Zohourianshahzadi | Jugal Kalita
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

Vision-language models can assess visual context in an image and generate descriptive text. While the generated text may be accurate and syntactically correct, it is often overly general. To address this, recent work has used optical character recognition to supplement visual information with text extracted from an image. In this work, we contend that vision-language models can benefit from information that can be extracted from an image, but are not used by current models. We modify previous multimodal frameworks to accept relevant information from any number of auxiliary classifiers. In particular, we focus on person names as an additional set of tokens and create a novel image-caption dataset to facilitate captioning with person names. The dataset, Politicians and Athletes in Captions (PAC), consists of captioned images of well-known people in context. By fine-tuning pretrained models with this dataset, we demonstrate a model that can naturally integrate facial recognition tokens into generated text by training on limited data. For the PAC dataset, we provide a discussion on collection and baseline benchmark scores.

pdf
Using Random Perturbations to Mitigate Adversarial Attacks on Sentiment Analysis Models
Abigail Swenor | Jugal Kalita
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

Attacks on deep learning models are often difficult to identify and therefore are difficult to protect against. This problem is exacerbated by the use of public datasets that typically are not manually inspected before use. In this paper, we offer a solution to this vulnerability by using, during testing, random perturbations such as spelling correction if necessary, substitution by random synonym, or simply drop-ping the word. These perturbations are applied to random words in random sentences to defend NLP models against adversarial attacks. Our Random Perturbations Defense andIncreased Randomness Defense methods are successful in returning attacked models to similar accuracy of models before attacks. The original accuracy of the model used in this work is 80% for sentiment classification. After undergoing attacks, the accuracy drops to an accuracy between 0% and 44%. After applying our defense methods, the accuracy of the model is returned to the original accuracy within statistical significance.

2020

pdf
Solving Arithmetic Word Problems Using Transformer and Pre-processing of Problem Texts
Kaden Griffith | Jugal Kalita
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

This paper outlines the use of Transformer networks trained to translate math word problems to equivalent arithmetic expressions in infix, prefix, and postfix notations. We compare results produced by a large number of neural configurations and find that most configurations outperform previously reported approaches on three of four datasets with significant increases in accuracy of over 20 percentage points. The best neural approaches boost accuracy by 30% on average when compared to the previous state-of-the-art.

pdf
Language Model Metrics and Procrustes Analysis for Improved Vector Transformation of NLP Embeddings
Thomas Conley | Jugal Kalita
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

Artificial Neural networks are mathematical models at their core. This truism presents some fundamental difficulty when networks are tasked with Natural Language Processing. A key problem lies in measuring the similarity or distance among vectors in NLP embedding space, since the mathematical concept of distance does not always agree with the linguistic concept. We suggest that the best way to measure linguistic distance among vectors is by employing the Language Model (LM) that created them. We introduce Language Model Distance (LMD) for measuring accuracy of vector transformations based on the Distributional Hypothesis ( LMD Accuracy ). We show the efficacy of this metric by applying it to a simple neural network learning the Procrustes algorithm for bilingual word mapping.

pdf
Generalization to Mitigate Synonym Substitution Attacks
Basemah Alshemali | Jugal Kalita
Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures

Studies have shown that deep neural networks (DNNs) are vulnerable to adversarial examples – perturbed inputs that cause DNN-based models to produce incorrect results. One robust adversarial attack in the NLP domain is the synonym substitution. In attacks of this variety, the adversary substitutes words with synonyms. Since synonym substitution perturbations aim to satisfy all lexical, grammatical, and semantic constraints, they are difficult to detect with automatic syntax check as well as by humans. In this paper, we propose a structure-free defensive method that is capable of improving the performance of DNN-based models with both clean and adversarial data. Our findings show that replacing the embeddings of the important words in the input samples with the average of their synonyms’ embeddings can significantly improve the generalization of DNN-based classifiers. By doing so, we reduce model sensitivity to particular words in the input samples. Our results indicate that the proposed defense is not only capable of defending against adversarial attacks, but is also capable of improving the performance of DNN-based models when tested on benign data. On average, the proposed defense improved the classification accuracy of the CNN and Bi-LSTM models by 41.30% and 55.66%, respectively, when tested under adversarial attacks. Extended investigation shows that our defensive method can improve the robustness of nonneural models, achieving an average of 17.62% and 22.93% classification accuracy increase on the SVM and XGBoost models, respectively. The proposed defensive method has also shown an average of 26.60% classification accuracy improvement when tested with the infamous BERT model. Our algorithm is generic enough to be applied in any NLP domain and to any model trained on any natural language.

2019

pdf
Introducing Aspects of Creativity in Automatic Poetry Generation
Brendan Bena | Jugal Kalita
Proceedings of the 16th International Conference on Natural Language Processing

Poetry Generation involves teaching systems to automatically generate text that resembles poetic work. A deep learning system can learn to generate poetry on its own by training on a corpus of poems and modeling the particular style of language. In this paper, we propose taking an approach that fine-tunes GPT-2, a pre-trained language model, to our downstream task of poetry generation. We extend prior work on poetry generation by introducing creative elements. Specifically, we generate poems that express emotion and elicit the same in readers, and poems that use the language of dreams—called dream poetry. We are able to produce poems that correctly elicit the emotions of sadness and joy 87.5 and 85 percent, respectively, of the time. We produce dreamlike poetry by training on a corpus of texts that describe dreams. Poems from this model are shown to capture elements of dream poetry with scores of no less than 3.2 on the Likert scale. We perform crowdsourced human-evaluation for all our poems. We also make use of the Coh-Metrix tool, outlining metrics we use to gauge the quality of text generated.

2018

pdf
Genre Identification and the Compositional Effect of Genre in Literature
Joseph Worsham | Jugal Kalita
Proceedings of the 27th International Conference on Computational Linguistics

Recent advances in Natural Language Processing are finding ways to place an emphasis on the hierarchical nature of text instead of representing language as a flat sequence or unordered collection of words or letters. A human reader must capture multiple levels of abstraction and meaning in order to formulate an understanding of a document. In this paper, we address the problem of developing approaches which are capable of working with extremely large and complex literary documents to perform Genre Identification. The task is to assign the literary classification to a full-length book belonging to a corpus of literature, where the works on average are well over 200,000 words long and genre is an abstract thematic concept. We introduce the Gutenberg Dataset for Genre Identification. Additionally, we present a study on how current deep learning models compare to traditional methods for this task. The results are presented as a baseline along with findings on how using an ensemble of chapters can significantly improve results in deep learning methods. The motivation behind the ensemble of chapters method is discussed as the compositionality of subtexts which make up a larger work and contribute to the overall genre.

pdf bib
Isolated and Ensemble Audio Preprocessing Methods for Detecting Adversarial Examples against Automatic Speech Recognition
Krishan Rajaratnam | Kunal Shah | Jugal Kalita
Proceedings of the 30th Conference on Computational Linguistics and Speech Processing (ROCLING 2018)

2017

pdf
Neural Networks for Semantic Textual Similarity
Derek Prijatelj | Jugal Kalita | Jonathan Ventura
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

pdf
Open Set Text Classification Using CNNs
Sridhama Prakhya | Vinodini Venkataram | Jugal Kalita
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

2016

pdf
Enhancing Automatic Wordnet Construction Using Word Embeddings
Feras Al Tarouti | Jugal Kalita
Proceedings of the Workshop on Multilingual and Cross-lingual Methods in NLP

pdf bib
Integrating WordNet for Multiple Sense Embeddings in Vector Semantics
David Foley | Jugal Kalita
Proceedings of the 13th International Conference on Natural Language Processing

pdf
Composition of Compound Nouns Using Distributional Semantics
Kyra Yee | Jugal Kalita
Proceedings of the 13th International Conference on Natural Language Processing

2015

pdf
Phrase translation using a bilingual dictionary and n-gram data: A case study from Vietnamese to English
Khang Nhut Lam | Feras Al Tarouti | Jugal Kalita
Proceedings of the 11th Workshop on Multiword Expressions

2014

pdf
Creating Lexical Resources for Endangered Languages
Khang Nhut Lam | Feras Al Tarouti | Jugal Kalita
Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf
Automatically constructing Wordnet Synsets
Khang Nhut Lam | Feras Al Tarouti | Jugal Kalita
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf
Better Twitter Summaries?
Joel Judd | Jugal Kalita
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Creating Reverse Bilingual Dictionaries
Khang Nhut Lam | Jugal Kalita
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2012

pdf
Summarization of Historical Articles Using Temporal Event Clustering
James Gung | Jugal Kalita
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Multi-objective Optimization for Efficient Brahmic Keyboards
Albert Brouillette | Devraj Sarmah | Jugal Kalita
Proceedings of the Second Workshop on Advances in Text Input Methods

2010

pdf
Summarizing Microblogs Automatically
Beaux Sharifi | Mark-Anthony Hutton | Jugal Kalita
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2009

pdf
Part of Speech Tagger for Assamese Text
Navanath Saharia | Dhrubajyoti Das | Utpal Sharma | Jugal Kalita
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2002

pdf bib
Unsupervised Learning of Morphology for Building Lexicon for a Highly Inflectional Language
Utpal Sharma | Jugal Kalita | Rajib Das
Proceedings of the ACL-02 Workshop on Morphological and Phonological Learning

1988

pdf
Automatically Generating Natural Language Reports in an Office Environment
Jugal Kalita | Sunil Shende
Second Conference on Applied Natural Language Processing

1986

pdf bib
Summarizing Natural Language Database Responses
Jugal K. Kalita | Marlene L. Jones | Gordon I. McCalla
Computational Linguistics. Formerly the American Journal of Computational Linguistics Volume 12, Number 2 April-June 1986

1984

pdf
A Response to the Need for Summary Responses
J.K. Kalita | M.J. Colbourn | G.I. McCalla
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics