Aparna Garimella


A Neural CRF-based Hierarchical Approach for Linear Text Segmentation
Inderjeet Nair | Aparna Garimella | Balaji Vasan Srinivasan | Natwar Modani | Niyati Chhaya | Srikrishna Karanam | Sumit Shekhar
Findings of the Association for Computational Linguistics: EACL 2023

We consider the problem of segmenting unformatted text and transcripts linearly based on their topical structure. While prior approaches explicitly train to predict segment boundaries, our proposed approach solves this task by inferring the hierarchical segmentation structure associated with the input text fragment. Given the lack of a large annotated dataset for this task, we propose a data curation strategy and create a corpus of over 700K Wikipedia articles with their hierarchical structures. We then propose the first supervised approach to generating hierarchical segmentation structures based on these annotations. Our method, in particular, is based on a neural conditional random field (CRF), which explicitly models the statistical dependency between a node and its constituent child nodes. We introduce a new data augmentation scheme as part of our model training strategy, which involves sampling a variety of node aggregations, permutations, and removals, all of which help capture fine-grained and coarse topical shifts in the data and improve model performance. Extensive experiments show that our model outperforms or achieves competitive performance when compared to previous state-of-the-art algorithms in the following settings: rich-resource, cross-domain transferability, few-shot supervision, and segmentation when topic label annotations are provided.


Demographic-Aware Language Model Fine-tuning as a Bias Mitigation Technique
Aparna Garimella | Rada Mihalcea | Akhash Amarnath
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

BERT-like language models (LMs), when exposed to large unstructured datasets, are known to learn and sometimes even amplify the biases present in such data. These biases generally reflect social stereotypes with respect to gender, race, age, and others. In this paper, we analyze the variations in gender and racial biases in BERT, a large pre-trained LM, when exposed to different demographic groups. Specifically, we investigate the effect of fine-tuning BERT on text authored by historically disadvantaged demographic groups in comparison to that by advantaged groups. We show that simply by fine-tuning BERT-like LMs on text authored by certain demographic groups can result in the mitigation of social biases in these LMs against various target groups.

Graph-based Keyword Planning for Legal Clause Generation from Topics
Sagar Joshi | Sumanth Balaji | Aparna Garimella | Vasudeva Varma
Proceedings of the Natural Legal Language Processing Workshop 2022

Generating domain-specific content such as legal clauses based on minimal user-provided information can be of significant benefit in automating legal contract generation. In this paper, we propose a controllable graph-based mechanism that can generate legal clauses using only the topic or type of the legal clauses. Our pipeline consists of two stages involving a graph-based planner followed by a clause generator. The planner outlines the content of a legal clause as a sequence of keywords in the order of generic to more specific clause information based on the input topic using a controllable graph-based mechanism. The generation stage takes in a given plan and generates a clause. The pipeline consists of a graph-based planner followed by text generation. We illustrate the effectiveness of our proposed two-stage approach on a broad set of clause topics in contracts.

Text Simplification for Legal Domain: {I}nsights and Challenges
Aparna Garimella | Abhilasha Sancheti | Vinay Aggarwal | Ananya Ganesh | Niyati Chhaya | Nandakishore Kambhatla
Proceedings of the Natural Legal Language Processing Workshop 2022

Legal documents such as contracts contain complex and domain-specific jargons, long and nested sentences, and often present with several details that may be difficult to understand for laypeople without domain expertise.In this paper, we explore the problem of text simplification (TS) in legal domain.The main challenge to this is the lack of availability of complex-simple parallel datasets for the legal domain.We investigate some of the existing datasets, methods, and metrics in the TS literature for simplifying legal texts, and perform human evaluation to analyze the gaps.{footnote{The model outputs and human ratings are available at {url{https://bit.ly/3U3ddIl}.}We present some of the challenges involved, and outline a few open questions that need to be addressed for future research in this direction.

Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models
Aniruddha Mahapatra | Sharmila Reddy Nangi | Aparna Garimella | Anandhavelu N
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Transformer-based language models trained on large natural language corpora have been very useful in downstream entity extraction tasks. However, they often result in poor performances when applied to domains that are different from those they are pretrained on. Continued pretraining using unlabeled data from target domains can help improve the performances of these language models on the downstream tasks. However, using all of the available unlabeled data for pretraining can be time-intensive; also, it can be detrimental to the performance of the downstream tasks, if the unlabeled data is not aligned with the data distribution for the target tasks. Previous works employed external supervision in the form of ontologies for selecting appropriate data samples for pretraining, but external supervision can be quite hard to obtain in low-resource domains. In this paper, we introduce effective ways to select data from unlabeled corpora of target domains for language model pretraining to improve the performances in target entity extraction tasks. Our data selection strategies do not require any external supervision. We conduct extensive experiments for the task of named entity recognition (NER) on seven different domains and show that language models pretrained on target domain unlabeled data obtained using our data selection strategies achieve better performances compared to those using data selection strategies in previous works that use external supervision. We also show that these pretrained language models using our data selection strategies outperform those pretrained on all of the available unlabeled target domain data.

Agent-Specific Deontic Modality Detection in Legal Language
Abhilasha Sancheti | Aparna Garimella | Balaji Vasan Srinivasan | Rachel Rudinger
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Legal documents are typically long and written in legalese, which makes it particularly difficult for laypeople to understand their rights and duties. While natural language understanding technologies can be valuable in supporting such understanding in the legal domain, the limited availability of datasets annotated for deontic modalities in the legal domain, due to the cost of hiring experts and privacy issues, is a bottleneck. To this end, we introduce, LEXDEMOD, a corpus of English contracts annotatedwith deontic modality expressed with respect to a contracting party or agent along with the modal triggers. We benchmark this dataset on two tasks: (i) agent-specific multi-label deontic modality classification, and (ii) agent-specific deontic modality and trigger span detection using Transformer-based (Vaswani et al., 2017) language models. Transfer learning experiments show that the linguistic diversity of modal expressions in LEXDEMOD generalizes reasonably from lease to employment andrental agreements. A small case study indicates that a model trained on LEXDEMOD can detect red flags with high recall. We believe our work offers a new research direction for deontic modality detection in the legal domain.


DRAG: Director-Generator Language Modelling Framework for Non-Parallel Author Stylized Rewriting
Hrituraj Singh | Gaurav Verma | Aparna Garimella | Balaji Vasan Srinivasan
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Author stylized rewriting is the task of rewriting an input text in a particular author’s style. Recent works in this area have leveraged Transformer-based language models in a denoising autoencoder setup to generate author stylized text without relying on a parallel corpus of data. However, these approaches are limited by the lack of explicit control of target attributes and being entirely data-driven. In this paper, we propose a Director-Generator framework to rewrite content in the target author’s style, specifically focusing on certain target attributes. We show that our proposed framework works well even with a limited-sized target author corpus. Our experiments on corpora consisting of relatively small-sized text authored by three distinct authors show significant improvements upon existing works to rewrite input texts in target author’s style. Our quantitative and qualitative analyses further show that our model has better meaning retention and results in more fluent generations.

EmpathBERT: A BERT-based Framework for Demographic-aware Empathy Prediction
Bhanu Prakash Reddy Guda | Aparna Garimella | Niyati Chhaya
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Affect preferences vary with user demographics, and tapping into demographic information provides important cues about the users’ language preferences. In this paper, we utilize the user demographics and propose EmpathBERT, a demographic-aware framework for empathy prediction based on BERT. Through several comparative experiments, we show that EmpathBERT surpasses traditional machine learning and deep learning models, and illustrate the importance of user demographics, for predicting empathy and distress in user responses to stimulative news articles. We also highlight the importance of affect information in the responses by developing affect-aware models to predict user demographic attributes.

He is very intelligent, she is very beautiful? On Mitigating Social Biases in Language Modelling and Generation
Aparna Garimella | Akhash Amarnath | Kiran Kumar | Akash Pramod Yalla | Anandhavelu N | Niyati Chhaya | Balaji Vasan Srinivasan
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

Domain-Aware Dependency Parsing for Questions
Aparna Garimella | Laura Chiticariu | Yunyao Li
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

ClauseRec: A Clause Recommendation Framework for AI-aided Contract Authoring
Vinay Aggarwal | Aparna Garimella | Balaji Vasan Srinivasan | Anandhavelu N | Rajiv Jain
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Contracts are a common type of legal document that frequent in several day-to-day business workflows. However, there has been very limited NLP research in processing such documents, and even lesser in generating them. These contracts are made up of clauses, and the unique nature of these clauses calls for specific methods to understand and generate such documents. In this paper, we introduce the task of clause recommendation, as a first step to aid and accelerate the authoring of contract documents. We propose a two-staged pipeline to first predict if a specific clause type is relevant to be added in a contract, and then recommend the top clauses for the given type based on the contract context. We pre-train BERT on an existing library of clauses with two additional tasks and use it for our prediction and recommendation. We experiment with classification methods and similarity-based heuristics for clause relevance prediction, and generation-based methods for clause recommendation, and evaluate the results from various methods on several clause types. We provide analyses on the results, and further outline the limitations and future directions of this line of research.

AUTOSUMM: Automatic Model Creation for Text Summarization
Sharmila Reddy Nangi | Atharv Tyagi | Jay Mundra | Sagnik Mukherjee | Raj Snehal | Niyati Chhaya | Aparna Garimella
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recent efforts to develop deep learning models for text generation tasks such as extractive and abstractive summarization have resulted in state-of-the-art performances on various datasets. However, obtaining the best model configuration for a given dataset requires an extensive knowledge of deep learning specifics like model architecture, tuning parameters etc., and is often extremely challenging for a non-expert. In this paper, we propose methods to automatically create deep learning models for the tasks of extractive and abstractive text summarization. Based on the recent advances in Automated Machine Learning and the success of large language models such as BERT and GPT-2 in encoding knowledge, we use a combination of Neural Architecture Search (NAS) and Knowledge Distillation (KD) techniques to perform model search and compression using the vast knowledge provided by these language models to develop smaller, customized models for any given dataset. We present extensive empirical results to illustrate the effectiveness of our model creation methods in terms of inference time and model size, while achieving near state-of-the-art performances in terms of accuracy across a range of datasets.


Understanding and Explicitly Measuring Linguistic and Stylistic Properties of Deception via Generation and Translation
Emily Saldanha | Aparna Garimella | Svitlana Volkova
Proceedings of the 13th International Conference on Natural Language Generation

Massive digital disinformation is one of the main risks of modern society. Hundreds of models and linguistic analyses have been done to compare and contrast misleading and credible content online. However, most models do not remove the confounding factor of a topic or narrative when training, so the resulting models learn a clear topical separation for misleading versus credible content. We study the feasibility of using two strategies to disentangle the topic bias from the models to understand and explicitly measure linguistic and stylistic properties of content from misleading versus credible content. First, we develop conditional generative models to create news content that is characteristic of different credibility levels. We perform multi-dimensional evaluation of model performance on mimicking both the style and linguistic differences that distinguish news of different credibility using machine translation metrics and classification models. We show that even though generative models are able to imitate both the style and language of the original content, additional conditioning on both the news category and the topic leads to reduced performance. In a second approach, we perform deception style “transfer” by translating deceptive content into the style of credible content and vice versa. Extending earlier studies, we demonstrate that, when conditioned on a topic, deceptive content is shorter, less readable, more biased, and more subjective than credible content, and transferring the style from deceptive to credible content is more challenging than the opposite direction.

“Judge me by my size (noun), do you?” YodaLib: A Demographic-Aware Humor Generation Framework
Aparna Garimella | Carmen Banea | Nabil Hossain | Rada Mihalcea
Proceedings of the 28th International Conference on Computational Linguistics

The subjective nature of humor makes computerized humor generation a challenging task. We propose an automatic humor generation framework for filling the blanks in Mad Libs® stories, while accounting for the demographic backgrounds of the desired audience. We collect a dataset consisting of such stories, which are filled in and judged by carefully selected workers on Amazon Mechanical Turk. We build upon the BERT platform to predict location-biased word fillings in incomplete sentences, and we fine-tune BERT to classify location-specific humor in a sentence. We leverage these components to produce YodaLib, a fully-automated Mad Libs style humor generation framework, which selects and ranks appropriate candidate words and sentences in order to generate a coherent and funny story tailored to certain demographics. Our experimental results indicate that YodaLib outperforms a previous semi-automated approach proposed for this task, while also surpassing human annotators in both qualitative and quantitative analyses.


Women’s Syntactic Resilience and Men’s Grammatical Luck: Gender-Bias in Part-of-Speech Tagging and Dependency Parsing
Aparna Garimella | Carmen Banea | Dirk Hovy | Rada Mihalcea
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Several linguistic studies have shown the prevalence of various lexical and grammatical patterns in texts authored by a person of a particular gender, but models for part-of-speech tagging and dependency parsing have still not adapted to account for these differences. To address this, we annotate the Wall Street Journal part of the Penn Treebank with the gender information of the articles’ authors, and build taggers and parsers trained on this data that show performance differences in text written by men and women. Further analyses reveal numerous part-of-speech tags and syntactic relations whose prediction performances benefit from the prevalence of a specific gender in the training data. The results underscore the importance of accounting for gendered differences in syntactic tasks, and outline future venues for developing more accurate taggers and parsers. We release our data to the research community.


Demographic-aware word associations
Aparna Garimella | Carmen Banea | Rada Mihalcea
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Variations of word associations across different groups of people can provide insights into people’s psychologies and their world views. To capture these variations, we introduce the task of demographic-aware word associations. We build a new gold standard dataset consisting of word association responses for approximately 300 stimulus words, collected from more than 800 respondents of different gender (male/female) and from different locations (India/United States), and show that there are significant variations in the word associations made by these groups. We also introduce a new demographic-aware word association model based on a neural net skip-gram architecture, and show how computational methods for measuring word associations that specifically account for writer demographics can outperform generic methods that are agnostic to such information.


pdf bib
Zooming in on Gender Differences in Social Media
Aparna Garimella | Rada Mihalcea
Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES)

Men are from Mars and women are from Venus - or so the genre of relationship literature would have us believe. But there is some truth in this idea, and researchers in fields as diverse as psychology, sociology, and linguistics have explored ways to better understand the differences between genders. In this paper, we take another look at the problem of gender discrimination and attempt to move beyond the typical surface-level text classification approach, by (1) identifying semantic and psycholinguistic word classes that reflect systematic differences between men and women and (2) finding differences between genders in the ways they use the same words. We describe several experiments and report results on a large collection of blogs authored by men and women.

Identifying Cross-Cultural Differences in Word Usage
Aparna Garimella | Rada Mihalcea | James Pennebaker
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Personal writings have inspired researchers in the fields of linguistics and psychology to study the relationship between language and culture to better understand the psychology of people across different cultures. In this paper, we explore this relation by developing cross-cultural word models to identify words with cultural bias – i.e., words that are used in significantly different ways by speakers from different cultures. Focusing specifically on two cultures: United States and Australia, we identify a set of words with significant usage differences, and further investigate these words through feature analysis and topic modeling, shedding light on the attributes of language that contribute to these differences.