Elena Eneva


2025

pdf bib
Not Your Typical Government Tipline: LLM-Assisted Routing of Environmental Protection Agency Citizen Tips
Sharanya Majumder | Zehua Li | Derek Ouyang | Kit T Rodolfa | Elena Eneva | Julian Nyarko | Daniel E. Ho
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Regulatory agencies often operate with limited resources and rely on tips from the public to identify potential violations. However, processing these tips at scale presents significant operational challenges, as agencies must correctly identify and route relevant tips to the appropriate enforcement divisions. Through a case study, we demonstrate how advances in large language models can be utilized to support overburdened agencies with limited capacities. In partnership with the U.S. Environmental Protection Agency, we leverage previously unstudied citizen tips data from their “Report a Violation” system to develop an LLM-assisted pipeline for tip routing. Our approach filters out 80.5% of irrelevant tips and increases overall routing accuracy from 31.8% to 82.4% compared to the current routing system. At a time of increased focus on government efficiencies, our approach provides a constructive path forward by using technology to empower civil servants.

2019

pdf bib
Context-specific Language Modeling for Human Trafficking Detection from Online Advertisements
Saeideh Shahrokh Esfahani | Michael J. Cafarella | Maziyar Baran Pouyan | Gregory DeAngelo | Elena Eneva | Andy E. Fano
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Human trafficking is a worldwide crisis. Traffickers exploit their victims by anonymously offering sexual services through online advertisements. These ads often contain clues that law enforcement can use to separate out potential trafficking cases from volunteer sex advertisements. The problem is that the sheer volume of ads is too overwhelming for manual processing. Ideally, a centralized semi-automated tool can be used to assist law enforcement agencies with this task. Here, we present an approach using natural language processing to identify trafficking ads on these websites. We propose a classifier by integrating multiple text feature sets, including the publicly available pre-trained textual language model Bi-directional Encoder Representation from transformers (BERT). In this paper, we demonstrate that a classifier using this composite feature set has significantly better performance compared to any single feature set alone.

2001

pdf bib
Learning Within-Sentence Semantic Coherence
Elena Eneva | Rose Hoberman | Lucian Lita
Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing