Julian Nyarko
2025
Not Your Typical Government Tipline: LLM-Assisted Routing of Environmental Protection Agency Citizen Tips
Sharanya Majumder
|
Zehua Li
|
Derek Ouyang
|
Kit T Rodolfa
|
Elena Eneva
|
Julian Nyarko
|
Daniel E. Ho
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Regulatory agencies often operate with limited resources and rely on tips from the public to identify potential violations. However, processing these tips at scale presents significant operational challenges, as agencies must correctly identify and route relevant tips to the appropriate enforcement divisions. Through a case study, we demonstrate how advances in large language models can be utilized to support overburdened agencies with limited capacities. In partnership with the U.S. Environmental Protection Agency, we leverage previously unstudied citizen tips data from their “Report a Violation” system to develop an LLM-assisted pipeline for tip routing. Our approach filters out 80.5% of irrelevant tips and increases overall routing accuracy from 31.8% to 82.4% compared to the current routing system. At a time of increased focus on government efficiencies, our approach provides a constructive path forward by using technology to empower civil servants.
Identifying Emerging Concepts in Large Corpora
Sibo Ma
|
Julian Nyarko
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
We introduce a new method to identify emerging concepts in large text corpora. By analyzing changes in the heatmaps of the underlying embedding space, we are able to detect these concepts with high accuracy shortly after they originate, in turn outperforming common alternatives. We further demonstrate the utility of our approach by analyzing speeches in the U.S. Senate from 1941 to 2015. Our results suggest that the minority party is more active in introducing new concepts into the Senate discourse. We also identify specific concepts that closely correlate with the Senators’ racial, ethnic, and gender identities. An implementation of our method is publicly available.
Search
Fix author
Co-authors
- Elena Eneva 1
- Daniel E. Ho 1
- Zehua Li 1
- Sibo Ma 1
- Sharanya Majumder 1
- show all...