Advait Deshmukh
2026
A Structured Clustering Approach for Inducing Media Narratives
Rohan Das | Advait Deshmukh | Alexandria Leto | Zohar Naaman | I-Ta Lee | Maria Leonor Pacheco
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Rohan Das | Advait Deshmukh | Alexandria Leto | Zohar Naaman | I-Ta Lee | Maria Leonor Pacheco
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Media narratives wield tremendous power in shaping public opinion, yet computational approaches struggle to capture the nuanced storytelling structures that communication theory emphasizes as central to how meaning is constructed. Existing approaches either miss subtle narrative patterns through coarse-grained analysis or require domain-specific taxonomies that limit scalability. To bridge this gap, we present a framework for inducing rich narrative schemas by jointly modeling events and characters via structured clustering. Our approach produces explainable narrative schemas that align with established framing theory while scaling to large corpora without exhaustive manual annotation.
2025
All Entities are Not Created Equal: Examining the Long Tail for Ultra-Fine Entity Typing
Advait Deshmukh | Ashwin Umadi | Dananjay Srinivas | Maria Leonor Pacheco
Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025)
Advait Deshmukh | Ashwin Umadi | Dananjay Srinivas | Maria Leonor Pacheco
Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025)
Due to their capacity to acquire world knowledge from large corpora, pre-trained language models (PLMs) are extensively used in ultra-fine entity typing tasks where the space of labels is extremely large. In this work, we explore the limitations of the knowledge acquired by PLMs by proposing a novel heuristic to approximate the pre-training distribution of entities when the pre-training data is unknown. Then, we systematically demonstrate that entity-typing approaches that rely solely on the parametric knowledge of PLMs struggle significantly with entities at the long tail of the pre-training distribution, and that knowledge-infused approaches can account for some of these shortcomings. Our findings suggest that we need to go beyond PLMs to produce solutions that perform well for infrequent entities.