Madhavan Seshadri
2026
Reasoners or Translators? Contamination-aware Evaluation and Neuro-Symbolic Robustness on Tax Law
Parisa Kordjamshidi | Samer Aslan | Madhavan Seshadri | Leslie Barrett | Enrico Santus
Proceedings of the First Workshop on Structured Understanding, Retrieval, and Generation in the LLM Era (SURGeLLM 2026)
Parisa Kordjamshidi | Samer Aslan | Madhavan Seshadri | Leslie Barrett | Enrico Santus
Proceedings of the First Workshop on Structured Understanding, Retrieval, and Generation in the LLM Era (SURGeLLM 2026)
Recent advances in large language models (LLMs) have significantly enhanced automated legal reasoning. Yet, it remains unclear whether their performance reflects genuine legal reasoning ability or artifacts of data contamination. We present a comprehensive empirical study of tax law reasoning approaches and implement a contamination detection protocol to rigorously assess LLM reliability. We show that performance can be inflated by contamination. Building on this analysis, we conduct a systematic evaluation, comparing monolithic LLMs with hybrid systems that translate statutory text into formal representations and delegate inference to symbolic solvers. We build a novel test suite designed to probe generalization to unseen documents via case and rule variations. Our findings indicate that legal reasoning is inherently compositional and that neuro-symbolic frameworks offer a more reliable and robust foundation for legal AI, as well as improved generalization to unobserved situations.
2022
A Lightweight Yet Robust Approach to Textual Anomaly Detection
Leslie Barrett | Robert Kingan | Alexandra Ortan | Madhavan Seshadri
Proceedings of the Third Workshop on Threat, Aggression and Cyberbullying (TRAC 2022)
Leslie Barrett | Robert Kingan | Alexandra Ortan | Madhavan Seshadri
Proceedings of the Third Workshop on Threat, Aggression and Cyberbullying (TRAC 2022)
Highly imbalanced textual datasets continue to pose a challenge for supervised learning models. However, viewing such imbalanced text data as an anomaly detection (AD) problem has advantages for certain tasks such as detecting hate speech, or inappropriate and/or offensive language in large social media feeds. There the unwanted content tends to be both rare and non-uniform with respect to its thematic character, and better fits the definition of an anomaly than a class. Several recent approaches to textual AD use transformer models, achieving good results but with trade-offs in pre-training and inflexibility with respect to new domains. In this paper we compare two linear models within the NMF family, which also have a recent history in textual AD. We introduce a new approach based on an alternative regularization of the NMF objective. Our results surpass other linear AD models and are on par with deep models, performing comparably well even in very small outlier concentrations.