Bamdev Mishra
2026
SAJA: A Simple Approach to Judge Alignment for LLM-as-a-Judge
Sneha Kola | Pankaj Kumar Sharma | Soumyadeep Dey | Bamdev Mishra | Mayur Datar
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Sneha Kola | Pankaj Kumar Sharma | Soumyadeep Dey | Bamdev Mishra | Mayur Datar
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
LLM-as-a-Judge systems are increasingly used to evaluate text at scale, yet production deployment demands low latency, minimal cost, and compatibility with closed-source APIs. Current approaches fall short in different ways: some require many LLM calls and per-dataset prompt tuning, others depend on logit access unavailable in commercial APIs, and yet others demand multiple rounds of LLM interaction for iterative feature discovery. We present **SAJA** (**S**imple **A**pproach to **J**udge **A**lignment), built on the principle that task-specific alignment should reside in a lightweight calibration head, not in elaborate prompts or model internals. SAJA makes exactly one LLM call per item using a fixed structured rubric prompt, extracts a multi-dimensional feature vector, and maps it to a human-aligned score via a calibration head trained on a small number of human labels. No iterative prompt search, no logit access, and no multi-round LLM interaction are needed. Yet SAJA matches far more complex systems across four evaluation paradigms: 86% F1 on MT-Bench pairwise preference (vs. 78% uncalibrated), competitive performance on five classification benchmarks with a single call, and +5.71% F1 over prompt-optimized baselines on proprietary data. Ablations confirm that multi-dimensional rubric features outperform one-dimensional calibration (SummEval 𝜌 improves from 0.60 to 0.74) and that coarse rubric outputs recover the same human alignment as full logit distributions (𝜌 = 0.36 vs. 0.37), establishing that logit access is unnecessary for calibrated judge alignment. Moreover, SAJA is model-agnostic: a 9B open-source model with SAJA (𝜌=0.70) surpasses raw GPT-4.1 (𝜌=0.60). Its single-call design yields up to 4.8× cost savings over per-question approaches.
2022
Generalised Spherical Text Embedding
Souvik Banerjee | Bamdev Mishra | Pratik Jawanpuria | Manish Shrivastava Shrivastava
Proceedings of the 19th International Conference on Natural Language Processing (ICON)
Souvik Banerjee | Bamdev Mishra | Pratik Jawanpuria | Manish Shrivastava Shrivastava
Proceedings of the 19th International Conference on Natural Language Processing (ICON)
This paper aims to provide an unsupervised modelling approach that allows for a more flexible representation of text embeddings. It jointly encodes the words and the paragraphs as individual matrices of arbitrary column dimension with unit Frobenius norm. The representation is also linguistically motivated with the introduction of a metric for the ambient space in which we train the embeddings that calculates the similarity between matrices of unequal number of columns. Thus, the proposed modelling and the novel similarity metric exploits the matrix structure of embeddings. We then go on to show that the same matrices can be reshaped into vectors of unit norm and transform our problem into an optimization problem in a spherical manifold for optimization simplicity. Given the total number of matrices we are dealing with, which is equal to the vocab size plus the total number of documents in the corpus, this makes the training of an otherwise expensive non-linear model extremely efficient. We also quantitatively verify the quality of our text embeddings by showing that they demonstrate improved results in document classification, document clustering and semantic textual similarity benchmark tests.
2020
Learning Geometric Word Meta-Embeddings
Pratik Jawanpuria | Satya Dev N T V | Anoop Kunchukuttan | Bamdev Mishra
Proceedings of the 5th Workshop on Representation Learning for NLP
Pratik Jawanpuria | Satya Dev N T V | Anoop Kunchukuttan | Bamdev Mishra
Proceedings of the 5th Workshop on Representation Learning for NLP
We propose a geometric framework for learning meta-embeddings of words from different embedding sources. Our framework transforms the embeddings into a common latent space, where, for example, simple averaging or concatenation of different embeddings (of a given word) is more amenable. The proposed latent space arises from two particular geometric transformations - source embedding specific orthogonal rotations and a common Mahalanobis metric scaling. Empirical results on several word similarity and word analogy benchmarks illustrate the efficacy of the proposed framework.
A Simple Approach to Learning Unsupervised Multilingual Embeddings
Pratik Jawanpuria | Mayank Meghwanshi | Bamdev Mishra
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Pratik Jawanpuria | Mayank Meghwanshi | Bamdev Mishra
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Recent progress on unsupervised cross-lingual embeddings in the bilingual setting has given the impetus to learning a shared embedding space for several languages. A popular framework to solve the latter problem is to solve the following two sub-problems jointly: 1) learning unsupervised word alignment between several language pairs, and 2) learning how to map the monolingual embeddings of every language to shared multilingual space. In contrast, we propose a simple approach by decoupling the above two sub-problems and solving them separately, one after another, using existing techniques. We show that this proposed approach obtains surprisingly good performance in tasks such as bilingual lexicon induction, cross-lingual word similarity, multilingual document classification, and multilingual dependency parsing. When distant languages are involved, the proposed approach shows robust behavior and outperforms existing unsupervised multilingual word embedding approaches.
Geometry-aware domain adaptation for unsupervised alignment of word embeddings
Pratik Jawanpuria | Mayank Meghwanshi | Bamdev Mishra
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Pratik Jawanpuria | Mayank Meghwanshi | Bamdev Mishra
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
We propose a novel manifold based geometric approach for learning unsupervised alignment of word embeddings between the source and the target languages. Our approach formulates the alignment learning problem as a domain adaptation problem over the manifold of doubly stochastic matrices. This viewpoint arises from the aim to align the second order information of the two language spaces. The rich geometry of the doubly stochastic manifold allows to employ efficient Riemannian conjugate gradient algorithm for the proposed formulation. Empirically, the proposed approach outperforms state-of-the-art optimal transport based approach on the bilingual lexicon induction task across several language pairs. The performance improvement is more significant for distant language pairs.
2019
Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach
Pratik Jawanpuria | Arjun Balgovind | Anoop Kunchukuttan | Bamdev Mishra
Transactions of the Association for Computational Linguistics, Volume 7
Pratik Jawanpuria | Arjun Balgovind | Anoop Kunchukuttan | Bamdev Mishra
Transactions of the Association for Computational Linguistics, Volume 7
We propose a novel geometric approach for learning bilingual mappings given monolingual embeddings and a bilingual dictionary. Our approach decouples the source-to-target language transformation into (a) language-specific rotations on the original embeddings to align them in a common, latent space, and (b) a language-independent similarity metric in this common space to better model the similarity between the embeddings. Overall, we pose the bilingual mapping problem as a classification problem on smooth Riemannian manifolds. Empirically, our approach outperforms previous approaches on the bilingual lexicon induction and cross-lingual word similarity tasks. We next generalize our framework to represent multiple languages in a common latent space. Language-specific rotations for all the languages and a common similarity metric in the latent space are learned jointly from bilingual dictionaries for multiple language pairs. We illustrate the effectiveness of joint learning for multiple languages in an indirect word translation setting.