Guy Rotman
2026
Distilling Examples into Task Instructions: Enhanced In-Context Learning for Real-World B2B Conversations
Guy Rotman | Adi Kopilov | Danit Berger Zalmanson | Omri Allouche
Findings of the Association for Computational Linguistics: ACL 2026
Guy Rotman | Adi Kopilov | Danit Berger Zalmanson | Omri Allouche
Findings of the Association for Computational Linguistics: ACL 2026
In-context learning (ICL) is the standard method for low-resource classification, yet its efficacy in specialized domains remains largely unexplored. We address the challenge of classifying semantically complex, multi-party B2B conversations, where traditional ICL encounters significant limitations, especially as context length increases due to the concatenation of multiple few-shot examples. We introduce the Call Playbook dataset, featuring five classification tasks derived from real-world B2B conversations targeting core sales concepts. To bridge the gap between performance and practical utility, we propose novel knowledge extraction methods that distill verbose examples into compact, interpretable representations of structured classification criteria and precise task descriptions. Our approach achieves a 99% reduction in token usage and improves macro-averaged AUC by up to 7% over traditional ICL. Notably, it remains robust as context grows, unlike advanced token compression baselines which degrade by over 9 F1 points. Importantly, our framework enables direct refinement of classification logic, addressing critical needs for transparency, efficiency, and user interaction in real-world NLP applications.
2022
Multi-task Active Learning for Pre-trained Transformer-based Models
Guy Rotman | Roi Reichart
Transactions of the Association for Computational Linguistics, Volume 10
Guy Rotman | Roi Reichart
Transactions of the Association for Computational Linguistics, Volume 10
Multi-task learning, in which several tasks are jointly learned by a single model, allows NLP models to share information from multiple annotations and may facilitate better predictions when the tasks are inter-related. This technique, however, requires annotating the same text with multiple annotation schemes, which may be costly and laborious. Active learning (AL) has been demonstrated to optimize annotation processes by iteratively selecting unlabeled examples whose annotation is most valuable for the NLP model. Yet, multi-task active learning (MT-AL) has not been applied to state-of-the-art pre-trained Transformer-based NLP models. This paper aims to close this gap. We explore various multi-task selection criteria in three realistic multi-task scenarios, reflecting different relations between the participating tasks, and demonstrate the effectiveness of multi-task compared to single-task selection. Our results suggest that MT-AL can be effectively used in order to minimize annotation efforts for multi-task NLP models.1
Designing an Automatic Agent for Repeated Language–based Persuasion Games
Maya Raifer | Guy Rotman | Reut Apel | Moshe Tennenholtz | Roi Reichart
Transactions of the Association for Computational Linguistics, Volume 10
Maya Raifer | Guy Rotman | Reut Apel | Moshe Tennenholtz | Roi Reichart
Transactions of the Association for Computational Linguistics, Volume 10
Persuasion games are fundamental in economics and AI research and serve as the basis for important applications. However, work on this setup assumes communication with stylized messages that do not consist of rich human language. In this paper we consider a repeated sender (expert) – receiver (decision maker) game, where the sender is fully informed about the state of the world and aims to persuade the receiver to accept a deal by sending one of several possible natural language reviews. We design an automatic expert that plays this repeated game, aiming to achieve the maximal payoff. Our expert is implemented within the Monte Carlo Tree Search (MCTS) algorithm, with deep learning models that exploit behavioral and linguistic signals in order to predict the next action of the decision maker, and the future payoff of the expert given the state of the game and a candidate review. We demonstrate the superiority of our expert over strong baselines and its adaptability to different decision makers and potential proposed deals.1
2021
Model Compression for Domain Adaptation through Causal Effect Estimation
Guy Rotman | Amir Feder | Roi Reichart
Transactions of the Association for Computational Linguistics, Volume 9
Guy Rotman | Amir Feder | Roi Reichart
Transactions of the Association for Computational Linguistics, Volume 9
Recent improvements in the predictive quality of natural language processing systems are often dependent on a substantial increase in the number of model parameters. This has led to various attempts of compressing such models, but existing methods have not considered the differences in the predictive power of various model components or in the generalizability of the compressed models. To understand the connection between model compression and out-of-distribution generalization, we define the task of compressing language representation models such that they perform best in a domain adaptation setting. We choose to address this problem from a causal perspective, attempting to estimate the average treatment effect (ATE) of a model component, such as a single layer, on the model’s predictions. Our proposed ATE-guided Model Compression scheme (AMoC), generates many model candidates, differing by the model components that were removed. Then, we select the best candidate through a stepwise regression model that utilizes the ATE to predict the expected performance on the target domain. AMoC outperforms strong baselines on dozens of domain pairs across three text classification and sequence tagging tasks.1
Proceedings of the Second Workshop on Domain Adaptation for NLP
Eyal Ben-David | Shay Cohen | Ryan McDonald | Barbara Plank | Roi Reichart | Guy Rotman | Yftah Ziser
Proceedings of the Second Workshop on Domain Adaptation for NLP
Eyal Ben-David | Shay Cohen | Ryan McDonald | Barbara Plank | Roi Reichart | Guy Rotman | Yftah Ziser
Proceedings of the Second Workshop on Domain Adaptation for NLP
2019
Deep Contextualized Self-training for Low Resource Dependency Parsing
Guy Rotman | Roi Reichart
Transactions of the Association for Computational Linguistics, Volume 7
Guy Rotman | Roi Reichart
Transactions of the Association for Computational Linguistics, Volume 7
Neural dependency parsing has proven very effective, achieving state-of-the-art results on numerous domains and languages. Unfortunately, it requires large amounts of labeled data, which is costly and laborious to create. In this paper we propose a self-training algorithm that alleviates this annotation bottleneck by training a parser on its own output. Our Deep Contextualized Self-training (DCST) algorithm utilizes representation models trained on sequence labeling tasks that are derived from the parser’s output when applied to unlabeled data, and integrates these models with the base parser through a gating mechanism. We conduct experiments across multiple languages, both in low resource in-domain and in cross-domain setups, and demonstrate that DCST substantially outperforms traditional self-training as well as recent semi-supervised training methods.1
2018
Bridging Languages through Images with Deep Partial Canonical Correlation Analysis
Guy Rotman | Ivan Vulić | Roi Reichart
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Guy Rotman | Ivan Vulić | Roi Reichart
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
We present a deep neural network that leverages images to improve bilingual text embeddings. Relying on bilingual image tags and descriptions, our approach conditions text embedding induction on the shared visual information for both languages, producing highly correlated bilingual embeddings. In particular, we propose a novel model based on Partial Canonical Correlation Analysis (PCCA). While the original PCCA finds linear projections of two views in order to maximize their canonical correlation conditioned on a shared third variable, we introduce a non-linear Deep PCCA (DPCCA) model, and develop a new stochastic iterative algorithm for its optimization. We evaluate PCCA and DPCCA on multilingual word similarity and cross-lingual image description retrieval. Our models outperform a large variety of previous methods, despite not having access to any visual signal during test time inference.