Asnat Greenstein-Messica


2025

pdf bib
JSON Whisperer: Efficient JSON Editing with LLMs
Sarel Duanis | Asnat Greenstein-Messica | Eliya Habba
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track

Large language models (LLMs) can modify JSON documents through natural language commands, but current approaches regenerate entire structures for each edit, resulting in computational inefficiency. We present JSON Whisperer, a framework that enables LLMs to generate RFC 6902 diff patches-expressing only the necessary modifications-rather than complete documents.We identify two key challenges in patch-based editing: (1) LLMs often miss related updates when generating isolated patches, and (2) array manipulations require tracking index shifts across operations, which LLMs handle poorly. To address these issues, we introduce EASE (Explicitly Addressed Sequence Encoding), which transforms arrays into dictionaries with stable keys, eliminating index arithmetic complexities.Our evaluation shows that patch generation with EASE reduces token usage by 31% while maintaining edit quality within 5% of full regeneration with particular gains for complex instructions and list manipulations.

2024

pdf bib
Visual Editing with LLM-based Tool Chaining: An Efficient Distillation Approach for Real-Time Applications
Oren Sultan | Alexander Khasin | Guy Shiran | Asnat Greenstein-Messica | Dafna Shahaf
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

We present a practical distillation approach to fine-tune LLMs for invoking tools in real-time applications. We focus on visual editing tasks; specifically, we modify images and videos by interpreting user stylistic requests, specified in natural language (“golden hour”), using an LLM to select the appropriate tools and their parameters to achieve the desired visual effect.We found that proprietary LLMs such as GPT-3.5-Turbo show potential in this task, but their high cost and latency make them unsuitable for real-time applications.In our approach, we fine-tune a (smaller) student LLM with guidance from a (larger) teacher LLM and behavioral signals.We introduce offline metrics to evaluate student LLMs. Both online and offline experiments show that our student models manage to match the performance of our teacher model (GPT-3.5-Turbo), significantly reducing costs and latency.Lastly, we show that fine-tuning was improved by 25% in low-data regimes using augmentation.