Ofri Hefetz


2025

pdf bib
Easy as PIE? Identifying Multi-Word Expressions with LLMs
Kai Golan Hashiloni | Ofri Hefetz | Kfir Bar
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

We investigate the identification of idiomatic expressions—a semantically non-compositional subclass of multiword expressions (MWEs)—in running text using large language models (LLMs) without any fine-tuning. Instead, we adopt a prompt-based approach and evaluate a range of prompting strategies, including zero-shot, few-shot, and chain-of-thought variants, across multiple languages, datasets, and model types. Our experiments show that, with well-crafted prompts, LLMs can perform competitively with supervised models trained on annotated data. These findings highlight the potential of prompt-based LLMs as a flexible and effective alternative for idiomatic expression identification.

pdf bib
Not Just a Piece of Cake: Cross-Lingual Fine-Tuning for Idiom Identification
Ofri Hefetz | Kai Golan Hashiloni | Alon Mannor | Kfir Bar
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

We investigate cross-lingual fine-tuning for idiomatic expression identification, addressing the limited availability of annotated data in many languages. We evaluate encoder and generative decoder models to examine their ability to generalize idiom identification across languages. Additionally, we conduct an explainability study using linear probing and LogitLens to analyze how idiomatic meaning is represented across model layers. Results show consistent cross-lingual transfer, with English emerging as a strong source language. All code and models are released to support future research.