Rishi Ravikumar

2026

ReciFine: Finely Annotated Recipe Dataset for Controllable Recipe Generation
Nuhu Ibrahim | Rishi Ravikumar | Robert Stevens | Riza Batista-Navarro
Findings of the Association for Computational Linguistics: EACL 2026

We introduce ReciFine, the largest human-evaluated, finely annotated recipe dataset to date, designed to advance controllable and trustworthy recipe generation. Existing resources, such as RecipeNLG, extract food items only from ingredient lists, overlooking entities expressed in instructions, including tools, chef actions, food and tool states, and durations, which are crucial for realistic and context-aware generation. To address this limitation, we extend RecipeNLG with finely annotated extraction of over 97 million entities across ten entity types from 2.2 million recipes. We are the first to explore recipe generation with explicit control over multiple entity types, enabling models to generate recipes conditioned not only on ingredients but also on tools, chef actions, cooking durations, and other contextual factors. Large language models fine-tuned or few-shot prompted with ReciFine extractions consistently outperform those trained on ingredient-list data alone across both automatic and human evaluations. ReciFine establishes a foundation for factual, coherent, structured, controllable recipe generation, and we release a human-annotated benchmark to support future evaluation and model development.

pdf bib abs

When Tasks Share Structure: A Comparative Study of Training Strategies for Generative Event Extraction
Rishi Ravikumar | Riza Batista-Navarro
Proceedings of the 9th Workshop on Event Extraction and Understanding: Challenges and Applications (EEUCA 2026)

Event extraction requires performing two interdependent subtasks: event detection and event argument extraction. While prior work has explored pipelined and joint training approaches, the question of how best to coordinate training across these subtasks in generative LLM-based systems remains open. We present a systematic study comparing three training paradigms: disjoint, fully shared and hybrid weight allocation, instantiated as eight concrete strategies and evaluated on ACE2005 and RichERE across multiple instruction-tuned LLMs. Our findings show that training strategy has a consistent and meaningful effect on extraction accuracy, and that a clear best-performing strategy emerges across models and benchmarks. We believe that these findings could extend beyond event extraction to other information extraction tasks that decompose into interdependent subtasks.

pdf bib abs

Lost in Formatting: How Output Formats Skew LLM Performance on Information Extraction
Rishi Ravikumar | Nuhu Ibrahim | Riza Batista-Navarro
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

We investigate how the choice of output format influences the performance of fine-tuned large language models on information extraction tasks. Based on over 280 experiments spanning multiple benchmarks, models and formats, we find that output formatting is a critical yet largely overlooked hyperparameter. Remarkably, in some cases, changing only the output format shifts F1 scores by over 40% despite using the same model. We further observe that no single format consistently dominates across settings, and the optimal choice depends on factors like model family and dataset characteristics. Overall, these results demonstrate that informationally equivalent output formats can produce substantial performance variation, highlighting the need to treat output formatting as a key factor in building accurate and reliable information extraction systems.

Co-authors

Venues

Fix author