ReciFine: Finely Annotated Recipe Dataset for Controllable Recipe Generation
Nuhu Ibrahim, Rishi Ravikumar, Robert Stevens, Riza Batista-Navarro
Abstract
We introduce ReciFine, the largest human-evaluated, finely annotated recipe dataset to date, designed to advance controllable and trustworthy recipe generation. Existing resources, such as RecipeNLG, extract food items only from ingredient lists, overlooking entities expressed in instructions, including tools, chef actions, food and tool states, and durations, which are crucial for realistic and context-aware generation. To address this limitation, we extend RecipeNLG with finely annotated extraction of over 97 million entities across ten entity types from 2.2 million recipes. We are the first to explore recipe generation with explicit control over multiple entity types, enabling models to generate recipes conditioned not only on ingredients but also on tools, chef actions, cooking durations, and other contextual factors. Large language models fine-tuned or few-shot prompted with ReciFine extractions consistently outperform those trained on ingredient-list data alone across both automatic and human evaluations. ReciFine establishes a foundation for factual, coherent, structured, controllable recipe generation, and we release a human-annotated benchmark to support future evaluation and model development.- Anthology ID:
- 2026.findings-eacl.210
- Volume:
- Findings of the Association for Computational Linguistics: EACL 2026
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Vera Demberg, Kentaro Inui, Lluís Marquez
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4058–4074
- Language:
- URL:
- https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.210/
- DOI:
- Cite (ACL):
- Nuhu Ibrahim, Rishi Ravikumar, Robert Stevens, and Riza Batista-Navarro. 2026. ReciFine: Finely Annotated Recipe Dataset for Controllable Recipe Generation. In Findings of the Association for Computational Linguistics: EACL 2026, pages 4058–4074, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- ReciFine: Finely Annotated Recipe Dataset for Controllable Recipe Generation (Ibrahim et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.210.pdf