Frame2KG: A Benchmark and Evaluation Toolkit for Interpretable Frame-to-Graph Generation

Lewis N. Watson, Carl Strathearn, Kenny Mitchell, Yanchao Yu


Abstract
Interpretable frame-to-knowledge-graph (Frame2KG) generation enables structured visual scene representation while supporting on-device inference to enhance privacy, improve interpretability, and minimise compute. We introduce Frame2KG-YC2, a synthetic, reproducible dataset derived from YouCook2 that pairs keyframes with schema-valid JSON knowledge graphs containing typed, spatially grounded entities and semantic predicates, alongside faithful textual paraphrases. Using this corpus, we fine-tune Qwen2.5-VL models (3B and 7B) with parameter-efficient LoRA adapters on attention layers (QKVO), with and without GateProj/Up/Down MLP projections. For evaluation and benchmarking, we propose a deterministic toolkit featuring two-stage node matching, an IoU gate followed by Hungarian assignment on blended spatial-semantic similarity, and comprehensive metrics spanning node/edge precision-recall-F1, matched-pair IoU, and structural validity. On a held-out test set, our models achieve Node F1μ up to 0.621 and Edge F1μ up to 0.208, with mean matched IoU of ≈0.61 and >98% schema conformity. We show that MLP gating consistently improves predicate accuracy and spatial grounding, while post-training quantisation maintains accuracy and improves deployability on edge hardware. We release the dataset, code, adapters, and evaluation toolkit to establish an open, interpretable baseline for future temporal and multi-view extensions.
Anthology ID:
2026.lrec-main.854
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
10912–10926
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.854/
DOI:
Bibkey:
Cite (ACL):
Lewis N. Watson, Carl Strathearn, Kenny Mitchell, and Yanchao Yu. 2026. Frame2KG: A Benchmark and Evaluation Toolkit for Interpretable Frame-to-Graph Generation. International Conference on Language Resources and Evaluation, main:10912–10926.
Cite (Informal):
Frame2KG: A Benchmark and Evaluation Toolkit for Interpretable Frame-to-Graph Generation (Watson et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.854.pdf