Suman Saha
2026
Reference-Free Schema Generation for Literature Review Tables via Multi-Faceted Rewards
Sinjoy Saha | Suman Saha | Mahfuza Farooque | Wenpeng Yin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Sinjoy Saha | Suman Saha | Mahfuza Farooque | Wenpeng Yin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
To accelerate scientific knowledge acquisition, LLMs are increasingly used to synthesize multiple papers into structured tables by inferring schemas and values. While value generation within a fixed schema can often be reduced to extractive question answering, the schema generation problem, determining which dimensions to compare a set of documents, lacks a formal mapping to standard NLP tasks. In this work, we formulate schema generation as a reinforcement learning problem and investigate whether these dimensions can be induced without access to gold-standard schemas. We design a multi-faceted reward framework capturing schema coverage, non-redundancy, relevance, and format, and train a small language model on a literature review dataset. Our approach yields consistent improvements over the untuned base model across intrinsic, reference-based, and LLM-judge metrics, and remains competitive with supervised fine-tuned models at 5× the parameter count on structural and diversity dimensions. All code, results and prompts are available in the GitHub repository: https://github.com/sinjoysaha/rl-schema-generation