A Budget Recipe for Finetuning a Long-form Legal Summarization Model

Chompakorn Chaksangchaichot; Pawitsapak Akarajaradwong

A Budget Recipe for Finetuning a Long-form Legal Summarization Model

Chompakorn Chaksangchaichot, Pawitsapak Akarajaradwong

Abstract

We describe an inexpensive system that ranked first in the JUST-NLP 2025 L-SUMM task, summarizing very long Indian court judgments (up to 857k characters) using a single 80GB GPU and a total budget of about $50. Our pipeline first filters out length–summary outliers, then applies two-stage LoRA SFT on Qwen3-4B-Instruct-2507 to learn style and extend context, and finally runs RLVR tuned to BLEU, ROUGE-2, and ROUGE-L, with BLEU upweighted. We showed that two-stage SFT is better than a single-stage run, and RLVR gives the largest gains, reaching 32.71 internal vs. 16.15 base and 29.91 on the test leaderboard. In ablation on prompting, we find that a simple, naive prompt converges faster but saturates earlier, while the curated legal-structured prompt keeps improving with longer training and yields higher final scores, and the finetuned model remains fairly robust to unseen prompts. Our code are fully open-sourced, available for reproducibility.

Anthology ID:: 2025.justnlp-main.11
Volume:: Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
Month:: December
Year:: 2025
Address:: Mumbai, India
Editors:: Ashutosh Modi, Saptarshi Ghosh, Asif Ekbal, Pawan Goyal, Sarika Jain, Abhinav Joshi, Shivani Mishra, Debtanu Datta, Shounak Paul, Kshetrimayum Boynao Singh, Sandeep Kumar
Venues:: JUSTNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 113–120
Language:
URL:: https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.justnlp-main.11/
DOI:
Bibkey:
Cite (ACL):: Chompakorn Chaksangchaichot and Pawitsapak Akarajaradwong. 2025. A Budget Recipe for Finetuning a Long-form Legal Summarization Model. In Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025), pages 113–120, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):: A Budget Recipe for Finetuning a Long-form Legal Summarization Model (Chaksangchaichot & Akarajaradwong, JUSTNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.justnlp-main.11.pdf

PDF Cite Search Fix data