Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment

Hannan Cao; Hai Ye; Hwee Tou Ng

Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment

Abstract

A Writing Assistant (WA) is a system that offers writing suggestions based on user instructions. Existing WAs are typically built by training large language models (LLMs) on domain-specific instruction data through supervised fine-tuning (SFT) only. However, SFT optimizes models to match a single reference, failing to capture the inherent flexibility of text editing, where multiple valid revisions exist. Therefore, solely relying on SFT limits WA performance. To address this limitation, we propose the Rationalize and Align framework, which enhances the WA performance with rationale (i.e., linguistic explanations) and alignment. Our framework automatically generates the rationale and preference data for writing tasks via distillation and self-training, eliminating the need for human annotation. These data are then leveraged to refine WA using a novel preference optimization method. Empirical results show that our framework significantly improves WA performance. Our WA outperforms both open-source state-of-the-art WAs and the closed-source GPT-4o by 3.9 and 7.1 points on average, respectively, across eight well-established writing-related test sets.

Anthology ID:: 2025.findings-acl.1383
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 26967–26982
Language:
URL:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.1383/
DOI:
Bibkey:
Cite (ACL):: Hannan Cao, Hai Ye, and Hwee Tou Ng. 2025. Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment. In Findings of the Association for Computational Linguistics: ACL 2025, pages 26967–26982, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment (Cao et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.1383.pdf

PDF Cite Search Fix data