Enhancing User-Controlled Text-to-Image Generation with Layout-Aware Personalization

Hongliang Luo; Wei Xi

Enhancing User-Controlled Text-to-Image Generation with Layout-Aware Personalization

Abstract

Recent diffusion-based models have advanced text-to-image synthesis, yet struggle to preserve fine visual details and enable precise spatial control in personalized content. We propose **LayoutFlex**, a novel framework that combines a Perspective-Adaptive Feature Extraction system with a Spatial Control Mechanism. Our approach captures fine-grained details via cross-modal representation learning and attention refinement, while enabling precise subject placement through coordinate-aware attention and region-constrained optimization. Experiments show LayoutFlex outperforms prior methods in visual fidelity (DINO ↑10.8%) and spatial accuracy (AP 43.1±1.2 vs. 19.3). LayoutFlex supports both single and multi-subject personalization, offering a powerful solution for controllable and coherent image generation in creative and interactive applications.

Anthology ID:: 2025.acl-long.1556
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 32349–32364
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1556/
DOI:
Bibkey:
Cite (ACL):: Hongliang Luo and Wei Xi. 2025. Enhancing User-Controlled Text-to-Image Generation with Layout-Aware Personalization. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 32349–32364, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Enhancing User-Controlled Text-to-Image Generation with Layout-Aware Personalization (Luo & Xi, ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1556.pdf

PDF Cite Search Fix data