Pengyu Zeng
2025
FloorPlan-LLaMa: Aligning Architects’ Feedback and Domain Knowledge in Architectural Floor Plan Generation
Jun Yin
|
Pengyu Zeng
|
Haoyuan Sun
|
Yuqin Dai
|
Han Zheng
|
Miao Zhang
|
Yachao Zhang
|
Shuai Lu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Floor plans serve as a graphical language through which architects sketch and communicate their design ideas. Actually, in the Architecture, Engineering, and Construction (AEC) design stages, generating floor plans is a complex task requiring domain expertise and alignment with user requirements. However, existing evaluation methods for floor plan generation rely mainly on statistical metrics like FID, GED, and PSNR, which often fail to evaluate using domain knowledge. As a result, even high-performing models on these metrics struggle to generate viable floor plans in practice. To address this, (1) we propose ArchiMetricsNet, the first floor plan dataset that includes functionality, flow, and overall evaluation scores, along with detailed textual analyses. We trained FloorPlan-MPS (Multi-dimensional Preference Score) on it. (2) We develope FloorPlan-LLaMa, a floor plan generation model based on autoregressive framework. To integrate architects’ professional expertise and preferences, FloorPlan-MPS serves as the reward model during the RLHF (Reinforcement Learning from Human Feedback) process, aligning FP-LLaMa with the needs of the architectural community. (3) Comparative experiments demonstrate that our method outperforms baseline models in both text-conditional and class-conditional tasks. Validation by professional architects confirms that our approach yields more rational plans and aligns better with human preferences.
CARD: Cross-modal Agent Framework for Generative and Editable Residential Design
Pengyu Zeng
|
Jun Yin
|
Miao Zhang
|
Yuqin Dai
|
Jizhizi Li
|
ZhanXiang Jin
|
Shuai Lu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
In recent years, architectural design automation has made significant progress, but the complexity of open-world environments continues to make residential design a challenging task, often requiring experienced architects to perform multiple iterations and human-computer interactions. Therefore, assisting ordinary users in navigating these complex environments to generate and edit residential design is crucial. In this paper, we present the CARD framework, which leverages a system of specialized cross-modal agents to adapt to complex open-world environments. The framework includes a point-based cross-modal information representation (CMI-P) that encodes the geometry and spatial relationships of residential rooms, a cross-modal residential generation model, supported by our customized Text2FloorEdit model, that acts as the lead designer to create standardized floor plans, and an embedded expert knowledge base for evaluating whether the designs meet user requirements and residential codes, providing feedback accordingly. Finally, a 3D rendering module assists users in visualizing and understanding the layout. CARD enables cross-modal residential generation from free-text input, empowering users to adapt to complex environments without requiring specialized expertise.