Haodong Wu


2025

pdf bib
ArchiDocGen: Multi-Agent Framework for Expository Document Generation in the Architectural Industry
Junjie Jiang | Haodong Wu | Yongqi Zhang | Songyue Guo | Bingcen Liu | Caleb Chen Cao | Ruizhe Shao | Chao Guan | Peng Xu | Lei Chen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)

The architectural industry produces extensive documents, including method statements—expository documents that integrate multi-source data into actionable guidance. Manual drafting however is labor-intensive and time-consuming. This paper introduces ArchiDocGen, a multi-agent framework automating method statement generation. Unlike traditional approaches relying on static templates or single-pass generation, ArchiDocGen decomposes the task into three steps: outline generation, section-based content generation, and polishing, each handled by specialized agents. To provide domain expertise, ArchiDocGen employs a section-based retriever to fetch and synthesize relevant documents from its custom knowledge base. Each section is generated through iterative reasoning of a section-based chain-of-thought (SeCoT) scheme, followed by refinement to meet professional standards. To evaluate the generated method statements, we partner with the industry to establish a multi-dimensional evaluation system by combining automatic and empirical methods. Experiments show that ArchiDocGen achieves 4.38 ContentScore, excelling in specialization, completeness, organization, and clarity. Additionally, a web-based application for ArchiDocGen is developed and deployed with industry partners.

2024

pdf bib
Empirical Prior for Text Autoencoders
Yongjing Yin | Wenyang Gao | Haodong Wu | Jianhao Yan | Yue Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024

This paper explores the application of Variational Autoencoders (VAE) in text generation, focusing on overcoming challenges like posterior collapse and the limitations of simplistic prior distributions. We investigate a transition from VAE to text autoencoders (AE), which model a compact latent space and preserves the capability of the language model itself. Our method involves layer-wise latent vectors regularized by orthogonal constraints to encourage distinct semantic spaces. In particular, we estimate an empirical prior online from the learned latent vectors to support sampling during generation like VAE. Experimental results on standard benchmarks demonstrate that the autoencoders generate higher quality and more diverse text than the VAE-based Transformer baselines, offering an effective alternative for generative language modeling.

1999

pdf bib
Determining the Antecedent of Noun Phrase Containing the Determiner KONO or SONO in Japanese
Toshimasa Koga | Haodong Wu | Teiji Furugori
Proceedings of the 13th Pacific Asia Conference on Language, Information and Computation

1998

pdf bib
Structural Disambiguation Based on Reliable Estimation of Strength of Association
Haodong Wu | Eduardo de Paiva Alves | Teiji Furugori
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib
Structural Disambiguation Based on Reliable Estimation of Strength of Association
Haodong Wu | Eduardo de Paiva Alves | Teiji Furugori
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf bib
A Computational Method for Resolving Ambiguities in Coordinate Structures
Haodong Wu | Teiji Furugori
Proceedings of the 12th Pacific Asia Conference on Language, Information and Computation

1996

pdf bib
Prepositional Phrase Attachment Through A Hybrid Disambiguation Model
Haodong Wu | Teiji Furugori
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics