PromptFE: Automated Feature Engineering by Prompting

Yufeng Zou, Jean Utke, Diego Klabjan, Han Liu


Abstract
Automated feature engineering (AutoFE) liberates data scientists from the burden of manual feature construction. The semantic information of datasets contains rich context information for feature engineering but has been underutilized in many existing AutoFE works. We present PromptFE, a novel AutoFE framework that leverages large language models (LLMs) to automatically construct features in a compact string format and generate semantic explanations based on dataset descriptions. By learning the performance of constructed features in context, the LLM iteratively improves feature construction. We demonstrate through experiments on real-world datasets the superior performance of PromptFE over state-of-the-art AutoFE methods. We verify the impact of dataset semantic information and provide comprehensive study on the LLM-based feature construction process.
Anthology ID:
2026.eacl-long.28
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
653–681
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.28/
DOI:
Bibkey:
Cite (ACL):
Yufeng Zou, Jean Utke, Diego Klabjan, and Han Liu. 2026. PromptFE: Automated Feature Engineering by Prompting. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 653–681, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
PromptFE: Automated Feature Engineering by Prompting (Zou et al., EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.28.pdf