Multi-Step Generation of Test Specifications using Large Language Models for System-Level Requirements

Dragan Milchevski, Gordon Frank, Anna Hätty, Bingqing Wang, Xiaowei Zhou, Zhe Feng


Abstract
System-level testing is a critical phase in the development of large, safety-dependent systems, such as those in the automotive industry. However, creating test specifications can be a time-consuming and error-prone process. This paper presents an AI-based assistant to aid users in creating test specifications for system-level requirements. The system mimics the working process of a test developer by utilizing a LLM and an agentic framework, and by introducing intermediate test artifacts - structured intermediate representations derived from input requirements. Our user study demonstrates a 30 to 40% reduction in effort required for test development. For test specification generation, our quantitative analysis reveals that iteratively providing the model with more targeted information, like examples of similar test specifications, based on comparable requirements and purposes, can boost the performance by up to 30% in ROUGE-L. Overall, our approach has the potential to improve the efficiency, accuracy, and reliability of system-level testing and can be applied to various industries where safety and functionality are paramount.
Anthology ID:
2025.acl-industry.11
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Georg Rehm, Yunyao Li
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
132–146
Language:
URL:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.acl-industry.11/
DOI:
10.18653/v1/2025.acl-industry.11
Bibkey:
Cite (ACL):
Dragan Milchevski, Gordon Frank, Anna Hätty, Bingqing Wang, Xiaowei Zhou, and Zhe Feng. 2025. Multi-Step Generation of Test Specifications using Large Language Models for System-Level Requirements. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), pages 132–146, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Multi-Step Generation of Test Specifications using Large Language Models for System-Level Requirements (Milchevski et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.acl-industry.11.pdf