On the Feasibility of LLM-based Automated Generation and Filtering of Competency Questions for Ontologies

Zola Mahlaza; C. Maria Keet; Nanee Chahinian; Batoul Haydar

On the Feasibility of LLM-based Automated Generation and Filtering of Competency Questions for Ontologies

Zola Mahlaza, C. Maria Keet, Nanee Chahinian, Batoul Haydar

Abstract

54 Competency questions for ontologies are used in a number of ontology development tasks. The questions’ sentences structure have been analysed to inform ontology authoring and validation. One of the problems to make this a seamless process is the hurdle of writing good CQs manually or offering automated assistance in writing CQs. In this paper, we propose an enhanced and automated pipeline where one can trace meticulously through each step, using a mini-corpus, T5, and the SQuAD dataset to generate questions, and the CLaRO controlled language, semantic similarity, and other steps for filtering. This was evaluated with two corpora of different genre in the same broad domain and evaluated with domain experts. The final output questions across the experiments were around 25% for scope and relevance and 45% of unproblematic quality. Technically, it provided ample insight into trade-offs in generation and filtering, where relaxing filtering increased sentence structure diversity but also led to more spurious sentences that required additional processing

Anthology ID:: 2025.ldk-1.15
Volume:: Proceedings of the 5th Conference on Language, Data and Knowledge
Month:: September
Year:: 2025
Address:: Naples, Italy
Editors:: Mehwish Alam, Andon Tchechmedjiev, Jorge Gracia, Dagmar Gromann, Maria Pia di Buono, Johanna Monti, Maxim Ionov
Venues:: LDK | WS
SIG:
Publisher:: Unior Press
Note:
Pages:: 136–146
Language:
URL:: https://preview.aclanthology.org/ldl-25-ingestion/2025.ldk-1.15/
DOI:
Bibkey:
Cite (ACL):: Zola Mahlaza, C. Maria Keet, Nanee Chahinian, and Batoul Haydar. 2025. On the Feasibility of LLM-based Automated Generation and Filtering of Competency Questions for Ontologies. In Proceedings of the 5th Conference on Language, Data and Knowledge, pages 136–146, Naples, Italy. Unior Press.
Cite (Informal):: On the Feasibility of LLM-based Automated Generation and Filtering of Competency Questions for Ontologies (Mahlaza et al., LDK 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ldl-25-ingestion/2025.ldk-1.15.pdf

PDF Cite Search Fix data