Exploring Medium-Sized LLMs for Knowledge Base Construction

Tomás Cerveira Da Cruz Pinto; Hugo Gonçalo Oliveira; Chris-Bennet Fleger

Exploring Medium-Sized LLMs for Knowledge Base Construction

Tomás Cerveira Da Cruz Pinto, Hugo Gonçalo Oliveira, Chris-Bennet Fleger

Abstract

19 Knowledge base construction (KBC) is one of the great challenges in Natural Language Processing (NLP) and of fundamental importance to the growth of the Semantic Web. Large Language Models (LLMs) may be useful for extracting structured knowledge, including subject-predicate-object triples. We tackle the LM-KBC 2023 Challenge by leveraging LLMs for KBC, utilizing its dataset and benchmarking our results against challenge participants. Prompt engineering and ensemble strategies are tested for object prediction with pretrained LLMs in the 0.5-2B parameter range, which is between the limits of tracks 1 and 2 of the challenge.Selected models are assessed in zero-shot and few-shot learning approaches when predicting the objects of 21 relations. Results demonstrate that instruction-tuned LLMs outperform generative baselines by up to four times, with relation-adapted prompts playing a crucial role in performance. The ensemble approach further enhances triple extraction, with a relation-based selection strategy achieving the highest F1 score. These findings highlight the potential of medium-sized LLMs and prompt engineering methods for efficient KBC.

Anthology ID:: 2025.ldk-1.23
Volume:: Proceedings of the 5th Conference on Language, Data and Knowledge
Month:: September
Year:: 2025
Address:: Naples, Italy
Editors:: Mehwish Alam, Andon Tchechmedjiev, Jorge Gracia, Dagmar Gromann, Maria Pia di Buono, Johanna Monti, Maxim Ionov
Venues:: LDK | WS
SIG:
Publisher:: Unior Press
Note:
Pages:: 221–232
Language:
URL:: https://preview.aclanthology.org/ldl-25-ingestion/2025.ldk-1.23/
DOI:
Bibkey:
Cite (ACL):: Tomás Cerveira Da Cruz Pinto, Hugo Gonçalo Oliveira, and Chris-Bennet Fleger. 2025. Exploring Medium-Sized LLMs for Knowledge Base Construction. In Proceedings of the 5th Conference on Language, Data and Knowledge, pages 221–232, Naples, Italy. Unior Press.
Cite (Informal):: Exploring Medium-Sized LLMs for Knowledge Base Construction (Pinto et al., LDK 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ldl-25-ingestion/2025.ldk-1.23.pdf

PDF Cite Search Fix data