Geological Text Summarization Using Generative Large Language Models
Matheus Stein de Aguiar, Rafael Oleques Nunes, Dennis Giovani Balreira
Abstract
Large generative language models have demonstrated impressive performance in various Natural Language Processing (NLP) tasks. However, the geological domain presents unique challenges for NLP due to its specialized language, which is full of technical terms. Therefore, pre-trained language models on generic corpora may not be suitable for performing geological domain-specific tasks. This article compares several models to identify those with the best performance in the Portuguese geological domain for a text summarization task. We applied the models to a Revista Geologia USP dataset. The dataset consists of abstracts of scientific texts and their respective titles, which we aim for the models to approximate with the summarization task. We tested the models in various scenarios, providing examples or not, and at two temperature levels. We then evaluated the models’ performance using quantitative metrics and a brief qualitative analysis comparing the titles proposed by the models with the original title. The results show that the Gemma3:27b model was better in some scenarios, while the Llama3:8b model performed best in others.- Anthology ID:
- 2026.propor-1.11
- Volume:
- Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
- Month:
- April
- Year:
- 2026
- Address:
- Salvador, Brazil
- Editors:
- Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
- Venue:
- PROPOR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 111–119
- Language:
- URL:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.11/
- DOI:
- Cite (ACL):
- Matheus Stein de Aguiar, Rafael Oleques Nunes, and Dennis Giovani Balreira. 2026. Geological Text Summarization Using Generative Large Language Models. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 111–119, Salvador, Brazil. Association for Computational Linguistics.
- Cite (Informal):
- Geological Text Summarization Using Generative Large Language Models (Aguiar et al., PROPOR 2026)
- PDF:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.11.pdf