Thematic Categorization on Pineapple Production in Costa Rica: An Exploratory Analysis through Topic Modeling

Valentina Tretti Beckles, Adrian Vergara Heidke


Abstract
Costa Rica is one of the largest producers and exporters of pineapple in the world. This status has encouraged multinational companies to use plantations in this Central American country for experimentation and the cultivation of new varieties, such as the Pinkglow pineapple. However, pineapple monoculture has significant socio-environmental impacts on the regions where it is cultivated.In this exploratory study, we aimed to analyze how pineapple production is portrayed on the Internet. To achieve this, we collected a corpus of texts in Spanish and English from online sources in two phases: using the BootCat tool and manual search on newspaper websites. The Hierarchical Dirichlet Process (HDP) topic model was then applied to identify dominant topics within the corpus. These topics were subsequently classified into thematic categories, and the texts were categorized accordingly. The findings indicate that environmental issues related to pineapple cultivation are underrepresented on the Internet, particularly in comparison to the extensive focus on topics related to pineapple production and marketing.
Anthology ID:
2025.nlp4ecology-1.11
Volume:
Proceedings of the 1st Workshop on Ecology, Environment, and Natural Language Processing (NLP4Ecology2025)
Month:
march
Year:
2025
Address:
Tallinn, Estonia
Editors:
Valerio Basile, Cristina Bosco, Francesca Grasso, Muhammad Okky Ibrohim, Maria Skeppstedt, Manfred Stede
Venues:
NLP4Ecology | WS
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
44–55
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.nlp4ecology-1.11/
DOI:
Bibkey:
Cite (ACL):
Valentina Tretti Beckles and Adrian Vergara Heidke. 2025. Thematic Categorization on Pineapple Production in Costa Rica: An Exploratory Analysis through Topic Modeling. In Proceedings of the 1st Workshop on Ecology, Environment, and Natural Language Processing (NLP4Ecology2025), pages 44–55, Tallinn, Estonia. University of Tartu Library.
Cite (Informal):
Thematic Categorization on Pineapple Production in Costa Rica: An Exploratory Analysis through Topic Modeling (Beckles & Heidke, NLP4Ecology 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.nlp4ecology-1.11.pdf