Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish

K. Bretonnel Cohen; Laura Manrique-Gómez; Rubén Manrique

Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish

Kevin Cohen, Laura Manrique-Gómez, Ruben Manrique

Abstract

This study explores the use of large language models (LLMs) to enhance datasets and improve irony detection in 19th-century Latin American newspapers. Two strategies were employed to evaluate the efficacy of BERT and GPT models in capturing the subtle nuances nature of irony, through both multi-class and binary classification tasks. First, we implemented dataset enhancements focused on enriching emotional and contextual cues; however, these showed limited impact on historical language analysis. The second strategy, a semi-automated annotation process, effectively addressed class imbalance and augmented the dataset with high-quality annotations. Despite the challenges posed by the complexity of irony, this work contributes to the advancement of sentiment analysis through two key contributions: introducing a new historical Spanish dataset tagged for sentiment analysis and irony detection, and proposing a semi-automated annotation methodology where human expertise is crucial for refining LLMs results, enriched by incorporating historical and cultural contexts as core features.

Anthology ID:: 2025.nlp4dh-1.48
Volume:: Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities
Month:: May
Year:: 2025
Address:: Albuquerque, USA
Editors:: Mika Hämäläinen, Emily Öhman, Yuri Bizzoni, So Miyagawa, Khalid Alnajjar
Venues:: NLP4DH | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 559–569
Language:
URL:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.nlp4dh-1.48/
DOI:
Bibkey:
Cite (ACL):: Kevin Cohen, Laura Manrique-Gómez, and Ruben Manrique. 2025. Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish. In Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities, pages 559–569, Albuquerque, USA. Association for Computational Linguistics.
Cite (Informal):: Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish (Cohen et al., NLP4DH 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.nlp4dh-1.48.pdf

PDF Cite Search Fix data