2025
pdf
bib
abs
Irapuarani at SemEval-2025 Task 10: Evaluating Strategies Combining Small and Large Language Models for Multilingual Narrative Detection
Gabriel Assis
|
Lívia De Azevedo
|
Joao De Moraes
|
Laura Ribeiro
|
Aline Paes
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
This paper presents the Irapuarani team’s participation in SemEval-2025 Task 10, Subtask 2, which focuses on hierarchical multi-label classification of narratives from online news articles. We explored three distinct strategies: (1) a direct classification approach using a multilingual Small Language Model (SLM), disregarding the hierarchical structure; (2) a translation-based strategy where texts from multiple languages were translated into a single language using a Large Language Model (LLM), followed by classification with a monolingual SLM; and (3) a hybrid strategy leveraging an SLM to filter domains and an LLM to assign labels while accounting for the hierarchy. We conducted experiments on datasets in all available languages, namely Bulgarian, English, Hindi, Portuguese and Russian. Our results show that Strategy 2 is the most generalizable across languages, achieving test set rankings of 21st in English, 9th in Portuguese and Russian, 7th in Bulgarian, and 10th in Hindi.
2024
pdf
bib
abs
Analysis of Material Facts on Financial Assets: A Generative AI Approach
Gabriel Assis
|
Daniela Vianna
|
Gisele L. Pappa
|
Alexandre Plastino
|
Wagner Meira Jr
|
Altigran Soares da Silva
|
Aline Paes
Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing
Material facts (MF) are crucial and obligatory disclosures that can significantly influence asset values. Following their release, financial analysts embark on the meticulous and highly specialized task of crafting analyses to shed light on their impact on company assets, a challenge elevated by the daily amount of MFs released. Generative AI, with its demonstrated power of crafting coherent text, emerges as a promising solution to this task. However, while these analyses must incorporate the MF, they must also transcend it, enhancing it with vital background information, valuable and grounded recommendations, prospects, potential risks, and their underlying reasoning. In this paper, we approach this task as an instance of controllable text generation, aiming to ensure adherence to the MF and other pivotal attributes as control elements. We first explore language models’ capacity to manage this task by embedding those elements into prompts and engaging popular chatbots. A bilingual proof of concept underscores both the potential and the challenges of applying generative AI techniques to this task.
pdf
bib
Exploring Portuguese Hate Speech Detection in Low-Resource Settings: Lightly Tuning Encoder Models or In-Context Learning of Large Models?
Gabriel Assis
|
Annie Amorim
|
Jonnathan Carvalho
|
Daniel de Oliveira
|
Daniela Vianna
|
Aline Paes
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1