Diogo Fernandes Costa Silva
Also published as: Diogo Fernandes
2026
AKCIT at SemEval-2026 Task 13: A Lightweight LightGBM Baseline for Cross-Language Detection of LLM-Generated Code
Rone Brandao Filho | Walcy Santos Rezende Rios | Lucas Neves | Jose Ricardo Fleury Oliveira | Diogo Fernandes | Arlindo Galvão Filho
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Rone Brandao Filho | Walcy Santos Rezende Rios | Lucas Neves | Jose Ricardo Fleury Oliveira | Diogo Fernandes | Arlindo Galvão Filho
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
The widespread use of LLMs in software development has made the detection of machine-generated code a pressing challenge, particularly when models must generalize across programming languages and domains. We present a lightweight, LLM-free pipeline that combines stylometric feature extraction with a LightGBM classifier and explicitly prioritizes structural generalization over deep semantic modeling. Despite its simplicity, the method achieves a Macro F1 of 0.70–0.72, more than doubling the CodeBERT baseline (0.30) in SemEval-2026 Task 13 Subtask A, while operating without GPUs or any fine-tuning.
AKCIT-UFG at SemEval-2026 Task 8: Structured Chunking and Optimized Query Reformulation for Efficient Multi-Turn Retrieval
David Ferreira | Wilson Ramos | Priscila Ribeiro | Emanuel Passinato | Diogo Fernandes | Arlindo Filho
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
David Ferreira | Wilson Ramos | Priscila Ribeiro | Emanuel Passinato | Diogo Fernandes | Arlindo Filho
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This submission investigates efficient multi-turn retrieval under constrained computational settings. We analyze how passage granularity and conversational query rewriting affect retrieval effectiveness across four benchmark domains. Using compact, locally deployable components, we show that smaller passage segmentation improves early-rank performance and that lightweight keyword-oriented query reformulation substantially enhances dense retrieval quality.Importantly, we observe that rewriting interacts differently with encoder backbones: some compact models benefit significantly from increased query specificity, while others degrade, indicating sensitivity to rewrite-induced distribution shifts. Our findings demonstrate that competitive multi-turn retrieval does not require large proprietary models, but can emerge from principled structural and preprocessing design choices. The results highlight the importance of aligning chunking strategy, rewriting policy, and encoder characteristics in resource-efficient MT-RAG systems.
Safety Is Not Universal: The Selective Safety Trap in LLM Alignment
Iago Alves Brito | Walcy Rios | Julia Soares Dollis | Diogo Fernandes Costa Silva | Arlindo Rodrigues Galvão Filho
Findings of the Association for Computational Linguistics: ACL 2026
Iago Alves Brito | Walcy Rios | Julia Soares Dollis | Diogo Fernandes Costa Silva | Arlindo Rodrigues Galvão Filho
Findings of the Association for Computational Linguistics: ACL 2026
Current safety evaluations of large language models (LLMs) create a dangerous illusion of universal protection by aggregating harms under generic categories such as "Identity Hate", obscuring vulnerabilities toward specific populations. In this work, we expose the Selective Safety Trap: a systemic failure mode where models robustly defend specific populations while leaving underrepresented communities highly vulnerable to identical adversarial attacks. To systematically audit this phenomenon, we introduce MiJaBench, a bilingual (English–Portuguese) adversarial benchmark comprising 43,961 controlled jailbreaking prompts across 16 minority groups. By evaluating 14 state-of-the-art LLMs on MiJaBench, we curate 615,454 prompt-response pairs that compose MiJaBench-Align, revealing that safety alignment is not a uniform semantic capability but a demographic hierarchy, with defense rates fluctuating by up to 42% within the same model solely based on the target group. This disparity persists across architectures and languages and is amplified by scaling, indicating that current alignment methods learn group-specific safeguards rather than a generalized notion of harm. Through targeted direct preference optimization (DPO) on a 1B-parameter baseline, we achieve strong zero-shot safety generalizations to entirely unseen demographics and complex attack strategies. We release all datasets and scripts to provide the community with a concrete pathway toward equitable, transferable safety alignment.
2022
CEIA-NLP at CASE 2022 Task 1: Protest News Detection for Portuguese
Diogo Fernandes | Adalberto Junior | Gabriel Marques | Anderson Soares | Arlindo Galvao Filho
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)
Diogo Fernandes | Adalberto Junior | Gabriel Marques | Anderson Soares | Arlindo Galvao Filho
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)
This paper summarizes our work on the document classification subtask of Multilingual protest news detection of the CASE @ ACL-IJCNLP 2022 workshok. In this context, we investigate the performance of monolingual and multilingual transformer-based models in low data resources, taking Portuguese as an example and evaluating language models on document classification. Our approach became the winning solution in Portuguese document classification achieving 0.8007 F1 Score on Test set. The experimental results demonstrate that multilingual models achieve best results in scenarios with few dataset samples of specific language, because we can train models using datasets from other languages of the same task and domain.