Franco M. Luque


2022

pdf
RoBERTuito: a pre-trained language model for social media text in Spanish
Juan Manuel Pérez | Damián Ariel Furman | Laura Alonso Alemany | Franco M. Luque
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Since BERT appeared, Transformer language models and transfer learning have become state-of-the-art for natural language processing tasks. Recently, some works geared towards pre-training specially-crafted models for particular domains, such as scientific papers, medical documents, user-generated texts, among others. These domain-specific models have been shown to improve performance significantly in most tasks; however, for languages other than English, such models are not widely available. In this work, we present RoBERTuito, a pre-trained language model for user-generated text in Spanish, trained on over 500 million tweets. Experiments on a benchmark of tasks involving user-generated text showed that RoBERTuito outperformed other pre-trained language models in Spanish. In addition to this, our model has some cross-lingual abilities, achieving top results for English-Spanish tasks of the Linguistic Code-Switching Evaluation benchmark (LinCE) and also competitive performance against monolingual models in English Twitter tasks. To facilitate further research, we make RoBERTuito publicly available at the HuggingFace model hub together with the dataset used to pre-train it.

2021

pdf
Region under Discussion for visual dialog
Mauricio Mazuecos | Franco M. Luque | Jorge Sánchez | Hernán Maina | Thomas Vadora | Luciana Benotti
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Visual Dialog is assumed to require the dialog history to generate correct responses during a dialog. However, it is not clear from previous work how dialog history is needed for visual dialog. In this paper we define what it means for a visual question to require dialog history and we release a subset of the Guesswhat?! questions for which their dialog history completely changes their responses. We propose a novel interpretable representation that visually grounds dialog history: the Region under Discussion. It constrains the image’s spatial features according to a semantic representation of the history inspired by the information structure notion of Question under Discussion.We evaluate the architecture on task-specific multimodal models and the visual transformer model LXMERT.

2019

pdf
Atalaya at SemEval 2019 Task 5: Robust Embeddings for Tweet Classification
Juan Manuel Pérez | Franco M. Luque
Proceedings of the 13th International Workshop on Semantic Evaluation

In this article, we describe our participation in HatEval, a shared task aimed at the detection of hate speech against immigrants and women. We focused on Spanish subtasks, building from our previous experiences on sentiment analysis in this language. We trained linear classifiers and Recurrent Neural Networks, using classic features, such as bag-of-words, bag-of-characters, and word embeddings, and also with recent techniques such as contextualized word representations. In particular, we trained robust task-oriented subword-aware embeddings and computed tweet representations using a weighted-averaging strategy. In the final evaluation, our systems showed competitive results for both Spanish subtasks ES-A and ES-B, achieving the first and fourth places respectively.

2013

pdf
Unsupervised Spectral Learning of WCFG as Low-rank Matrix Completion
Raphaël Bailly | Xavier Carreras | Franco M. Luque | Ariadna Quattoni
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf
Spectral Learning for Non-Deterministic Dependency Parsing
Franco M. Luque | Ariadna Quattoni | Borja Balle | Xavier Carreras
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2009

pdf
Upper Bounds for Unsupervised Parsing with Unambiguous Non-Terminally Separated Grammars
Franco M. Luque | Gabriel Infante-Lopez
Proceedings of the EACL 2009 Workshop on Computational Linguistic Aspects of Grammatical Inference