Anselmo Peñas
Also published as: Anselmo Penas
2025
ViClaim: A Multilingual Multilabel Dataset for Automatic Claim Detection in Videos
Patrick Giedemann | Pius von Däniken | Jan Milan Deriu | Alvaro Rodrigo | Anselmo Peñas | Mark Cieliebak
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Patrick Giedemann | Pius von Däniken | Jan Milan Deriu | Alvaro Rodrigo | Anselmo Peñas | Mark Cieliebak
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
The growing influence of video content as a medium for communication and misinformation underscores the urgent need for effective tools to analyze claims in multilingual and multi-topic settings. Existing efforts in misinformation detection largely focus on written text, leaving a significant gap in addressing the complexity of spoken text in video transcripts. We introduce ViClaim, a dataset of 1,798 annotated video transcripts across three languages (English, German, Spanish) and six topics. Each sentence in the transcripts is labeled with three claim-related categories: fact-check-worthy, fact-non-check-worthy, or opinion. We developed a custom annotation tool to facilitate the highly complex annotation process. Experiments with state-of-the-art multilingual language models demonstrate strong performance in cross-validation (macro F1 up to 0.896) but reveal challenges in generalization to unseen topics, particularly for distinct domains. Our findings highlight the complexity of claim detection in video transcripts. ViClaim offers a robust foundation for advancing misinformation detection in video-based communication, addressing a critical gap in multimodal analysis.
UNEDTeam at SemEval-2025 Task 10: Zero-Shot Narrative Classification
Jesus M. Fraile - Hernandez | Anselmo Peñas
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Jesus M. Fraile - Hernandez | Anselmo Peñas
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
In this paper we present our participation in Subtask 2 of SemEval-2025 Task 10, focusing on the identification and classification of narratives in news of multiple languages, on climate change and the Ukraine-Russia war. To address this task, we employed a Zero-Shot approach using a generative Large Language Model without prior training on the dataset. Our classification strategy is based on two steps: first, the system classifies the topic of each news item; subsequently, it identifies the sub-narratives directly at the finer granularity. We present a detailed analysis of the performance of our system compared to the best ranked systems on the leaderboard, highlighting the strengths and limitations of our approach.
2024
UNED team at BEA 2024 Shared Task: Testing different Input Formats for predicting Item Difficulty and Response Time in Medical Exams
Alvaro Rodrigo | Sergio Moreno-Álvarez | Anselmo Peñas
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
Alvaro Rodrigo | Sergio Moreno-Álvarez | Anselmo Peñas
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
This paper presents the description and primary outcomes of our team’s participation in the BEA 2024 shared task. Our primary exploration involved employing transformer-based systems, particularly BERT models, due to their suitability for Natural Language Processing tasks and efficiency with computational resources. We experimented with various input formats, including concatenating all text elements and incorporating only the clinical case. Surprisingly, our results revealed different impacts on predicting difficulty versus response time, with the former favoring clinical text only and the latter benefiting from including the correct answer. Despite moderate performance in difficulty prediction, our models excelled in response time prediction, ranking highest among all participants. This study lays the groundwork for future investigations into more complex approaches and configurations, aiming to advance the automatic prediction of exam difficulty and response time.
HAMiSoN-Generative at ClimateActivism 2024: Stance Detection using generative large language models
Jesus M. Fraile-Hernandez | Anselmo Peñas
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)
Jesus M. Fraile-Hernandez | Anselmo Peñas
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)
CASE in EACL 2024 proposes the shared task on Hate Speech and Stance Detection during Climate Activism. In our participation in the stance detection task, we have tested different approaches using LLMs for this classification task. We have tested a generative model using the classical seq2seq structure. Subsequently, we have considerably improved the results by replacing the last layer of these LLMs with a classifier layer. We have also studied how the performance is affected by the amount of data used in training. For this purpose, a partition of the dataset has been used and external data from posture detection tasks has been added.
HAMiSoN-Ensemble at ClimateActivism 2024: Ensemble of RoBERTa, Llama 2, and Multi-task for Stance Detection
Raquel Rodriguez-Garcia | Julio Reyes Montesinos | Jesus M. Fraile-Hernandez | Anselmo Peñas
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)
Raquel Rodriguez-Garcia | Julio Reyes Montesinos | Jesus M. Fraile-Hernandez | Anselmo Peñas
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2024)
CASE @ EACL 2024 proposes a shared task on Stance and Hate Event Detection for Climate Activism discourse. For our participation in the stance detection task, we propose an ensemble of different approaches: a transformer-based model (RoBERTa), a generative Large Language Model (Llama 2), and a Multi-Task Learning model. Our main goal is twofold: to study the effect of augmenting the training data with external datasets, and to examine the contribution of several, diverse models through a voting ensemble. The results show that if we take the best configuration during training for each of the three models (RoBERTa, Llama 2 and MTL), the ensemble would have ranked first with the highest F1 on the leaderboard for the stance detection subtask.
2017
Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics
André Martins | Anselmo Peñas
Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics
André Martins | Anselmo Peñas
Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics
2015
Unsupervised Learning of Coherent and General Semantic Classes for Entity Aggregates
Henry Anaya-Sánchez | Anselmo Peñas
Proceedings of the 11th International Conference on Computational Semantics
Henry Anaya-Sánchez | Anselmo Peñas
Proceedings of the 11th International Conference on Computational Semantics
2014
“One Entity per Discourse” and “One Entity per Collocation” Improve Named-Entity Disambiguation
Ander Barrena | Eneko Agirre | Bernardo Cabaleiro | Anselmo Peñas | Aitor Soroa
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
Ander Barrena | Eneko Agirre | Bernardo Cabaleiro | Anselmo Peñas | Aitor Soroa
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
2012
Evaluating Machine Reading Systems through Comprehension Tests
Anselmo Peñas | Eduard Hovy | Pamela Forner | Álvaro Rodrigo | Richard Sutcliffe | Corina Forascu | Caroline Sporleder
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Anselmo Peñas | Eduard Hovy | Pamela Forner | Álvaro Rodrigo | Richard Sutcliffe | Corina Forascu | Caroline Sporleder
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
This paper describes a methodology for testing and evaluating the performance of Machine Reading systems through Question Answering and Reading Comprehension Tests. The methodology is being used in QA4MRE (QA for Machine Reading Evaluation), one of the labs of CLEF. The task was to answer a series of multiple choice tests, each based on a single document. This allows complex questions to be asked but makes evaluation simple and completely automatic. The evaluation architecture is completely multilingual: test documents, questions, and their answers are identical in all the supported languages. Background text collections are comparable collections harvested from the web for a set of predefined topics. Each test received an evaluation score between 0 and 1 using c@1. This measure encourages systems to reduce the number of incorrect answers while maintaining the number of correct ones by leaving some questions unanswered. 12 groups participated in the task, submitting 62 runs in 3 different languages (German, English, and Romanian). All runs were monolingual; no team attempted a cross-language task. We report here the conclusions and lessons learned after the first campaign in 2011.
Temporally Anchored Relation Extraction
Guillermo Garrido | Anselmo Peñas | Bernardo Cabaleiro | Álvaro Rodrigo
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Guillermo Garrido | Anselmo Peñas | Bernardo Cabaleiro | Álvaro Rodrigo
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2011
A Simple Measure to Assess Non-response
Anselmo Peñas | Alvaro Rodrigo
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Anselmo Peñas | Alvaro Rodrigo
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Unsupervised Discovery of Domain-Specific Knowledge from Text
Dirk Hovy | Chunliang Zhang | Eduard Hovy | Anselmo Peñas
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Dirk Hovy | Chunliang Zhang | Eduard Hovy | Anselmo Peñas
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Detecting Compositionality Using Semantic Vector Space Models Based on Syntactic Context. Shared Task System Description
Guillermo Garrido | Anselmo Peñas
Proceedings of the Workshop on Distributional Semantics and Compositionality
Guillermo Garrido | Anselmo Peñas
Proceedings of the Workshop on Distributional Semantics and Compositionality
2010
GikiCLEF: Crosscultural Issues in Multilingual Information Access
Diana Santos | Luís Miguel Cabral | Corina Forascu | Pamela Forner | Fredric Gey | Katrin Lamm | Thomas Mandl | Petya Osenova | Anselmo Peñas | Álvaro Rodrigo | Julia Schulz | Yvonne Skalban | Erik Tjong Kim Sang
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Diana Santos | Luís Miguel Cabral | Corina Forascu | Pamela Forner | Fredric Gey | Katrin Lamm | Thomas Mandl | Petya Osenova | Anselmo Peñas | Álvaro Rodrigo | Julia Schulz | Yvonne Skalban | Erik Tjong Kim Sang
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
In this paper we describe GikiCLEF, the first evaluation contest that, to our knowledge, was specifically designed to expose and investigate cultural and linguistic issues involved in structured multimedia collections and searching, and which was organized under the scope of CLEF 2009. GikiCLEF evaluated systems that answered hard questions for both human and machine, in ten different Wikipedia collections, namely Bulgarian, Dutch, English, German, Italian, Norwegian (Bokmäl and Nynorsk), Portuguese, Romanian, and Spanish. After a short historical introduction, we present the task, together with its motivation, and discuss how the topics were chosen. Then we provide another description from the point of view of the participants. Before disclosing their results, we introduce the SIGA management system explaining the several tasks which were carried out behind the scenes. We quantify in turn the GIRA resource, offered to the community for training and further evaluating systems with the help of the 50 topics gathered and the solutions identified. We end the paper with a critical discussion of what was learned, advancing possible ways to reuse the data.
Evaluating Multilingual Question Answering Systems at CLEF
Pamela Forner | Danilo Giampiccolo | Bernardo Magnini | Anselmo Peñas | Álvaro Rodrigo | Richard Sutcliffe
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Pamela Forner | Danilo Giampiccolo | Bernardo Magnini | Anselmo Peñas | Álvaro Rodrigo | Richard Sutcliffe
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
The paper offers an overview of the key issues raised during the seven years activity of the Multilingual Question Answering Track at the Cross Language Evaluation Forum (CLEF). The general aim of the Multilingual Question Answering Track has been to test both monolingual and cross-language Question Answering (QA) systems that process queries and documents in several European languages, also drawing attention to a number of challenging issues for research in multilingual QA. The paper gives a brief description of how the task has evolved over the years and of the way in which the data sets have been created, presenting also a brief summary of the different types of questions developed. The document collections adopted in the competitions are sketched as well, and some data about the participation are provided. Moreover, the main evaluation measures used to evaluate system performances are explained and an overall analysis of the results achieved is presented.
Semantic Enrichment of Text with Background Knowledge
Anselmo Peñas | Eduard Hovy
Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
Anselmo Peñas | Eduard Hovy
Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
2007
Experiments of UNED at the Third Recognising Textual Entailment Challenge
Álvaro Rodrigo | Anselmo Peñas | Jesús Herrera | Felisa Verdejo
Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing
Álvaro Rodrigo | Anselmo Peñas | Jesús Herrera | Felisa Verdejo
Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing
2006
The Multilingual Question Answering Track at CLEF
Bernardo Magnini | Danilo Giampiccolo | Lili Aunimo | Christelle Ayache | Petya Osenova | Anselmo Peñas | Maarten de Rijke | Bogdan Sacaleanu | Diana Santos | Richard Sutcliffe
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Bernardo Magnini | Danilo Giampiccolo | Lili Aunimo | Christelle Ayache | Petya Osenova | Anselmo Peñas | Maarten de Rijke | Bogdan Sacaleanu | Diana Santos | Richard Sutcliffe
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
This paper presents an overview of the Multilingual Question Answering evaluation campaigns which have been organized at CLEF (Cross Language Evaluation Forum) since 2003. Over the years, the competition has registered a steady increment in the number of participants and languages involved. In fact, from the original eight groups which participated in 2003 QA track, the number of competitors in 2005 rose to twenty-four. Also, the performances of the systems have steadily improved, and the average of the best performances in the 2005 saw an increase of 10% with respect to the previous year.
2005
QARLA: A Framework for the Evaluation of Text Summarization Systems
Enrique Amigó | Julio Gonzalo | Anselmo Peñas | Felisa Verdejo
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)
Enrique Amigó | Julio Gonzalo | Anselmo Peñas | Felisa Verdejo
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)
Evaluating DUC 2004 Tasks with the QARLA Framework
Enrique Amigó | Julio Gonzalo | Anselmo Peñas | Felisa Verdejo
Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization
Enrique Amigó | Julio Gonzalo | Anselmo Peñas | Felisa Verdejo
Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization
2004
Using syntactic information to extract relevant terms for multi-document summarization
Enrique Amigó | Julio Gonzalo | Víctor Peinado | Anselmo Peñas | Felisa Verdejo
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics
Enrique Amigó | Julio Gonzalo | Víctor Peinado | Anselmo Peñas | Felisa Verdejo
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics
An Empirical Study of Information Synthesis Task
Enrique Amigo | Julio Gonzalo | Victor Peinado | Anselmo Peñas | Felisa Verdejo
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
Enrique Amigo | Julio Gonzalo | Victor Peinado | Anselmo Peñas | Felisa Verdejo
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)
Word Sense Disambiguation based on term to term similarity in a context space
Javier Artiles | Anselmo Penas | Felisa Verdejo
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text
Javier Artiles | Anselmo Penas | Felisa Verdejo
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text
2000
Evaluating Wordnets in Cross-language Information Retrieval: the ITEM Search Engine
Felisa Verdejo | Julio Gonzalo | Anselmo Peñas | Fernando López | David Fernández
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
Felisa Verdejo | Julio Gonzalo | Anselmo Peñas | Fernando López | David Fernández
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
1999
Search
Fix author
Co-authors
- Felisa Verdejo 9
- Álvaro Rodrigo 8
- Julio Gonzalo 7
- Enrique Amigó 4
- Eduard Hovy 4
- Pamela Forner 3
- Jesus M. Fraile - Hernandez 3
- Richard Sutcliffe 3
- Bernardo Cabaleiro 2
- Corina Forăscu 2
- Guillermo Garrido 2
- Danilo Giampiccolo 2
- Bernardo Magnini 2
- Petya Osenova 2
- Víctor Peinado 2
- Diana Santos 2
- Eneko Agirre 1
- Henry Anaya-Sánchez 1
- Javier Artiles 1
- Lili Aunimo 1
- Christelle Ayache 1
- Ander Barrena 1
- Luís Miguel Cabral 1
- Mark Cieliebak 1
- Jan Milan Deriu 1
- David Fernández-Amorós 1
- Fredric Gey 1
- Patrick Giedemann 1
- Jesús Herrera 1
- Dirk Hovy 1
- Katrin Lamm 1
- Fernando López 1
- Thomas Mandl 1
- André F. T. Martins 1
- Sergio Moreno-Álvarez 1
- Julio Reyes Montesinos 1
- Raquel Rodriguez-Garcia 1
- Bogdan Sacaleanu 1
- Julia Maria Schulz 1
- Yvonne Skalban 1
- Aitor Soroa 1
- Caroline Sporleder 1
- Erik Tjong Kim Sang 1
- Pius Von Däniken 1
- Chunliang Zhang 1
- Maarten de Rijke 1