2024
pdf
abs
ELiRF-VRAIN at BioLaySumm: Boosting Lay Summarization Systems Performance with Ranking Models
Vicent Ahuir
|
Diego Torres
|
Encarna Segarra
|
Lluís-F. Hurtado
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
This paper presents our contribution to the BioLaySumm 2024 shared task of the 23rd BioNLP Workshop. The task is to create a lay summary, given a biomedical research article and its technical summary. As the input to the system could be large, a Longformer Encoder-Decoder (LED) has been used. We continuously pre-trained a general domain LED model with biomedical data to adapt it to this specific domain. In the pre-training phase, several pre-training tasks were aggregated to inject linguistic knowledge and increase the abstractivity of the generated summaries. Since the distribution of samples between the two datasets, eLife and PLOS, is unbalanced, we fine-tuned two models: one for eLife and another for PLOS. To increase the quality of the lay summaries of the system, we developed a regression model that helps us rank the summaries generated by the summarization models. This regression model predicts the quality of the summary in three different aspects: Relevance, Readability, and Factuality. We present the results of our models and a study to measure the ranking capabilities of the regression model.
2023
pdf
abs
ELiRF-VRAIN at BioNLP Task 1B: Radiology Report Summarization
Vicent Ahuir Esteve
|
Encarna Segarra
|
Lluis Hurtado
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
This paper presents our system at the Radiology Report Summarization Shared Task-1B of the 22nd BioNLP Workshop 2023. Inspired by the work of the BioBART model, we continuously pre-trained a general domain BART model with biomedical data to adapt it to this specific domain. In the pre-training phase, several pre-training tasks are aggregated to inject linguistic knowledge and increase the abstractivity of the generated summaries. We present the results of our models, and also, we have carried out an additional study on the lengths of the generated summaries, which has provided us with interesting information.
2018
pdf
abs
ELiRF-UPV at SemEval-2018 Task 10: Capturing Discriminative Attributes with Knowledge Graphs and Wikipedia
José-Ángel González
|
Lluís-F. Hurtado
|
Encarna Segarra
|
Ferran Pla
Proceedings of the 12th International Workshop on Semantic Evaluation
This paper describes the participation of ELiRF-UPV team at task 10, Capturing Discriminative Attributes, of SemEval-2018. Our best approach consists of using ConceptNet, Wikipedia and NumberBatch embeddings in order to stablish relationships between concepts and attributes. Furthermore, this system achieves competitive results in the official evaluation.
pdf
abs
ELiRF-UPV at SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge
José-Ángel González
|
Lluís-F. Hurtado
|
Encarna Segarra
|
Ferran Pla
Proceedings of the 12th International Workshop on Semantic Evaluation
This paper describes the participation of ELiRF-UPV team at task 11, Machine Comprehension using Commonsense Knowledge, of SemEval-2018. Our approach is based on the use of word embeddings, NumberBatch Embeddings, and a Deep Learning architecture to find the best answer for the multiple-choice questions based on the narrative text. The results obtained are in line with those obtained by the other participants and they encourage us to continue working on this problem.
2017
pdf
abs
ELiRF-UPV at SemEval-2017 Task 7: Pun Detection and Interpretation
Lluís-F. Hurtado
|
Encarna Segarra
|
Ferran Pla
|
Pascual Carrasco
|
José-Ángel González
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
This paper describes the participation of ELiRF-UPV team at task 7 (subtask 2: homographic pun detection and subtask 3: homographic pun interpretation) of SemEval2017. Our approach is based on the use of word embeddings to find related words in a sentence and a version of the Lesk algorithm to establish relationships between synsets. The results obtained are in line with those obtained by the other participants and they encourage us to continue working on this problem.
2012
pdf
abs
The acquisition and dialog act labeling of the EDECAN-SPORTS corpus
Lluís-F. Hurtado
|
Fernando García
|
Emilio Sanchis
|
Encarna Segarra
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
In this paper, we present the acquisition and labeling processes of the EDECAN-SPORTS corpus, which is a corpus that is oriented to the development of multimodal dialog systems acquired in Spanish and Catalan. Two Wizards of Oz were used in order to better simulate the behavior of an actual system in terms of both the information used by the different modules and the communication mechanisms between these modules. User and system dialog-act labeling, as well as other information, have been obtained automatically using this acquisition method Some preliminary experimental results with the acquired corpus show the appropriateness of the proposed acquisition method for the development of dialog systems
2008
pdf
abs
Acquisition and Evaluation of a Dialog Corpus through WOz and Dialog Simulation Techniques
David Griol
|
Lluís F. Hurtado
|
Encarna Segarra
|
Emilio Sanchis
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
In this paper, we present a comparison between two corpora acquired by means of two different techniques. The first corpus was acquired by means of the Wizard of Oz technique. A dialog simulation technique has been developed for the acquisition of the second corpus. A random selection of the user and system turns has been used, defining stop conditions for automatically deciding if the simulated dialog is successful or not. We use several evaluation measures proposed in previous research to compare between our two acquired corpora, and then discuss the similarities and differences between the two corpora with regard to these measures.
2007
pdf
Acquiring and Evaluating a Dialog Corpus through a Dialog Simulation Technique
David Griol
|
Lluis F. Hurtado
|
Emilio Sanchis
|
Encarna Segarra
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue
2004
pdf
WSD system based on specialized Hidden Markov Model (upv-shmm-eaw)
Antonio Molina
|
Ferran Pla
|
Encarna Segarra
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text
2002
pdf
Word Sense Disambiguation using Statistical Models and WordNet
Antonio Molina
|
Ferran Pla
|
Encarna Segarra
|
Lidia Moreno
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)