Yoan Gutierrez
Papers on this page may belong to the following people: Yoan Gutiérrez, Yoan Gutierrez
2023
A Review in Knowledge Extraction from Knowledge Bases
Fabio Yanez
|
Andrés Montoyo
|
Yoan Gutierrez
|
Rafael Muñoz
|
Armando Suarez
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Generative language models achieve the state of the art in many tasks within natural language processing (NLP). Although these models correctly capture syntactic information, they fail to interpret knowledge (semantics). Moreover, the lack of interpretability of these models promotes the use of other technologies as a replacement or complement to generative language models. This is the case with research focused on incorporating knowledge by resorting to knowledge bases mainly in the form of graphs. The generation of large knowledge graphs is carried out with unsupervised or semi-supervised techniques, which promotes the validation of this knowledge with the same type of techniques due to the size of the generated databases. In this review, we will explain the different techniques used to test and infer knowledge from graph structures with machine learning algorithms. The motivation of validating and inferring knowledge is to use correct knowledge in subsequent tasks with improved embeddings.
2021
Knowledge Discovery in COVID-19 Research Literature
Ernesto L. Estevanell-Valladares
|
Suilan Estevez-Velarde
|
Alejandro Piad-Morffis
|
Yoan Gutierrez
|
Andres Montoyo
|
Rafael Muñoz
|
Yudivián Almeida Cruz
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
This paper presents the preliminary results of an ongoing project that analyzes the growing body of scientific research published around the COVID-19 pandemic. In this research, a general-purpose semantic model is used to double annotate a batch of 500 sentences that were manually selected from the CORD-19 corpus. Afterwards, a baseline text-mining pipeline is designed and evaluated via a large batch of 100,959 sentences. We present a qualitative analysis of the most interesting facts automatically extracted and highlight possible future lines of development. The preliminary results show that general-purpose semantic models are a useful tool for discovering fine-grained knowledge in large corpora of scientific documents.