Koldo Gojenola
Also published as: Koldo Gojenola Galletebeitia, K. Gojenola, Koldobika Gojenola
2024
A Virtual Patient Dialogue System Based on Question-Answering on Clinical Records
Janire Arana | Mikel Idoyaga | Maitane Urruela | Elisa Espina | Aitziber Atutxa Salazar | Koldo Gojenola
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Janire Arana | Mikel Idoyaga | Maitane Urruela | Elisa Espina | Aitziber Atutxa Salazar | Koldo Gojenola
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
In this work we present two datasets for the development of virtual patients and the first evaluation results. We firstly introduce a Spanish corpus of medical dialogue questions annotated with intents, built upon prior research in French. We also propose a second dataset of dialogues using a novel annotation approach that involves doctor questions, patient answers, and corresponding clinical records, organized as triples of the form (clinical report, question, patient answer). This way, the doctor-patient conversation is modeled as a question-answering system that tries to find responses to questions taking a clinical record as input. This approach can help to eliminate the need for manually structured patient records, as commonly used in previous studies, thereby expanding the pool of diverse virtual patients available. Leveraging these annotated corpora, we develop and assess an automatic system designed to answer medical dialogue questions posed by medical students to simulated patients in medical exams. Our approach demonstrates robust generalization, relying solely on medical records to generate new patient cases. The two datasets and the code will be freely available for the research community.
2019
IxaMed at PharmacoNER Challenge 2019
Xabier Lahuerta | Iakes Goenaga | Koldo Gojenola | Aitziber Atutxa Salazar | Maite Oronoz
Proceedings of the 5th Workshop on BioNLP Open Shared Tasks
Xabier Lahuerta | Iakes Goenaga | Koldo Gojenola | Aitziber Atutxa Salazar | Maite Oronoz
Proceedings of the 5th Workshop on BioNLP Open Shared Tasks
The aim of this paper is to present our approach (IxaMed) in the PharmacoNER 2019 task. The task consists of identifying chemical, drug, and gene/protein mentions from clinical case studies written in Spanish. The evaluation of the task is divided in two scenarios: one corresponding to the detection of named entities and one corresponding to the indexation of named entities that have been previously identified. In order to identify named entities we have made use of a Bi-LSTM with a CRF on top in combination with different types of word embeddings. We have achieved our best result (86.81 F-Score) combining pretrained word embeddings of Wikipedia and Electronic Health Records (50M words) with contextual string embeddings of Wikipedia and Electronic Health Records. On the other hand, for the indexation of the named entities we have used the Levenshtein distance obtaining a 85.34 F-Score as our best result.
Towards discourse annotation and sentiment analysis of the Basque Opinion Corpus
Jon Alkorta | Koldo Gojenola | Mikel Iruskieta
Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019
Jon Alkorta | Koldo Gojenola | Mikel Iruskieta
Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019
Discourse information is crucial for a better understanding of the text structure and it is also necessary to describe which part of an opinionated text is more relevant or to decide how a text span can change the polarity (strengthen or weaken) of other span by means of coherence relations. This work presents the first results on the annotation of the Basque Opinion Corpus using Rhetorical Structure Theory (RST). Our evaluation results and analysis show us the main avenues to improve on a future annotation process. We have also extracted the subjectivity of several rhetorical relations and the results show the effect of sentiment words in relations and the influence of each relation in the semantic orientation value.
2018
Saying no but meaning yes: negation and sentiment analysis in Basque
Jon Alkorta | Koldo Gojenola | Mikel Iruskieta
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Jon Alkorta | Koldo Gojenola | Mikel Iruskieta
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
In this work, we have analyzed the effects of negation on the semantic orientation in Basque. The analysis shows that negation markers can strengthen, weaken or have no effect on sentiment orientation of a word or a group of words. Using the Constraint Grammar formalism, we have designed and evaluated a set of linguistic rules to formalize these three phenomena. The results show that two phenomena, strengthening and no change, have been identified accurately and the third one, weakening, with acceptable results.
2017
Using lexical level information in discourse structures for Basque sentiment analysis
Jon Alkorta | Koldo Gojenola | Mikel Iruskieta | Maite Taboada
Proceedings of the 6th Workshop on Recent Advances in RST and Related Formalisms
Jon Alkorta | Koldo Gojenola | Mikel Iruskieta | Maite Taboada
Proceedings of the 6th Workshop on Recent Advances in RST and Related Formalisms
2016
The impact of simple feature engineering in multilingual medical NER
Rebecka Weegar | Arantza Casillas | Arantza Diaz de Ilarraza | Maite Oronoz | Alicia Pérez | Koldo Gojenola
Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)
Rebecka Weegar | Arantza Casillas | Arantza Diaz de Ilarraza | Maite Oronoz | Alicia Pérez | Koldo Gojenola
Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)
The goal of this paper is to examine the impact of simple feature engineering mechanisms before applying more sophisticated techniques to the task of medical NER. Sometimes papers using scientifically sound techniques present raw baselines that could be improved adding simple and cheap features. This work focuses on entity recognition for the clinical domain for three languages: English, Swedish and Spanish. The task is tackled using simple features, starting from the window size, capitalization, prefixes, and moving to POS and semantic tags. This work demonstrates that a simple initial step of feature engineering can improve the baseline results significantly. Hence, the contributions of this paper are: first, a short list of guidelines well supported with experimental results on three languages and, second, a detailed description of the relevance of these features for medical NER.
Fully unsupervised low-dimensional representation of adverse drug reaction events through distributional semantics
Alicia Pérez | Arantza Casillas | Koldo Gojenola
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)
Alicia Pérez | Arantza Casillas | Koldo Gojenola
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)
Electronic health records show great variability since the same concept is often expressed with different terms, either scientific latin forms, common or lay variants and even vernacular naming. Deep learning enables distributional representation of terms in a vector-space, and therefore, related terms tend to be close in the vector space. Accordingly, embedding words through these vectors opens the way towards accounting for semantic relatedness through classical algebraic operations. In this work we propose a simple though efficient unsupervised characterization of Adverse Drug Reactions (ADRs). This approach exploits the embedding representation of the terms involved in candidate ADR events, that is, drug-disease entity pairs. In brief, the ADRs are represented as vectors that link the drug with the disease in their context through a recursive additive model. We discovered that a low-dimensional representation that makes use of the modulus and argument of the embedded representation of the ADR event shows correlation with the manually annotated class. Thus, it can be derived that this characterization results in to be beneficial for further classification tasks as predictive features.
2014
On WordNet Semantic Classes and Dependency Parsing
Kepa Bengoetxea | Eneko Agirre | Joakim Nivre | Yue Zhang | Koldo Gojenola
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Kepa Bengoetxea | Eneko Agirre | Joakim Nivre | Yue Zhang | Koldo Gojenola
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
IxaMed: Applying Freeling and a Perceptron Sequential Tagger at the Shared Task on Analyzing Clinical Texts
Koldo Gojenola | Maite Oronoz | Alicia Pérez | Arantza Casillas
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)
Koldo Gojenola | Maite Oronoz | Alicia Pérez | Arantza Casillas
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)
Adverse Drug Event prediction combining shallow analysis and machine learning
Sara Santiso | Arantza Casillas | Alicia Pérez | Maite Oronoz | Koldo Gojenola
Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)
Sara Santiso | Arantza Casillas | Alicia Pérez | Maite Oronoz | Koldo Gojenola
Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)
2013
Exploiting the Contribution of Morphological Information to Parsing: the BASQUE TEAM system in the SPRML‘2013 Shared Task
Iakes Goenaga | Koldo Gojenola | Nerea Ezeiza
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages
Iakes Goenaga | Koldo Gojenola | Nerea Ezeiza
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages
Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages
Djamé Seddah | Reut Tsarfaty | Sandra Kübler | Marie Candito | Jinho D. Choi | Richárd Farkas | Jennifer Foster | Iakes Goenaga | Koldo Gojenola Galletebeitia | Yoav Goldberg | Spence Green | Nizar Habash | Marco Kuhlmann | Wolfgang Maier | Joakim Nivre | Adam Przepiórkowski | Ryan Roth | Wolfgang Seeker | Yannick Versley | Veronika Vincze | Marcin Woliński | Alina Wróblewska | Eric Villemonte de la Clergerie
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages
Djamé Seddah | Reut Tsarfaty | Sandra Kübler | Marie Candito | Jinho D. Choi | Richárd Farkas | Jennifer Foster | Iakes Goenaga | Koldo Gojenola Galletebeitia | Yoav Goldberg | Spence Green | Nizar Habash | Marco Kuhlmann | Wolfgang Maier | Joakim Nivre | Adam Przepiórkowski | Ryan Roth | Wolfgang Seeker | Yannick Versley | Veronika Vincze | Marcin Woliński | Alina Wróblewska | Eric Villemonte de la Clergerie
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages
2012
Combining Rule-Based and Statistical Syntactic Analyzers
Iakes Goenaga | Koldobika Gojenola | María Jesús Aranzabe | Arantza Díaz de Ilarraza | Kepa Bengoetxea
Proceedings of the ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages
Iakes Goenaga | Koldobika Gojenola | María Jesús Aranzabe | Arantza Díaz de Ilarraza | Kepa Bengoetxea
Proceedings of the ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages
First Approaches on Spanish Medical Record Classification Using Diagnostic Term to Class Transduction
A. Casillas | A. Díaz de Ilarraza | K. Gojenola | M. Oronoz | Alicia Pérez
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing
A. Casillas | A. Díaz de Ilarraza | K. Gojenola | M. Oronoz | Alicia Pérez
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing
2011
Improving Dependency Parsing with Semantic Classes
Eneko Agirre | Kepa Bengoetxea | Koldo Gojenola | Joakim Nivre
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Eneko Agirre | Kepa Bengoetxea | Koldo Gojenola | Joakim Nivre
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Using Kybots for Extracting Events in Biomedical Texts
Arantza Casillas | Arantza Díaz de Ilarraza | Koldo Gojenola | Maite Oronoz | German Rigau
Proceedings of BioNLP Shared Task 2011 Workshop
Arantza Casillas | Arantza Díaz de Ilarraza | Koldo Gojenola | Maite Oronoz | German Rigau
Proceedings of BioNLP Shared Task 2011 Workshop
Testing the Effect of Morphological Disambiguation in Dependency Parsing of Basque
Kepa Bengoetxea | Arantza Casillas | Koldo Gojenola
Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
Kepa Bengoetxea | Arantza Casillas | Koldo Gojenola
Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
2010
Application of Different Techniques to Dependency Parsing of Basque
Kepa Bengoetxea | Koldo Gojenola
Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Kepa Bengoetxea | Koldo Gojenola
Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
2009
Exploring Treebank Transformations in Dependency Parsing
Kepa Bengoetxea | Koldo Gojenola
Proceedings of the International Conference RANLP-2009
Kepa Bengoetxea | Koldo Gojenola
Proceedings of the International Conference RANLP-2009
Evaluating the Impact of Morphosyntactic Ambiguity in Grammatical Error Detection
Arantza Díaz de Ilarraza | Koldo Gojenola | Maite Oronoz
Proceedings of the International Conference RANLP-2009
Arantza Díaz de Ilarraza | Koldo Gojenola | Maite Oronoz
Proceedings of the International Conference RANLP-2009
Application of feature propagation to dependency parsing
Kepa Bengoetxea | Koldo Gojenola
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)
Kepa Bengoetxea | Koldo Gojenola
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)
2008
Detecting Erroneous Uses of Complex Postpositions in an Agglutinative Language
Arantza Díaz de Ilarraza | Koldo Gojenola | Maite Oronoz
Coling 2008: Companion volume: Posters
Arantza Díaz de Ilarraza | Koldo Gojenola | Maite Oronoz
Coling 2008: Companion volume: Posters
2004
Exploring Portability of Syntactic Information from English to Basque
Eneko Agirre | Aitziber Atutxa | Koldo Gojenola | Kepa Sarasola
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
Eneko Agirre | Aitziber Atutxa | Koldo Gojenola | Kepa Sarasola
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
Representation and Treatment of Multiword Expressions in Basque
Iñaki Alegria | Olatz Ansa | Xabier Artola | Nerea Ezeiza | Koldo Gojenola | Ruben Urizar
Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Iñaki Alegria | Olatz Ansa | Xabier Artola | Nerea Ezeiza | Koldo Gojenola | Ruben Urizar
Proceedings of the Workshop on Multiword Expressions: Integrating Processing
2002
A Class Library for the Integration of NLP Tools: Definition and implementation of an Abstract Data Type Collection for the manipulation of SGML documents in a context of stand-off linguistic annotation
X. Artola | A. Díaz de Ilarraza | N. Ezeiza | K. Gojenola | G. Hernández | A. Soroa
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)
X. Artola | A. Díaz de Ilarraza | N. Ezeiza | K. Gojenola | G. Hernández | A. Soroa
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)
Learning Argument/Adjunct Dictinction for Basque
Izaskun Aldezabal | Maxux Aranzabe | Koldo Gojenola | Kepa Sarasola | Aitziber Atutxa
Proceedings of the ACL-02 Workshop on Unsupervised Lexical Acquisition
Izaskun Aldezabal | Maxux Aranzabe | Koldo Gojenola | Kepa Sarasola | Aitziber Atutxa
Proceedings of the ACL-02 Workshop on Unsupervised Lexical Acquisition
2000
A Bootstrapping Approach to Parser Development
Izaskun Aldezabal | Koldo Gojenola | Kepa Sarasola
Proceedings of the Sixth International Workshop on Parsing Technologies
Izaskun Aldezabal | Koldo Gojenola | Kepa Sarasola
Proceedings of the Sixth International Workshop on Parsing Technologies
This paper presents a robust parsing system for unrestricted Basque texts. It analyzes a sentence in two stages: a unification-based parser builds basic syntactic units such as NPs, PPs, and sentential complements, while a finite-state parser performs syntactic disambiguation and filtering of the results. The system has been applied to the acquisition of verbal subcategorization information, obtaining 66% recall and 87% precision in the determination of verb subcategorization instances. This information will be later incorporated to the parser, in order to improve its performance.
Corpus-Based Syntactic Error Detection Using Syntactic Patterns
Koldo Gojenola | Maite Oronoz
Proceedings of the ANLP-NAACL 2000 Student Research Workshop
Koldo Gojenola | Maite Oronoz
Proceedings of the ANLP-NAACL 2000 Student Research Workshop
A word-grammar based morphological analyzer for agglutinative languages
I. Aduriz | E. Agirre | I. Aldezabal | I. Alegria | X. Arregi | J. M. Arriola | X. Artola | K. Gojenola | A. Maritxalar | K. Sarasola | M. Urkia
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics
I. Aduriz | E. Agirre | I. Aldezabal | I. Alegria | X. Arregi | J. M. Arriola | X. Artola | K. Gojenola | A. Maritxalar | K. Sarasola | M. Urkia
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics
A Word-level Morphosyntactic Analyzer for Basque
I. Aduriz | E. Agirre | I. Aldezabal | X. Arregi | J. M. Arriola | X. Artola | K. Gojenola | A. Maritxalar | K. Sarasola | M. Urkia
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
I. Aduriz | E. Agirre | I. Aldezabal | X. Arregi | J. M. Arriola | X. Artola | K. Gojenola | A. Maritxalar | K. Sarasola | M. Urkia
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
A Proposal for the Integration of NLP Tools using SGML-Tagged Documents
X. Artola | A. Díaz de Ilarraza | N. Ezeiza | K. Gojenola | A. Maritxalar | A. Soroa
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
X. Artola | A. Díaz de Ilarraza | N. Ezeiza | K. Gojenola | A. Maritxalar | A. Soroa
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
1998
Search
Fix author
Co-authors
- Maite Oronoz 9
- Arantza Díaz de Ilarraza 8
- Eneko Agirre 7
- Kepa Bengoetxea 7
- Arantza Casillas 7
- Kepa Sarasola 7
- Xabier Artola 5
- Alicia Pérez 5
- Izaskun Aldezabal 4
- Nerea Ezeiza 4
- Iakes Goenaga 4
- Jon Alkorta 3
- Mikel Iruskieta 3
- Alberto Maritxalar 3
- Joakim Nivre 3
- Itziar Aduriz 2
- Iñaki Alegría 2
- Xabier Arregi 2
- Jose Mari Arriola 2
- Aitziber Atutxa 2
- Aitziber Atutxa Salazar 2
- Aitor Soroa 2
- Miriam Urkia 2
- Atro Voutilainen 2
- Olatz Ansa 1
- Janire Arana 1
- Maxux Aranzabe 1
- María Jesús Aranzabe 1
- Marie Candito 1
- Jinho D. Choi 1
- Elisa Espina 1
- Richárd Farkas 1
- Jennifer Foster 1
- Yoav Goldberg 1
- Spence Green 1
- Nizar Habash 1
- Gregorio Hernández 1
- Mikel Idoyaga 1
- Marco Kuhlmann 1
- Sandra Kübler 1
- Xabier Lahuerta 1
- Wolfgang Maier 1
- Adam Przepiórkowski 1
- German Rigau 1
- Ryan Roth 1
- Sara Santiso 1
- Djamé Seddah 1
- Wolfgang Seeker 1
- Maite Taboada 1
- Reut Tsarfaty 1
- Ruben Urizar 1
- Maitane Urruela 1
- Yannick Versley 1
- Éric Villemonte de la Clergerie 1
- Veronika Vincze 1
- Rebecka Weegar 1
- Marcin Woliński 1
- Alina Wróblewska 1
- Yue Zhang 1