Patrick Ruch


Contextualized French Language Models for Biomedical Named Entity Recognition
Jenny Copara | Julien Knafou | Nona Naderi | Claudia Moro | Patrick Ruch | Douglas Teodoro
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Atelier DÉfi Fouille de Textes

Named entity recognition (NER) is key for biomedical applications as it allows knowledge discovery in free text data. As entities are semantic phrases, their meaning is conditioned to the context to avoid ambiguity. In this work, we explore contextualized language models for NER in French biomedical text as part of the Défi Fouille de Textes challenge. Our best approach achieved an F1 -measure of 66% for symptoms and signs, and pathology categories, being top 1 for subtask 1. For anatomy, dose, exam, mode, moment, substance, treatment, and value categories, it achieved an F1 -measure of 75% (subtask 2). If considered all categories, our model achieved the best result in the challenge, with an F1 -measure of 72%. The use of an ensemble of neural language models proved to be very effective, improving a CRF baseline by up to 28% and a single specialised language model by 4%.

BiTeM at WNUT 2020 Shared Task-1: Named Entity Recognition over Wet Lab Protocols using an Ensemble of Contextual Language Models
Julien Knafou | Nona Naderi | Jenny Copara | Douglas Teodoro | Patrick Ruch
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)

Recent improvements in machine-reading technologies attracted much attention to automation problems and their possibilities. In this context, WNUT 2020 introduces a Name Entity Recognition (NER) task based on wet laboratory procedures. In this paper, we present a 3-step method based on deep neural language models that reported the best overall exact match F1-score (77.99%) of the competition. By fine-tuning 10 times, 10 different pretrained language models, this work shows the advantage of having more models in an ensemble based on a majority of votes strategy. On top of that, having 100 different models allowed us to analyse the combinations of ensemble that demonstrated the impact of having multiple pretrained models versus fine-tuning a pretrained model multiple times.


Argumentative Feedback: A Linguistically-Motivated Term Expansion for Information Retrieval
Patrick Ruch | Imad Tbahriti | Julien Gobeill | Alan R. Aronson
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions


An Argumentative Annotation Schema for Meeting Discussions
Vincenzo Pallotta | Hatem Ghorbel | Patrick Ruch | Giovanni Coray
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

Query Translation by Text Categorization
Patrick Ruch
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Using Argumentation to Retrieve Articles with Similar Citations from MEDLINE
Imad Tbahriti | Christine Chichester | Frédérique Lisacek | Patrick Ruch
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)


Using Contextual Spelling Correction to Improve Retrieval Effectiveness in Degraded Text Collections
Patrick Ruch
COLING 2002: The 19th International Conference on Computational Linguistics


Minimal Commitment and Full Lexical Disambiguation: Balancing Rules and Hidden Markov Models
Patrick Ruch | Robert Baud | Pierrette Bouillon | Gilbert Robert
Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop

Comparing corpora and lexical ambiguity
Patrick Ruch | Arnaud Gaudinat
The Workshop on Comparing Corpora