Claire Ibarboure
2026
PRIVaThe: An Annotated Dataset of Multi-Objectives Web Search Sessions
Claire Ibarboure | Ludovic Tanguy | Franck Amadieu | Josiane Mothe
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Claire Ibarboure | Ludovic Tanguy | Franck Amadieu | Josiane Mothe
Proceedings of the Fifteenth Language Resources and Evaluation Conference
This paper presents PRIVaThe, a new French-language dataset, consisting of 200 web search sessions from 100 participants performing two multi-objective, multi-hop tasks, designed to enable cross-user comparison of session-level search strategies. Unlike existing datasets that capture only query sequences or final answers, PRIVaThe provides explicit sub-objective decomposition traces for each session. We automatically annotate 3,162 queries with their addressed sub-objective(s) using validated open-weight LLMs (Mistral, LLama3, and Gemma) against human gold annotations. This annotation enables systematic analyses of how users distribute and sequence sub-objectives throughout their sessions, revealing distinct search strategies such as logical, global, and exploratory approaches.
2020
LITL at SMM4H: An Old-school Feature-based Classifier for Identifying Adverse Effects in Tweets
Ludovic Tanguy | Lydia-Mai Ho-Dac | Cécile Fabre | Roxane Bois | Touati Mohamed Yacine Haddad | Claire Ibarboure | Marie Joyau | François Le moal | Jade Moiilic | Laura Roudaut | Mathilde Simounet | Irena Stankovic | Mickaela Vandewaetere
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task
Ludovic Tanguy | Lydia-Mai Ho-Dac | Cécile Fabre | Roxane Bois | Touati Mohamed Yacine Haddad | Claire Ibarboure | Marie Joyau | François Le moal | Jade Moiilic | Laura Roudaut | Mathilde Simounet | Irena Stankovic | Mickaela Vandewaetere
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task
This paper describes our participation to the SMM4H shared task 2. We designed a rule-based classifier that estimates whether a tweet mentions an adverse effect associated to a medication. Our system addresses English and French, and is based on a number of specific word lists and features. These cues were mostly obtained through an extensive corpus analysis of the provided training data. Different weighting schemes were tested (manually tuned or based on a logistic regression), the best one achieving a F1 score of 0.31 for English and 0.15 for French.