Salah Ait-Mokhtar
Also published as: Salah Aït-Mokhtar, Salah Ait Mokhtar
2026
StarDrinks: An English and Korean Test Set for SLU Evaluation in a Drink Ordering Scenario
Marcely Zanon Boito | Caroline Brun | Inyoung Kim | Denys M. PROUX | Salah Ait-Mokhtar | Nikolaos Lagos | Jean-Luc Meunier | Ioan Calapodescu
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Marcely Zanon Boito | Caroline Brun | Inyoung Kim | Denys M. PROUX | Salah Ait-Mokhtar | Nikolaos Lagos | Jean-Luc Meunier | Ioan Calapodescu
Proceedings of the Fifteenth Language Resources and Evaluation Conference
LLMs and speech assistants are increasingly used for task-oriented interactions, yet their evaluation often relies on controlled scenarios that fail to capture the variability and complexity of real user requests. Drink ordering, for example, involves diverse named entities, drink types, sizes, customizations, and brand-specific terminology, as well as spontaneous speech phenomena such as hesitations and self-corrections. To address this gap, we introduce StarDrinks, a test set in English and Korean containing speech utterances features, transcriptions, and annotated slots. Our dataset supports speech-to-slots SLU, transcription-to-slots NLU, and speech-to-transcription ASR evaluation, providing a realistic benchmark for model robustness and generalization in a linguistically rich, real-world task.
2022
Zero-Shot Aspect-Based Scientific Document Summarization using Self-Supervised Pre-training
Amir Soleimani | Vassilina Nikoulina | Benoit Favre | Salah Ait Mokhtar
Proceedings of the 21st Workshop on Biomedical Language Processing
Amir Soleimani | Vassilina Nikoulina | Benoit Favre | Salah Ait Mokhtar
Proceedings of the 21st Workshop on Biomedical Language Processing
We study the zero-shot setting for the aspect-based scientific document summarization task. Summarizing scientific documents with respect to an aspect can remarkably improve document assistance systems and readers experience. However, existing large-scale datasets contain a limited variety of aspects, causing summarization models to over-fit to a small set of aspects and a specific domain. We establish baseline results in zero-shot performance (over unseen aspects and the presence of domain shift), paraphrasing, leave-one-out, and limited supervised samples experimental setups. We propose a self-supervised pre-training approach to enhance the zero-shot performance. We leverage the PubMed structured abstracts to create a biomedical aspect-based summarization dataset. Experimental results on the PubMed and FacetSum aspect-based datasets show promising performance when the model is pre-trained using unlabelled in-domain data.
2021
Semantic Context Path Labeling for Semantic Exploration of User Reviews
Salah Aït-Mokhtar | Caroline Brun | Yves Hoppenot | Agnes Sandor
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Salah Aït-Mokhtar | Caroline Brun | Yves Hoppenot | Agnes Sandor
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
In this paper we present a prototype demonstrator showcasing a novel method to perform semantic exploration of user reviews. The system enables effective navigation in a rich contextual semantic schema with a large number of structured classes indicating relevant information. In order to identify instances of the structured classes in the reviews, we defined a new Information Extraction task called Semantic Context Path (SCP) labeling, which simultaneously assigns types and semantic roles to entity mentions. Reviews can rapidly be explored based on the fine-grained and structured semantic classes. As a proof-of-concept, we have implemented this system for reviews on Points-of-Interest, in English and Korean.
2019
“Sentiment Aware Map” : exploration cartographique de points d’intérêt via l’analyse de sentiments au niveau des aspects ()
Ioan Calapodescu | Caroline Brun | Vassilina Nikoulina | Salah Aït-Mokhtar
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume IV : Démonstrations
Ioan Calapodescu | Caroline Brun | Vassilina Nikoulina | Salah Aït-Mokhtar
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume IV : Démonstrations
2013
A Framework to Generate Sets of Terms from Large Scale Medical Vocabularies for Natural Language Processing
Salah Aït-Mokhtar | Caroline Hagège | Pajolma Rupi
Proceedings of the IWCS 2013 Workshop on Computational Semantics in Clinical Text (CSCT 2013)
Salah Aït-Mokhtar | Caroline Hagège | Pajolma Rupi
Proceedings of the IWCS 2013 Workshop on Computational Semantics in Clinical Text (CSCT 2013)
2001
A Multi-Input Dependency Parser
Salah Aït-Mokhtar | Jean-Pierre Chanod | Claude Roux
Proceedings of the Seventh International Workshop on Parsing Technologies
Salah Aït-Mokhtar | Jean-Pierre Chanod | Claude Roux
Proceedings of the Seventh International Workshop on Parsing Technologies