Niccolo’ Gentile
Also published as: Niccolo' Gentile
2026
Do Large Language Models Grasp the Grammar? Evidence from Grammar-Book-Guided Probing in Luxembourgish
Lujun LI | Yewei Song | Lama Sleem | Yiqun Wang | Yangjie Xu | Cedric LOTHRITZ | Niccolo' Gentile | Radu State | Tegawendé F. Bissyandé | Jacques Klein
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Lujun LI | Yewei Song | Lama Sleem | Yiqun Wang | Yangjie Xu | Cedric LOTHRITZ | Niccolo' Gentile | Radu State | Tegawendé F. Bissyandé | Jacques Klein
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Grammar refers to the system of rules that governs the structural organization and the semantic relations among linguistic units such as sentences, phrases, and words within a given language. In natural language processing, there remains a notable scarcity of grammar-focused evaluation protocols, a gap that is even more pronounced for low-resource languages. Moreover, the extent to which large language models genuinely comprehend grammatical structure, especially the mapping between syntactic structures and meanings remains under debate. To investigate this issue, we propose a Grammar-Book–Guided evaluation pipeline intended to provide a systematic and generalizable framework for grammar evaluation consisting of four key stages, and in this work we take Luxembourgish as a case study. The results show a weak positive correlation between translation performance and grammatical understanding, indicating that strong translations do not necessarily imply deep grammatical competence. Larger models perform well overall due to their semantic strength but remain weak in morphology and syntax, struggling particularly with Minimal Pair tasks, while strong reasoning ability offers a promising way to enhance their grammatical understanding.
Are Small Language Models the Silver Bullet to Low-Resource Languages Machine Translation?
Yewei Song | Lujun Li | Cedric Lothritz | Saad Ezzini | Lama Sleem | Niccolo' Gentile | Radu State | Tegawendé F. Bissyandé | Jacques Klein
Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026)
Yewei Song | Lujun Li | Cedric Lothritz | Saad Ezzini | Lama Sleem | Niccolo' Gentile | Radu State | Tegawendé F. Bissyandé | Jacques Klein
Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026)
Small language models (SLMs) offer computationally efficient alternatives to large language models, yet their translation quality for low-resource languages (LRLs) remains severely limited. This work presents the first large-scale evaluation of SLMs across 200 languages, revealing systematic underperformance in LRLs and identifying key sources of linguistic disparity. We show that knowledge distillation from strong teacher models using predominantly monolingual LRL data substantially boosts SLM translation quality—often enabling 2B–3B models to match or surpass systems up to 70B parameters. Our study highlights three core findings: (1) a comprehensive benchmark exposing the limitations of SLMs on 200 languages; (2) evidence that LRL-focused distillation improves translation without inducing catastrophic forgetting, with full-parameter fine-tuning and decoder-only teachers outperforming LoRA and encoder–decoder approaches; and (3) consistent cross-lingual gains demonstrating the scalability and robustness of the method. These results establish an effective, low-cost pathway for improving LRL translation and provide practical guidance for deploying SLMs in truly low-resource settings.
2025
Small Language Models in the Real World: Insights from Industrial Text Classification
Lujun Li | Lama Sleem | Niccolo’ Gentile | Geoffrey Nichil | Radu State
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Lujun Li | Lama Sleem | Niccolo’ Gentile | Geoffrey Nichil | Radu State
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
With the emergence of ChatGPT, Transformer models have significantly advanced text classification and related tasks. Decoder-only models such as Llama exhibit strong performance and flexibility, yet they suffer from inefficiency on inference due to token-by-token generation, and their effectiveness in text classification tasks heavily depends on prompt quality. Moreover, their substantial GPU resource requirements often limit widespread adoption. Thus, the question of whether smaller language models are capable of effectively handling text classification tasks emerges as a topic of significant interest. However, the selection of appropriate models and methodologies remains largely underexplored. In this paper, we conduct a comprehensive evaluation of prompt engineering and supervised fine-tuning methods for transformer-based text classification. Specifically, we focus on practical industrial scenarios, including email classification, legal document categorization, and the classification of extremely long academic texts. We examine the strengths and limitations of smaller models, with particular attention to both their performance and their efficiency in Video Random-Access Memory (VRAM) utilization, thereby providing valuable insights for the local deployment and application of compact models in industrial settings.