Luca Mouchel
2026
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Alejandro Hern\'andez-Cano | Alexander H\"agele | Allen Hao Huang | Angelika Romanou | Antoni-Joan Solergibert | Barna P\'asztor | Bettina Messmer | Dhia Garbaya | Eduard Frank \v{D}urech | Ido Hakimi | Juan Garcia Giraldo | Mete Ismayilzada | Negar Foroutan | Skander Moalla | Tiancheng Chen | Vinko Sabol\v{c}ec | Yixuan Xu | Michael Aerni | Badr AlKhamissi | In\'es Altemir Marinas | Mohammad Hossein Amani | Matin Ansaripour | Ilia Badanin | Harold Benoit | Emanuela Boros | Nicholas John Browning | Fabian B\"osch | Maximilian B\"other | Niklas Canova | Camille Challier | Cl\'ement Charmillot | Jonathan Coles | Jan Milan Deriu | Arnout Devos | Lukas Drescher | Daniil Dzenhaliou | Maud Ehrmann | Dongyang Fan | Simin Fan | Silin Gao | Miguel Gila | Mar{\'\i}a Grandury | Diba Hashemi | Alexander Miserlis Hoyle | Jiaming Jiang | Mark Klein | Andrei Kucharavy | Anastasiia Kucherenko | Frederike L\"ubeck | Roman Machacek | Theofilos Ioannis Manitaras | Andreas Marfurt | Kyle Matoba | Simon Matrenok | Henrique Mendon\c{c}a | Fawzi Roberto Mohamed | Syrielle Montariol | Luca Mouchel | Sven Najem-Meyer | Jingwei Ni | Gennaro Oliva | Matteo Pagliardini | Elia Palme | Andrei Panferov | L\'eo Paoletti | Marco Passerini | Ivan Pavlov | Auguste Poiroux | Kaustubh Ponkshe | Nathan Ranchin | Javier Rando | Mathieu Sauser | Jakhongir Saydaliev | Mukhammadali Sayfiddinov | Marian Schneider | Stefano Schuppli | Marco Scialanga | Andrei Semenov | Kumar Shridhar | Raghav Singhal | Anna Sotnikova | Alexander Sternfeld | Ayush Kumar Tarun | Paul Teiletche | Jannis Vamvas | Xiaozhe Yao | Hao Zhao | Alexander Ilic | Ana Klimovic | Andreas Krause | Caglar Gulcehre | David Rosenthal | Elliott Ash | Florian Tram\`er | Joost VandeVondele | Livio Veraldi | Martin Rajman | Thomas C. Schulthess | Torsten Hoefler | Antoine Bosselut | Martin Jaggi | Imanol Schlag
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Alejandro Hern\'andez-Cano | Alexander H\"agele | Allen Hao Huang | Angelika Romanou | Antoni-Joan Solergibert | Barna P\'asztor | Bettina Messmer | Dhia Garbaya | Eduard Frank \v{D}urech | Ido Hakimi | Juan Garcia Giraldo | Mete Ismayilzada | Negar Foroutan | Skander Moalla | Tiancheng Chen | Vinko Sabol\v{c}ec | Yixuan Xu | Michael Aerni | Badr AlKhamissi | In\'es Altemir Marinas | Mohammad Hossein Amani | Matin Ansaripour | Ilia Badanin | Harold Benoit | Emanuela Boros | Nicholas John Browning | Fabian B\"osch | Maximilian B\"other | Niklas Canova | Camille Challier | Cl\'ement Charmillot | Jonathan Coles | Jan Milan Deriu | Arnout Devos | Lukas Drescher | Daniil Dzenhaliou | Maud Ehrmann | Dongyang Fan | Simin Fan | Silin Gao | Miguel Gila | Mar{\'\i}a Grandury | Diba Hashemi | Alexander Miserlis Hoyle | Jiaming Jiang | Mark Klein | Andrei Kucharavy | Anastasiia Kucherenko | Frederike L\"ubeck | Roman Machacek | Theofilos Ioannis Manitaras | Andreas Marfurt | Kyle Matoba | Simon Matrenok | Henrique Mendon\c{c}a | Fawzi Roberto Mohamed | Syrielle Montariol | Luca Mouchel | Sven Najem-Meyer | Jingwei Ni | Gennaro Oliva | Matteo Pagliardini | Elia Palme | Andrei Panferov | L\'eo Paoletti | Marco Passerini | Ivan Pavlov | Auguste Poiroux | Kaustubh Ponkshe | Nathan Ranchin | Javier Rando | Mathieu Sauser | Jakhongir Saydaliev | Mukhammadali Sayfiddinov | Marian Schneider | Stefano Schuppli | Marco Scialanga | Andrei Semenov | Kumar Shridhar | Raghav Singhal | Anna Sotnikova | Alexander Sternfeld | Ayush Kumar Tarun | Paul Teiletche | Jannis Vamvas | Xiaozhe Yao | Hao Zhao | Alexander Ilic | Ana Klimovic | Andreas Krause | Caglar Gulcehre | David Rosenthal | Elliott Ash | Florian Tram\`er | Joost VandeVondele | Livio Veraldi | Martin Rajman | Thomas C. Schulthess | Torsten Hoefler | Antoine Bosselut | Martin Jaggi | Imanol Schlag
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Open LLMs enable AI practitioners to control development costs by building on an existing foundation for downstream applications. While offering substantial promise, current models often fail to meet the needs of users needing open solutions aligned with responsible AI principles, including data compliance, transparency, and inclusivity. In this work, we present Apertus, a fully open suite of large language models (LLMs) designed to address responsibility shortcomings in today’s open model ecosystem, namely data responsibility and global representation. Unlike many prior models that release weights without reproducible data pipelines or regard for content-owner rights, Apertus models are pretrained exclusively on openly available data, retroactively respecting robots.txt exclusions and filtering for non-permissive, toxic, and personally identifiable content. To mitigate risks of data memorization, we also adopt the Goldfish objective during pretraining, strongly suppressing verbatim recall of data while retaining downstream task performance. Apertus also drastically expands multilingual coverage, training on 15T tokens from over approximately 1800 languages, with about 40% of pretraining data allocated to non-English content. Released at 8B and 70B scales, Apertus approaches state-of-the-art results among fully open models on multilingual benchmarks, rivaling or surpassing open-weight counterparts.
2025
A Logical Fallacy-Informed Framework for Argument Generation
Luca Mouchel | Debjit Paul | Shaobo Cui | Robert West | Antoine Bosselut | Boi Faltings
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Luca Mouchel | Debjit Paul | Shaobo Cui | Robert West | Antoine Bosselut | Boi Faltings
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Despite the remarkable performance of large language models (LLMs), they still struggle with generating logically sound arguments, resulting in potential risks such as spreading misinformation. An important factor contributing to LLMs’ suboptimal performance in generating coherent arguments is their oversight of logical fallacies. To address this issue, we introduce fallacy-informed preference optimization (FIPO) that helps steer LLMs toward generating logically sound arguments. FIPO includes a classification loss to capture the fine-grained information on fallacy types. Our results on argument generation tasks show that FIPO reduces the fallacy errors by up to 17.5%. Furthermore, our human evaluation results reveal that the quality of the arguments generated by our method significantly outperforms the fine-tuned baselines and other preference optimization methods, such as DPO. These findings highlight the importance of ensuring models are aware of logical fallacies for effective argument generation.
Uncertainty in Causality: A New Frontier
Shaobo Cui | Luca Mouchel | Boi Faltings
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Shaobo Cui | Luca Mouchel | Boi Faltings
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Understanding uncertainty in causality is vital in various domains, including core NLP tasks like event causality extraction, commonsense reasoning, and counterfactual text generation. However, existing literature lacks a comprehensive examination of this area. This survey aims to fill this gap by thoroughly reviewing uncertainty in causality. We first introduce a novel trichotomy, categorizing causal uncertainty into aleatoric (inherent randomness in causal data), epistemic (causal model limitations), and ontological (existence of causal links) uncertainty. We then survey methods for quantifying uncertainty in causal analysis and highlight the complementary relationship between causal uncertainty and causal strength. Furthermore, we examine the challenges that large language models (LLMs) face in handling causal uncertainty, such as hallucinations and inconsistencies, and propose key traits for an optimal causal LLM. Our paper reviews current approaches and outlines future research directions, aiming to serve as a practical guide for researchers and practitioners in this emerging field.
Search
Fix author
Co-authors
- Antoine Bosselut 2
- Shaobo Cui 2
- Boi Faltings 2
- Michael Aerni 1
- Badr AlKhamissi 1
- Mohammad Hossein Amani 1
- Matin Ansaripour 1
- Elliott Ash 1
- Fabian B\"osch 1
- Maximilian B\"other 1
- Ilia Badanin 1
- Harold Benoit 1
- Emanuela Boroş 1
- Nicholas John Browning 1
- Niklas Canova 1
- Camille Challier 1
- Cl\'ement Charmillot 1
- Tiancheng Chen 1
- Jonathan Coles 1
- Jan Milan Deriu 1
- Arnout Devos 1
- Lukas Drescher 1
- Daniil Dzenhaliou 1
- Maud Ehrmann 1
- Dongyang Fan 1
- Simin Fan 1
- Negar Foroutan 1
- Silin Gao 1
- Dhia Garbaya 1
- Miguel Gila 1
- Juan Garcia Giraldo 1
- María Grandury 1
- Çağlar Gu̇lçehre 1
- Alexander H\"agele 1
- Ido Hakimi 1
- Diba Hashemi 1
- Alejandro Hern\'andez-Cano 1
- Torsten Hoefler 1
- Alexander Miserlis Hoyle 1
- Allen Hao Huang 1
- Alexander Ilic 1
- Mete Ismayilzada 1
- Martin Jaggi 1
- Jiaming Jiang 1
- Mark Klein 1
- Ana Klimovic 1
- Andreas Krause 1
- Andrei Kucharavy 1
- Anastasiia Kucherenko 1
- Frederike L\"ubeck 1
- Roman Machacek 1
- Theofilos Ioannis Manitaras 1
- Andreas Marfurt 1
- In\'es Altemir Marinas 1
- Kyle Matoba 1
- Simon Matrenok 1
- Henrique Mendon\c{c}a 1
- Bettina Messmer 1
- Skander Moalla 1
- Fawzi Roberto Mohamed 1
- Syrielle Montariol 1
- Sven Najem-Meyer 1
- Jingwei Ni 1
- Gennaro Oliva 1
- Barna P\'asztor 1
- Matteo Pagliardini 1
- Elia Palme 1
- Andrei Panferov 1
- L\'eo Paoletti 1
- Marco Passerini 1
- Debjit Paul 1
- Ivan Pavlov 1
- Auguste Poiroux 1
- Kaustubh Ponkshe 1
- Martin Rajman 1
- Nathan Ranchin 1
- Javier Rando 1
- Angelika Romanou 1
- David Rosenthal 1
- Vinko Sabol\v{c}ec 1
- Mathieu Sauser 1
- Jakhongir Saydaliev 1
- Mukhammadali Sayfiddinov 1
- Imanol Schlag 1
- Marian Schneider 1
- Thomas C. Schulthess 1
- Stefano Schuppli 1
- Marco Scialanga 1
- Andrei Semenov 1
- Kumar Shridhar 1
- Raghav Singhal 1
- Antoni-Joan Solergibert 1
- Anna Sotnikova 1
- Alexander Sternfeld 1
- Ayush Kumar Tarun 1
- Paul Teiletche 1
- Florian Tram\`er 1
- Jannis Vamvas 1
- Joost VandeVondele 1
- Livio Veraldi 1
- Robert West 1
- Yixuan Xu 1
- Xiaozhe Yao 1
- Hao Zhao 1
- Eduard Frank \v{D}urech 1