Davide Ceolin
2025
Toward Reasonable Parrots: Why Large Language Models Should Argue with Us by Design
Elena Musi
|
Nadin Kökciyan
|
Khalid Al Khatib
|
Davide Ceolin
|
Emmanuelle Dietz
|
Klara Maximiliane Gutekunst
|
Annette Hautli-Janisz
|
Cristián Santibáñez
|
Jodi Schneider
|
Jonas Scholz
|
Cor Steging
|
Jacky Visser
|
Henning Wachsmuth
Proceedings of the 12th Argument mining Workshop
In this position paper, we advocate for the development of conversational technology that is inherently designed to support and facilitate argumentative processes. We argue that, at present, large language models (LLMs) are inadequate for this purpose, and we propose an ideal technology design aimed at enhancing argumentative skills. This involves re-framing LLMs as tools to exercise our critical thinking skills rather than replacing them. We introduce the concept of reasonable parrots that embody the fundamental principles of relevance, responsibility, and freedom, and that interact through argumentative dialogical moves. These principles and moves arise out of millennia of work in argumentation theory and should serve as the starting point for LLM-based technology that incorporates basic principles of argumentation.
Navigating the Political Compass: Evaluating Multilingual LLMs across Languages and Nationalities
Chadi Helwe
|
Oana Balalau
|
Davide Ceolin
Findings of the Association for Computational Linguistics: ACL 2025
Large Language Models (LLMs) have become ubiquitous in today’s technological landscape, boasting a plethora of applications, and even endangering human jobs in complex and creative fields. One such field is journalism: LLMs are being used for summarization, generation and even fact-checking. However, in today’s political landscape, LLMs could accentuate tensions if they exhibit political bias. In this work, we evaluate the political bias of the most used 15 multilingual LLMs via the Political Compass Test. We test different scenarios, where we vary the language of the prompt, while also assigning a nationality to the model. We evaluate models on the 50 most populous countries and their official languages. Our results indicate that language has a strong influence on the political ideology displayed by a model. In addition, smaller models tend to display a more stable political ideology, i.e. ideology that is less affected by variations in the prompt.