Asya Zanollo

2026

PharmaQA.IT: an Italian dataset for Q A in the pharmaceutical domain
Kamyar Zeinalipour | Andrea Zugarini | Asya Zanollo | Leonardo Rigutini
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)

The growing use of Large Language Models (LLMs) for medical Question Answering (QA) requires reliable, evidence-grounded benchmarks beyond English. In Italy, Riassunti delle Caratteristiche del Prodotto (RCP) issued by the Italian Medicines Agency (AIFA) are the main regulatory source on medicines, yet no QA dataset exists on these documents, limiting the development and evaluation of trustworthy Italian QA systems.We introduce PharmaQA.IT, an Italian extractive QA dataset built from RCPs in PharmaER.IT. Using a semi-automatic pipeline, we (i) select informative pages from 1,077 leaflets, (ii) prompt a multimodal LLM on page images with professional personas to generate candidate question–answer pairs, and (iii) validate and normalise them with expert revision. The final dataset contains 861 high-quality question–answer pairs on indications, contraindications, dosage, warnings, interactions, and pharmacological properties.We frame PharmaQA.IT as an extractive QA benchmark with structured JSON outputs and evaluate a range of open and proprietary LLMs. Results show that open models approach closed-source performance under a chunking-and-retrieval setup. PharmaQA.IT, together with all code, prompts, and evaluation scripts, will be publicly released to support research on trustworthy Italian biomedical QA.PharmaQA.IT, together with all code, prompts, and evaluation scripts, is publicly available on Hugging Face to support research on trustworthy Italian biomedical QA.

2025

pdf bib abs

Linguistic Units as Tokens: Intrinsic and Extrinsic Evaluation with BabyLM
Achille Fusco | Maria Letizia Piccini Bianchessi | Tommaso Sgrizzi | Asya Zanollo | Cristiano Chesi
Proceedings of the First BabyLM Workshop

Tokenization is often treated as a preprocessing step, yet in data-limited settings it directly shapes what a model can learn. We compare four segmentation strategies in the BabyLM Challenge: frequency-based BPE, morphology-aware MorPiece and ParadigmFinder, and syllable-based SylliTok. Evaluation combines two perspectives. First, an intrinsic test on the SIGMORPHON 2022 segmentation benchmark, adapted to English, measures how closely each tokenizer aligns with morpheme boundaries. Second, extrinsic tests train GPT-2 on the 10M BabyLM corpus and evaluate on the 2025 benchmark. No single tokenizer dominates. BPE remains strong on syntax-heavy tasks. ParadigmFinder excels in semantic composition and age-of-acquisition alignment. MorPiece shows advantages in discourse tracking. Morphology-aware tokenizers achieve the best intrinsic segmentation scores, and these gains translate into more robust generalisation in comprehension tasks. These results highlight tokenization as a core modeling decision, with direct consequences for compression, morphology, and the path to humanlike learning.

pdf bib

Surprisal and Crossword Clues Difficulty: Evaluating Linguistic Processing between LLMs and Humans
Tommaso Iaquinta | Asya Zanollo | Achille Fusco | Kamyar Zeinalipour | Cristiano Chesi
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

pdf bib

Acquisition in Babies and Machines: Comparing the Learning Trajectories of LMs in Terms of Syntactic Structures (ATTracTSS Test Set)
Sarah Rossi | Guido Formichi | Sofia Neri | Tommaso Sgrizzi | Asya Zanollo | Veronica Bressan | Cristiano Chesi
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

pdf bib

Structural Sensitivity Does Not Entail Grammaticality: Assessing LLMs against the Universal Functional Hierarchy
Tommaso Sgrizzi | Asya Zanollo | Cristiano Chesi
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)

2024

pdf bib abs

Harnessing LLMs for Educational Content-Driven Italian Crossword Generation
Kamyar Zeinalipour | Achille Fusco | Asya Zanollo | Marco Maggini | Marco Gori
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

In this work, we unveil a novel tool for generating Italian crossword puzzles from text, utilizing advanced language models such as GPT-4o, Mistral-7B-Instruct-v0.3, and Llama3-8B-Instruct. Crafted specifically for educational applications, this cutting-edge generator makes use of the comprehensive Italian-Clue-Instruct dataset, which comprises over 30,000 entries including diverse text, solutions, and types of clues. This carefully assembled dataset is designed to facilitate the creation of contextually relevant clues in various styles associated with specific texts and keywords.The study delves into four distinctive styles of crossword clues: those without format constraints, those formed as definite determiner phrases, copular sentences, and bare noun phrases. Each style introduces unique linguistic structures to diversify clue presentation.Given the lack of sophisticated educational tools tailored to the Italian language, this project seeks to enhance learning experiences and cognitive development through an engaging, interactive platform. By meshing state-of-the-art AI with contemporary educational strategies, our tool can dynamically generate crossword puzzles from Italian educational materials, thereby providing an enjoyable and interactive learning environment. This technological advancement not only redefines educational paradigms but also sets a new benchmark for interactive and cognitive language learning solutions.

pdf bib abs

ECWCA - Educational CrossWord Clues Answering: A CALAMITA Challenge
Andrea Zugarini | Kamyar Zeinalipour | Achille Fusco | Asya Zanollo
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

This paper presents ECWCA (Educational CrossWord Clues Answering), a novel challenge designed to evaluate knowledge and reasoning capabilities of large language models through crossword clue-answering. The challenge consists of two tasks: a standard question-answering format where the LLM has to solve crossword clues, and a variation of it, where the model is receives hints about the word lengths of the answers, which is expected to help models with reasoning abilities. To construct the ECWCA dataset, synthetic clues were generated based on entities and facts extracted from Italian Wikipedia. Generated clues were then selected manually in order to ensure high-quality examples with factually correct and unambiguous clues.

Asya Zanollo

2026

2025

2024

2023

Co-authors

Venues