Anna Colli

2026

We describe the participation of team TFB in SemEval-2026 Task 4 on narrative similarity. We explore ColBERT-inspired sentence-level late interaction to capture event reordering, compare fine-tuning with synthetic data at multiple difficulty tiers, finding that distribution proximity to the target data matters more than volume and evaluate chain-of-thought prompting. We complement our approaches with a human annotation study (Krippendorff’s alpha=0.32) confirming the task’s inherent difficulty, an analysis of synthetic data distribution shift explaining why fine-tuning on out-of-distribution data hurts the model’s performance. Despite our tests, we didn’t surpass results of sentence-t5-xxl on Track B and Qwen2.5-7B on Track A. We finally decided to submit these two models for the task.

2025

pdf bib abs

Exploration de la modalité en français parlé et écrit
Anna Colli | Delphine Battistelli
Actes des 32ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : articles scientifiques originaux

Dans cet article, nous présentons une méthodologie pour comparer entre eux les profils modaux de corpus en français. Nous montrons quelles différences émergent ou non entre l’écrit et l’oral et pointons l’importance et la place des marqueurs polysémiques dans les deux cas. L’analyse de la polysémie du verbe pouvoir retient notre attention dans la mesure où ce verbe s’avère être un marqueur très présent dans l’ensemble des corpus.

2024

pdf bib abs

A Modal Sense Classifier for the French Modal Verb Pouvoir
Anna Colli | Diego Rossini | Delphine Battistelli
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)

In this paper we address the problem of modal sense classification for the French modal verb pouvoir in a transcribed spoken corpus. To the best of our knowledge, no studies have focused on this task in French. We fine-tuned various BERT-based models for French in order to determine which one performed best. It was found that the Flaubert-base-cased model was the most effective (F1-score of 0.94) and that the most frequent categories in our corpus were material possibility and ability, which are both part of the more global alethic category.

Co-authors

Eve Sauvage 1

Julien Tourille 1

Zheng Zhang 1

Venues

Fix author