Arnaud Delhay

2025

pdf bib abs
SocialForge: simulating the social internet to provide realistic training against influence operations
Ulysse Oliveri | Guillaume Gadek | Alexandre Dey | Benjamin Costé | Damien Lolive | Arnaud Delhay | Bruno Grilheres
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)

Social media platforms have enabled large-scale influence campaigns, impacting democratic processes. To fight against these threats, continuous training is needed. A typical training session is based on a fictive scenario describing key elements which are instantiated into a dedicated platform.Such a platform simulates social networks, which host a huge amount of content aligned with the training scenario. However, directly using Large Language Models to create appropriate content result in low content diversity due to coarse-grained and high-level scenario constraints, which compromises the trainees’ immersion.We address this issue with SocialForge, a system designed toenhance the diversity and realism of the generated content while ensuring its adherence to the original scenario.Specifically, SocialForge refines and augments the initial scenario constraints by generating detailed subnarratives, personas, and events.We assess diversity, realism, and adherence to the scenario through custom evaluation protocol. We also propose an automatic method to detect erroneous constraint generation, ensuring optimal alignment of the content with the scenario.SocialForge has been used in real trainings and in several showcases, with great end-user satisfaction. We release an open-source dataset generated with SocialForge for the research community.

pdf bib abs
Paraphrase Generation Evaluation Powered by an LLM: A Semantic Metric, Not a Lexical One
Quentin Lemesle | Jonathan Chevelu | Philippe Martin | Damien Lolive | Arnaud Delhay | Nelly Barbot
Proceedings of the 31st International Conference on Computational Linguistics

Evaluating automatic paraphrase production systems is a difficult task as it involves, among other things, assessing the semantic proximity between two sentences. Usual measures are based on lexical distances, or at least on semantic embedding alignments. The rise of Large Language Models (LLM) has provided tools to model relationships within a text thanks to the attention mechanism. In this article, we introduce ParaPLUIE, a new measure based on a log likelihood ratio from an LLM, to assess the quality of a potential paraphrase. This measure is compared with usual measures on two known by the NLP community datasets prior to this study. Three new small datasets have been built to allow metrics to be compared in different scenario and to avoid data contamination bias. According to evaluations, the proposed measure is better for sorting pairs of sentences by semantic proximity. In particular, it is much more independent to lexical distance and provides an interpretable classification threshold between paraphrases and non-paraphrases.

Set covering algorithms are efficient tools for solving an optimal linguistic corpus reduction. The optimality of such a process is directly related to the descriptive features of the sentences of a reference corpus. This article suggests to verify experimentally the behaviour of three algorithms, a greedy approach and a lagrangian relaxation based one giving importance to rare events and a third one considering the Kullback-Liebler divergence between a reference and the ongoing distribution of events. The analysis of the content of the reduced corpora shows that the both first approaches stay the most effective to compress a corpus while guaranteeing a minimal content. The variant which minimises the Kullback-Liebler divergence guarantees a distribution of events close to a reference distribution as expected; however, the price for this solution is a much more important corpus. In the proposed experiments, we have also evaluated a mixed-approach considering a random complement to the smallest coverings.

2008

pdf bib abs
Comparing Set-Covering Strategies for Optimal Corpus Design
Jonathan Chevelu | Nelly Barbot | Olivier Boeffard | Arnaud Delhay
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This article is interested in the problem of the linguistic content of a speech corpus. Depending on the target task, the phonological and linguistic content of the corpus is controlled by collecting a set of sentences which covers a preset description of phonological attributes under the constraint of an overall duration as small as possible. This goal is classically achieved by greedy algorithms which however do not guarantee the optimality of the desired cover. In recent works, a lagrangian-based algorithm, called LamSCP, has been used to extract coverings of diphonemes from a large corpus in French, giving better results than a greedy algorithm. We propose to keep comparing both algorithms in terms of the shortest duration, stability and robustness by achieving multi-represented diphoneme or triphoneme covering. These coverings correspond to very large scale optimization problems, from a corpus in English. For each experiment, LamSCP improves the greedy results from 3.9 to 9.7 percent.