Ludovic Moncla
2026
ATOM: AdapTive and OptiMized dynamic temporal knowledge graph construction using LLMs
Yassir Lairgi | Ludovic Moncla | Khalid Benabdeslem | Rémy Cazabet | Pierre Cléau
Findings of the Association for Computational Linguistics: EACL 2026
Yassir Lairgi | Ludovic Moncla | Khalid Benabdeslem | Rémy Cazabet | Pierre Cléau
Findings of the Association for Computational Linguistics: EACL 2026
In today’s rapidly expanding data landscape, knowledge extraction from unstructured text is vital for real-time analytics, temporal inference, and dynamic memory frameworks. However, traditional static knowledge graph (KG) construction often overlooks the dynamic and time-sensitive nature of real-world data, limiting adaptability to continuous changes. Moreover, recent zero- or few-shot approaches that avoid domain-specific fine-tuning or reliance on prebuilt ontologies often suffer from instability across multiple runs, as well as incomplete coverage of key facts. To address these challenges, we introduce ATOM (AdapTive and OptiMized), a few-shot and scalable approach that builds and continuously updates Temporal Knowledge Graphs (TKGs) from unstructured texts. ATOM splits input documents into minimal, self-contained “atomic” facts, improving extraction exhaustivity and stability. Then, it constructs atomic TKGs from these facts, employing a dual-time modeling that distinguishes between when information is observed and when it is valid. The resulting atomic TKGs are subsequently merged in parallel. Empirical evaluations demonstrate that ATOM achieves 18% higher exhaustivity, 33% better stability, and over 90% latency reduction compared to baseline methods, demonstrating a strong scalability potential for dynamic TKG construction.
EDDA-Coordinata: An Annotated Dataset of Historical Geographic Coordinates
Ludovic Moncla | Pierre Nugues | Thierry Joliveau | Katherine McDonough
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Ludovic Moncla | Pierre Nugues | Thierry Joliveau | Katherine McDonough
Proceedings of the Fifteenth Language Resources and Evaluation Conference
This paper introduces a dataset of enriched geographic coordinates retrieved from Diderot and d’Alembert’s eighteenth-century Encyclopédie. Automatically recovering geographic coordinates from historical texts is a complex task, as they are expressed in a variety of ways and with varying levels of precision. To improve retrieval of coordinates from similar digitized early modern texts, we have created a gold standard dataset, trained models, published the resulting inferred and normalized coordinate data, and experimented applying these models to new texts. From 74,000 total articles in each of the digitized versions of the Encyclopédie from ARTFL and ENCCRE, we examined 15,278 geographical entries, manually identifying 4,798 containing coordinates, and 10,480 with descriptive but non-numerical references. Leveraging our gold standard annotations, we trained transformer-based models to retrieve and normalize coordinates. The pipeline presented here combines a classifier to identify coordinate-bearing entries and a second model for retrieval, tested across encoder–decoder and decoder architectures. Cross-validation yielded an 86% EM score. On an out-of-domain eighteenth-century Trévoux dictionary (also in French), our fine-tuned model had an 61% EM score, while for the nineteenth-century, 7th edition of the Encyclopædia Britannica in English, the EM was 77%. These findings highlight the gold standard dataset’s usefulness as training data, and our two-step method’s cross-lingual, cross-domain generalizability.