Dorian Midou


2026

Numerical data from sensors and time series are widely used in scientific research fields such as nuclear fusion experiments, which generate vast amounts of complex, high-dimensional data. Therefore, efficient numerical data analysis tools are crucial to accelerate experimental research. Large language models (LLMs) have emerged as promising solutions to analyze numerical data with natural language queries. However, LLMs have difficulties treating this type of data as they have been designed for text in the first place. To overcome these limitations, we propose a model-agnostic and data-agnostic agent that processes numerical data by code generation and multimodal reasoning. Our agent demonstrates competitive performance against baselines on benchmark data on numerical data tasks such as sensor data classification and time series understanding. While outperforming them on information retrieval benchmarks, also we have successfully applied our agent in the context of nuclear fusion research, where physicists and Tokamak operators interact with it to plan and analyze fusion experiments.