Alexander Most


2025

Large Language Models (LLMs) are increasinglybeing leveraged for generating andtranslating scientific computer codes by bothdomain-experts and non-domain experts. Fortranhas served as one of the go to programminglanguages in legacy high-performance computing(HPC) for scientific discoveries. Despitegrowing adoption, LLM-based code translationof legacy code-bases has not been thoroughlyassessed or quantified for its usability.Here, we studied the applicability of LLMbasedtranslation of Fortran to C++ as a step towardsbuilding an agentic-workflow using openweightLLMs on two different computationalplatforms. We statistically quantified the compilationaccuracy of the translated C++ codes,measured the similarity of the LLM translatedcode to the human translated C++ code, andstatistically quantified the output similarity ofthe Fortran to C++ translation.

2023

Orange Silicon Valley hosted a low-resource machine translation (MT) competition with monetary prizes. The goals of the competition were to raise awareness of the challenges in the low-resource MT domain, improve MT algorithms and data strategies, and support MT expertise development in the regions where people speak Bambara and other low-resource languages. The participants built Bambara to French and French to Bambara machine translation systems using data provided by the organizers and additional data resources shared amongst the competitors. This paper details each team’s different approaches and motivation for ongoing work in Bambara and the broader low-resource machine translation domain.