Marco Zocca
2025
Experience Report: Implementing Machine Translation in a Regulated Industry
Marco Zocca
|
Per Fallgren
|
David Buffoni
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
This paper presents lessons learned from implementing Machine Translation systems in the context of a global medical technology company. We describe system challenges, legal and security considerations, and the critical role of human-in-the-loop validation for quality assurance and responsible deployment. Furthermore, based on an experiment involving over 11,000 ranked translations, we report reviewer preferences for outputs from small and large language models under various prompting configurations, using a domain-specific dataset spanning five language pairs.
2023
Natural Language Annotations for Reasoning about Program Semantics
Marco Zocca
Findings of the Association for Computational Linguistics: EMNLP 2023
By grounding natural language inference in code (and vice versa), researchers aim to create programming assistants that explain their work, are “coachable” and can surface any gaps in their reasoning. Can we deduce automatically interesting properties of programs from their syntax and common-sense annotations alone, without resorting to static analysis? How much of program logic and behaviour can be captured in natural language? To stimulate research in this direction and attempt to answer these questions we propose HTL, a dataset and protocol for annotating programs with natural language predicates at a finer granularity than code comments and without relying on internal compiler representations. The dataset is available at the following address: https://doi.org/10.5281/zenodo.7893113 .