Guillaume Jacques


2022

pdf
Fine-tuning pre-trained models for Automatic Speech Recognition, experiments on a fieldwork corpus of Japhug (Trans-Himalayan family)
Séverine Guillaume | Guillaume Wisniewski | Cécile Macaire | Guillaume Jacques | Alexis Michaud | Benjamin Galliot | Maximin Coavoux | Solange Rossato | Minh-Châu Nguyên | Maxime Fily
Proceedings of the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages

This is a report on results obtained in the development of speech recognition tools intended to support linguistic documentation efforts. The test case is an extensive fieldwork corpus of Japhug, an endangered language of the Trans-Himalayan (Sino-Tibetan) family. The goal is to reduce the transcription workload of field linguists. The method used is a deep learning approach based on the language-specific tuning of a generic pre-trained representation model, XLS-R, using a Transformer architecture. We note difficulties in implementation, in terms of learning stability. But this approach brings significant improvements nonetheless. The quality of phonemic transcription is improved over earlier experiments; and most significantly, the new approach allows for reaching the stage of automatic word recognition. Subjective evaluation of the tool by the author of the training data confirms the usefulness of this approach.

2021

pdf
User-friendly Automatic Transcription of Low-resource Languages: Plugging ESPnet into Elpis
Oliver Adams | Benjamin Galliot | Guillaume Wisniewski | Nicholas Lambourne | Ben Foley | Rahasya Sanders-Dwyer | Janet Wiles | Alexis Michaud | Séverine Guillaume | Laurent Besacier | Christopher Cox | Katya Aplonova | Guillaume Jacques | Nathan Hill
Proceedings of the 4th Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)

2016

pdf
Contribuer au progrès solidaire des recherches et de la documentation : la Collection Pangloss et la Collection AuCo (Contributing to joint progress in documentation and research: some achievements and future perspectives of the Pangloss Collection and the AuCo Collection)
Alexis Michaud | Séverine Guillaume | Guillaume Jacques | Đăng-Khoa Mạc | Michel Jacobson | Thu-Hà Phạm | Matthew Deo
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 1 : JEP

La présente communication présente les projets scientifiques et les réalisations de deux collections hébergées par la plateforme de ressources orales Cocoon : la Collection Pangloss, qui concerne principalement des langues de tradition orale (sans écriture), du monde entier ; et la Collection AuCo, dédiée aux langues du Vietnam et de pays voisins. L’objectif est un progrès solidaire des recherches et de la documentation linguistique. L’accent est mis sur les perspectives ouvertes pour la recherche en phonétique/phonologie par certaines réalisations récentes dans le cadre de ces deux Collections.