Yaya Sy
Also published as: Yaya SY
2026
Cross-lingual Matryoshka Representation Learning across Speech and Text
Yaya SY | Dioula Doucouré | Christophe Cerisara | Irina Illina
Findings of the Association for Computational Linguistics: ACL 2026
Yaya SY | Dioula Doucouré | Christophe Cerisara | Irina Illina
Findings of the Association for Computational Linguistics: ACL 2026
Speakers of under-represented languages face both a language barrier, as most online knowledge is in a few dominant languages, and a modality barrier, since information is largely text-based while many languages are primarily oral. We address this for French-Wolof by training the first bilingual speech-text Matryoshka embedding model, enabling efficient retrieval of French text from Wolof speech queries without relying on a costly ASR-translation pipelines. We introduce large-scale data curation pipelines and new benchmarks, compare modeling strategies, and show that modality fusion within a frozen text Matryoshka model performs best. Although trained only for retrieval, the model generalizes well to other tasks, such as speech intent detection, indicating the learning of general semantic representations. Finally, we analyze cost-accuracy trade-offs across Matryoshka dimensions and ranks, showing that information is concentrated only in a few components, suggesting potential for efficiency improvements.
2025
Efficient One-shot Compression via Low-Rank Local Feature Distillation
Yaya Sy | Christophe Cerisara | Irina Illina
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Yaya Sy | Christophe Cerisara | Irina Illina
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Current structured pruning approaches for large language models typically involve two steps: (1) compression using calibration data and (2) costly continued pretraining on billions of tokens to recover lost performance. This second step is necessary as the first significantly impacts model accuracy. Moreover, prior research suggests that pretrained Transformer weights are not necessarily low-rank, unlike their activations, making one-shot structured pruning challenging. Based on this observation, we propose Lillama, a compression method that locally distills activations with low-rank weights. Using SVD for initialization and a joint loss combining teacher and student activations, we accelerate convergence and reduce memory use with local gradient updates. Lillama compresses Mixtral-8x7B within minutes on a single A100 GPU, removing 10 billion parameters while retaining over 95% of its original performance. Phi-2 3B can be compressed by 40% with just 13 million calibration tokens, resulting in a small model that competes with recent models of similar size. The method generalizes well to non-transformer architectures, compressing Mamba-3B by 20% while maintaining 99% performance.
2023
Findings from the Bambara - French Machine Translation Competition (BFMT 2023)
Ninoh Agostinho Da Silva | Tunde Oluwaseyi Ajayi | Alexander Antonov | Panga Azazia Kamate | Moussa Coulibaly | Mason Del Rio | Yacouba Diarra | Sebastian Diarra | Chris Emezue | Joel Hamilcaro | Christopher M. Homan | Alexander Most | Joseph Mwatukange | Peter Ohue | Michael Pham | Abdoulaye Sako | Sokhar Samb | Yaya Sy | Tharindu Cyril Weerasooriya | Yacine Zahidi | Sarah Luger
Proceedings of the Sixth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2023)
Ninoh Agostinho Da Silva | Tunde Oluwaseyi Ajayi | Alexander Antonov | Panga Azazia Kamate | Moussa Coulibaly | Mason Del Rio | Yacouba Diarra | Sebastian Diarra | Chris Emezue | Joel Hamilcaro | Christopher M. Homan | Alexander Most | Joseph Mwatukange | Peter Ohue | Michael Pham | Abdoulaye Sako | Sokhar Samb | Yaya Sy | Tharindu Cyril Weerasooriya | Yacine Zahidi | Sarah Luger
Proceedings of the Sixth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2023)
Orange Silicon Valley hosted a low-resource machine translation (MT) competition with monetary prizes. The goals of the competition were to raise awareness of the challenges in the low-resource MT domain, improve MT algorithms and data strategies, and support MT expertise development in the regions where people speak Bambara and other low-resource languages. The participants built Bambara to French and French to Bambara machine translation systems using data provided by the organizers and additional data resources shared amongst the competitors. This paper details each team’s different approaches and motivation for ongoing work in Bambara and the broader low-resource machine translation domain.
Search
Fix author
Co-authors
- Christophe Cerisara 2
- Irina Illina 2
- Ninoh Agostinho Da Silva 1
- Tunde Oluwaseyi Ajayi 1
- Alexander Antonov 1
- Moussa Coulibaly 1
- Mason Del Rio 1
- Sebastian Diarra 1
- Yacouba Diarra 1
- Dioula Doucouré 1
- Chris Chinenye Emezue 1
- Joel Hamilcaro 1
- Christopher M. Homan 1
- Panga Azazia Kamaté 1
- Sarah Luger 1
- Alexander Most 1
- Joseph Mwatukange 1
- Peter Ohue 1
- Michael Pham 1
- Abdoulaye Sako 1
- Sokhar Samb 1
- Tharindu Cyril Weerasooriya 1
- Yacine Zahidi 1