2025
pdf
bib
abs
Statement-Tuning Enables Efficient Cross-lingual Generalization in Encoder-only Models
Ahmed Elshabrawy
|
Thanh-Nhi Nguyen
|
Yeeun Kang
|
Lihan Feng
|
Annant Jain
|
Faadil Abdullah Shaikh
|
Jonibek Mansurov
|
Mohamed Fazli Mohamed Imam
|
Jesus-German Ortiz-Barajas
|
Rendi Chevi
|
Alham Fikri Aji
Findings of the Association for Computational Linguistics: ACL 2025
Large Language Models (LLMs) excel in zero-shot and few-shot tasks, but achieving similar performance with encoder-only models like BERT and RoBERTa has been challenging due to their architecture. However, encoders offer advantages such as lower computational and memory costs. Recent work adapts them for zero-shot generalization using Statement Tuning, which reformulates tasks into finite templates. We extend this approach to multilingual NLP, exploring whether encoders can achieve zero-shot cross-lingual generalization and serve as efficient alternatives to memory-intensive LLMs for low-resource languages. Our results show that state-of-the-art encoder models generalize well across languages, rivaling multilingual LLMs while being more efficient. We also analyze multilingual Statement Tuning dataset design, efficiency gains, and language-specific generalization, contributing to more inclusive and resource-efficient NLP models. We release our code and models.
2024
pdf
bib
abs
MBZUAI-UNAM at SemEval-2024 Task 1: Sentence-CROBI, a Simple Cross-Bi-Encoder-Based Neural Network Architecture for Semantic Textual Relatedness
Jesus German Ortiz Barajas
|
Gemma Bel-enguix
|
Helena Goméz-adorno
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
The Semantic Textual Relatedness (STR) shared task aims at detecting the degree of semantic relatedness between pairs of sentences on low-resource languages from Afroasiatic, Indoeuropean, Austronesian, Dravidian, and Nigercongo families. We use the Sentence-CROBI architecture to tackle this problem. The model is adapted from its original purpose of paraphrase detection to explore its capacities in a related task with limited resources and in multilingual and monolingual settings. Our approach combines the vector representation of cross-encoders and bi-encoders and possesses high adaptable capacity by combining several pre-trained models. Our system obtained good results on the low-resource languages of the dataset using a multilingual fine-tuning approach.
2020
pdf
bib
abs
Enhancing Job Searches in Mexico City with Language Technologies
Gerardo Sierra Martínez
|
Gemma Bel-Enguix
|
Helena Gómez-Adorno
|
Juan Manuel Torres Moreno
|
Tonatiuh Hernández-García
|
Julio V Guadarrama-Olvera
|
Jesús-Germán Ortiz-Barajas
|
Ángela María Rojas
|
Tomas Damerau
|
Soledad Aragón Martínez
Proceedings of the 1st Workshop on Language Technologies for Government and Public Administration (LT4Gov)
In this paper, we show the enhancing of the Demanded Skills Diagnosis (DiCoDe: Diagnóstico de Competencias Demandadas), a system developed by Mexico City’s Ministry of Labor and Employment Promotion (STyFE: Secretaría de Trabajo y Fomento del Empleo de la Ciudad de México) that seeks to reduce information asymmetries between job seekers and employers. The project uses webscraping techniques to retrieve job vacancies posted on private job portals on a daily basis and with the purpose of informing training and individual case management policies as well as labor market monitoring. For this purpose, a collaboration project between STyFE and the Language Engineering Group (GIL: Grupo de Ingeniería Lingüística) was established in order to enhance DiCoDe by applying NLP models and semantic analysis. By this collaboration, DiCoDe’s job vacancies system’s macro-structure and its geographic referencing at the city hall (municipality) level were improved. More specifically, dictionaries were created to identify demanded competencies, skills and abilities (CSA) and algorithms were developed for dynamic classifying of vacancies and identifying terms for searches on free text, in order to improve the results and processing time of queries.