False Friends or Cognates? A Cross-lingual Semantic Ambiguity Evaluation for Galician, Portuguese and Spanish

Marta V\'azquez Abu{\'\i}n, Jose Camacho-Collados, Marcos Garcia


Abstract
The linguistic proximity between Galician, Portuguese, and Spanish results in a lexical overlap that often conceals semantic interference. This is particularly evident in false friends, posing a challenge for NLP systems.In this work, we assess whether state-of-the-art language models can identify and process false friends among these languages. We introduce six cross-lingual datasets –created manually or using semi-automatic methods, with all instances being carefully verified– covering cognates and false friends. We evaluate a broad range of encoder and decoder models of varying sizes via zero-shot and few-shot settings. Our results highlight the challenging nature of the task, but also show the clear progress made by LLMs in recent years, particularly those of a larger size, with smaller language models struggling on the task. Notably, unlike other tasks where language distance poses additional challenges, we find that linguistic proximity itself introduces errors: closely related language pairs tend to perform worse, reflecting the challenge of semantic discrimination due to lexical overlap.
Anthology ID:
2026.acl-long.1818
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
39199–39214
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1818/
DOI:
Bibkey:
Cite (ACL):
Marta V\'azquez Abu{\'\i}n, Jose Camacho-Collados, and Marcos Garcia. 2026. False Friends or Cognates? A Cross-lingual Semantic Ambiguity Evaluation for Galician, Portuguese and Spanish. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 39199–39214, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
False Friends or Cognates? A Cross-lingual Semantic Ambiguity Evaluation for Galician, Portuguese and Spanish (Abu{'\i}n et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1818.pdf
Checklist:
 2026.acl-long.1818.checklist.pdf