INGEOTEC at SemEval-2024 Task 1: Bag of Words and Transformers

Daniela Moctezuma, Eric Tellez, Mario Graff


Abstract
Understanding the meaning of a written message is crucial in solving problems related to Natural Language Processing; the relatedness of two or more messages is a semantic problem tackled with supervised and unsupervised learning. This paper outlines our submissions to the Semantic Textual Relatedness (STR) challenge at SemEval 2024, which is devoted to evaluating the degree of semantic similarity and relatedness between two sentences across multiple languages. We use two main strategies in our submissions. The first approach is based on the Bag-of-Word scheme, while the second one uses pre-trained Transformers for text representation. We found some attractive results, especially in cases where different models adjust better to certain languages over others.
Anthology ID:
2024.semeval-1.168
Volume:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1155–1159
Language:
URL:
https://aclanthology.org/2024.semeval-1.168
DOI:
Bibkey:
Cite (ACL):
Daniela Moctezuma, Eric Tellez, and Mario Graff. 2024. INGEOTEC at SemEval-2024 Task 1: Bag of Words and Transformers. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1155–1159, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
INGEOTEC at SemEval-2024 Task 1: Bag of Words and Transformers (Moctezuma et al., SemEval 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-checklist/2024.semeval-1.168.pdf
Supplementary material:
 2024.semeval-1.168.SupplementaryMaterial.txt