Leixin Zhang


2024

pdf
Unveiling Semantic Information in Sentence Embeddings
Leixin Zhang | David Burian | Vojtěch John | Ondřej Bojar
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024

This study evaluates the extent to which semantic information is preserved within sentence embeddings generated from state-of-art sentence embedding models: SBERT and LaBSE. Specifically, we analyzed 13 semantic attributes in sentence embeddings. Our findings indicate that some semantic features (such as tense-related classes) can be decoded from the representation of sentence embeddings. Additionally, we discover the limitation of the current sentence embedding models: inferring meaning beyond the lexical level has proven to be difficult.

pdf
Tübingen-CL at SemEval-2024 Task 1: Ensemble Learning for Semantic Relatedness Estimation
Leixin Zhang | Çağrı Çöltekin
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

The paper introduces our system for SemEval-2024 Task 1, which aims to predict the relatedness of sentence pairs. Operating under the hypothesis that semantic relatedness is a broader concept that extends beyond mere similarity of sentences, our approach seeks to identify useful features for relatedness estimation. We employ an ensemble approach integrating various systems, including statistical textual features and outputs of deep learning models to predict relatedness scores. The findings suggest that semantic relatedness can be inferred from various sources and ensemble models outperform many individual systems in estimating semantic relatedness.