Less Mature is More Adaptable for Sentence-level Language Modeling

Abhilasha Sancheti; David Dale; Artyom Kozhevnikov; Maha Elbayad

Less Mature is More Adaptable for Sentence-level Language Modeling

Abhilasha Sancheti, David Dale, Artyom Kozhevnikov, Maha Elbayad

Abstract

This work investigates sentence-level models (i.e., models that operate at the sentence-level) to study how sentence representations from various encoders influence downstream task performance, and which syntactic, semantic, and discourse-level properties are essential for strong performance. Our experiments encompass encoders with diverse training regimes and pretraining domains, as well as various pooling strategies applied to multi-sentence input tasks (including sentence ordering, sentiment classification, and natural language inference) requiring coarse-to-fine-grained reasoning. We find that ”less mature” representations (e.g., mean-pooled representations from BERT’s first or last layer, or representations from encoders with limited fine-tuning) exhibit greater generalizability and adaptability to downstream tasks compared to representations from extensively fine-tuned models (e.g., SBERT or SimCSE). These findings are consistent across different pretraining seed initializations for BERT. Our probing analysis reveals that syntactic and discourse-level properties are stronger indicators of downstream performance than MTEB scores or decodability. Furthermore, the data and time efficiency of sentence-level models, often outperforming token-level models, underscores their potential for future research.

Anthology ID:: 2025.acl-long.573
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11680–11695
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.573/
DOI:
Bibkey:
Cite (ACL):: Abhilasha Sancheti, David Dale, Artyom Kozhevnikov, and Maha Elbayad. 2025. Less Mature is More Adaptable for Sentence-level Language Modeling. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11680–11695, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Less Mature is More Adaptable for Sentence-level Language Modeling (Sancheti et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.573.pdf

PDF Cite Search Fix data