Abstract
“Self-supervised learning has been widely used to learn effective sentence representations. Previ-ous evaluation of sentence representations mainly focuses on the limited combination of tasks andparadigms while failing to evaluate their effectiveness in a wider range of application scenarios.Such divergences prevent us from understanding the limitations of current sentence representa-tions, as well as the connections between learning approaches and downstream applications. Inthis paper, we propose SentBench, a new comprehensive benchmark to evaluate sentence repre-sentations. SentBench covers 12 kinds of tasks and evaluates sentence representations with threetypes of different downstream application paradigms. Based on SentBench, we re-evaluate sev-eral frequently used self-supervised sentence representation learning approaches. Experimentsshow that SentBench can effectively evaluate sentence representations from multiple perspec-tives, and the performance on SentBench leads to some novel findings which enlighten futureresearches.”- Anthology ID:
- 2023.ccl-1.69
- Volume:
- Proceedings of the 22nd Chinese National Conference on Computational Linguistics
- Month:
- August
- Year:
- 2023
- Address:
- Harbin, China
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 813–823
- Language:
- English
- URL:
- https://aclanthology.org/2023.ccl-1.69
- DOI:
- Cite (ACL):
- Liu Xiaoming, Lin Hongyu, Han Xianpei, and Sun Le. 2023. SentBench: Comprehensive Evaluation of Self-Supervised Sentence Representation with Benchmark Construction. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics, pages 813–823, Harbin, China. Chinese Information Processing Society of China.
- Cite (Informal):
- SentBench: Comprehensive Evaluation of Self-Supervised Sentence Representation with Benchmark Construction (Xiaoming et al., CCL 2023)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2023.ccl-1.69.pdf