SNLP at TextGraphs 2022 Shared Task: Unsupervised Natural Language Premise Selection in Mathematical Texts Using Sentence-MPNet

Paul Trust, Provia Kadusabe, Haseeb Younis, Rosane Minghim, Evangelos Milios, Ahmed Zahran


Abstract
This paper describes our system for the submission to the TextGraphs 2022 shared task at COLING 2022: Natural Language Premise Selection (NLPS) from mathematical texts. The task of NLPS is about selecting mathematical statements called premises in a knowledge base written in natural language and mathematical formulae that are most likely to be used to prove a particular mathematical proof. We formulated this task as an unsupervised semantic similarity task by first obtaining contextualized embeddings of both the premises and mathematical proofs using sentence transformers. We then obtained the cosine similarity between the embeddings of premises and proofs and then selected premises with the highest cosine scores as the most probable. Our system improves over the baseline system that uses bag of words models based on term frequency inverse document frequency in terms of mean average precision (MAP) by about 23.5% (0.1516 versus 0.1228).
Anthology ID:
2022.textgraphs-1.13
Volume:
Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Dmitry Ustalov, Yanjun Gao, Alexander Panchenko, Marco Valentino, Mokanarangan Thayaparan, Thien Huu Nguyen, Gerald Penn, Arti Ramesh, Abhik Jana
Venue:
TextGraphs
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
119–123
Language:
URL:
https://aclanthology.org/2022.textgraphs-1.13
DOI:
Bibkey:
Cite (ACL):
Paul Trust, Provia Kadusabe, Haseeb Younis, Rosane Minghim, Evangelos Milios, and Ahmed Zahran. 2022. SNLP at TextGraphs 2022 Shared Task: Unsupervised Natural Language Premise Selection in Mathematical Texts Using Sentence-MPNet. In Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing, pages 119–123, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
SNLP at TextGraphs 2022 Shared Task: Unsupervised Natural Language Premise Selection in Mathematical Texts Using Sentence-MPNet (Trust et al., TextGraphs 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2022.textgraphs-1.13.pdf