Similar, but why? A Toolkit for Explaining Text Similarity

Juri Opitz; Andrianos Michail; Lucas Moeller; Sebastian Padó; Simon Clematide

Similar, but why? A Toolkit for Explaining Text Similarity

Juri Opitz, Andrianos Michail, Lucas Moeller, Sebastian Padó, Simon Clematide

Abstract

Explaining text similarity and developing interpretable models are emerging research challenges (Opitz et al., 2025). We release XPLAINSIM, a Python package that unifies three complementary approaches for explaining textual similarity in an easily accessible way: 1. a token attribution method that explains how individual word interactions contribute to the predicted similarity of any embedding model; 2. a method for inferring structured neural embedding spaces that capture explainable aspects of text, and 3. a symbolic approach that explains textual similarity transparently through parsed meaning representations. We demonstrate the value of our package through intuitive examples and three focused empirical research studies. The first study evaluates interpretability methods for constructing cross-lingual token alignments. The second investigates how modern information retrieval methods handle stop words. The third sheds more light on a long-standing question in computational linguistics: the distinction between relatedness and similarity. XPLAINSIM is available at https://github.com/flipz357/XPLAINSIM.

Anthology ID:: 2026.eacl-demo.16
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:: March
Year:: 2026
Address:: Rabat, Marocco
Editors:: Danilo Croce, Jochen Leidner, Nafise Sadat Moosavi
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 203–214
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-demo.16/
DOI:
Bibkey:
Cite (ACL):: Juri Opitz, Andrianos Michail, Lucas Moeller, Sebastian Padó, and Simon Clematide. 2026. Similar, but why? A Toolkit for Explaining Text Similarity. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 203–214, Rabat, Marocco. Association for Computational Linguistics.
Cite (Informal):: Similar, but why? A Toolkit for Explaining Text Similarity (Opitz et al., EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-demo.16.pdf

PDF Cite Search Fix data