Similar, but why? A Toolkit for Explaining Text Similarity
Juri Opitz, Andrianos Michail, Lucas Moeller, Sebastian Padó, Simon Clematide
Abstract
Explaining text similarity and developing interpretable models are emerging research challenges (Opitz et al., 2025). We release XPLAINSIM, a Python package that unifies three complementary approaches for explaining textual similarity in an easily accessible way: 1. a token attribution method that explains how individual word interactions contribute to the predicted similarity of any embedding model; 2. a method for inferring structured neural embedding spaces that capture explainable aspects of text, and 3. a symbolic approach that explains textual similarity transparently through parsed meaning representations. We demonstrate the value of our package through intuitive examples and three focused empirical research studies. The first study evaluates interpretability methods for constructing cross-lingual token alignments. The second investigates how modern information retrieval methods handle stop words. The third sheds more light on a long-standing question in computational linguistics: the distinction between relatedness and similarity. XPLAINSIM is available at https://github.com/flipz357/XPLAINSIM.- Anthology ID:
- 2026.eacl-demo.16
- Volume:
- Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Marocco
- Editors:
- Danilo Croce, Jochen Leidner, Nafise Sadat Moosavi
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 203–214
- Language:
- URL:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-demo.16/
- DOI:
- Cite (ACL):
- Juri Opitz, Andrianos Michail, Lucas Moeller, Sebastian Padó, and Simon Clematide. 2026. Similar, but why? A Toolkit for Explaining Text Similarity. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 203–214, Rabat, Marocco. Association for Computational Linguistics.
- Cite (Informal):
- Similar, but why? A Toolkit for Explaining Text Similarity (Opitz et al., EACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-demo.16.pdf