Textagon: Boosting Language Models with Theory-guided Parallel Representations

John P. Lalor, Ruiyang Qin, David Dobolyi, Ahmed Abbasi


Abstract
Pretrained language models have significantly advanced the state of the art in generating distributed representations of text. However, they do not account for the wide variety of available expert-generated language resources and lexicons that explicitly encode linguistic/domain knowledge. Such lexicons can be paired with learned embeddings to further enhance NLP prediction and linguistic inquiry. In this work we present Textagon, a Python package for generating parallel representations for text based on predefined lexicons and selecting representations that provide the most information. We discuss the motivation behind the software, its implementation, as well as two case studies for its use to demonstrate operational utility.
Anthology ID:
2025.acl-demo.9
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Pushkar Mishra, Smaranda Muresan, Tao Yu
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
82–92
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-demo.9/
DOI:
Bibkey:
Cite (ACL):
John P. Lalor, Ruiyang Qin, David Dobolyi, and Ahmed Abbasi. 2025. Textagon: Boosting Language Models with Theory-guided Parallel Representations. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 82–92, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Textagon: Boosting Language Models with Theory-guided Parallel Representations (Lalor et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-demo.9.pdf
Copyright agreement:
 2025.acl-demo.9.copyright_agreement.pdf