The Classical Language Toolkit: An NLP Framework for Pre-Modern Languages
Kyle P. Johnson, Patrick J. Burns, John Stewart, Todd Cook, Clément Besnier, William J. B. Mattingly
Abstract
This paper announces version 1.0 of the Classical Language Toolkit (CLTK), an NLP framework for pre-modern languages. The vast majority of NLP, its algorithms and software, is created with assumptions particular to living languages, thus neglecting certain important characteristics of largely non-spoken historical languages. Further, scholars of pre-modern languages often have different goals than those of living-language researchers. To fill this void, the CLTK adapts ideas from several leading NLP frameworks to create a novel software architecture that satisfies the unique needs of pre-modern languages and their researchers. Its centerpiece is a modular processing pipeline that balances the competing demands of algorithmic diversity with pre-configured defaults. The CLTK currently provides pipelines, including models, for almost 20 languages.- Anthology ID:
- 2021.acl-demo.3
- Volume:
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Heng Ji, Jong C. Park, Rui Xia
- Venues:
- ACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 20–29
- Language:
- URL:
- https://preview.aclanthology.org/ingest_wac_2008/2021.acl-demo.3/
- DOI:
- 10.18653/v1/2021.acl-demo.3
- Cite (ACL):
- Kyle P. Johnson, Patrick J. Burns, John Stewart, Todd Cook, Clément Besnier, and William J. B. Mattingly. 2021. The Classical Language Toolkit: An NLP Framework for Pre-Modern Languages. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations, pages 20–29, Online. Association for Computational Linguistics.
- Cite (Informal):
- The Classical Language Toolkit: An NLP Framework for Pre-Modern Languages (Johnson et al., ACL-IJCNLP 2021)
- PDF:
- https://preview.aclanthology.org/ingest_wac_2008/2021.acl-demo.3.pdf
- Code
- cltk/cltk