Reducing tokenizer’s tokens per word ratio in Financial domain with T-MuFin BERT Tokenizer
Braulio Blanco Lambruschini, Patricia Becerra-Sanchez, Mats Brorsson, Maciej Zurad
- Anthology ID:
- 2023.finnlp-1.9
- Volume:
- Proceedings of the Fifth Workshop on Financial Technology and Natural Language Processing and the Second Multimodal AI For Financial Forecasting
- Month:
- 20 August
- Year:
- 2023
- Address:
- Macao
- Editors:
- Chung-Chi Chen, Hiroya Takamura, Puneet Mathur, Remit Sawhney, Hen-Hsen Huang, Hsin-Hsi Chen
- Venues:
- FinNLP | WS
- SIG:
- Publisher:
- -
- Note:
- Pages:
- 94–103
- Language:
- URL:
- https://aclanthology.org/2023.finnlp-1.9
- DOI:
- Cite (ACL):
- Braulio Blanco Lambruschini, Patricia Becerra-Sanchez, Mats Brorsson, and Maciej Zurad. 2023. Reducing tokenizer’s tokens per word ratio in Financial domain with T-MuFin BERT Tokenizer. In Proceedings of the Fifth Workshop on Financial Technology and Natural Language Processing and the Second Multimodal AI For Financial Forecasting, pages 94–103, Macao. -.
- Cite (Informal):
- Reducing tokenizer’s tokens per word ratio in Financial domain with T-MuFin BERT Tokenizer (Lambruschini et al., FinNLP-WS 2023)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2023.finnlp-1.9.pdf