Punctuation Restoration in Spanish Customer Support Transcripts using Transfer Learning
Xiliang Zhu, Shayna Gardiner, David Rossouw, Tere Roldán, Simon Corston-Oliver
Abstract
Automatic Speech Recognition (ASR) systems typically produce unpunctuated transcripts that have poor readability. In addition, building a punctuation restoration system is challenging for low-resource languages, especially for domain-specific applications. In this paper, we propose a Spanish punctuation restoration system designed for a real-time customer support transcription service. To address the data sparsity of Spanish transcripts in the customer support domain, we introduce two transferlearning-based strategies: 1) domain adaptation using out-of-domain Spanish text data; 2) crosslingual transfer learning leveraging in-domain English transcript data. Our experiment results show that these strategies improve the accuracy of the Spanish punctuation restoration system.- Anthology ID:
- 2022.deeplo-1.9
- Volume:
- Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing
- Month:
- July
- Year:
- 2022
- Address:
- Hybrid
- Editors:
- Colin Cherry, Angela Fan, George Foster, Gholamreza (Reza) Haffari, Shahram Khadivi, Nanyun (Violet) Peng, Xiang Ren, Ehsan Shareghi, Swabha Swayamdipta
- Venue:
- DeepLo
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 80–89
- Language:
- URL:
- https://aclanthology.org/2022.deeplo-1.9
- DOI:
- 10.18653/v1/2022.deeplo-1.9
- Cite (ACL):
- Xiliang Zhu, Shayna Gardiner, David Rossouw, Tere Roldán, and Simon Corston-Oliver. 2022. Punctuation Restoration in Spanish Customer Support Transcripts using Transfer Learning. In Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing, pages 80–89, Hybrid. Association for Computational Linguistics.
- Cite (Informal):
- Punctuation Restoration in Spanish Customer Support Transcripts using Transfer Learning (Zhu et al., DeepLo 2022)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2022.deeplo-1.9.pdf