An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution
Tien-Hong Lo, Fu-An Chao, Tzu-i Wu, Yao-Ting Sung, Berlin Chen
Abstract
Automated speaking assessment (ASA) typically involves automatic speech recognition (ASR) and hand-crafted feature extraction from the ASR transcript of a learner’s speech. Recently, self-supervised learning (SSL) has shown stellar performance compared to traditional methods. However, SSL-based ASA systems are faced with at least three data-related challenges: limited annotated data, uneven distribution of learner proficiency levels and non-uniform score intervals between different CEFR proficiency levels. To address these challenges, we explore the use of two novel modeling strategies: metric-based classification and loss re-weighting, leveraging distinct SSL-based embedding features. Extensive experimental results on the ICNALE benchmark dataset suggest that our approach can outperform existing strong baselines by a sizable margin, achieving a significant improvement of more than 10% in CEFR prediction accuracy.- Anthology ID:
- 2024.findings-naacl.86
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2024
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Kevin Duh, Helena Gomez, Steven Bethard
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1352–1362
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-naacl.86/
- DOI:
- 10.18653/v1/2024.findings-naacl.86
- Cite (ACL):
- Tien-Hong Lo, Fu-An Chao, Tzu-i Wu, Yao-Ting Sung, and Berlin Chen. 2024. An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 1352–1362, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution (Lo et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-naacl.86.pdf