Anna Sugian

2026

A Multimodal Framework for Aphasia Severity Classification in Russian
Kolmogorova Anastasia | Ekaterina Yavshitz | Anastasia Margolina | Anna Sugian
Proceedings of the 1st Workshop on Linguistic Analysis for Health (HeaLing 2026)

Automatic classification of aphasia severity presents persistent challenges, particularly for languages with limited clinical speech resources such as Russian. This paper explores a multimodal approach to severity estimation that combines acoustic and semantic representations of pathological speech. Acoustic features are extracted using pretrained Wav2Vec 2.0 models, while semantic information is obtained from the encoder of the Whisper model. The two representations are integrated via early feature fusion and evaluated using gradient boosting classifiers in a speaker-independent cross-validation setting. Experiments are conducted on a newly collected dataset of Russian speech recordings from patients with aphasia and neurotypical speakers (RuAphasiaBank). The results suggest that the combined use of acoustic and semantic embeddings can provide more stable severity estimates than unimodal baselines. This study contributes empirical evidence on the applicability of multimodal representation learning for aphasia severity classification under data-scarce conditions.

Co-authors

Venues

HeaLing1
WS1

Fix author