Andreas Scherbakov

2021

pdf abs
Anlirika: An LSTM–CNN Flow Twister for Spoken Language Identification
Andreas Scherbakov | Liam Whittle | Ritesh Kumar | Siddharth Singh | Matthew Coleman | Ekaterina Vylomova
Proceedings of the Third Workshop on Computational Typology and Multilingual NLP

The paper presents Anlirika’s submission to SIGTYP 2021 Shared Task on Robust Spoken Language Identification. The task aims at building a robust system that generalizes well across different domains and speakers. The training data is limited to a single domain only with predominantly single speaker per language while the validation and test data samples are derived from diverse dataset and multiple speakers. We experiment with a neural system comprising a combination of dense, convolutional, and recurrent layers that are designed to perform better generalization and obtain speaker-invariant representations. We demonstrate that the task in its constrained form (without making use of external data or augmentation the train set with samples from the validation set) is still challenging. Our best system trained on the data augmented with validation samples achieves 29.9% accuracy on the test data.

2020

pdf abs
The UniMelb Submission to the SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection
Andreas Scherbakov
Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

The paper describes the University of Melbourne’s submission to the SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection. Our team submitted three systems in total, two neural and one non-neural. Our analysis of systems’ performance shows positive effects of newly introduced data hallucination technique that we employed in one of neural systems, especially in low-resource scenarios. A non-neural system based on observed inflection patterns shows optimistic results even in its simple implementation (>75% accuracy for 50% of languages). With possible improvement within the same modeling principle, accuracy might grow to values above 90%.

2016

pdf
VectorWeavers at SemEval-2016 Task 10: From Incremental Meaning to Semantic Unit (phrase by phrase)
Andreas Scherbakov | Ekaterina Vylomova | Fei Liu | Timothy Baldwin
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

Co-authors

Fei Liu 1

Timothy Baldwin 1