Imen Laouirine
2024
TunArTTS: Tunisian Arabic Text-To-Speech Corpus
Imen Laouirine
|
Rami Kammoun
|
Fethi Bougares
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Being labeled as a low-resource language, the Tunisian dialect has no existing prior TTS research. In this paper, we present a speech corpus for Tunisian Arabic Text-to-Speech (TunArTTS) to initiate the development of end-to-end TTS systems for the Tunisian dialect. Our Speech corpus is extracted from an online English and Tunisian Arabic dictionary. We were able to extract a mono-speaker speech corpus of +3 hours of a male speaker sampled at 44100 kHz. The corpus is processed and manually diacritized. Furthermore, we develop various TTS systems based on two approaches: training from scratch and transfer learning. Both Tacotron2 and FastSpeech2 were used and evaluated using subjective and objective metrics. The experimental results show that our best results are obtained with the transfer learning from a pre-trained model on the English LJSpeech dataset. This model obtained a mean opinion score (MOS) of 3.88. TunArTTS will be publicly available for research purposes along with the baseline TTS system demo. Keywords: Tunisian Dialect, Text-To-Speech, Low-resource, Transfer Learning, TunArTTS
2023
ELYADATA at WojoodNER Shared Task: Data and Model-centric Approaches for Arabic Flat and Nested NER
Imen Laouirine
|
Haroun Elleuch
|
Fethi Bougares
Proceedings of ArabicNLP 2023
This paper describes our submissions to the WojoodNER shared task organized during the first ArabicNLP conference. We participated in the two proposed sub-tasks of flat and nested Named Entity Recognition (NER). Our systems were ranked first over eight and third over eleven in the Nested NER and Flat NER, respectively. All our primary submissions are based on DiffusionNER models (Shen et al., 2023), where the NER task is formulated as a boundary-denoising diffusion process. Experiments on nested WojoodNER achieves the best results with a micro F1-score of 93.73%. For the flat sub-task, our primary system was the third-best system, with a micro F1-score of 91.92%.
Search