Yusser Al Ghussin


2025

pdf bib
Saarland-Groningen at NADI 2025 Shared Task: Effective Dialectal Arabic Speech Processing under Data Constraints
Badr M. Abdullah | Yusser Al Ghussin | Zena Al-Khalili | Ömer Tarik Özyilmaz | Matias Valdenegro-Toro | Simon Ostermann | Dietrich Klakow
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib
TenseLoC: Tense Localization and Control in a Multilingual LLM
Ariun-Erdene Tumurchuluun | Yusser Al Ghussin | David Mareček | Josef Van Genabith | Koel Dutta Chowdhury
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)

Multilingual language models excel across languages, yet how they internally encode grammatical tense remains largely unclear. We investigate how decoder-only transformers represent, transfer, and control tense across eight typologically diverse languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. We construct a synthetic tense-annotated dataset and combine probing, causal analysis, feature disentanglement, and model steering to LLaMA-3.1 8B. We show that tense emerges as a distinct signal from early layers and transfers most strongly within the same language family. Causal tracing reveals that attention outputs around layer 16 consistently carry cross-lingually transferable tense information. Leveraging sparse autoencoders in this subspace, we isolate and steer English tense-related features, improving target-tense prediction accuracy by up to 11%% in a downstream cloze task.

2023

pdf bib
Exploring Paracrawl for Document-level Neural Machine Translation
Yusser Al Ghussin | Jingyi Zhang | Josef van Genabith
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Document-level neural machine translation (NMT) has outperformed sentence-level NMT on a number of datasets. However, document-level NMT is still not widely adopted in realworld translation systems mainly due to the lack of large-scale general-domain training data for document-level NMT. We examine the effectiveness of using Paracrawl for learning document-level translation. Paracrawl is a large-scale parallel corpus crawled from the Internet and contains data from various domains. The official Paracrawl corpus was released as parallel sentences (extracted from parallel webpages) and therefore previous works only used Paracrawl for learning sentence-level translation. In this work, we extract parallel paragraphs from Paracrawl parallel webpages using automatic sentence alignments and we use the extracted parallel paragraphs as parallel documents for training document-level translation models. We show that document-level NMT models trained with only parallel paragraphs from Paracrawl can be used to translate real documents from TED, News and Europarl, outperforming sentence-level NMT models. We also perform a targeted pronoun evaluation and show that document-level models trained with Paracrawl data can help context-aware pronoun translation.