Abdullah Alrajeh


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Octopus: Towards Building the Arabic Speech LLM Suite
Sara Althubaiti | Vasista Sai Lodagala | Tjad Clark | Yousseif Ahmed Elshahawy | Daniel Izham | Abdullah Alrajeh | Aljawahrah Bin Tamran | Ahmed Ali
Proceedings of The Third Arabic Natural Language Processing Conference

We present Octopus, a first family of modular speech-language models designed for Arabic-English ASR, dialect identification, and speech translation. Built on Whisper-V3 and enhanced with large language models like ALLaM, LLaMA, and DeepSeek, Octopus bridges speech and text through a lightweight projection layer and Q-Former. To broaden its scope beyond speech, Octopus integrates BEATs, a general-purpose audio encoder allowing it to understand both linguistic and acoustic events. Despite its simplicity, this dual-encoder design supports robust performance across multilingual and code-switched scenarios. We also introduce TinyOctopus, a distilled variant using smaller models (Distil-Whisper + LLaMA3-1B / DeepSeek-1.5B), achieving competitive results with just a fraction of the parameters. Fine-tuning on synthetic code-switched data further boosts its performance. Octopus demonstrates the power of compact, extensible architectures in Arabic-centric speech modeling and sets the stage for unified multilingual audio-language understanding.

2014

pdf bib
Large-scale Reordering Model for Statistical Machine Translation using Dual Multinomial Logistic Regression
Abdullah Alrajeh | Mahesan Niranjan
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Bayesian Reordering Model with Feature Selection
Abdullah Alrajeh | Mahesan Niranjan
Proceedings of the Ninth Workshop on Statistical Machine Translation