Mohab El-karef

Also published as: Mohab Elkaref

2023

pdf abs
NLPeople at SemEval-2023 Task 2: A Staged Approach for Multilingual Named Entity Recognition
Mohab Elkaref | Nathan Herr | Shinnosuke Tanaka | Geeth De Mel
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

The MultiCoNER II shared task aims at detecting complex, ambiguous named entities with fine-grained types in a low context setting. Previous winning systems incorporated external knowledge bases to retrieve helpful contexts. In our submission we additionally propose splitting the NER task into two stages, a Span Extraction Step, and an Entity Classification step. Our results show that the former does not suffer from the low context setting comparably, and in so leading to a higher overall performance for an external KB-assisted system. We achieve 3rd place on the multilingual track and an average of 6th place overall.

pdf abs
NLPeople at NADI 2023 Shared Task: Arabic Dialect Identification with Augmented Context and Multi-Stage Tuning
Mohab Elkaref | Movina Moses | Shinnosuke Tanaka | James Barry | Geeth Mel
Proceedings of ArabicNLP 2023

This paper presents the approach of the NLPeople team to the Nuanced Arabic Dialect Identification (NADI) 2023 shared task. Subtask 1 involves identifying the dialect of a source text at the country level. Our approach to Subtask 1 makes use of language-specific language models, a clustering and retrieval method to provide additional context to a target sentence, a fine-tuning strategy which makes use of the provided data from the 2020 and 2021 shared tasks, and finally, ensembling over the predictions of multiple models. Our submission achieves a macro-averaged F1 score of 87.27, ranking 1st among the other participants in the task.

pdf abs
El-Kawaref at WojoodNER shared task: StagedNER for Arabic Named Entity Recognition
Nehal Elkaref | Mohab Elkaref
Proceedings of ArabicNLP 2023

Named Entity Recognition (NER) is the task of identifying word-units that correspond to mentions as location, organization, person, or currency. In this shared task we tackle flat-entity classification for Arabic, where for each word-unit a single entity should be identified. To resolve the classification problem we propose StagedNER a novel technique to fine-tuning NER downstream tasks that divides the learning process of a transformer-model into two phases, where a model is tasked to learn sequence tags and then entity tags rather than learn both together simultaneously for an input sequence. We create an ensemble of two base models using this method that yield a score of on the development set and an F1 performance of 90.03% on the validation set and 91.95% on the test set.

2021

pdf abs
A Joint Training Approach to Tweet Classification and Adverse Effect Extraction and Normalization for SMM4H 2021
Mohab Elkaref | Lamiece Hassan
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task

In this work we describe our submissions to the Social Media Mining for Health (SMM4H) 2021 Shared Task. We investigated the effectiveness of a joint training approach to Task 1, specifically classification, extraction and normalization of Adverse Drug Effect (ADE) mentions in English tweets. Our approach performed well on the normalization task, achieving an above average f1 score of 24%, but less so on classification and extraction, with f1 scores of 22% and 37% respectively. Our experiments also showed that a larger dataset with more negative results led to stronger results than a smaller more balanced dataset, even when both datasets have the same positive examples. Finally we also submitted a tuned BERT model for Task 6: Classification of Covid-19 tweets containing symptoms, which achieved an above average f1 score of 96%.

Mohab El-karef

2023

2021

2019

2015

2014

Co-authors

Venues