Yassine Benajiba

2021

pdf abs
ODIST: Open World Classification via Distributionally Shifted Instances
Lei Shu | Yassine Benajiba | Saab Mansour | Yi Zhang
Findings of the Association for Computational Linguistics: EMNLP 2021

In this work, we address the open-world classification problem with a method called ODIST, open world classification via distributionally shifted instances. This novel and straightforward method can create out-of-domain instances from the in-domain training instances with the help of a pre-trained generative language model. Experimental results show that ODIST performs better than state-of-the-art decision boundary finding method.

2020

pdf abs
Aspect On: an Interactive Solution for Post-Editing the Aspect Extraction based on Online Learning
Mara Chinea-Rios | Marc Franco-Salvador | Yassine Benajiba
Proceedings of the Twelfth Language Resources and Evaluation Conference

The task of aspect extraction is an important component of aspect-based sentiment analysis. However, it usually requires an expensive human post-processing to ensure quality. In this work we introduce Aspect On, an interactive solution based on online learning that allows users to post-edit the aspect extraction with little effort. The Aspect On interface shows the aspects extracted by a neural model and, given a dataset, annotates its words with the corresponding aspects. Thanks to the online learning, Aspect On updates the model automatically and continuously improves the quality of the aspects displayed to the user. Experimental results show that Aspect On dramatically reduces the number of user clicks and effort required to post-edit the aspects extracted by the model.

2019

pdf abs
SymantoResearch at SemEval-2019 Task 3: Combined Neural Models for Emotion Classification in Human-Chatbot Conversations
Angelo Basile | Marc Franco-Salvador | Neha Pawar | Sanja Štajner | Mara Chinea Rios | Yassine Benajiba
Proceedings of the 13th International Workshop on Semantic Evaluation

In this paper, we present our participation to the EmoContext shared task on detecting emotions in English textual conversations between a human and a chatbot. We propose four neural systems and combine them to further improve the results. We show that our neural ensemble systems can successfully distinguish three emotions (SAD, HAPPY, and ANGRY) and separate them from the rest (OTHERS) in a highly-imbalanced scenario. Our best system achieved a 0.77 F1-score and was ranked fourth out of 165 submissions.

2017

pdf abs
MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction
Yassine Benajiba | Jin Sun | Yong Zhang | Zhiliang Weng | Or Biran
Proceedings of the IJCNLP 2017, Shared Tasks

This paper introduces Mainiway AI Labs submitted system for the IJCNLP 2017 shared task on Dimensional Sentiment Analysis of Chinese Phrases (DSAP), and related experiments. Our approach consists of deep neural networks with various architectures, and our best system is a voted ensemble of networks. We achieve a Mean Absolute Error of 0.64 in valence prediction and 0.68 in arousal prediction on the test set, both placing us as the 5th ranked team in the competition.

pdf abs
The Sentimental Value of Chinese Sub-Character Components
Yassine Benajiba | Or Biran | Zhiliang Weng | Yong Zhang | Jin Sun
Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing

Sub-character components of Chinese characters carry important semantic information, and recent studies have shown that utilizing this information can improve performance on core semantic tasks. In this paper, we hypothesize that in addition to semantic information, sub-character components may also carry emotional information, and that utilizing it should improve performance on sentiment analysis tasks. We conduct a series of experiments on four Chinese sentiment data sets and show that we can significantly improve the performance in various tasks over that of a character-level embeddings baseline. We then focus on qualitatively assessing multiple examples and trying to explain how the sub-character components affect the results in each case.

2012

pdf
Grading the Quality of Medical Evidence
Binod Gyawali | Thamar Solorio | Yassine Benajiba
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

2010

pdf
Arabic Named Entity Recognition: Using Features Extracted from Noisy Data
Yassine Benajiba | Imed Zitouni | Mona Diab | Paolo Rosso
Proceedings of the ACL 2010 Conference Short Papers

pdf
Enhancing Mention Detection Using Projection via Aligned Corpora
Yassine Benajiba | Imed Zitouni
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf abs
Arabic Word Segmentation for Better Unit of Analysis
Yassine Benajiba | Imed Zitouni
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The Arabic language has a very rich morphology where a word is composed of zero or more prefixes, a stem and zero or more suffixes. This makes Arabic data sparse compared to other languages, such as English, and consequently word segmentation becomes very important for many Natural Language Processing tasks that deal with the Arabic language. We present in this paper two segmentation schemes that are morphological segmentation and Arabic TreeBank segmentation and we show their impact on an important natural language processing task that is mention detection. Experiments on Arabic TreeBank corpus show 98.1% accuracy on morphological segmentation and 99.4% on morphological segmentation. We also discuss the importance of segmenting the text; experiments show up to 6F points improvement of the mention detection system performance when morphological segmentation is used instead of not segmenting the text. Obtained results also show up to 3F points improvement is achieved when the appropriate segmentation style is used.

pdf
Arabic Mention Detection: Toward Better Unit of Analysis
Yassine Benajiba | Imed Zitouni
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics