Yasaman Boreshban
2023
RobustQA: A Framework for Adversarial Text Generation Analysis on Question Answering Systems
Yasaman Boreshban
|
Seyed Morteza Mirbostani
|
Seyedeh Fatemeh Ahmadi
|
Gita Shojaee
|
Fatemeh Kamani
|
Gholamreza Ghassem-Sani
|
Seyed Abolghasem Mirroshandel
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Question answering (QA) systems have reached human-level accuracy; however, these systems are not robust enough and are vulnerable to adversarial examples. Recently, adversarial attacks have been widely investigated in text classification. However, there have been few research efforts on this topic in QA. In this article, we have modified the attack algorithms widely used in text classification to fit those algorithms for QA systems. We have evaluated the impact of various attack methods on QA systems at character, word, and sentence levels. Furthermore, we have developed a new framework, named RobustQA, as the first open-source toolkit for investigating textual adversarial attacks in QA systems. RobustQA consists of seven modules: Tokenizer, Victim Model, Goals, Metrics, Attacker, Attack Selector, and Evaluator. It currently supports six different attack algorithms. Furthermore, the framework simplifies the development of new attack algorithms in QA. The source code and documentation of RobustQA are available at https://github.com/mirbostani/RobustQA.
Deep Active Learning for Morphophonological Processing
Seyed Morteza Mirbostani
|
Yasaman Boreshban
|
Salam Khalifa
|
SeyedAbolghasem Mirroshandel
|
Owen Rambow
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Building a system for morphological processing is a challenging task in morphologically complex languages like Arabic. Although there are some deep learning based models that achieve successful results, these models rely on a large amount of annotated data. Building such datasets, specially for some of the lower-resource Arabic dialects, is very difficult, time-consuming, and expensive. In addition, some parts of the annotated data do not contain useful information for training machine learning models. Active learning strategies allow the learner algorithm to select the most informative samples for annotation. There has been little research that focuses on applying active learning for morphological inflection and morphophonological processing. In this paper, we have proposed a deep active learning method for this task. Our experiments on Egyptian Arabic show that with only about 30% of annotated data, we achieve the same results as does the state-of-the-art model on the whole dataset.