Question Answering Classification for Amharic Social Media Community Based Questions
Tadesse Destaw Belay, Seid Muhie Yimam, Abinew Ayele, Chris Biemann
Abstract
In this work, we build a Question Answering (QA) classification dataset from a social media platform, namely the Telegram public channel called @AskAnythingEthiopia. The channel has more than 78k subscribers and has existed since May 31, 2019. The platform allows asking questions that belong to various domains, like politics, economics, health, education, and so on. Since the questions are posed in a mixed-code, we apply different strategies to pre-process the dataset. Questions are posted in Amharic, English, or Amharic but in a Latin script. As part of the pre-processing tools, we build a Latin to Ethiopic Script transliteration tool. We collect 8k Amharic and 24K transliterated questions and develop deep learning-based questions answering classifiers that attain as high as an F-score of 57.29 in 20 different question classes or categories. The datasets and pre-processing scripts are open-sourced to facilitate further research on the Amharic community-based question answering.- Anthology ID:
- 2022.sigul-1.18
- Volume:
- Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Maite Melero, Sakriani Sakti, Claudia Soria
- Venue:
- SIGUL
- SIG:
- SIGUL
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 137–145
- Language:
- URL:
- https://aclanthology.org/2022.sigul-1.18
- DOI:
- Cite (ACL):
- Tadesse Destaw Belay, Seid Muhie Yimam, Abinew Ayele, and Chris Biemann. 2022. Question Answering Classification for Amharic Social Media Community Based Questions. In Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages, pages 137–145, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Question Answering Classification for Amharic Social Media Community Based Questions (Belay et al., SIGUL 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2022.sigul-1.18.pdf
- Code
- uhh-lt/amharicmodels