Atanu Mandal
2023
IACS-LRILT: Machine Translation for Low-Resource Indic Languages
Dhairya Suman
|
Atanu Mandal
|
Santanu Pal
|
Sudip Naskar
Proceedings of the Eighth Conference on Machine Translation
Even though, machine translation has seen huge improvements in the the last decade, translation quality for Indic languages is still underwhelming, which is attributed to the small amount of parallel data available. In this paper, we present our approach to mitigate the issue of the low amount of parallel training data availability for Indic languages, especially for the language pair English-Manipuri and Assamese-English. Our primary submission for the Manipuri-to-English translation task provided the best scoring system for this language direction. We describe about the systems we built in detail and our findings in the process.
Attentive Fusion: A Transformer-based Approach to Multimodal Hate Speech Detection
Atanu Mandal
|
Gargi Roy
|
Amit Barman
|
Indranil Dutta
|
Sudip Kumar Naskar
Proceedings of the 20th International Conference on Natural Language Processing (ICON)
With the recent surge and exponential growth of social media usage, scrutinizing social media content for the presence of any hateful content is of utmost importance. Researchers have been diligently working since the past decade on distinguishing between content that promotes hatred and content that does not. Traditionally, the main focus has been on analyzing textual content. However, recent research attempts have also commenced into the identification of audio-based content. Nevertheless, studies have shown that relying solely on audio or text-based content may be ineffective, as recent upsurge indicates that individuals often employ sarcasm in their speech and writing. To overcome these challenges, we present an approach to identify whether a speech promotes hate or not utilizing both audio and textual representations. Our methodology is based on the Transformer framework that incorporates both audio and text sampling, accompanied by our very own layer called “Attentive Fusion”. The results of our study surpassed previous stateof-the-art techniques, achieving an impressive macro F1 score of 0.927 on the Test Set.
2021
JUNLP@DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Langauges
Avishek Garain
|
Atanu Mandal
|
Sudip Kumar Naskar
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Offensive language identification has been an active area of research in natural language processing. With the emergence of multiple social media platforms offensive language identification has emerged as a need of the hour. Traditional offensive language identification models fail to deliver acceptable results as social media contents are largely in multilingual and are code-mixed in nature. This paper tries to resolve this problem by using IndicBERT and BERT architectures, to facilitate identification of offensive languages for Kannada-English, Malayalam-English, and Tamil-English code-mixed language pairs extracted from social media. The presented approach when evaluated on the test corpus provided precision, recall, and F1 score for language pair Kannada-English as 0.62, 0.71, and 0.66, respectively, for language pair Malayalam-English as 0.77, 0.43, and 0.53, respectively, and for Tamil-English as 0.71, 0.74, and 0.72, respectively.
Search
Co-authors
- Sudip Kumar Naskar 3
- Dhairya Suman 1
- Santanu Pal 1
- Avishek Garain 1
- Gargi Roy 1
- show all...