2023
pdf
abs
Mytho-Annotator: An Annotation tool for Indian Hindu Mythology
Apurba Paul
|
Anupam Mondal
|
Sainik Mahata
|
Srijan Seal
|
Prasun Sarkar
|
Dipankar Das
Proceedings of the 20th International Conference on Natural Language Processing (ICON)
Mythology is a collection of myths, especially one belonging to a particular religious or cultural tradition. We observed that an annotation tool is essential to identify important and complex information from any mythological texts or corpora. Additionally, obtaining highquality annotated corpora for complex information extraction including labeled text segments is an expensive and timeconsuming process. Hence, in this paper, we have designed and deployed an annotation tool for Hindu mythology which is presented as Mytho-Annotator. Its easy-to-use web-based text annotation tool is powered by Natural Language Processing (NLP). This tool primarily labels three different categories such as named entities, relationships, and event entities. This annotation tool offers a comprehensive and adaptable annotation paradigm.
2021
pdf
abs
Classification of COVID19 tweets using Machine Learning Approaches
Anupam Mondal
|
Sainik Mahata
|
Monalisa Dey
|
Dipankar Das
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task
The reported work is a description of our participation in the “Classification of COVID19 tweets containing symptoms” shared task, organized by the “Social Media Mining for Health Applications (SMM4H)” workshop. The literature describes two machine learning approaches that were used to build a three class classification system, that categorizes tweets related to COVID19, into three classes, viz., self-reports, non-personal reports, and literature/news mentions. The steps for pre-processing tweets, feature extraction, and the development of the machine learning models, are described extensively in the documentation. Both the developed learning models, when evaluated by the organizers, garnered F1 scores of 0.93 and 0.92 respectively.
2018
pdf
A Content-based Recommendation System for Medical Concepts: Disease and Symptom
Anupam Mondal
|
Dipankar Das
|
Sivaji Bandyopadhyay
Proceedings of the 15th International Conference on Natural Language Processing
pdf
bib
abs
WME 3.0: An Enhanced and Validated Lexicon of Medical Concepts
Anupam Mondal
|
Dipankar Das
|
Erik Cambria
|
Sivaji Bandyopadhyay
Proceedings of the 9th Global Wordnet Conference
Information extraction in the medical domain is laborious and time-consuming due to the insufficient number of domain-specific lexicons and lack of involvement of domain experts such as doctors and medical practitioners. Thus, in the present work, we are motivated to design a new lexicon, WME 3.0 (WordNet of Medical Events), which contains over 10,000 medical concepts along with their part of speech, gloss (descriptive explanations), polarity score, sentiment, similar sentiment words, category, affinity score and gravity score features. In addition, the manual annotators help to validate the overall as well as individual category level of medical concepts of WME 3.0 using Cohen’s Kappa agreement metric. The agreement score indicates almost correct identification of medical concepts and their assigned features in WME 3.0.
2017
pdf
Relationship Extraction based on Category of Medical Concepts from Lexical Contexts
Anupam Mondal
|
Dipankar Das
|
Sivaji Bandyopadhyay
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)
pdf
abs
JUNLP at IJCNLP-2017 Task 3: A Rank Prediction Model for Review Opinion Diversification
Monalisa Dey
|
Anupam Mondal
|
Dipankar Das
Proceedings of the IJCNLP 2017, Shared Tasks
IJCNLP-17 Review Opinion Diversification (RevOpiD-2017) task has been designed for ranking the top-k reviews of a product from a set of reviews, which assists in identifying a summarized output to express the opinion of the entire review set. The task is divided into three independent subtasks as subtask-A,subtask-B, and subtask-C. Each of these three subtasks selects the top-k reviews based on helpfulness, representativeness, and exhaustiveness of the opinions expressed in the review set individually. In order to develop the modules and predict the rank of reviews for all three subtasks, we have employed two well-known supervised classifiers namely, Naïve Bayes and Logistic Regression on the top of several extracted features such as the number of nouns, number of verbs, and number of sentiment words etc from the provided datasets. Finally, the organizers have helped to validate the predicted outputs for all three subtasks by using their evaluation metrics. The metrics provide the scores of list size 5 as (0.80 (mth)) for subtask-A, (0.86 (cos), 0.87 (cos d), 0.71 (cpr), 4.98 (a-dcg), and 556.94 (wt)) for subtask B, and (10.94 (unwt) and 0.67 (recall)) for subtask C individually.
2016
pdf
abs
WME: Sense, Polarity and Affinity based Concept Resource for Medical Events
Anupam Mondal
|
Dipankar Das
|
Erik Cambria
|
Sivaji Bandyopadhyay
Proceedings of the 8th Global WordNet Conference (GWC)
In order to overcome the lack of medical corpora, we have developed a WordNet for Medical Events (WME) for identifying medical terms and their sense related information using a seed list. The initial WME resource contains 1654 medical terms or concepts. In the present research, we have reported the enhancement of WME with 6415 number of medical concepts along with their conceptual features viz. Parts-of-Speech (POS), gloss, semantics, polarity, sense and affinity. Several polarity lexicons viz. SentiWordNet, SenticNet, Bing Liu’s subjectivity list and Taboda’s adjective list were introduced with WordNet synonyms and hyponyms for expansion. The semantics feature guided us to build a semantic co-reference relation based network between the related medical concepts. These features help to prepare a medical concept network for better sense relation based visualization. Finally, we evaluated with respect to Adaptive Lesk Algorithm and conducted an agreement analysis for validating the expanded WME resource.