Loitongbam Gyanendro Singh


2020

pdf bib
Sentiment Analysis of Tweets using Heterogeneous Multi-layer Network Representation and Embedding
Loitongbam Gyanendro Singh | Anasua Mitra | Sanasam Ranbir Singh
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Sentiment classification on tweets often needs to deal with the problems of under-specificity, noise, and multilingual content. This study proposes a heterogeneous multi-layer network-based representation of tweets to generate multiple representations of a tweet and address the above issues. The generated representations are further ensembled and classified using a neural-based early fusion approach. Further, we propose a centrality aware random-walk for node embedding and tweet representations suitable for the multi-layer network. From various experimental analysis, it is evident that the proposed method can address the problem of under-specificity, noisy text, and multilingual content present in a tweet and provides better classification performance than the text-based counterparts. Further, the proposed centrality aware based random walk provides better representations than unbiased and other biased counterparts.

2016

pdf bib
Automatic Syllabification for Manipuri language
Loitongbam Gyanendro Singh | Lenin Laitonjam | Sanasam Ranbir Singh
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Development of hand crafted rule for syllabifying words of a language is an expensive task. This paper proposes several data-driven methods for automatic syllabification of words written in Manipuri language. Manipuri is one of the scheduled Indian languages. First, we propose a language-independent rule-based approach formulated using entropy based phonotactic segmentation. Second, we project the syllabification problem as a sequence labeling problem and investigate its effect using various sequence labeling approaches. Third, we combine the effect of sequence labeling and rule-based method and investigate the performance of the hybrid approach. From various experimental observations, it is evident that the proposed methods outperform the baseline rule-based method. The entropy based phonotactic segmentation provides a word accuracy of 96%, CRF (sequence labeling approach) provides 97% and hybrid approach provides 98% word accuracy.