Loitongbam Gyanendro Singh

2020

pdf bib abs
Sentiment Analysis of Tweets using Heterogeneous Multi-layer Network Representation and Embedding
Loitongbam Gyanendro Singh | Anasua Mitra | Sanasam Ranbir Singh
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Sentiment classification on tweets often needs to deal with the problems of under-specificity, noise, and multilingual content. This study proposes a heterogeneous multi-layer network-based representation of tweets to generate multiple representations of a tweet and address the above issues. The generated representations are further ensembled and classified using a neural-based early fusion approach. Further, we propose a centrality aware random-walk for node embedding and tweet representations suitable for the multi-layer network. From various experimental analysis, it is evident that the proposed method can address the problem of under-specificity, noisy text, and multilingual content present in a tweet and provides better classification performance than the text-based counterparts. Further, the proposed centrality aware based random walk provides better representations than unbiased and other biased counterparts.

2016

pdf bib abs
Automatic Syllabification for Manipuri language
Loitongbam Gyanendro Singh | Lenin Laitonjam | Sanasam Ranbir Singh
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Development of hand crafted rule for syllabifying words of a language is an expensive task. This paper proposes several data-driven methods for automatic syllabification of words written in Manipuri language. Manipuri is one of the scheduled Indian languages. First, we propose a language-independent rule-based approach formulated using entropy based phonotactic segmentation. Second, we project the syllabification problem as a sequence labeling problem and investigate its effect using various sequence labeling approaches. Third, we combine the effect of sequence labeling and rule-based method and investigate the performance of the hybrid approach. From various experimental observations, it is evident that the proposed methods outperform the baseline rule-based method. The entropy based phonotactic segmentation provides a word accuracy of 96%, CRF (sequence labeling approach) provides 97% and hybrid approach provides 98% word accuracy.

Co-authors

Venues

COLING1
EMNLP1