Nikhil Kumar


2019

India is one of unique countries in the world that has the legacy of diversity of languages. Most of these languages are influenced by English. This causes a large presence of code-mixed text in Social Media. Enormous presence of this code-mixed text provides an important research area for Natural Language Processing (NLP). This paper proposes a novel Attention based deep learning technique for Sentiment Classification on Code-Mixed Text (ACCMT) of Hindi-English. The proposed architecture uses fusion of character and word features. Non availability of suitable Word Embedding to represent these Code-Mixed texts is another important hurdle for this league of NLP tasks. This paper also proposes a novel technique for preparing Word Embedding of Code-Mixed text. This embedding is prepared with two separately trained word-embedding on Romanized Hindi and English respectively. This embedding is further used in the proposed deep learning based architecture for robust classification. The Proposed technique achieves 71.97% accuracy, which exceeds the baseline accuracy.