(1)Feature representation
For embedding layer, this system do some preprocessing, such as lowercase, punctuation segmentation, emoticon segmentation and abbreviation processing.
Using glove for pre-training 200 dimension word vector based on aarge twitter's corpus.
Using 300 dimension emoji vector. Made a PCA dimension reduction on emoji vector for suiting word vector.

(2)Classification model
The classifier is a two layer bidirectional long-short memory neural network, and there is a highway cascade the first layer and the second layer's output.
Attention mechanism is used for BiLSTM's output.
Using a double attention model for dealing with emoji appear in a sentence.
Finally, use a full connection layer to classify.

(3)Learning strategy
Using cross entropy as loss function after softmax.
Exponentially attenuated learning rate.

(4)Ensemble
Using 5-fold cross validation to training five models, then make a soft-voting as the final result.

Learning rate changed
