training word 0.0042045
context word 0.004197274
english word 0.004090615000000001
word information 0.004046979
word sequence 0.004019367
word candidate 0.003915252
dictionary word 0.003872987
word forms 0.003821152
corresponding word 0.003763572
modeling word 0.003759129
word frequency 0.003750102
word pairs 0.003737472
original word 0.003726162
word abbreviations 0.003663965
probable word 0.003655742
glish word 0.003641039
model training 0.00259689
language model 0.002374793
context words 0.002364274
english words 0.002257615
standard words 0.0021918529999999997
transformation model 0.00214883
model confidence 0.0021173349999999997
channel model 0.00210581
candidate words 0.0020822519999999997
markov model 0.002076511
crf model 0.002070736
dictionary words 0.002039987
mation model 0.002037881
model con 0.002033236
training data 0.00193852
nonstandard words 0.0019381759999999998
original words 0.0018931619999999999
model 0.00178292
data set 0.001749763
letter sequence 0.0016325570000000002
letter alignment 0.0015619050000000002
words 0.00155753
first letter 0.00153886
sms data 0.0015335050000000001
letter level 0.0014758570000000001
data pairs 0.001471492
data sets 0.001456456
annotated data 0.001455774
training set 0.0014391830000000001
twitter test 0.0014258959999999998
features character 0.001421112
text normalization 0.001420891
character sequence 0.001416775
single letter 0.001409356
twitter corpus 0.001403193
different feature 0.0013911330000000001
data collection 0.001381832
labeled data 0.001380345
general letter 0.001373763
letter transformation 0.0013696300000000002
previous letter 0.001363618
test set 0.00134782
letter combination 0.001336767
standard english 0.001334408
unified letter 0.0013253380000000001
common character 0.001308409
different systems 0.001306469
noisy text 0.001304349
generic letter 0.00129627
letter repetition 0.001291427
accuracy twitter 0.001272824
letter combinations 0.001272203
text message 0.001268565
letter switching 0.001264842
letter trans 0.001261476
frequent letter 0.001257714
letter transfor 0.0012517960000000001
letter chunks 0.0012502770000000002
letter repetitions 0.0012502770000000002
noisy training 0.001250195
phone text 0.0012342289999999999
system accuracy 0.001212932
text messages 0.001186523
training pair 0.0011711690000000001
large set 0.001167329
system performance 0.0011653829999999999
training pairs 0.001160912
common sequence 0.001149308
most text 0.001136249
sms test 0.001131562
english forms 0.0011307070000000001
message test 0.001123048
test tokens 0.001112667
different types 0.001106459
level features 0.001105311
twitter message 0.00110373
consecutive context 0.001094051
above training 0.0010888949999999999
translation approach 0.0010797659999999998
alignment algorithm 0.0010742170000000001
labeled training 0.001069765
following features 0.001060208
next character 0.001057098
test sets 0.001054513
