word normalization 0.003647265
standard word 0.003577304
word alignment 0.003495365
oov word 0.003448357
dictionary word 0.003441246
word level 0.003430134
word pairs 0.003427684
word candidates 0.003404034
word vector 0.0033124179999999997
unsupervised word 0.003301779
dard word 0.003259508
automatic word 0.003257181
continuous word 0.0032394249999999998
proper word 0.0032275
glish word 0.003203523
word embedding 0.0032022089999999997
unlikely word 0.0032022089999999997
rect word 0.0032022089999999997
word entries 0.0032022089999999997
isolated word 0.0032022089999999997
word clusters 0.0032022089999999997
similarity model 0.002320527
context words 0.002227762
language model 0.0022108270000000003
standard words 0.002172864
translation model 0.002146621
model probability 0.002103652
oov words 0.002043917
dictionary words 0.002036806
model score 0.002000565
error model 0.001932836
model scores 0.001899625
different feature 0.001861537
statistical model 0.00185979
dard words 0.001855068
individual words 0.001853376
markov model 0.0018325120000000001
proper words 0.00182306
above model 0.001814179
tual words 0.001813998
known words 0.001803282
crf model 0.001789415
reranking model 0.00178017
labeling model 0.0017772550000000002
channel model 0.001745945
text normalization 0.001733245
guage model 0.001718668
checking model 0.0017171390000000001
text information 0.0016583470000000001
text corpus 0.0016472470000000001
lexical features 0.001578905
words 0.0015777
probability feature 0.001515804
model 0.00149222
score features 0.0014513500000000001
language models 0.001412768
corpus similarity 0.001407434
noisy text 0.001398659
social text 0.0013983490000000001
semantic similarity 0.0013979740000000002
media text 0.0013868130000000002
training data 0.001370757
normalization performance 0.001365707
different methods 0.00136379
feature set 0.001360131
text domain 0.001343795
informal text 0.001339988
similarity score 0.001336652
feature weight 0.001317012
same letter 0.001314698
unlabeled text 0.0013138610000000002
system performance 0.0013124170000000002
lexical normalization 0.001301025
level similarity 0.001276301
global features 0.001274142
test data 0.0012529989999999999
data set 0.001240853
context information 0.0012402889999999999
string similarity 0.001227448
other information 0.001220943
different ways 0.00121593
contextual similarity 0.00120842
twitter data 0.001196213
features com 0.001179373
boolean features 0.001177094
feature weights 0.001162414
normalization results 0.001160881
standard sentence 0.001156434
previous normalization 0.001130568
normalization task 0.001126312
level normalization 0.001113119
level translation 0.001102395
normalization systems 0.001086046
english letter 0.001078149
normalization methods 0.00107175
various models 0.001065267
data sets 0.001060608
ing method 0.0010597179999999999
surface similarity 0.001059253
similarity mea 0.0010489800000000001
