normalization word 0.00352152
other word 0.0027300320000000003
text normalization 0.0026880100000000002
word nodes 0.002655774
noisy word 0.0025439710000000003
oov word 0.0025036900000000003
word node 0.002419809
word cor 0.0023579910000000003
word mapping 0.002355544
word mappings 0.002346065
center word 0.0023432770000000004
text data 0.002149988
normalization system 0.002072459
normalization approach 0.0020154730000000003
normalization sequence 0.002007002
model data 0.001997878
language model 0.001976071
language words 0.001954081
normalization lexicon 0.00190757
normalization candidates 0.001881131
text corpus 0.001838256
possible normalization 0.001821487
normalization candidate 0.001807108
normalization task 0.001800811
normalization dictionary 0.001794258
baseline normalization 0.001769696
normalization evaluation 0.00176918
normalization results 0.001764164
small normalization 0.001763714
probability normalization 0.001746244
main normalization 0.001734242
unsupervised normalization 0.001734026
other words 0.001722422
normalization problem 0.001717527
noisy text 0.001710461
correct normalization 0.001701327
entity normalization 0.00167481
normalization pairs 0.001669923
normalization equivalences 0.00166499
wrong normalization 0.001640704
normalization examples 0.001639691
equivalent normalization 0.00163904
media text 0.001629052
probable normalization 0.001626384
normalization prob 0.001624394
bad normalization 0.00162167
normalization meth 0.001620864
normalization lexicons 0.001619524
normalization sys 0.001619431
normalization candi 0.001616727
normalization lexi 0.0016163570000000001
appropriate normalization 0.0016149670000000001
inappropriate normalization 0.001611258
confident normalization 0.001610103
conservative normalization 0.001609554
fident normalization 0.0016091080000000002
normalization dur 0.001607822
normalization equiva 0.001607822
unlabeled text 0.001606604
many words 0.001589924
precision text 0.001569892
different data 0.001560638
noisy words 0.001536361
words candidate 0.001523298
text normaliza 0.001514867
dia text 0.001513896
similar words 0.001507319
glish text 0.001505335
beled text 0.001499352
spanish text 0.001498354
oov words 0.00149608
channel model 0.001437636
lexical similarity 0.001422987
domain data 0.001399043
normalization 0.00139886
clean words 0.0013795790000000001
lexicon data 0.001369548
similarity graph 0.001359417
translation system 0.001352533
training data 0.001352145
similarity approach 0.001346144
ized words 0.001328363
translation approach 0.001295547
noisy data 0.001282149
evaluation data 0.001231158
parallel data 0.001221043
string similarity 0.001204272
media data 0.00120074
data size 0.00119645
unlabeled data 0.001178292
phonetic similarity 0.001169713
system candidates 0.00115587
different systems 0.001153259
model 0.00113704
such systems 0.00113503
ing context 0.0011298620000000001
natural language 0.001129765
language processing 0.00112755
clean data 0.001125367
lexicon approach 0.0011253230000000001
