standard word 0.003872149
english word 0.003644068
dictionary word 0.003637091
word frequency 0.003605031
word table 0.003588305
word boundary 0.00352766
word sense 0.003519166
word formation 0.003516484
specific word 0.003479559
correct word 0.003471798
word stem 0.00346852
glish word 0.003451639
word construction 0.003448045
word share 0.003435296
context words 0.002316025
text normalization 0.00225057
standard words 0.002056699
normalization system 0.001969199
candidate words 0.001890022
common words 0.00188645
english words 0.0018286180000000002
dictionary words 0.0018216410000000001
nonstandard words 0.001787443
text corpus 0.001756928
standard text 0.0017347589999999998
study words 0.001694006
novel words 0.00167923
correct words 0.0016563480000000002
annotated words 0.001649782
twitter data 0.001633259
tionary words 0.001623406
intended words 0.001620394
jumbled words 0.00161575
sms normalization 0.001609353
normalization task 0.00157936
training data 0.001529551
normalization problem 0.001508944
context information 0.001506548
social text 0.001481375
phonetic similarity 0.001434949
normalization sys 0.001400264
words 0.00139991
normalization prob 0.001398188
twitter corpus 0.0013956300000000001
driven normalization 0.001388422
data set 0.001382044
contextual similarity 0.001371374
transformation model 0.001369183
informal text 0.001361635
ing data 0.001361232
news text 0.00133265
similarity threshold 0.001331664
model confidence 0.001329263
visual similarity 0.001325092
cial text 0.0013132019999999999
robust text 0.0013123219999999999
world text 0.001306167
channel model 0.001302842
local context 0.001299819
markov model 0.001259324
global context 0.001248311
different characters 0.00124365
different perspective 0.001242726
context vector 0.001230335
crf model 0.0012281920000000001
data sets 0.001214672
original data 0.00121274
phonemic similarity 0.001198531
different domains 0.001195226
top context 0.001177214
normalization 0.0011726
different perspec 0.001166695
different populations 0.001166695
prior context 0.001157816
similarity algo 0.001152396
cosine similarity 0.001149314
context vectors 0.001145056
context infor 0.001141035
system combination 0.0011342610000000001
system output 0.001130145
synthesis system 0.001129374
standard nlp 0.001118548
other candidate 0.001106541
standard dictionary 0.00107852
other nlp 0.001078188
labeling system 0.001075421
training pair 0.001066811
standard tokens 0.001066052
tion system 0.001064478
first letter 0.001059085
boundary features 0.001048623
token frequency 0.001041899
malization system 0.001033601
ization system 0.001025909
asr system 0.0010211410000000001
system cov 0.001020384
system effectiveness 0.0010169950000000001
training pairs 0.00101366
sulting system 0.001013357
global corpus 0.001011154
