lexical pairs 0.002370827
oov word 0.002137332
word error 0.002073619
word type 0.002046072
lexical normalisation 0.002037364
word frequency 0.002036848
context words 0.001926491
word match 0.001918154
word fre 0.001909927
lexical variants 0.001823833
similarity information 0.0017668900000000001
lexical variant 0.001757461
text similarity 0.001747729
oov words 0.001743732
lexical choice 0.001729272
string similarity 0.001700549
lexical vari 0.001683272
similarity measures 0.001661104
lexical creativity 0.001656841
lexical normalisa 0.001656841
contextual similarity 0.001621431
similarity methods 0.0016213710000000001
distributional similarity 0.0015659530000000001
unknown words 0.001561706
normalisation dictionary 0.0015538700000000002
known words 0.001546655
normalisation pairs 0.001534591
similarity scores 0.001534414
surrounding words 0.001513155
paired words 0.001513155
normalisation approach 0.001472719
lexical 0.0014368
similarity parameters 0.001436771
dictionary oov 0.001404798
tributional similarity 0.001398996
tweets similarity 0.001394482
butional similarity 0.001382306
other normalisation 0.0013745340000000002
ing pairs 0.001364961
form pairs 0.001339837
standard edit 0.001300261
dictionary lookup 0.001300083
words 0.00129224
different string 0.001289752
correct pairs 0.001289036
good approach 0.001285072
normalisation method 0.001284584
tion dictionary 0.0012723909999999999
test data 0.001269344
twitter data 0.001252382
context information 0.001243531
other methods 0.001237731
other approaches 0.001230217
noisy pairs 0.001226616
ing data 0.001225512
aspell dictionary 0.0012185310000000001
fixed dictionary 0.001210941
slang dictionary 0.001210195
standard form 0.001197917
malisation dictionary 0.001191932
isation dictionary 0.00117705
incorrect pairs 0.001175346
feasibility dictionary 0.00117462
malisation pairs 0.001172653
standard evaluation 0.001169014
similarity 0.00115761
natural language 0.001153856
language identification 0.001146792
edit distance 0.001116231
other types 0.001115101
microblog data 0.0011138419999999999
standard forms 0.001113374
such methods 0.001112096
malisation approach 0.001110781
different dictionaries 0.001102214
ternative approach 0.001094733
fast approach 0.001094733
development data 0.001094098
error model 0.001086568
language processing 0.001085349
normalisation task 0.001076839
character level 0.001071729
distance measure 0.0010681520000000002
normalisation methods 0.001064325
different rank 0.001063258
normalisation approaches 0.001056811
other parameters 0.0010531310000000001
truth data 0.001051441
raw data 0.001050809
small normalisation 0.001037143
other hand 0.001030031
such tokens 0.001028263
other meth 0.001018768
twitter corpus 0.0010180100000000002
croblog data 0.001015702
context tokens 0.001014179
gold standard 0.0010138740000000001
standard orthography 0.001012591
translation task 0.001011135
same way 0.0010043489999999999
