hindi word 0.003103674
language words 0.0030849700000000002
word level 0.0030044679999999997
embedded word 0.002926142
word origin 0.002888371
pos tag 0.0024886300000000004
other language 0.00241389
pos tagging 0.002387862
pos tags 0.002344582
pos tagger 0.002335442
hindi pos 0.002256484
experiments pos 0.002191478
pos tagset 0.0021775780000000003
same language 0.002152281
twitter pos 0.002097771
language identification 0.00207278
pos taggers 0.0020680390000000002
pos labels 0.002051546
stanford pos 0.002045705
universal pos 0.002044911
english words 0.00199056
hindi words 0.001964544
matrix language 0.001949269
source language 0.001915228
embedded language 0.001894322
such data 0.0018762789999999998
language labels 0.0018669160000000001
language identities 0.0018568740000000001
language identifica 0.00185062
language identity 0.001850365
language identifier 0.001849904
language iden 0.00184928
adjacent words 0.001746381
training data 0.001738199
language 0.00159614
words 0.00148883
hindi data 0.0014744749999999998
other languages 0.001385713
different models 0.001364388
media data 0.001357084
annotated data 0.0013544779999999999
linguistic features 0.001342101
tagging task 0.001331238
annotated corpus 0.001258487
context information 0.001250465
sms corpus 0.001211639
identification task 0.001200786
linguistic analysis 0.001190793
notated corpus 0.0011783380000000001
other script 0.001170879
corpus creation 0.001157277
such content 0.001154469
different experiments 0.0011382620000000001
english text 0.001114696
other hand 0.001097919
matrix information 0.001081215
other bilin 0.001072603
model 0.00106726
baseline results 0.001056582
training examples 0.001052935
test set 0.001043869
separate tag 0.001036941
shared task 0.001028658
same speech 0.001020908
hindi news 9.99215E-4
text normalization 9.87792E-4
hindi sentence 9.87572E-4
different kinds 9.83967E-4
different tagsets 9.83179E-4
media text 9.712890000000001E-4
text processing 9.508330000000001E-4
use models 9.395359999999999E-4
interesting linguistic 9.289560000000001E-4
linguistic point 9.24071E-4
tagging accuracies 9.19921E-4
indian languages 9.16629E-4
large number 9.13424E-4
lexical category 9.1212E-4
tional models 9.07589E-4
corpus 9.0277E-4
pipelined approach 8.977E-4
mixed text 8.91107E-4
linguistic complexity 8.84015E-4
tional text 8.837210000000001E-4
rate context 8.76577E-4
unnormalized text 8.739090000000001E-4
glish text 8.70663E-4
tagging accu 8.60062E-4
asian languages 8.54623E-4
identification module 8.53854E-4
multilingual con 8.48611E-4
output tags 8.39959E-4
monolingual tagger 8.36193E-4
dravidian languages 8.2446E-4
first experiment 8.242130000000001E-4
baseline system 8.21621E-4
other 8.1775E-4
hindi speakers 8.159770000000001E-4
guage identification 8.08645E-4
spelling normalization 7.93549E-4
