language model 0.00351164
bigram model 0.002605843
model this 0.002498144
clean model 0.002486488
model 0.00226151
text corpus 0.002204314
other word 0.002038175
training corpus 0.0019109300000000001
language models 0.001807379
transcriptions corpus 0.0017644980000000002
small corpus 0.001749525
whole corpus 0.001719581
clean corpus 0.0016879080000000001
iraining corpus 0.001685824
speech data 0.001659088
word accuracy 0.001609968
trigram language 0.001606319
language modeling 0.0015870559999999999
transcriber language 0.0015437399999999998
corpus 0.00146293
same text 0.001407539
speech recognition 0.001376348
short words 0.00136475
preceding words 0.001342406
transcription data 0.0012763730000000001
testing data 0.001270813
data testing 0.001270813
different trigram 0.001266561
language 0.00125013
different ways 0.001205123
different sets 0.001197109
lexical information 0.001188537
recognition accuracy 0.001134188
accuracy recognition 0.001134188
first method 0.001130116
other types 0.001128129
words 0.00111626
other parts 0.0011057250000000001
perplexity test 0.0010230600000000001
perplexity perplexity 0.001022034
test results 0.001016047
training corpora 9.79263E-4
successful recognition 9.58261E-4
chen recognition 9.58261E-4
speech family 9.42118E-4
good results 9.31853E-4
spontaneous speech 9.27697E-4
second method 9.23401E-4
such boundaries 9.2176E-4
perplexity measures 9.2001E-4
relative frequency 9.177090000000001E-4
trigram models 9.134379999999999E-4
accuracy results 9.025419999999999E-4
large training 8.77564E-4
disfluent speech 8.77184E-4
corpora base 8.67465E-4
base corpora 8.67465E-4
high frequency 8.56777E-4
lexical material 8.35761E-4
lexical items 8.33909E-4
adaptation test 8.240459999999999E-4
such division 8.17068E-4
random function 8.08153E-4
first stage 8.048300000000001E-4
statistical test 8.042419999999999E-4
several conclusions 7.800299999999999E-4
perplexity tests 7.62967E-4
relative weights 7.58497E-4
derived corpora 7.556850000000001E-4
newspaper articles 7.533450000000001E-4
perplexity measurements 7.47097E-4
high density 7.431790000000001E-4
many doctors 7.39615E-4
recognition 7.3565E-4
discourse boundary 7.355339999999999E-4
systematic distribution 7.33366E-4
average frequency 7.29506E-4
following utterance 7.26587E-4
bigram probabilities 7.198409999999999E-4
further evidence 7.18229E-4
random events 7.11153E-4
medical dictations 7.06928E-4
medical dictation 6.88366E-4
theoretical measure 6.87601E-4
medical doctors 6.87543E-4
discourse boundaries 6.85891E-4
random mix 6.78813E-4
random population 6.73225E-4
following formulas 6.67513E-4
accuracy rates 6.602380000000001E-4
adaptation tool 6.591710000000001E-4
accuracy tests 6.504880000000001E-4
use continuum 6.45125E-4
information 6.42513E-4
modeling process 6.4207E-4
speech 6.40698E-4
initial position 6.34667E-4
modeling tools 6.31965E-4
interesting result 6.30826E-4
soap format 6.293049999999999E-4
