language corpus 0.00359296
speech corpus 0.002512069
kazakh corpus 0.002320022
annotated corpus 0.002312656
language model 0.002298382
language kazakh 0.002264602
kazakh language 0.002264602
national corpus 0.002210959
language models 0.002199997
primary corpus 0.002125241
language section 0.002112926
scale corpus 0.002111373
russian language 0.002098917
corpus development 0.002096184
available corpus 0.002080179
guage corpus 0.0020779880000000002
corpus management 0.002074592
state language 0.00205978
balanced corpus 0.002054916
markup language 0.002054452
informal language 0.002042412
corpus klc 0.002038972
purpose corpus 0.002036332
corpus most 0.002031884
poetry corpus 0.002031884
dialect corpus 0.002031884
pioneering corpus 0.002031884
turkic language 0.001997973
predominant language 0.001977801
flected language 0.001976559
finding language 0.001976559
corpus 0.00182419
text data 0.001804759
speech data 0.001793499
language 0.00176877
annotated data 0.0015940860000000002
audio data 0.0014880380000000001
other languages 0.001392799
sentence sentence 0.00138088
foreign words 0.001380821
other corpora 0.001379979
russian words 0.001362827
automatic word 0.001330272
words literary 0.0013162830000000001
meta data 0.00131602
other part 0.001313069
foreign word 0.001301106
unique words 0.00128265
different sentences 0.001277385
frequent words 0.001264712
manual annotation 0.0012641990000000001
word forms 0.001251933
word form 0.001219219
annotation tool 0.001219094
such parts 0.001196404
such categories 0.001189649
annotated text 0.001187605
kazakh sentence 0.001186272
kazakh speech 0.001183711
other materials 0.0011766189999999998
annotated speech 0.001176345
such subcorpora 0.001162442
lion word 0.001162436
such exceptions 0.001150274
other countries 0.001146686
such functionalities 0.001145937
different affix 0.0011410679999999999
annotation process 0.0011373450000000001
speech corpora 0.0011359159999999998
annotation files 0.001133015
structural annotation 0.001127902
different combinations 0.0011056759999999999
annotation scheme 0.001104224
text format 0.001102231
annotation speed 0.001086925
annotation figure 0.001083955
corresponding text 0.001076373
different con 0.0010688009999999999
annotation experience 0.001067408
different age 0.0010599939999999999
many research 0.001056184
direct speech 0.001052227
speech processing 0.001050388
kazakh texts 0.0010429039999999999
different organi 0.001041109
words 0.00103268
speech tagging 0.001031685
linguistic property 0.001027617
speech recognition 0.001026238
original text 0.001024551
document sentence 0.0010101239999999998
raw text 0.001008999
sentence list 9.96203E-4
text transcript 9.93846E-4
linguistic properties 9.85892E-4
many corpora 9.856869999999998E-4
linguistic markups 9.763549999999999E-4
various speech 9.69363E-4
sentence boundaries 9.635799999999999E-4
parallel text 9.49078E-4
