test data 0.001842584
data set 0.0017883830000000002
training corpus 0.001652351
level language 0.001535329
same corpus 0.001467014
training set 0.001460984
wsj data 0.00145453
relevant data 0.0014416680000000001
test corpus 0.001431434
words tnt 0.001417981
test time 0.001415763
ent model 0.0014042380000000001
unseen words 0.001401107
morphological knowledge 0.0014009
individual words 0.001382165
lowercase words 0.001368725
unseen word 0.0013581970000000002
such information 0.0013445290000000001
morphological lexicon 0.00134102
other methods 0.001298485
morphological analyzer 0.001288446
training lexicon 0.001279983
test set 0.001240067
different models 0.001224262
training sets 0.00120671
morphological dictio 0.001202034
morphological ana 0.00117503
other systems 0.001159869
ing system 0.001135864
limited training 0.0011312520000000001
corpus tnt 0.0011188910000000001
whole corpus 0.001088729
words 0.00108339
language 0.00108251
model 0.0010614
pos tag 0.001054827
other items 0.0010402739999999999
szeged corpus 0.001030894
information extraction 0.001029086
trigram system 0.001022537
lexical probabilities 0.0010182120000000001
other architectures 0.0010142670000000001
other trie 0.0010142670000000001
surprising results 0.001013638
evaluation english 9.99174E-4
english evaluation 9.99174E-4
possible tags 9.83779E-4
previous tag 9.79113E-4
own system 9.7565E-4
large corpora 9.52137E-4
tag distribution 9.50991E-4
set tnt 9.275240000000001E-4
hmm performance 9.094019999999999E-4
good performance 8.979939999999999E-4
lexical prob 8.871910000000001E-4
hmm models 8.787300000000001E-4
hmm work 8.739030000000001E-4
maxent tag 8.73168E-4
training 8.68051E-4
pos tagging 8.65998E-4
current tag 8.65811E-4
second order 8.520209999999999E-4
stem list 8.42246E-4
such tweaking 8.27386E-4
full dictionary 8.21346E-4
possible labels 8.16525E-4
complex models 8.00219E-4
pos tagger 7.98526E-4
hmm search 7.96586E-4
vocabulary problem 7.95267E-4
error rate 7.91578E-4
tional tags 7.91539E-4
tokens unseen 7.89264E-4
corpus 7.843E-4
research purposes 7.80745E-4
lex order 7.762489999999999E-4
hmm tagging 7.70376E-4
results 7.65274E-4
information 7.64708E-4
search space 7.58463E-4
emission order 7.56801E-4
tagging process 7.37134E-4
open source 7.367960000000001E-4
standard way 7.36372E-4
overall tnt 7.30326E-4
system 7.26933E-4
research labs 7.24421E-4
justed corpora 7.20864E-4
biguous tokens 7.18922E-4
suffix tries 7.138E-4
unseen overall 7.13452E-4
good tnt 7.067320000000001E-4
approach 6.97783E-4
handy way 6.945580000000001E-4
rich morphology 6.94489E-4
tnt output 6.93991E-4
features 6.82022E-4
proper names 6.81355E-4
executable form 6.789960000000001E-4
penn treebank 6.78902E-4
