chinese word 0.00477144
chinese words 0.00400411
chinese pos 0.00339771
other word 0.003307917
chinese sentence 0.00329266
chinese language 0.003177941
word segmentation 0.003130807
chinese text 0.003120971
chinese novelty 0.003086486
chinese character 0.002914805
new word 0.0028982210000000003
chinese document 0.002857823
word recognition 0.002828539
word count 0.00277893
word list 0.002771027
stop word 0.002756899
glish word 0.002741178
word seg 0.002735181
unknown word 0.002728926
word stem 0.002728063
chinese datasets 0.002726846
word combinations 0.002724234
chinese docu 0.002657899
chinese preprocessing 0.00264767
apwsj chinese 0.002634502
chinese sen 0.002625426
chinese nov 0.002618897
other words 0.002540587
words novelty 0.002425896
chinese 0.00233235
candidate words 0.00207907
words recognition 0.002061209
english sentence 0.0020458100000000003
meaningful words 0.002031429
punctuation words 0.002024919
stop words 0.001989569
known words 0.001972098
unknown words 0.001961596
words stem 0.001960733
stem words 0.001960733
english language 0.0019310910000000002
english text 0.001874121
english novelty 0.001839636
segmentation pos 0.0017570770000000001
novelty information 0.00172898
words 0.00167176
pos tagging 0.001658652
topic model 0.0016525939999999999
language model 0.001647881
machine translation 0.001622199
pos sequence 0.001567663
language segmentation 0.001537308
sentence level 0.0015354420000000001
english datasets 0.0014799960000000001
recognition pos 0.001454809
nese pos 0.001429904
novel information 0.001415704
main language 0.00141387
english docu 0.0014110490000000002
english preprocessing 0.00140082
result pos 0.001372199
text mining 0.001366476
pos filtering 0.001362201
novelty score 0.0013569979999999999
corrected translation 0.0013537619999999999
sentence string 0.0013518710000000001
dissimilar pos 0.001348424
tagging novelty 0.001347428
above sentence 0.001332439
novelty mining 0.0013319909999999998
redundant information 0.001329396
level novelty 0.001329268
useful information 0.001322103
information retrieval 0.001304207
mining algorithm 0.0012870199999999998
document novelty 0.0012796090000000001
information stream 0.001260498
sequence novelty 0.0012564389999999998
other chi 0.0012303169999999999
novel text 0.001229481
relevant topic 0.00122309
nese language 0.001210135
first step 0.001209763
certain topic 0.001205556
segmentation process 0.0011965209999999999
language techniques 0.001190187
topic detection 0.001175841
mining system 0.001172996
other works 0.001171299
natural language 0.001161382
nese text 0.001153165
topic tracking 0.001148852
topic bin 0.001138475
space vector 0.0011180019999999999
vector space 0.0011180019999999999
text detection 0.001114158
correct segmentation 0.001109864
document level 0.0011006050000000002
novelty track 0.001090705
evance model 0.001090449
