training data 0.00392524
word word 0.0036674
data record 0.003373441
data records 0.003267504
ing data 0.0032321859999999997
level data 0.0032256
model training 0.00321979
data extraction 0.003208443
single data 0.003203517
new data 0.0031567929999999998
web data 0.003147767
data resources 0.003130144
language model 0.00312496
individual data 0.0031181539999999997
generalized data 0.003087801
meaningful data 0.003086669
help data 0.003086669
multiple data 0.003068445
data mining 0.003050367
coherent data 0.0030491069999999997
vidual data 0.0030485679999999998
tabular data 0.0030479879999999997
data balance 0.0030479879999999997
embedded data 0.0030479879999999997
word language 0.0028540199999999996
crf model 0.002647301
ing model 0.002526736
model fea 0.002426076
fields model 0.002403496
tree model 0.0023815530000000002
model trigrams 0.002381533
model variants 0.002345757
model comparisons 0.00234287
text language 0.00232434
word context 0.002312196
word similarity 0.002281657
word level 0.0022492099999999998
form word 0.002234396
word heuristic 0.0022212959999999998
previous word 0.002220911
current word 0.0021974259999999997
word current 0.0021974259999999997
word analysis 0.002195568
window word 0.002169146
word window 0.002169146
word alignment 0.00216407
word distribution 0.002162268
next word 0.002143914
surface word 0.002138803
word statistics 0.002123242
word heuristics 0.002114432
model 0.00210464
word topog 0.002072587
word similairties 0.002072587
text segmentation 0.00198697
language models 0.0018140349999999999
training set 0.001749002
full text 0.0016863719999999998
unsupervised text 0.0016828899999999998
text analysis 0.001665888
text corpora 0.001628666
text resources 0.0016240739999999999
free text 0.0016167159999999998
surface text 0.0016091229999999998
biomedical text 0.001607621
text seg 0.0015783919999999999
unstructured text 0.001577013
other words 0.001560639
crf features 0.0015530309999999999
traditional text 0.00155288
most text 0.00154922
plain text 0.00154573
text segmen 0.001545363
text mining 0.001544297
text narratives 0.00154335
semantic language 0.001490937
semantic features 0.001480987
heuristic features 0.001397966
format features 0.001384242
dependency features 0.0013840950000000001
language analysis 0.001382188
good features 0.001379965
feature function 0.00137838
many words 0.001353318
crf models 0.0013363759999999998
feature selection 0.001316384
linguistic features 0.001314931
language processing 0.001314619
heuristic feature 0.001311926
natural language 0.001311072
system performance 0.001308141
labeling words 0.0013001339999999999
record information 0.001297259
feature functions 0.0012835049999999999
learning system 0.00127985
sentence level 0.0012795390000000001
various feature 0.0012777420000000001
feature description 0.0012765110000000001
gram features 0.001269454
regular language 0.0012636
