language model 0.002160365
sentence words 0.002061669
other features 0.001818298
word features 0.001809218
previous sentence 0.001763393
sentence number 0.001749307
sentence probability 0.001719496
sentence length 0.001681456
level model 0.001656953
current sentence 0.001655797
several features 0.001648372
sentence boundaries 0.001646883
syntactic features 0.001605273
complete sentence 0.001587964
age sentence 0.0015500890000000002
finding sentence 0.00154368
available sentence 0.001531783
segmentation model 0.001530434
preceding sentence 0.001522126
sentence splitting 0.001517778
sentence splitter 0.001517778
sentence contin 0.001517778
rent sentence 0.001517778
news corpus 0.0015176389999999999
language models 0.001517239
following features 0.001488175
feature fiction 0.001481628
greek feature 0.001476237
features distance 0.001453821
guage model 0.001443802
useful feature 0.001441528
modelling features 0.001404735
such models 0.001397655
text segmentation 0.001397474
adjacent text 0.001376383
text position 0.001369748
paragraph boundary 0.001357441
average paragraph 0.001355972
paragraph structure 0.001345465
automatic text 0.00133993
raw text 0.0013335439999999999
text blocks 0.001327034
original text 0.001324706
news texts 0.001323406
paragraph prediction 0.001315505
sentence 0.00131535
paragraph insertion 0.00131033
text sim 0.001299412
paragraph length 0.001299212
text posi 0.001297714
text positions 0.0012972449999999998
former text 0.001295254
text categorisation 0.001295254
paragraph boundaries 0.001264639
previous sentences 0.001231399
model 0.00122444
target language 0.001212489
feature 0.00121185
paragraph position 0.0012113739999999999
source language 0.00120757
paragraph detection 0.001199522
training data 0.001185442
paragraph identification 0.0011850889999999998
features 0.00118471
automatic paragraph 0.0011815559999999998
paragraph break 0.001173059
language modelling 0.00115595
new texts 0.001151329
paragraph breaks 0.001150975
cmu language 0.001150461
language mod 0.001147722
paragraph formation 0.00114629
paragraph markings 0.0011445819999999999
paragraph starting 0.0011355879999999999
content words 0.001132379
initial sentences 0.001118847
english data 0.001088963
fiction corpus 0.001068299
such lists 0.001050297
speech recognition 0.0010481499999999999
first word 0.001046019
eci corpus 0.001035279
same task 0.001027701
speech tagging 0.001014667
information extraction 0.0010102829999999998
corpus sizes 0.001006708
other way 0.001005824
europarl corpus 0.001005233
test set 9.976359999999999E-4
fiction news 9.88896E-4
human performance 9.88847E-4
news domain 9.84052E-4
ing models 9.81775E-4
word order 9.813209999999998E-4
cue words 9.78043E-4
matic speech 9.72159E-4
speech recogniser 9.72159E-4
related information 9.67392E-4
simply words 9.508979999999999E-4
data sets 9.49401E-4
