source text 0.002576543
news text 0.002411643
text similarity 0.002373005
text analysis 0.002355758
new text 0.002315872
ing text 0.002266386
newspaper text 0.002179855
text pairs 0.002161324
computational text 0.002157982
text reuse 0.00212094
agency text 0.002117954
original text 0.0021116810000000002
text sources 0.002107491
arbitrary text 0.002088089
electronic text 0.002087405
journalistic text 0.00207116
free text 0.002070261
tional text 0.002061504
measuring text 0.002060819
tify text 0.002055558
verbatim text 0.002055558
text para 0.002055558
unacceptable text 0.002055558
text format 0.002055558
sentence alignment 0.0019253959999999998
source sentence 0.001619239
alignment algorithm 0.001509393
source texts 0.001387601
other words 0.001386636
statistical alignment 0.0013577899999999998
various alignment 0.0013379619999999998
alignment program 0.0013075979999999999
source sentences 0.00128725
true alignment 0.0012751709999999999
single sentence 0.0012658090000000001
alignment gst 0.0012529549999999998
alignment algorithms 0.001240813
news texts 0.001222701
information news 0.001189701
short sentence 0.001166535
complex sentence 0.001143147
shared word 0.001132271
sentence align 0.001114435
word sequence 0.00111113
word sequences 0.001104219
same information 0.001085562
word ngrams 0.001061052
ual word 0.001055451
lexical similarity 0.001035125
ing method 0.001020137
alignment 0.0010109
similarity score 0.001008367
candidate source 9.92948E-4
newspaper texts 9.90913E-4
several source 9.85287E-4
agency source 9.50897E-4
ual words 9.29223E-4
automatic method 9.29203E-4
words sharing 9.2698E-4
matched words 9.257359999999999E-4
sentence 9.14496E-4
other hand 9.017019999999999E-4
newswire source 8.945439999999999E-4
language processing 8.942360000000001E-4
possible features 8.91573E-4
other sources 8.82715E-4
such candidate 8.78286E-4
separate texts 8.748849999999999E-4
information need 8.689570000000001E-4
rived texts 8.68759E-4
multiple sentences 8.6648E-4
natural language 8.62928E-4
training data 8.46335E-4
user information 8.397610000000001E-4
measure score 8.272030000000001E-4
ison method 8.11739E-4
similarity measures 8.096539999999999E-4
learning algorithm 8.03297E-4
matching algorithm 7.94887E-4
news agency 7.859970000000001E-4
containment score 7.84943E-4
tion score 7.841370000000001E-4
similarity scores 7.821449999999999E-4
string similarity 7.77235E-4
news story 7.693540000000001E-4
cation value 7.69306E-4
ural language 7.52168E-4
weighted score 7.5182E-4
translation equivalents 7.418240000000001E-4
gst algorithm 7.40548E-4
approach category 7.4012E-4
words 7.39612E-4
high amount 7.30354E-4
annotated corpus 7.29873E-4
news sto 7.260880000000001E-4
high likelihood 7.24809E-4
results table 7.24524E-4
news agen 7.24083E-4
lexical variation 7.18075E-4
dice score 7.1692E-4
