model representation 0.001872323
tensor model 0.0016903110000000001
bos model 0.0016055960000000001
bow model 0.0015638120000000001
model yields 0.0015181220000000002
anchor language 0.001489499
text data 0.001471682
models document 0.001470695
representation models 0.001383325
same language 0.001370999
language evaluation 0.001355487
model 0.00132126
english language 0.001243457
language size 0.0012432419999999999
language analysis 0.001214171
different methods 0.001211166
document representation 0.001189496
specific language 0.001184901
available language 0.0011813729999999999
second language 0.001169273
such matrix 0.001149757
language support 0.00114488
data structure 0.0011286
language links 0.001106873
ian language 0.001105357
analyzed language 0.001105357
language spe 0.001105357
data mining 0.001091528
different topics 0.001089774
different clusters 0.00107549
input word 0.001069551
sparse data 0.0010672709999999998
data preparation 0.0010653099999999999
tation models 0.001064011
time performance 0.00105567
different labels 0.001050753
different dataset 0.001048714
feature matrix 0.001045623
document clustering 0.001045601
different languages 0.001042345
corpus evaluation 0.0010421
representation space 0.001036857
document vector 0.001035953
standard matrix 0.001031442
standard document 0.00102153
multilingual document 0.001018359
word senses 0.0010125289999999999
different lan 0.00100235
other methods 9.93501E-4
different scenarios 9.858879999999999E-4
multilingual corpus 9.83383E-4
different classes 9.83223E-4
different trend 9.78213E-4
words bag 9.75029E-4
corpus tensor 9.72508E-4
text documents 9.57178E-4
factor matrix 9.484160000000001E-4
same space 9.39949E-4
final document 9.29258E-4
document clusters 9.258319999999999E-4
bos document 9.227689999999999E-4
original document 9.22533E-4
language 9.16844E-4
document segments 9.04146E-4
clustering algorithm 8.94865E-4
same set 8.94305E-4
related corpus 8.89726E-4
collection matrix 8.89083E-4
document segment 8.87718E-4
feature space 8.830719999999999E-4
document collection 8.79171E-4
matrix product 8.75585E-4
new approach 8.73972E-4
methods evaluation 8.61718E-4
same clustering 8.61323E-4
comparable document 8.58928E-4
document representations 8.490279999999999E-4
corpus characteristics 8.486589999999999E-4
parallel corpus 8.46497E-4
clustering evaluation 8.45811E-4
clustering performance 8.45521E-4
input text 8.40096E-4
balanced corpus 8.3961E-4
matrix factorizations 8.37631E-4
orthogonal matrix 8.37631E-4
document categorization 8.3745E-4
lingual document 8.37203E-4
tilingual document 8.35582E-4
bos representation 8.353989999999999E-4
models 8.32262E-4
document segmentation 8.31974E-4
corpus unbalanced 8.3084E-4
unbalanced corpus 8.3084E-4
clustering methods 8.30243E-4
document semantics 8.29983E-4
document seg 8.29609E-4
document collections 8.28713E-4
document correlations 8.26412E-4
document similarities 8.26412E-4
other languages 8.246799999999999E-4
