Ping-pong Document Clustering using NMF and Linkage-Based Refinement

Hiroyuki Shinnou, Minoru Sasaki


Abstract
This paper proposes a ping-pong document clustering method using NMF and the linkage based refinement alternately, in order to improve the clustering result of NMF. The use of NMF in the ping-pong strategy can be expected effective for document clustering. However, NMF in the ping-pong strategy often worsens performance because NMF often fails to improve the clustering result given as the initial values. Our method handles this problem with the stop condition of the ping-pong process. In the experiment, we compared our method with the k-means and NMF by using 16 document data sets. Our method improved the clustering result of NMF significantly.
Anthology ID:
L08-1287
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/38_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Hiroyuki Shinnou and Minoru Sasaki. 2008. Ping-pong Document Clustering using NMF and Linkage-Based Refinement. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
Ping-pong Document Clustering using NMF and Linkage-Based Refinement (Shinnou & Sasaki, LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/38_paper.pdf