Computer Engineering & Science >
An Automatic Clustering Method Using SubSampling for the KDTree
Received date: 2010-02-26
Revised date: 2010-05-30
Online published: 2011-01-25
The evolution theory based automatic clustering method has advantages in finding the global optimum and the cluster number, but shows the lack of efficiency in machine time. An autoclustering method using the KDTree subsampling technique is proposed in this paper. The sample space is divided into subspaces using the KDTree method. In each subspace, the KDTree subsamples are produced by randomly sampling for later autoclustering. The KMeans method is used to optimize the cluster results of the subsamples. The method can effectively overcome the defect of biased distribution for random subsamples and give good cluster results even for small samples. The simulation results show that the method remarkably reduces the machine time for auto clustering without decreasing the clustering effect.
PAN Zhangming . An Automatic Clustering Method Using SubSampling for the KDTree[J]. Computer Engineering & Science, 2011 , 33(1) : 166 -170 . DOI: 10.3969/j.issn.1007130X.2011.
[1]Bandyopadhyay S,Maulik U. Genetic Clustering for Automatic Evolution of Clusters and Application to Image Classification[J]. Pattern Recognition, 2002,35(6):11971208.
[2]Omran M G H, Engelbrecht A P, Salman A. Dynamic Clustering Using Particle Swarm Optimization with Application in Unsupervised Image Classification[J]. Proc of World Academy of Science, Engineering and Technology, 2005, 9(11):199204.
[3]Abraham A, Das S,Roy S. Soft Computing for Knowledge Discovery and Data Mining[M]. Springer, 2007.
[4]Das S, Abraham A, Konar A. Automatic Clustering Using an Improved Differential Evolution Algorithm[J]. IEEE Trans on Systems, Man, and CyberneticsPart A: Systems and Humans, 2008, 38(1):218236.
[5]Bradley P S, Fayyad U M. Refining Initial Points for KMeans Clustering[C]∥Proc of the Fifteenth Int’l Conf on Machine Learning, 1998:9199.
[6]Rocke D M, Dai J. Sampling and Subsampling for Cluster Analysis in Data Mining[J]. Applications to Sky Survey Data, Data Mining and Knowledge Discovery,2003,7(2):215232.
[7]Tamminen M. Comment on Quad and Octrees[J]. Communications of the ACM,1984, 30(3):204212.
[8]仇明华, 殷丽华, 李斌. 基于多维二进制搜索树的异常检测技术[J]. 计算机工程与应用, 2007, 43(22):122125.
[9]范文山, 王斌. 启发式探查最佳分割平面的快速KDTree构建方法[J]. 计算机学报, 2009, 32(2):185192.
[10]Stephen R J,Henghan C. A Method for Initializing the Kmeans Clustering Algorithm Using Kdtrees[J]. Pattern Recognition Letters, 2007,28(8):965973.
/
| 〈 |
|
〉 |