• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Optimization of grid based clustering by
fast search and find of density peaks

SUN Hao1,ZHANG Ming-xin2,DAI Jiao2,SHANG Zhao-wei3   

  1. (1.School of Computer Science and Technology,Soochow University,Suzhou 215006;
    2.School of Computer Science and Engineering,Changshu Institute of Technology,Changshu 215500;
    3.College of Computer Science,University of Chongqing,Chongqing 400030,China)
  • Received:2016-01-15 Revised:2016-03-04 Online:2017-05-25 Published:2017-05-25

Abstract:

The CFSFDP is a clustering algorithm based on density peaks, which can cluster arbitrary shape data sets, and has the advantages of fast clustering and simple realization. However, the global density threshold dc, which can lead to the decrease of clustering quality, is specified without the consideration of spatial distribution of the data. Moreover, the data sets with multi-density peaks cannot be clustered accurately. To resolve the above shortcomings, we propose an optimized CFSFDP algorithm based on grid (GbCFSFDP). To avoid the using of global dc, the algorithm divides the data sets into smaller partitions by using the grid partitioning method and performs local clustering on them. Then the GbCFSFDP merges the sub classes. Data sets, which are unevenly distributed and have multi-density peaks, are correctly classified. Simulation experiments of two  typical data sets show that the GbCFSFDP algorithm is more accurate than the CFSFDP.

Key words: clustering, density threshold, grid partition, merging clusters