• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (2): 327-335.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

A clustering algorithm based on the multi-level density center graph

LU Jianyun1,2,SHAO Junming1   

  1.  (1.School of Computer Science and Engineering(School of Cybersecurity),
    University of Electronic Science and Technology of China,Chengdu 611731;
    2.Artificial Intelligence and Big Data College,
    Chongqing Polytechnic University of Electronic Technology,Chongqing 401331,China)
  • Received:2024-07-02 Revised:2024-08-23 Online:2025-02-25 Published:2025-02-24

Abstract: Density-based clustering is an algorithm that partitions a dataset based on the density relationships among data objects. By determining the membership relationships between low-density objects and density-center objects within the dataset, density-based clustering can effectively handle clusters of various sizes, shapes, and densities. However, due to the impact of variable densities, noise and complex distributions within datasets, how to accurately estimate the local density of data objects and determine the number of clusters through density centers remain challenges that require further research. To address these issues in density-based clustering, a clustering algorithm based on the multi-level density center graph (CMDCG) is proposed. Firstly, the local density of each data object is calculated using information entropy based on its neighborhood. Secondly, the membership relationships of each data object are statistically analyzed according to its local density and neighborhood space, and density centers are determined. Finally, multi-level density centers are obtained by varying the neighborhood space, and a graph structure is constructed based on the membership relationships among these multi-level density centers. The connected components of the graph are identified as initial clusters, and other data objects are assigned to these initial clusters based on their membership relationships. Experimental results on both synthetic and real dataset demonstrate that the CMDCG algorithm can accurately identify the number of clusters and form correct initial clusters, with clustering results that are robust to varying densities and noise.

Key words: density clustering, multi-level density center, connected graph, information entropy, neighborhood space