• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (01): 170-179.

• 人工智能与数据挖掘 • 上一篇    下一篇

基于精英集的多目标差分进化聚类算法

张明珠,曹杰,王斌   

  1. (南京财经大学信息工程学院,江苏 南京 210023)
  • 收稿日期:2019-12-31 修回日期:2020-05-08 接受日期:2021-01-25 出版日期:2021-01-25 发布日期:2021-01-22
  • 基金资助:
    江苏省自然科学基金(BK20181414);江苏省高校优秀科技创新团队(2017-15);江苏省研究生科研与实践创新计划(KYCX18_1390)


An elitist-archive-based differential evolutionary algorithm for multi-objective clustering

ZHANG Ming-zhu,CAO Jie,WANG Bin   

  1. (School of Information Engineering,Nanjing University of Finance and Economics,Nanjing 210023,China)



  • Received:2019-12-31 Revised:2020-05-08 Accepted:2021-01-25 Online:2021-01-25 Published:2021-01-22

摘要: 聚类数的确定在聚类分析中是一个基本却具有挑战性的问题。一方面,最佳聚类数根据不同的评价标准、用户偏好或需求可能不一致,因此将不同聚类数的聚类结果呈现给用户作参考是有意义的。另一方面,增加聚类数虽会使聚类结果更加紧致,却会削弱不同类之间的分离性,所以选择合适的聚类数是一个在最小化聚类数与最大化类内紧致性或类间分离性之间取得平衡的多目标优化问题。因此,在聚类数不确定的聚类问题中直接将聚类数作为一个优化目标与另一个反映类内紧致性的目标函数同时进行优化,利用新的基于精英集的多目标差分进化算法得到一个Pareto解集,集合中含有多个不同聚类数的近似最优聚类结果。实验结果验证了所提算法的可行性和有效性。


关键词: 多目标聚类, 聚类数, 进化算法, 精英集, 多目标优化, 差分进化

Abstract: Determining the number of clusters is a basic yet challenging problem in clustering analysis. On one hand, the optimal number of clusters varies according to different evaluation criteria, user preferences or demands, hence it makes sense to provide the user with multiple clustering results for different number of clusters. On the other hand, increasing the number of clusters without any penalty usually optimizes the within-cluster compactness while deteriorating the between-cluster separation. Therefore, selecting an appropriate number of clusters is, in fact, a multi-objective optimization problem, which needs to choose a balanced solution among a set of tradeoffs between the minimum number of clusters and the maximum compactness or separation of clusters. As a result, in order to deal with the clustering problem with unknown number of clusters, we directly take the number of clusters as one optimization objective, and simultaneously optimize it with another objective function reflecting the within-cluster compactness by a newly designed multi-objective differential evolutionary algorithm with an elitist archive. The proposed algorithm obtains a nearly Pareto-optimal set, which contains multiple clustering results for distinct number of clusters, in a single run. Experiments on several datasets and comparative experiments demonstrate the practicability and effectiveness of our proposed algorithm.




Key words: multi-objective clustering;the number of clusters;evolutionary algorithm, elitist archive;multi-objective optimization;differential evolution