• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (11): 2077-2083.

• 人工智能与数据挖掘 • 上一篇    下一篇

一类连续的K-means 等价聚类模型及其优化算法

谢挺1,刘瑞华2,魏正元1   

  1. (1.重庆理工大学理学院,重庆 400054; 2.重庆理工大学人工智能学院,重庆 400054)

  • 收稿日期:2020-07-10 修回日期:2020-09-29 接受日期:2021-11-25 出版日期:2021-11-25 发布日期:2021-11-23
  • 基金资助:
    重庆市自然科学基金(cstc2019jcyj-msxmX0491);重庆市教委科技项目青年项目(KJQN201901145);重庆理工大学科研项目(2009ZD55)

A continuous K-means equivalent clustering model and its optimization algorithm

XIE Ting1,LIU Rui-hua2,WEI Zheng-yuan1   

  1. (1.School of Science,Chongqing University of Technology,Chongqing 400054;

    2.School of Artificial Intelligence,Chongqing University of Technology,Chongqing 400054,China)


  • Received:2020-07-10 Revised:2020-09-29 Accepted:2021-11-25 Online:2021-11-25 Published:2021-11-23

摘要: 聚类作为一种非监督学习方法是数据科学中重要的研究内容。K-means是一种基于划分的聚类算法,一般是利用启发式算法求解一个离散的NP问题。为增强K-means在大数据问题中的应用性,从聚类矩阵的属性出发,设计了一类非凸连续的K-means等价聚类优化模型,并利用ADMM 框架给出了该等价模型的快速优化算法。数值实验结果表明了该模型及其优化算法在大数据聚类中的准确性和高效性。此外,还讨论了该模型的性质及等价性问题。




关键词: K-means, 聚类, 稀疏, 交替方向乘子法

Abstract: As an unsupervised learning method, clustering is a significant research topic in data science. K-means is a partition-based clustering algorithm, which generally uses a heuristic algorithm to solve a discrete NP problem. In order to improve the application of K-means in big data problems, a continuous non-convex K-means equivalent clustering model is designed according to the properties of clustering matrix, and the  fast optimization algorithm of this equivalent clustering medel is given by ADMM framework. Numerical experiments show that the model and algorithm are accurate and efficient in big data clustering. In addition, the feature and equivalence of the model are discussed.

Key words: K-means, clustering, sparse, alternating direction method of multipliers