J4 ›› 2011, Vol. 33 ›› Issue (6): 138-143.
• 论文 • 上一篇 下一篇
袁利永,王基一
收稿日期:
修回日期:
出版日期:
发布日期:
作者简介:
基金资助:
2010年度浙江省教育厅项目(Y201016493)
YUAN Liyong,WANG Jiyi
Received:
Revised:
Online:
Published:
摘要:
半监督聚类利用部分标签的数据辅助未标签的数据进行学习,从而提高聚类的性能。针对基于Kmeans的聚类算法发现非球状簇能力差的问题,本文提出新的处理思想,即把已标签数据对未标签数据的引力影响加入到类别分配决策中,给出了类与点的引力影响度定义,设计了带引力参数的半监督Kmeans聚类算法。实验表明,该算法在处理非球状簇分布的聚类时比现有的半监督Kmeans方法效果更好。
关键词: 半监督聚类, constrainedKmeans, 标记数据, 引力影响, 非球状簇
Abstract:
Semisupervised clustering employs a small amount of labeled data to aid unsupervised learning. For the poor ability of the clustering algorithm based on the K-means for nonspherical clusters problems, this paper presents a new idea that considers the influence of the labeled datapoints on the unlabeled datapoints in allocating category, puts forward a definition of gravitational influence degree between category and datapoint,and designs a semisupervised K-means clustering algorithm with a gravitational parameter.The experiments show that the new algorithm has better effect than the traditional semisupervised K-means clustering method in dealing with the distribution of nonspherical cluster clustering.
Key words: semisupervised clustering;constrainedKmeans;labeled data, gravitational influence;nonspherical cluster
袁利永,王基一. 一种改进的半监督K-Means聚类算法[J]. J4, 2011, 33(6): 138-143.
YUAN Liyong,WANG Jiyi. An Improved SemiSupervised K-Means Clustering Algorithm[J]. J4, 2011, 33(6): 138-143.
0 / / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://joces.nudt.edu.cn/CN/
http://joces.nudt.edu.cn/CN/Y2011/V33/I6/138