• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 • 上一篇    下一篇

基于RPCA对高维数据子空间聚类的预测方法

吕红伟,王士同   

  1. (江南大学数字媒体学院,江苏 无锡 214122)
  • 收稿日期:2015-09-22 修回日期:2015-12-07 出版日期:2017-03-25 发布日期:2017-03-25
  • 基金资助:

    国家自然科学基金(61272210)

A predictive subspace clustering method of
high-dimensional data based on RPCA
#br#  

L Hong-wei,WANG Shi-tong   

  1. (School of Digital Media,Jiangnan University,Wuxi 214122,China)
  • Received:2015-09-22 Revised:2015-12-07 Online:2017-03-25 Published:2017-03-25

摘要:

预测子空间聚类PSC算法由于建立在PCA模型下,无法鲁棒地进行主元分析,导致在面对带有强噪声的数据时,聚类性能受到严重影响。为了提高PSC算法对噪声的鲁棒性,利用近年来受到广泛关注的RPCA分解技术得到数据的低秩结构,鲁棒地提取子空间,具体地,通过将RPCA模型融入PSC算法,提出了一种基于RPCA的预测子空间聚类算法。该算法在RPCA模型下检测强影响点,不但可以有效地进行变量选择和模型选择,而且更重要的是改善了PSC算法在噪声环境下的聚类性能。在真实基因表达数据集上的实验结果表明,改进后的算法较之经典的PSC算法无论在无噪声或加噪声环境下都表现出一定聚类优势及良好的鲁棒性。

关键词: RPCA, 子空间聚类, 变量选择, 模型选择, 鲁棒性

Abstract:

Because the predictive subspace clustering (PSC) algorithm is not robust to the principal component analysis in the PCA model, the clustering performance is severely affected when dealing with the data with strong noise. In order to improve the robustness to noise of the PSC algorithm, we use the robust principal component analysis (RPCA) decomposition technique which is paid extensive attention in recent years to obtain the low rank structure of the data and achieve a robust extraction subspace. We integrate the RPCA model into the PSC algorithm and propose a predictive subspace clustering algorithm based on the RPCA. The proposed algorithm can detect influential observations in the RPCA model, effectively carry out variable selection and model selection, and more importantly it can improve the clustering performance of the PSC algorithm in noise environment. Experimental results on real gene expression data sets show that the improved algorithm has clustering advantages and better robustness both in the noise environment and the environment without noise in comparison with the classical algorithm PSC.

Key words: RPCA, subspace clustering, variable selection, model selection, robustness