• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

A predictive subspace clustering method of
high-dimensional data based on RPCA
#br#  

L Hong-wei,WANG Shi-tong   

  1. (School of Digital Media,Jiangnan University,Wuxi 214122,China)
  • Received:2015-09-22 Revised:2015-12-07 Online:2017-03-25 Published:2017-03-25

Abstract:

Because the predictive subspace clustering (PSC) algorithm is not robust to the principal component analysis in the PCA model, the clustering performance is severely affected when dealing with the data with strong noise. In order to improve the robustness to noise of the PSC algorithm, we use the robust principal component analysis (RPCA) decomposition technique which is paid extensive attention in recent years to obtain the low rank structure of the data and achieve a robust extraction subspace. We integrate the RPCA model into the PSC algorithm and propose a predictive subspace clustering algorithm based on the RPCA. The proposed algorithm can detect influential observations in the RPCA model, effectively carry out variable selection and model selection, and more importantly it can improve the clustering performance of the PSC algorithm in noise environment. Experimental results on real gene expression data sets show that the improved algorithm has clustering advantages and better robustness both in the noise environment and the environment without noise in comparison with the classical algorithm PSC.

Key words: RPCA, subspace clustering, variable selection, model selection, robustness