• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (10): 1880-1890.

• 人工智能与数据挖掘 • 上一篇    下一篇

具有相似性传播的K-均值欠定盲分离算法

何选森1,2,徐丽1,夏娟1    

  1. (1.广州商学院信息技术与工程学院,广东 广州 511363;2.湖南大学信息科学与工程学院,湖南 长沙 410082)
  • 收稿日期:2020-04-21 修回日期:2021-04-20 接受日期:2021-10-25 出版日期:2021-10-25 发布日期:2021-10-22
  • 作者简介:许力 (1997),男,安徽合肥人,硕士生,研究方向为自然语言处理。
  • 基金资助:
     国家自然科学基金(60572183)

An underdetermined blind source separation algorithm based on K-means with affinity propagation

HE Xuan-sen1,2,XU Li1,XIA Juan1   

  1. (1.School of Information Technology and Engineering,Guangzhou College of Commerce,Guangzhou 511363;

    2.College of Information Science and Engineering,Hunan University,Changsha 410082,China)

  • Received:2020-04-21 Revised:2021-04-20 Accepted:2021-10-25 Online:2021-10-25 Published:2021-10-22

摘要: 对于稀疏信源的欠定盲分离问题,混合矩阵的估计是至关重要的。为了提高估计性能,提出一种组合的聚类分析算法。首先,利用短时傅里叶变换把时域中的观测信号转变成频域中的稀疏信号,并通过数据的归一化把稀疏信号在频域的线性聚类转变成致密聚类。然后,利用相似性传播AP聚类方法搜索每个观测数据的邻域自动形成数据族的数量和相对应的关键数据。最后,以AP聚类的结果作为K-均值算法的初始值,对每类(族)数据的聚类中心进一步修正。仿真结果表明,组合聚类法能有效地提高混合矩阵的估计精度。把AP聚类和K-均值算法相结合的另一个优势是,能够克服经典K-均值算法需要事先知道信源数量和对数据的初始划分非常敏感的缺陷。

关键词: 欠定盲分离, 稀疏表示, 混合矩阵估计, 相似性传播, K-均值

Abstract: For the problem of underdetermined blind source separation of sparse sources, the estimation of the mixing matrix is crucial. To improve the estimation performance, a combined cluster analysis algorithm  is proposed. Firstly, the short-time Fourier transform is used to transform the observed signal in the time domain into a sparse signal in the frequency domain, and the normalization of the observed data is used to transform the linear clustering of sparse signal into the compact clustering in the frequency domain. Secondly, the affinity propagation (AP) clustering is used to search the neighborhood of each observed data to automatically form the number of data classes and corresponding key data. Finally, the results of AP clustering are used as the initial values of the K-means algorithm, and the clustering center of each class data is further modified. The simulation results shows that the combined cluster algorithm  can effectively improve the estimation accuracy of the underdetermined mixing matrix. Another advantage of the combined method is that it overcomes the drawbacks of the classic K-means algorithm that needs to know the number of sources and is very sensitives to the initial partition of the data.



Key words: underdetermined blind source separation, sparse representation, mixing matrix estimation, affinity propagation, K-means