• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 图形与图像 • 上一篇    下一篇

一种面向识别的无监督特征学习算法

夏海蛟1,2,谭毅华1,2   

  1. (1.华中科技大学自动化学院,湖北 武汉 430074;2.多谱信息处理技术国家级重点实验室,湖北 武汉 430074)
  • 收稿日期:2016-09-19 修回日期:2016-12-20 出版日期:2018-06-25 发布日期:2018-06-25
  • 基金资助:

    国家自然科学基金(41371339)

A recognition-oriented unsupervised
feature learning algorithm
 

XIA Haijiao 1,2,TAN Yihua 1,2   

  1. (1.School of Automation,Huazhong University of Science and Technology,Wuhan 430074;
    2.National Key Laboratory of Science and Technology on Multispectral Information Processing,Wuhan 430074,China)
  • Received:2016-09-19 Revised:2016-12-20 Online:2018-06-25 Published:2018-06-25

摘要:

特征抽取是图像识别的关键环节,准确的特征表达能够产生更准确的分类效果。采用软阈值编码器和正交匹配追踪(OMP)算法正交化视觉词典的方法,以提高单级计算结构的识别率,并进一步构造两级计算结构,获取图像更准确的特征,以提高图像的识别率。实验表明,采用软阈值编码器和OMP算法能提高单级计算结构提取特征的能力,提高大样本数据集中图像的识别率。两级计算结构能够提高自选数据集中图像的识别率。采用OMP算法能提高VOC2012数据中图像的识别率。在自选数据集上,两级计算结构优于单级计算结构,与NIN结构相比表现出优势,与卷积神经网络CNN相当,说明两级计算结构在自选数据集上有很好的适应性。
 


 

关键词: 无监督学习, Kmeans, OMP, 编码器, 平均值池化, 空间金字塔池化

Abstract:

Feature extraction is a key part of  image recognition, and precise feature expression can generate more accurate classification. We improve the recognition rate of the singlestage computational structure by adopting soft threshold encoder and the orthogonalizing visual dictionary of the orthogonal matching pursuit (OMP) algorithm. Besides, we build a twostage computational structure which extracts images’ features and increases the recognition rate. Experiments demonstrate that adopting softthreshold encoder and the OMP algorithm can increase the ability of extracting features of the singlestage computational structure and enhance imagerecognition rate in bigsample datasets. The twostage computational structure can improve recognition rate on selfselection datasets. The OMP algorithm can improve recognition rate of the VOC2012 dataset. For selfselection datasets, the twostage computational structure outperforms the singlestage computational structure and network in network (NIN), and is equivalent to convolutional neural networks (CNN), indicating that the twostage computational structure is adaptive to selfselection datasets.
 

Key words: unsupervised learning, K-means, OMP, encoder, average pooling, spatial pyramid pooling