• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (9): 162-166.

• 论文 • 上一篇    下一篇

基于RSC模型和噪声去除的半监督训练方法

袁兴梅,谢雪莲   

  1. (南京工程学院信息化建设与管理办公室,江苏 南京 211167)
  • 收稿日期:2012-05-25 修回日期:2012-08-27 出版日期:2013-09-25 发布日期:2013-09-25
  • 基金资助:

    南京工程学院青年基金资助项目(QKJB2011028)

Semi-supervised training approach based onRSC model and noise removing         

YUAN Xingmei,XIE Xuelian   

  1. (Information Construction and Management Office,Nanjing Institute of Technology,Nanjing 211167,China)
  • Received:2012-05-25 Revised:2012-08-27 Online:2013-09-25 Published:2013-09-25

摘要:

“半监督学习”是利用已经标记好的训练样本和未标记的训练样本一起训练分类器。传统的半监督训练过程对噪声不作辨别,这种做法会因噪声的存在破坏分类器的训练过程,进而影响分类器的分类效果。针对该问题,提出了基于RSC模型和噪声去除的半监督训练方法,在样本训练过程中,使用RSC标签扩展的方法,并添加噪声去除环节。实验表明,该算法能有效降低半监督学习中噪声对分类器的影响,得到更加精确的分类边界,最终提高算法的性能和稳定性。

关键词: 半监督学习, 噪声去除, 分类器训练, RSC模型, 标签扩展, 训练集

Abstract:

According to semisupervised learning, both labeled and unlabeled data are used to train a classifier. Traditional semi supervised training methods do not distinguish the noise in the samples. Because of the noisy samples, this kind of method may impact the training process, and then affect the classifier results. To solve the problem, a kind of semisupervised training approach based on RSC model and noise removing is proposed. The noise removing function is added to the traditional approach while training the unlabeled samples. The experiments show that this kind of algorithm can both improve the classification accuracy and make the algorithm more stable.

Key words: semi-supervised learning;noise remove;classifier training;RSC model;label extension;training set1