• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2011, Vol. 33 ›› Issue (12): 148-152.

• 论文 • Previous Articles     Next Articles

CHEN Yiming1,2,LI Zhoujun1,LIU Junwan1   

  1. (1.School of Computer Science,National University of Defense and Technology,Changsha 410073;2.School of Information Science and Technology,Hunan Agricultural University,Changsha 410128,China)
  • Received:2009-09-07 Revised:2009-12-15 Online:2011-12-24 Published:2011-11-25

Abstract:

This paper formulates the protein function prediction into a typical LPU. Aiming at imbalance or overfitting from LPU with few positive examples, it proposes a method creating synthetic examples to enlarge the set of positive examples based on the nearest neighbor and convex combination, and meanwhile modifies the procedure learning optimal classifier for the classic LPU algorithm by using oneclass SVM(support vector machine) to identify the most probable negative examples, running iteratively SVM to move the classification hyperplane to a suitable place and obtaining representative negative examples through cross validation. For the yeast genomic data, the experiments show that our algorithm outperforms several classic prediction methods, particularly, for function classes with few positive examples.

Key words: protein function prediction;SVM;LPU