• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊
论文

A New Bayesian Classification Algorithmfor NonBalance Datasets

Expand
  • (1.No.2 Hospital Affiliated to Suzhou University,Suzhou 215004;
    (2.School of Computer Science and Technology,Suzhou University,Suzhou 215006,China)

Received date: 2009-03-13

  Revised date: 2009-08-26

  Online published: 2010-06-25

Abstract

Based on the idea of semisupervised learning, a new Bayesian classifier model by using an improved EM (ExpectationMaximum) algorithm is proposed to classify and predict nonbalance data gathered from mobile communication networks. Firstly, a statistical analysis is performed to calculate the priori probabilities based on the actual data. By using these priori probabilities as the initial values of the Bayesian model, we can speed up the convergence process of the EM algorithm. Secondly, a classifier based on the Bayesian network is constructed to learn the category characteristics of the historic communication data by improving the EM (ExpectationMaximum) steps. Thirdly, by using this classifier, the label of the current data sample is predicted. The experimental results demonstrate that, the proposed method highly increases the prediction accuracy of the negative label, and gains better performance than the traditional statistical methods.

Cite this article

WANG Chunliang1,2,FU Yuchen2 . A New Bayesian Classification Algorithmfor NonBalance Datasets[J]. Computer Engineering & Science, 2010 , 32(7) : 95 -98 . DOI: 10.3969/j.issn.1007130X.2010.

Outlines

/