Computer Engineering & Science >
A New Bayesian Classification Algorithmfor NonBalance Datasets
Received date: 2009-03-13
Revised date: 2009-08-26
Online published: 2010-06-25
Based on the idea of semisupervised learning, a new Bayesian classifier model by using an improved EM (ExpectationMaximum) algorithm is proposed to classify and predict nonbalance data gathered from mobile communication networks. Firstly, a statistical analysis is performed to calculate the priori probabilities based on the actual data. By using these priori probabilities as the initial values of the Bayesian model, we can speed up the convergence process of the EM algorithm. Secondly, a classifier based on the Bayesian network is constructed to learn the category characteristics of the historic communication data by improving the EM (ExpectationMaximum) steps. Thirdly, by using this classifier, the label of the current data sample is predicted. The experimental results demonstrate that, the proposed method highly increases the prediction accuracy of the negative label, and gains better performance than the traditional statistical methods.
Key words: semisupervised learning;Bayes
WANG Chunliang1,2,FU Yuchen2 . A New Bayesian Classification Algorithmfor NonBalance Datasets[J]. Computer Engineering & Science, 2010 , 32(7) : 95 -98 . DOI: 10.3969/j.issn.1007130X.2010.
/
| 〈 |
|
〉 |