• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2011, Vol. 33 ›› Issue (12): 99-105.

• 论文 • Previous Articles     Next Articles

An Ensemble Classifier for Mining Imbalanced Data Streams with Noise

OUYANG Zhenzheng1,TAO Zijin1,CAI Jianyu2,WU Quanyuan1   

  1. (1.School of Computer Science,National University of Defense Technology,Changsha 410073;2.National Defense Information Academy,Wuhan 430010,China)
  • Received:2009-06-02 Revised:2009-09-27 Online:2011-12-24 Published:2011-12-25

Abstract:

Many real world data streams mining applications involve learning from imbalanced data streams, where such applications expect to have a higher predictive accuracy over the minority class, however most classification models assume relatively balanced data streams, and they cannot handle imbalanced distribution. In this paper, we propose a novel ensemble classifier framework (IMDAP) for mining conceptdrifting and noisy data streams with imbalanced distribution by using an averaged probability ensemble framework and sampling technique. Our empirical study shows that the IMDAP is superior and have improves both the capability of the classifier and the accuracy in performing classification over the minority class.

Key words: imbalanced data streams;concept drift;noise;ensemble classifier