Computer Engineering & Science ›› 2022, Vol. 44 ›› Issue (05): 788-799.
• High Performance Computing • Previous Articles Next Articles
ZHANG Xi-long,HAN Meng,CHEN Zhi-qiang,WU Hong-xin,LI Mu-hang
Received:
Revised:
Accepted:
Online:
Published:
Abstract: Imbalanced data stream will seriously affect the classification performance of the algorithm and the emer-gence of concept drift is a difficult problem in the field of stream data mining. In order to improve the classification performance of such problem, a new Boosting Classification Algorithm for imbalanced drifted data stream based on Hellinger Distance (BCA-HD) is proposed. The algorithm innovatively uses the weighted combination of instance level and classifier level to dynamically update the classifier to adapt to the occurrence of concept drift. The integrated algorithm SMOTEBoost is used as the base classifier at the bottom layer, and the classifier uses resampling technology to deal with the imbalanced data. Finally, the proposed algorithm is compared with 9 different algorithms on 16 abrupt and gradual datasets. The results show that average value and average rankings of G-mean and AUC are both ranked first. Experiments show that the algorithm can better adapt to the simultaneous occurrence of concept drift and imbalance, which helps to improve the classification performance.
Key words: data stream, imbalanced data, concept drift, Boosting, Hellinger distance
ZHANG Xi-long, HAN Meng, CHEN Zhi-qiang, WU Hong-xin, LI Mu-hang. A Boosting classification algorithm for imbalanced drift data stream based on Hellinger distance[J]. Computer Engineering & Science, 2022, 44(05): 788-799.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2022/V44/I05/788