Computer Engineering & Science
Previous Articles Next Articles
ZHANG Zhonglin,WU Dangping
Received:
Revised:
Online:
Published:
Abstract:
The category imbalance problem exists widely in real life. Most of the traditional classifiers assume balanced class distribution or equal misclassification cost. However, when dealing with unbalanced data, their classification performance is seriously affected. Aiming at the classification problem of imbalanced data sets, we propose a probability threshold Bagging classification algorithm, called PT-Bagging to deal with unbalanced data. The algorithm combines the threshold-moving technique with the bagging ensemble algorithm, uses the original distributed training set for training in the training phase, introduces a decision threshold-moving method in the prediction phase, and employs the calibrated posterior probability estimation to obtain the maximized average performance measurement of the imbalanced data classification. Experimental results show that the PT-Bagging algorithm can better classify imbalanced data.
Key words: imbalanced data, thresholdmoving, Bagging integrated learning, posterior probability
ZHANG Zhonglin,WU Dangping.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2019/V41/I06/1086