Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (03): 525-533.
Previous Articles Next Articles
CAO Ting-ting,ZHANG Zhong-lin
Received:
Revised:
Accepted:
Online:
Published:
Abstract: Cost-sensitive learning is an important strategy to solve the problem of imbalanced data classification. The non-linearity of data characteristics also brings some difficulties to classification. In view of this problem, by combining cost-sensitive learning with kernel principal component analysis (KPCA), this paper proposes a cost-sensitive Stacking integration algorithm called KPCA-Stacking. Firstly, the original data set is over-sampled by the adaptive synthetic sampling method (ADASYN) and KPCA dimensionality reduction is performed; Secondly, KNN, LDA, SVM, and RF are converted into cost-sensitive algorithms according to the Bayesian risk minimization principle as the primary learner in the Stacking integrated learning framework, and logistic regression is used as the meta-learner. Compa- rative experiments on 10 algorithms such as J48 decision tree in 5 public datasets show that the cost- sensitive KPCA-Stacking algorithm improves the recognition rate of a few classes to a certain extent, and is better than the overall classification performance of a single model.
Key words: imbalanced data, cost-sensitive, KPCA, Stacking, ADASYN oversampling, classification
CAO Ting-ting, ZHANG Zhong-lin. A cost-sensitive imbalanced data classification algorithm based on KPCA-Stacking[J]. Computer Engineering & Science, 2021, 43(03): 525-533.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2021/V43/I03/525