A cost-sensitive imbalanced data classification algorithm based on KPCA-Stacking

Computer Engineering & Science ›› 2021, Vol. 43 ›› Issue (03): 525-533.

Previous Articles Next Articles

A cost-sensitive imbalanced data classification algorithm based on KPCA-Stacking

CAO Ting-ting，ZHANG Zhong-lin

（College of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China）

Received:2019-12-31 Revised:2020-04-27 Accepted:2021-03-25 Online:2021-03-25 Published:2021-03-29

Abstract

Abstract: Cost-sensitive learning is an important strategy to solve the problem of imbalanced data classification. The non-linearity of data characteristics also brings some difficulties to classification. In view of this problem, by combining cost-sensitive learning with kernel principal component analysis (KPCA), this paper proposes a cost-sensitive Stacking integration algorithm called KPCA-Stacking.
Firstly, the original data set is over-sampled by the adaptive synthetic sampling method (ADASYN) and KPCA dimensionality reduction is performed; Secondly, KNN, LDA, SVM, and RF are converted into cost-sensitive algorithms according to the Bayesian risk minimization principle as the primary learner in the Stacking integrated learning framework, and logistic regression is used as the meta-learner. Compa- rative experiments on 10 algorithms such as J48 decision tree in 5 public datasets show that the cost- sensitive KPCA-Stacking algorithm improves the recognition rate of a few classes to a certain extent, and is better than the overall classification performance of a single model.

Key words: imbalanced data, cost-sensitive, KPCA, Stacking, ADASYN oversampling, classification

CAO Ting-ting, ZHANG Zhong-lin. A cost-sensitive imbalanced data classification algorithm based on KPCA-Stacking[J]. Computer Engineering & Science, 2021, 43(03): 525-533.

[1]	WU Peicheng, ZHAO Xujun, JIN Lizhong. Anomaly detection of stream data based on grid density stacking [J]. Computer Engineering & Science, 2025, 47(01): 75-85.
[2]	LIU Pei, LIU Chang-hua, LIN Qiao-ling . An intrusion detection model for vehicular networks based on optimized feature stacking and ensemb [J]. Computer Engineering & Science, 2024, 46(12): 2186-2195.
[3]	DONG Yan-ling , ZHANG Shu-fen, XU Jing-cheng, WANG Hao-shi, . Research on differential privacy protection for Stacking algorithm [J]. Computer Engineering & Science, 2024, 46(02): 244-252.
[4]	SU Fu, LUO Hai-bo. A fingerprint recognition algorithm based on improved Stacking ensemble learning [J]. Computer Engineering & Science, 2022, 44(12): 2153-2161.
[5]	LI Shuai, CHANG Jin-cai, LI-L Mu-zhi, CAI Kun-jie, . A Stacking ensemble clustering algorithm based on differential privacy protection [J]. Computer Engineering & Science, 2022, 44(08): 1402-1408.
[6]	HE Xiao-juan1,PAN Wen-jie1,CHENG Hong2. An advertisement click-through rate prediction model based on ensemble learning [J]. Computer Engineering & Science, 2019, 41(12): 2278-2284.
[7]	YE Cheng,ZHENG Hong,CHENG Yun-hui. A user churn prediction method based on multi-model fusion [J]. Computer Engineering & Science, 2019, 41(11): 2027-2032.
[8]	YANG Xiansheng1,JIANG Lei1,PENG Xiong2,ZHOU Qian1,LIU Jujun1. A new outlier detection method based on large data [J]. Computer Engineering & Science, 2018, 40(07): 1180-1186.
[9]	XIE Xiao-jun1,2,YU Chun-qiang3,WANG Bo1,HE Xian1,XU Zhang-yan1,2. A test-cost-sensitive attribute reduction algorithm based on immune quantum particle swarm optimization [J]. Computer Engineering & Science, 2017, 39(07): 1371-1378.

A cost-sensitive imbalanced data classification algorithm based on KPCA-Stacking

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 9

Recommended Articles

Metrics

Comments