• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (06): 1133-1140.

• 人工智能与数据挖掘 • 上一篇    

改进SVM不平衡数据分类的IGWOSMOTE方法

马汉达,朱敏   

  1. (江苏大学计算机科学与通信工程学院,江苏 镇江 212013)
  • 收稿日期:2020-10-15 修回日期:2021-02-25 接受日期:2022-06-25 出版日期:2022-06-25 发布日期:2022-06-17

IGWOSMOTE: An over sampling method based on improved gray wolf algorithm  for SVM imbalanced data classification

MA Han-da,ZHU Min#br#   

  1. (School of Computer Science and Communication Engineering,Jiangsu University,Jiangsu 212013,China)
  • Received:2020-10-15 Revised:2021-02-25 Accepted:2022-06-25 Online:2022-06-25 Published:2022-06-17

摘要: 为了改善传统支持向量机SVM对不平衡数据集中少数类的分类效果,提出一种基于改进灰狼算法(IGWO)的过采样方法——IGWOSMOTE。首先,改进初始灰狼种群的生成形式,由SVM的惩罚因子、核参数、特征向量和少数类的采样率组成灰狼个体;然后,经由灰狼优化过程智能搜索获得最优相关参数和最优采样率组合,进行重新采样供分类器学习及预测。通过对6个UCI数据集的分类实验得出:IGWOSMOTE+SVM较传统SMOTE+SVM方法在少数类分类精度上提高了6.3个百分点,在整体数据集分类精度上提高了2.1个百分点,IGWOSMOTE可作为一种新的过采样分类方法。

关键词: 支持向量机, 不平衡数据, 过采样, 灰狼优化算法

Abstract: In order to improve the classification effect of traditional Support Vector Machine (SVM) for minority classes in unbalanced datasets, an over sampling method based on Improved Gray Wolf algorithm (IGWO),  called SMOTE,  is proposed. Firstly, the generation form of the initial gray wolf population is improved, and the individual gray wolf is composed of the penalty factor, kernel parameter, eigenvector and sampling rate of minority classes of SVM. Then, the optimal correlation parameters and the optimal sampling rate combination are obtained by intelligent search of gray wolf optimization process, and resampling is performed for the prediction of the classifier learning machine. Through the empirical test of six UCI datasets, it is concluded that IGWOSMOTE+SVM can improve the classification accuracy of minority classes by 6.3% and the overall classification accuracy by 2.1% compared with the traditional SMOTE+SVM model. IGWOSMOTE can be used as a new over sampling classification method.

Key words: support vector machine;unbalanced data;over sampling;gray wolf optimization algorithm ,