• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    

基于AP聚类的约简孪生支持向量机快速分类算法

韦修喜1,黄华娟1,周永权1,2   

  1. (1.广西民族大学信息科学与工程学院,广西 南宁 530006;
    2.广西高校复杂系统与智能计算重点实验室(广西民族大学),广西 南宁 530006)
  • 收稿日期:2019-04-13 修回日期:2019-07-11 出版日期:2019-10-25 发布日期:2019-10-25
  • 基金资助:

    国家自然科学基金(61662005);广西自然科学基金(2018JJA170121);广西高校中青年教师科研基础能力提升项目(2019KY0195)

A fast classification algorithm of reduced twin
support vector machines based on AP clustering

WEI Xiu-xi1,HUANG Hua-juan1,ZHOU Yong-quan1,2   

  1. (1.College of Information Science and Engineering,Guangxi University for Nationalities,Nanning 530006;
    2.Guangxi Higher School Key Laboratory of Complex Systems and Intelligent Computing
    (Guangxi University for Nationalities),Nanning 530006,China)
  • Received:2019-04-13 Revised:2019-07-11 Online:2019-10-25 Published:2019-10-25

摘要:

孪生支持向量机TWSVMs分类过程的计算量和样本的数量成正比,当样本个数较多时,其分类过程将会比较耗时。为了提高样本集的稀疏性,从而提高TWSVMs的分类速度,提出了一种基于AP聚类的约简孪生支持向量机快速分类算法FCTSVMs-AP。首先对原始数据集进行AP聚类操作。聚类的中心为约简后新的样本集,按照分类误差最小的原则构建优化模型,用二次规划方法求解新的决策函数的系数,并证明了当样本集压缩时,收紧新的快速决策函数和原始决策函数之间的误差等价于在样本空间对原始数据集进行AP聚类操作。在人工数据集和UCI数据集上的实验表明,保持分类精度的损失在统计意义上不明显的前提下,FCTSVMs-AP可以通过有效压缩样本数量的方式提高分类速度。
 

关键词: 孪生支持向量机, 自适应, AP聚类, 稀疏性, 二次规划

Abstract:

The computation of the classification process of Twin Support Vector Machines (TSVMs) is proportional to the number of samples. When the number of samples is large, the classification process will be time-consuming. In order to improve the sparsity of sample sets, a Fast Classification algorithm of Twin Support Vector Machines based on Affinity Propagation clustering (FCTSVMs-AP) is proposed. FCTSVMs-AP first performs adaptive AP clustering on the original data set. The center of the cluster is be used as the new sample set after reduction. According to the principle of minimum classification error, an optimization model is constructed. Moreover, the coefficients of the new decision tree is solved by quadratic programming. Furthermore, we prove that, when the sample set is compressed, the error between the new fast decision function and the original decision function is equivalent to the adaptive AP clustering of the original data set in the sample space. Experiments on artificial datasets and UCI datasets show that FCTSVMs-AP can improve the classification speed by effectively compressing the number of samples, while the loss of classification accuracy is not statistically significant.
Key words:
 

Key words: twin support vector machine (TSVM), adaptive, AP clustering, sparsity, quadratic programming