• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (04): 689-696.

• 软件工程 • 上一篇    下一篇

一种基于DBSCAN算法的代码包层次重构改进方法

李文昊,李英梅,边奕心   

  1. (哈尔滨师范大学计算机科学与信息工程学院,黑龙江 哈尔滨 150025)
  • 收稿日期:2019-10-10 修回日期:2020-06-08 接受日期:2021-04-25 出版日期:2021-04-25 发布日期:2021-04-21
  • 基金资助:
    黑龙江省自然科学基金(F2017021);哈尔滨市科技创新人才研究专项基金(2016RAQXJ036,RC2017QN010002);哈尔滨师范大学硕士研究生创新科研项目(HSDSSCX2019-10);哈尔滨师范大学计算机学院科研项目(JKYKYY202003);哈尔滨师范大学博士启动基金项目(XKB201801)

Improvement of code package level refactoring based on DBSCAN algorithm

LI Wen-hao,LI Ying-mei,BIAN Yi-xin   

  1. (School of Computer Science and Information Engineering,Harbin Normal University,Harbin 150025,China)
  • Received:2019-10-10 Revised:2020-06-08 Accepted:2021-04-25 Online:2021-04-25 Published:2021-04-21

摘要: 在包层次的代码重构研究中,为了得到“高内聚、低耦合”的软件结构,层次聚类算法因其简单有效、聚类精度高等特点被认为是一种较好的软件聚类方法。但是,层次聚类算法时间复杂度高,不利于处理较大规模的软件。而基于密度聚类的DBSCAN算法则与之相反,具有较快的聚类速度,但是精度却较低。因此,提出一种基于DBSCAN的软件层次聚类算法,利用DBSCAN算法所产生的类来约束层次聚类算法的聚类空间,该算法可以保持层次聚类算法的精度不变,且它的时间复杂度介于DBSCAN和层次聚类算法之间。实验结果表明,该算法可以有效地对软件进行合理划分,并通过专家评判、模块划分度量指标和算法运行时间对比来表明其比其他常用聚类算法的性能更好。

关键词: DBSCAN算法, 层次聚类, 软件聚类, 代码重构

Abstract: In the research of code refactoring at the package level, in order to obtain the software structure of "high cohesion and low coupling", the hierarchical clustering algorithm is considered to be a better software clustering algorithm because of its simple and effective characteristics and high clustering accuracy. However, the time complexity of the hierarchical clustering algorithm is high, which is not conducive to processing large-scale software. The DBSCAN algorithm, on the other hand, has faster clustering speed but lower accuracy. Therefore, a software hierarchical clustering algorithm based on DBSCAN is proposed, which uses the classes generated by the DBSCAN algorithm to constrain the clustering space of the hierarchical clustering algorithm. This algorithm can keep the accuracy of the hierarchical clustering algorithm unchanged, and its time complexity lies between DBSCAN and the hierarchical clustering algorithm. The experimental results show that the algorithm can effectively divide the software reasonably, and prove that the performance of the algorithm is better than other common clustering algorithms through expert evaluation, module division metrics and algorithm running time comparison.


Key words: DBSCAN algorithm, hierarchical clustering, software clustering, code refactoring