• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2007, Vol. 29 ›› Issue (10): 50-53.

• 论文 • 上一篇    下一篇

基于单因素方差分析的决策树算法

丁顺利 洪允德 袁静波   

  • 出版日期:2007-10-01 发布日期:2010-06-02

  • Online:2007-10-01 Published:2010-06-02

摘要:

测试属性的选择是决策树构建的关键。本文基于单因素方差分析原理,提出了决策树算法ANOVA1.0及ANOVA2.0。两种算法在测试属性的选择上分别采用最大组间平方和、最大组内平方和增益率,而且都在平台WEKA-3-5上实现。与ID3、C4.5进行效率、精度等方面比较的大数据集实验结果表明,提出的两种算法是较好的分类算法。

关键词: 决策树 组间平方和 组内平方和增益率

Abstract:

Two new decision tree algorithms, ANOVA1.0 and ANOVA2.0, are presented in this paper. The algorithrns are based on one-way analysis of variance. ANOVA1. 0 selects tested attributes according to the biggest sum of squares between groups. ANOVA2.0 selects the tested attributes according to the biggest i  ntergroup gain ratio of sum of squares. ANOVA1.0 and ANOVA2.0 are implemented in the Weka-3-5 software. The two given algorithms are compared to ID3 andCA. 5 in performance, precision,and so on. The experiments with larger datasets are done and the experimental re- sults show that ANOVA1.0 and ANOVA2.  0 are better classification algorithms.

Key words: (decision tree, intergroup sum of. squares, intra-group gain ratio of sum of squares)