• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2008, Vol. 30 ›› Issue (11): 46-47.

• 论文 • 上一篇    下一篇

一种基于修正信息增益的ID3算法

张春丽 张磊   

  • 出版日期:2008-11-01 发布日期:2010-05-19

  • Online:2008-11-01 Published:2010-05-19

摘要:

ID3算法是决策树中影响最大的算法之一,它以信息增益为标准选择决策树的测试属性。这种算法存在不足之处,在选择合适的测试属性时,倾向于选择取值较多的属性,而在实际应用中,取值较多的属性未必是重要的。针对此算法的不足,本文提出了一种对增益修正的 ID3算法,为改善 ID3的多值偏向问题提供了一种有效途径。通过理论分析和实验证明,这种算法能较好地解决多值倾向的问题。

关键词: ID3 决策树 信息增益 多值偏向 修正增益

Abstract:

The ID3 algorithm is a decision tree algorithm,which is important in the field of machine learning. The concept of information gain is proposed by Qui nlan in the ID3 algorithm. Information gain is the selection criteria of the best splitting attribute for inducing decision trees. This algorithm has so   me drawbacks, one of which is that it tends to choose multi-value attribute as the best splitting attribute. However, the multi-value attribute is not n ecessarily important for classification in the real world. This paper presents a revised information gain of the ID3 algorithm in an attempt to solve th  is problem. From the theoretical analysis and experimental results we can see that the new method has a good effect on multivalue orientation of the ID3  algorithm.

Key words: ID3, decision tree, information gain, multi-value orientation, revised information gain