• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (9): 1783-1793.

• 论文 • Previous Articles     Next Articles

An optimized algorithm of decision tree
based on correlation coefficients 

DONG Yuehua,LIU Li   

  1. (School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,China)
  • Received:2014-08-25 Revised:2014-12-29 Online:2015-09-25 Published:2015-09-25

Abstract:

Aiming at the problem of multivalue bias in ID3 algorithm, we propose an optimized algorithm of decision tree based on correlation coefficients. Firstly, the correlation coefficients between the attributes are introduced to improve the ID3 algorithm, and in turn the multivalue bias problem is overcome. Then the properties of Taylor formula and Maclaurin formula are adopted to simplify the information gain formula. The concrete data of examples prove that the optimized ID3 algorithm can overcome multivalue bias problem. Experiments on the standard UCI data sets show that the optimized algorithm of decision tree not only improves the accuracy of average classification, but also reduces the complexity in building decision trees and thus reduces the generation time of decision trees. Besides, the efficiency of the optimized ID3 algorithm increases significantly for large scale samples.

Key words: ID3 algorithm;correlation coefficient;decision tree;Taylor formula;information gain