• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

基于随机森林的老年人居住偏好预测研究

吴帅,赵方   

  1. (北京林业大学信息学院,北京 100083)
  • 收稿日期:2016-08-16 修回日期:2016-12-20 出版日期:2018-05-25 发布日期:2018-05-25
  • 基金资助:

    国家自然科学基金(11272061);北京市科技计划课题(Z151100002115002)

Elderly living preference prediction
based on random forests

WU Shuai,ZHAO Fang   

  1. (School of Information,Beijing Forestry University,Beijing 100083,China)
  • Received:2016-08-16 Revised:2016-12-20 Online:2018-05-25 Published:2018-05-25

摘要:

随着我国老龄化和高龄化趋势的加速,以及家庭养老功能弱化、社会养老服务体系不健全等问题,养老事业面临诸多挑战。为了更好地为老年人提供居住安排建议,同时为养老事业管理部门提供精准的决策支持,对CHARLS问卷中将近2万名老年人的数据进行了分析,力图发现影响老年人居住偏好的主要因素。同时,也尝试利用大数据和数据挖掘方法,从个人层面对老年人居住偏好进行预测,并针对类不平衡的情况下随机森林特征选择算法进行了改进。研究结果表明:基于老年人的特征数据可以很好地预测其居住偏好,为养老事业的精准化决策提供一种依据。
 

关键词: 数据挖掘, 居住偏好, 随机森林, 非平衡数据集, 特征选择

Abstract:

With the speedy aging population, weakening family endowment and unsound social endowment services, old-age care faces many challenges in China. In order to provide suitable living arrangement suggestions to the elderly and accurate decision-supporting to the departments geared toward elderly care, we analyze nearly 20,000 old people's data in CHARLS questionnaire, trying to find main factors affecting the elderly living preference. Besides, we also attempt to predict living preferences for old people by using the characteristics data of the elderly and improve feature selection algorithm on imbalanced data based on random forests. Experimental results indicate that the elderly living preference can be predicted well by using the characteristics data of the elderly. Importantly, this method is potential to provide reference on accuracy decision-making for the departments geared toward elderly care.
 

Key words: data mining, living preference, random forests, imbalanced dataset, feature selection