• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

基于多模型融合的流失用户预测方法

叶成,郑红,程云辉   

  1. (华东理工大学信息科学与工程学院,上海 200237)
  • 收稿日期:2019-06-15 修回日期:2019-08-11 出版日期:2019-11-25 发布日期:2019-11-25
  • 基金资助:

    国家自然科学基金(61103115,61103172);上海市科委科技创新行动计划高新技术领域项目(16511101000)

A user churn prediction method
 based on multi-model fusion

YE Cheng,ZHENG Hong,CHENG Yun-hui   

  1. (School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
     
  • Received:2019-06-15 Revised:2019-08-11 Online:2019-11-25 Published:2019-11-25

摘要:

准确的用户流失预测能力有助于企业提高用户保持率、增加用户数量和增加盈利。现有的流失用户预测模型大多为单一模型或是多个模型的简单融合,没有充分发挥多模型集成的优势。借鉴了随机森林的Bootstrap Sampling的思想,提出了一种改进的Stacking集成方法,并将该方法应用到了真实数据集上进行流失用户的预测。通过验证集上的实验比较可知,提出的方法在流失用户F1值、召回率和预测准确率3项指标上均好于所有相同结构的经典Stacking集成方法;当采用恰当的集成结构时,其表现可超越基分类器上的最优表现。

关键词: Stacking集成学习, 用户流失预测, Bootstrap Sampling, 机器学习

Abstract:

Accurate user churn prediction ability facilitates improving user retention rate, increasing user count and increasing profitability. Most of the existing user churn prediction models are single model or simple integration of multiple models, and the advantages of multi-model integration are not fully utilized.This paper draws on the idea of Bootstrap Sampling in random forests, proposes an improved Stacking ensemble method, and applies the method to the real data set to predict the user churn. Through the experimental comparison on the validation set, the proposed method is better than the classical Stacking ensemble method with the same structure in the terms of the F1-score, recall rate and prediction accuracy of user churn. When the appropriate structure is adopted, the performance can surpass the optimal performance on the base classifier.
 

Key words: Stacking ensemble learning, user churn prediction, Bootstrap Sampling, machine learning