• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (11): 2084-2090.

• 人工智能与数据挖掘 • 上一篇    

伴随时间的模糊聚类协同过滤推荐算法

阎红灿1,2,王子茹1,李伟芳1,谷建涛1   

  1. (1.华北理工大学理学院,河北 唐山 063210;2.河北省数据科学与应用重点实验室,河北 唐山  063000)
  • 收稿日期:2020-07-04 修回日期:2020-09-17 接受日期:2021-11-25 出版日期:2021-11-25 发布日期:2021-11-23
  • 基金资助:
    中国信息协会“十三五”规划课题(ZXXJ2019030);河北省社会科学基金(HB17GL071)

Time-based fuzzy cluster collaborative filtering recommendation algorithm

YAN Hong-can1,2,WANG Zi-ru1,LI Wei-fang1,GU Jian-tao1   

  1. (1.College of Science,North China University of Science and Technology,Tangshan 063210;

    2.Hebei Key Laboratory of Data Science and Applications,Tangshan 063000,China)
  • Received:2020-07-04 Revised:2020-09-17 Accepted:2021-11-25 Online:2021-11-25 Published:2021-11-23

摘要: 随着用户对推荐的准确性和实时性需求的不断提高,从海量用户历史数据中挖掘出用户需要的准确信息是一个极有价值的研究方向。基于模糊聚类的协同过滤算法首先要解决数据稀疏问题,对原始的用户评分数据进行预处理,采用SMOTE过采样技术填充数据有效解决了数据稀疏问题;然后利用模糊聚类实现评分数据的分类,结合艾宾浩斯遗忘曲线,将用户评价的时间戳作为因子对聚类后的数据进行评分预测,以此改善用户爱好随时间变化对推荐效果的影响,解决实时性问题。在MovieLens-100k数据集上的实验结果表明,伴随时间的模糊协同过滤推荐可以明显提高推荐算法的准确性。


关键词: SMOTE, 模糊聚类, 协同过滤, 评分矩阵, 时间因子

Abstract: As the accuracy and real-time requirements of users continue to increase , it is a very valuable research direction to mine the accurate information that users need from massive historical data of users .The collaborative filtering algorithm based on fuzzy clustering must first solve the problem of data sparsity. Firstly, the original user rating data is preprocessed, and the data is filled by the SMOTE method to effectively solve the data sparsity problem. Then, the classification data is classified using fuzzy clustering. By combining Ebbinghaus's forgetting curve, the timestamp of user evaluation is used as a factor to score and predict the clustered data, in order to improve the impact of user preferences over time on the recommendation effect and solve real-time problems. Through experiments on MovieLens-100k dataset, the results show that fuzzy collaborative filtering recommendation with time  can significantly improve the recommendation accuracy.

Key words: synthetic minority oversampling technique, fuzzy clustering, collaborative filtering, scor- ing matrix, time factor