• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

An improved density peak algorithm for
micro-learning unit text clustering based on LSA model

WU Guo-sheng,ZHANG Yue-qin     

  1. (College of Information and Computer Science,Taiyuan University of Technology,Jinzhong  030600,China)
     
  • Received:2019-09-05 Revised:2019-10-22 Online:2020-04-25 Published:2020-04-25

Abstract:

With the explosive growth of micro-learning resources, a large number of unprocessed fragmented text resources bring great inconvenience to learners. In order to help learners to find suitable contents from fragmented resources for personalized learning, it is necessary to cluster micro-learning resources in the form of text. Therefore, this paper attempts to apply an improved density peak algorithm to micro-learning unit text clustering. Aiming at the problems of high dimensional sparse vector space, insufficient global consistency, cutoff distance sensitivity, and supervised selection of density peak centers when the density peak algorithm perform clustering in its field, this paper proposes two approaches based on Latent Semantic Analysis (LSA) model. Firstly, a new definition of local density is proposed according to clustering requirements, density sensitive distance is used as the clustering criteria, and the global consistency problem of clustering is solved by solving the problem of cutoff distance sensitivity. Secondly, outliers are found by linear fitting to automatically find the density peak centers in order to realize unsupervised selection problem of peak centers. Experimental results on real data sets of micro-learning units show that the proposal is more suitable for text clustering of micro-learning units than the original algorithm and other classical clustering algorithms.
 

Key words: micro-learning, text clustering, density-based clustering, LSA, density-sensitive distance, linear fitting