An improved density peak algorithm for
micro-learning unit text clustering based on LSA model

Computer Engineering & Science

Previous Articles Next Articles

An improved density peak algorithm for

micro-learning unit text clustering based on LSA model

WU Guo-sheng,ZHANG Yue-qin

(College of Information and Computer Science,Taiyuan University of Technology,Jinzhong 030600,China)

Received:2019-09-05 Revised:2019-10-22 Online:2020-04-25 Published:2020-04-25

Abstract

Abstract:

With the explosive growth of micro-learning resources, a large number of unprocessed fragmented text resources bring great inconvenience to learners. In order to help learners to find suitable contents from fragmented resources for personalized learning, it is necessary to cluster micro-learning resources in the form of text. Therefore, this paper attempts to apply an improved density peak algorithm to micro-learning unit text clustering. Aiming at the problems of high dimensional sparse vector space, insufficient global consistency, cutoff distance sensitivity, and supervised selection of density peak centers when the density peak algorithm perform clustering in its field, this paper proposes two approaches based on Latent Semantic Analysis (LSA) model. Firstly, a new definition of local density is proposed according to clustering requirements, density sensitive distance is used as the clustering criteria, and the global consistency problem of clustering is solved by solving the problem of cutoff distance sensitivity. Secondly, outliers are found by linear fitting to automatically find the density peak centers in order to realize unsupervised selection problem of peak centers. Experimental results on real data sets of micro-learning units show that the proposal is more suitable for text clustering of micro-learning units than the original algorithm and other classical clustering algorithms.

Key words: micro-learning, text clustering, density-based clustering, LSA, density-sensitive distance, linear fitting

WU Guo-sheng, ZHANG Yue-qin.

An improved density peak algorithm for

micro-learning unit text clustering based on LSA model

[J]. Computer Engineering & Science.

[1]	WANG Ruo-bin, GENG Fang-dong, ZHANG Yong-mei, SONG Wei, WANG Wei-feng, XU Lin. Blended MOOC video viewing pattern mining based on an improved self-adaptive DBSCAN [J]. Computer Engineering & Science, 2023, 45(09): 1670-1678.
[2]	ZHONG Guo-yun,WANG Meng-meng,WANG Yu-ling,CHANG Yan-rong,WU Zhong-liang. A rotation mean pulsation feature extraction method and its application in fuzzy face recognition [J]. Computer Engineering & Science, 2020, 42(03): 474-482.
[3]	MA Hui-fang，ZHU Zhi-qiang，CHENG Yu-dan，JIA Jun-jie. Core term based mean partition similarity for short text clustering [J]. Computer Engineering & Science, 2017, 39(08): 1562-1569.
[4]	LIN Jianghao1，ZHOU Yongmei1，2,YANG Aimin1，2,WANG Wei2. Analysis on topic evolution of news comments by combining word vector and clustering algorithm [J]. Computer Engineering & Science, 2016, 38(11): 2368-2374.
[5]	TAN Guangxing,LIU Zhenhui. A local latent semantic analysis algorithm based on support vector machine [J]. J4, 2016, 38(01): 177-182.
[6]	TURDI Tohti，AHMATJAN Ablat，MUYASSAR Aniwar，ASKAR Hamdulla. Combined algorithm of GAAC and K-means for Uyghur text clustering [J]. J4, 2013, 35(7): 149-155.
[7]	MA Jialin,LIU Jinling,YU Changhui. An efficient algorithm for Chinese text clustering [J]. J4, 2013, 35(2): 103-108.
[8]	WANG Lixin,CHEN Haitao,WANG Zhifa. A SOA Business Recovery Oriented Service Selection Algorithm [J]. J4, 2012, 34(11): 180-185.
[9]	JIN Chunxia,ZHOU Haiyan. A Text Clustering Algorithm Based on Position Weighting [J]. J4, 2011, 33(6): 154-158.
[10]	SHEN Yancheng1,XIE Duanqiang1,LI Chao1,2. Differential Fault Analysis of Salsa20 [J]. J4, 2011, 33(3): 7-12.
[11]	LIU Xiaoyong. Text Clustering Algorithm with Ant Colony Based on the Best Solution Kept [J]. J4, 2010, 32(5): 79-81.
[12]	. [J]. J4, 2008, 30(7): 30-32.
[13]	. [J]. J4, 2007, 29(4): 98-100.

An improved density peak algorithm for

micro-learning unit text clustering based on LSA model

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 13

Recommended Articles

Metrics

Comments