J4 ›› 2013, Vol. 35 ›› Issue (9): 127-131.
• 论文 • Previous Articles Next Articles
YANG Wenchuan,LIU Jian,YU Miao
Received:
Revised:
Online:
Published:
Abstract:
Chinese word segmentation dictionary based on the doublearray Trietree has higher search efficiency, but the dynamic insertion consumes a lot of time. Therefore, an improved algorithm (iDAT) based on doublearray Trietree for Chinese word segmentation dictionary is proposed. The nodes with more branches are handled while the original dictionary is being initialized. After the initialization, a Hash process is performed on the index values of empty sequence in base array. The final Hash table stores the sum of the empty sequences before the current empty sequence. After that, the iDAT is used to carry out the dynamic insertion process. This algorithm adopts Sunday jumps algorithm of single pattern matching. With the reasonable increasement of space, it reduces the the average time complexity of the dynamic insertion process in Trietree. Practical results show it has good operation performance.
Key words: doublearray;Trietree;time complexity;word segmentation dictionary
YANG Wenchuan,LIU Jian,YU Miao. Research of an improved algorithm for Chinese word segmentation dictionary based on double-array Trie-tree [J]. J4, 2013, 35(9): 127-131.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2013/V35/I9/127