J4 ›› 2013, Vol. 35 ›› Issue (5): 166-172.
• 论文 • Previous Articles Next Articles
WU Jieming,HAN Yunhui,JI Dandan
Received:
Revised:
Online:
Published:
Abstract:
On the basis of the Lucene’s fulltext retrieval toolkit, the current main Chinese word segmentation algorithm and the Lucene relevance sorting algorithm was analyzed, and an improved segmentation algorithm and an improved relevance sorting algorithm were proposed. The paper also used the inverted index, search technologies, distributed storage and parallel computing to analyze and design a search engine for the massive digital works, thus providing users with fast and accurate search service of massive digital works. The experiments compared the segmentation speed, segmentation results and the response time of the keyword search results, the hit number, accuracy and recall rate. The experiment results show that this system does improve the search speed and ensure the accuracy of search results.
Key words: Lucene;segmentation algorithm;index;relevance sorting algorithm;distributed
WU Jieming,HAN Yunhui,JI Dandan. Research and design of search engine for digital works based on Lucene [J]. J4, 2013, 35(5): 166-172.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2013/V35/I5/166