• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (02): 252-256.

• 论文 • Previous Articles     Next Articles

Design and implementation of
Lucene-based full-text retrieval system  

ZHOU Jingcai1,HU Huaping1,2,YUE Hong1   

  1. (1.Troop 61070,Fuzhou 350003;
    2.College of Computer,National University of Defense Technology,Changsha 410073,China)
  • Received:2013-06-24 Revised:2013-09-29 Online:2015-02-25 Published:2015-02-25

Abstract:

With the continuous improvement of informationization, a highperformance, full-featured text search system, which can fast locate the matching records among massive data, has become a new research hotspot. Based on the analysis of the fundamentals of the fulltext retrieval techniques and the structure of Lucene system, we present a MVCpattern fulltext retrieval model and develop a retrieval system based on SSH framework and Lucene search engine. It has three contributions. Firstly this system optimizes the supported file formats, and adds PDF, HTML, and RTF along with TXT, Ms office documents into the search library. Secondly, it improves the Chinese words segmentation machine in efficiency and accuracy. Thirdly, it enhances humanmachine interaction and achieves a similar display function as Baidu and Google, which can highlight the search keywords. The practical application of this system demonstrates that it is efficient in creating indexes and can speed up search with much more relevant results.

Key words: Lucene;document parse;fulll-text retrieval;search engine