• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

    Next Articles

Optimization of the log pattern extraction
algorithm for large-scale syslog files

ZHAO Yi-ning,XIAO Hai-li   

  1. (Computer Network Information Center,Chinese Academy of Sciences,Beijing 100190,China)
  • Received:2017-01-05 Revised:2017-03-17 Online:2017-05-25 Published:2017-05-25

Abstract:

The LARGE system is a log analysis framework deployed in the supercomputing environment in Chinese Academy of Sciences. It monitors and analyzes various log files in the environment through log collection, centrally analysis and result feedback. In the process of monitoring system logs, it is necessary for system maintenance personnel to reduce the large number of original logs into a small set of log patterns using the log pattern extraction algorithm. However, because of the fast increase of log size and the peculiarity of messages log files,  the traditional log pattern extraction algorithm fails to satisfy the requirement of rapid processing of logs. We propose an optimization method for  the log pattern extraction algorithm by introducing the idea of the MapReduce mechanism to accelerate the process of log pattern extraction in case of multiple input log files. Evaluation results show that when there are a number of input files, the optimization method can significantly improve the running speed of the vocabulary consistency algorithm and greatly reduce  the running time. We also evaluate the time cost and the extraction effect the optimization algorithm when the vocabulary conversion function is used.

Key words: log processing, MapReduce, bigdata analysis, grid environment