• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (11): 245-2054.

• 论文 •    下一篇

海量气象数据实时解析与存储系统的设计与实现

王若曈1,黄向东2,张博2,王建民2,罗兵1   

  1. (1.国家气象中心,北京 100081;2,清华大学软件学院,北京 100084)
  • 收稿日期:2015-08-13 修回日期:2015-10-17 出版日期:2015-11-25 发布日期:2015-11-25

Design and implementation of a real time parsing and
storage system for massive meteorological data 

 WANG Ruotong1,HUANG Xiangdong2,ZHANG Bo2,WANG Jianmin2,LUO Bing1   

  1. (1.National Meteorological Center,Beijing 100081;2.School of Software,Tsinghua University,Beijing 100084,China)
  • Received:2015-08-13 Revised:2015-10-17 Online:2015-11-25 Published:2015-11-25

摘要:

气象数据是一种典型的非结构化数据,在实际应用中其日增量达数十TB,基于关系数据库和传统文件系统的解析、存储与访问模式已成为制约天气预报系统信息化发展的瓶颈之一。为满足全国天气预报平台MICAPS用户对实时数据的及时、快速查询,介绍了能够7*24小时稳定工作、支撑数十TB/天的数据实时解析系统。根据气象数据的多维模型和用户行为,采用非关系型分布式KeyValue数据库,设计实现了高性能海量数据存储系统。实践证明,数据实时解析系统和基于分布式非关系型KeyValue数据库的存储系统能有效满足海量实时气象数据存储、查询和应用需求。该系统已成为中国天气预报业务流程中的核心系统,体现了优异的功能和性能。

关键词: 多维数据, 气象数据, 分布式, 解析, 存储, MICAPS

Abstract:

Meteorological data is a typical non-structure data, which reaches dozens of TBs per day. Parsing, storage and access mode based on RDBMS and file systems become the bottleneck of weather forecast data processing system. To fulfill fast and in time queries of realtime data of the users of national weather forecast platform MICAPS’, we depict a stable 7*24 distributed data parsing system, supporting a realtime parsing system containing dozens of TBs per day. According to the multidimension model and the user behaviors of meteorological data, using nonrelational keyvalue DDBMS, we design and implement a high performance massive data storage system. Experiments prove that the proposed real time data parsing system and the massive data storage system based on non-relational key-value DDBMS can meet storage, query and applications requirements of massive meteorological data. This system is also the core system of China weather forecast data flow, possessing excellent functions and performance.

Key words: multi-dimension data;meteorological data;distributed;parse;storage;MICAPS