• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (12): 2216-2221.

• 论文 • 上一篇    下一篇

基于Spark框架的乘潮水位计算与可视化平台

秦勃1,朱勇1,秦雪2   

  1. (1.中国海洋大学信息科学与工程学院,山东 青岛 266100;2.国家海洋信息中心,天津 300171)
  • 收稿日期:2015-08-20 修回日期:2015-10-26 出版日期:2015-12-25 发布日期:2015-12-25
  • 基金资助:

    海洋公益性行业科研专项经费资助项目(201105033)

Tide-bound water level computing and
visualization platform based on Spark 

QIN Bo1,ZHU Yong1,QIN Xue2   

  1. (1.College of Information Science and Engineering,Ocean University of China,Qingdao 266100;2.National Marine Data & Information Service,Tianjin 300171,China)
  • Received:2015-08-20 Revised:2015-10-26 Online:2015-12-25 Published:2015-12-25

摘要:

乘潮水位计算是海洋环境信息处理的重要组成部分,具有计算量大、计算复杂度高、计算时间长等特性。采用传统集群计算模式实现乘潮水位计算业务,存在计算成本高、计算伸缩性和交互性差的问题。针对以上问题,提出一种基于Spark框架的乘潮水位计算和可视化平台。结合对Spark任务调度算法的研究,设计和实现了一种基于节点计算能力的任务调度算法,实现了长时间序列的多任务乘潮水位数据的检索、获取、数值计算、特征可视化的并行处理,达到了海量海洋环境数据计算和可视化处理的目的。实验结果表明,提出的基于Spark的乘潮水位计算和可视化平台可以有效地提高海量乘潮水位数据的分布式并行处理的效率,为更加快速和高效的乘潮水位计算提供了一种新的方法。

关键词: Spark, 乘潮水位, 任务调度算法, 并行处理, 海洋环境信息

Abstract:

Tidebound water level computing is an important part of ocean environment information processing, which features huge amount of data, high complexity, and prolonged computing time. The traditional computing model implemented by HPC has a number of problems, such as high computation cost, poor scalability and interactivity. Aiming at all these problems, we propose an interactive computing and visualization platform based on the Spark scheduling algorithm. We design a computing capacity scheduling algorithm, realize the parallel processing of largescale tidebound water level data, such as data retrieval, data extraction, numerical calculation, featurebased visualization, and achieve the purpose of parallel processing and visualization of largescale ocean environmental data on Spark. Experimental results show that the computing and visualization platform based on Spark can improve the traditional computing model, lessen the dependence of tidal level calculation on high performance cluster and reduce computation cost. In addition, the newlydeveloped task scheduling algorithm can make task allocation more rational and scientific, and therefore further enhance its efficiency. In conclusion, the proposed platform provides a new method for tidebound water level computing.

Key words: Spark, tide-bound water level;task scheduling algorithm;parallel processing;ocean environmental information