• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (10): 1736-1743.

• 高性能计算 • 上一篇    下一篇

基于Spark Streaming的视频大数据并行处理方法

张元鸣,虞家睿,陆佳炜,高飞,肖刚   

  1. (浙江工业大学计算机科学与技术学院,浙江 杭州 310023)

  • 收稿日期:2020-05-18 修回日期:2020-09-02 接受日期:2021-10-25 出版日期:2021-10-25 发布日期:2021-10-22
  • 基金资助:
    浙江省自然科学基金(LY19F020034);计算机体系结构国家重点实验室开放课题(CARCH201804)

A parallel processing approach for video big data based on Spark Streaming framework

ZHANG Yuan-ming,YU Jia-rui,LU Jia-wei,GAO Fei,XIAO Gang#br#   

  1. (College of Computer Science & Technology,Zhejiang University of Technology,Hangzhou 310023,China)



  • Received:2020-05-18 Revised:2020-09-02 Accepted:2021-10-25 Online:2021-10-25 Published:2021-10-22

摘要: 视频设备被广泛应用于公共区域、智能交通和工业生产等许多领域,其产生的视频数据具有体量巨大、速度极快、价值稀疏和完全非结构化等大数据典型特征。为了进一步提高视频大数据的处理性能,提出了一种基于Spark Streaming的视频大数据并行处理方法,设计了基于Spark Streaming的视频大数据并行处理框架,针对帧间无关分析算法和帧间相关分析算法分别给出了并行化策略,前者利用数据并行机制将去冗余后的视频帧映射到不同节点并行处理,后者利用流水线并行机制将分析算法的各个算子根据依赖关系映射到不同节点并行处理;结合实际应用对并行处理框架和并行化策略进行了评价,设计了电梯乘客数并行检测算法和电梯门异常并行检测算法,当节点数增加到16个时,电梯乘客数检测算法的性能加速比为615%,电梯门异常检测的性能加速比为253%。

关键词: 视频大数据, 并行处理策略, 帧间相关分析, 帧间无关分析

Abstract: Video devices are widely used in public areas, smart transportation, industrial production and many other fields. Video data has typical characteristics of huge volume, fast speed, sparse value and completely unstructured data. To achieve higher processing performance for video big data, a parallel processing approach based on Spark Streaming is proposed. A parallel processing framework based on Spark Streaming is designed. Especially, parallel strategies for inter-frame independent algorithms and inter-frame correlation algorithms are given in detail. The former strategy maps the de-redundant video frames to different nodes with data parallelism, and the latter maps the operators of the algorithm to different nodes based on the dependency relationship. The parallel processing approach is evaluated with real video big data. A parallel detection algorithm for elevator passenger number and a parallel detection algorithm for elevator door anomalies are designed. When the number of nodes increases to 16, the speedup of the elevator passenger number detection algorithm is 615%, and the speedup of elevator door anomaly detection is 253%.


Key words: video big data, parallel processing strategy, inter-frame correlation, inter-frame independence