• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2011, Vol. 33 ›› Issue (3): 129-135.doi: 10.3969/j.issn.1007130X.2011.

• 论文 • Previous Articles     Next Articles

MapReduce:a New Programming Model for Distributed Parallel Computing

LI Chenghua,ZHANG Xinfang,JIN Hai,XIANG Wen   

  1. (School of Computer Science and Technology,
    Huazhong University of Science and Technology,Wuhan 430074,China)
  • Received:2009-12-29 Revised:2010-05-04 Online:2011-03-25 Published:2011-03-25

Abstract:

MapReduce is a programming model introduced by Google for writing applications that rapidly process vast amounts of data in parallel on large clusters of computing nodes. The model is inspired by map and reduce functions commonly used in functional programming. A Map/Reduce job usually splits the input dataset into independent chunks which are processed by the map tasks in a completely parallel manner. The reduce tasks merge all intermediate values generated by the map tasks. Users only devote themselves to how to specify the map functions and reduce functions. The details of partitioning the input data, scheduling the program’s execution across a set of machines, handling machine failures, and managing the required intermachine communication are taken care of by the runtime system of MapReduce. MapReduce will be widely adopted on the cloud computing platform. Several aspects of the Hadoop MapReduce contributed by Apache remain to be perfected.

Key words: MapReduce;distributed parallel computing;cloud computing