A Method for Solving the Congestion Issue During the Single Node Recovering Based on the MapReduce Model

ZHANG Zhaoning,PENG Yuxing

doi:10.3969/j.issn.1007130X.2011.

Computer Engineering & Science >

2011 , Vol. 33 >Issue 3: 146 - 151

DOI: https://doi.org/10.3969/j.issn.1007130X.2011.

论文

A Method for Solving the Congestion Issue During the Single Node Recovering Based on the MapReduce Model

Expand

(National Laboratory for Parallel and Distributed Processing,Changsha 710073,China)

Received date: 2009-10-21

Revised date: 2010-01-09

Online published: 2011-03-25

Fold

Abstract

The MapReduce model has provided strong support for the dataintensive supercomputing as a fundamental application flat. It has a singlenode task scheduler, which has a simple architecture and is convenient to control the worker nodes, while there exists the single node error problem. In Hadoop (Open Source MapReduce) released versions, it has three different mechanisms such as synchronization on demand, recovery from history logging and dropping. This paper analyses the data jam, result errors and efficiency decline in the three methods, and then gives a method for delivering the information of task dependencies to solve the problems.

Key words： MapReduce;Hadoop;task scheduling;single node error recovery;task dependency

Cite this article

ZHANG Zhaoning,PENG Yuxing . A Method for Solving the Congestion Issue During the Single Node Recovering Based on the MapReduce Model[J]. Computer Engineering & Science, 2011 , 33(3) : 146 -151 . DOI: 10.3969/j.issn.1007130X.2011.

References

［1］Dean J,Ghemawat S. MapReduce: Simplied Data Processing on Large Clusters［C］∥Proc of OSDI’04,2004:137150.
［2］Ghemawat S, Gobioff H,Leung ST. The Google File System［C］∥Proc of SOSP’03,2003:2943.
［3］Zaharia M,Konwinski A,Joseph A D,et al. Improving MapReduce Performance in Heterogeneous Environments［C］∥Proc of OSDI’08,2008:2942.
［4］Hadoop3245, Provide ability to persist running jobs［EB/OL］. ［20090703］. https://issues.apache.org/jira/browse/HADOOP3245.
［5］Hadoop1876, Persisting completed jobs status［EB/OL］.［20090703］. https://issues.apache.org/jira/browse/HADOOP1876.
［6］http://www.citrix.com/xenserver.
［7］Running Sort Benchmark［EB/OL］.［20090703］. http://wiki.apache.org/hadoop/Sort.
［8］Amazon Elastic Compute Cloud (Amazon EC2) ［EB/OL］. ［20090703］. http://aws.amazon.com/ec2/.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References