基于分布式架构的星载并行计算机容错技术
收稿日期: 2009-07-03
修回日期: 2009-11-05
网络出版日期: 2011-03-25
FaultTolerance Techniques for OnBoard Parallel Computer System Based on Distributed Architecture
Received date: 2009-07-03
Revised date: 2009-11-05
Online published: 2011-03-25
星载计算机需要容错技术来满足在外太空运行的可靠性要求。目前的星载计算机多机系统通常设计为
主从结构,集中于一个主节点上进行容错策略控制,这种结构存在着一点失效即瘫痪的隐患。为此,本文提
出一种分布式架构下的星载并行容错计算机系统,将集中控制的容错部件分布化于各个节点之上,提高了系
统的容错可靠性,在此架构上提出了计算节点、容错部件和I/O等容错策略,并给出了相应的模型及模拟测
试结果,为进行类似项目的开发研究提供了有价值的指导和参考。
关键词: 容错;星载计算机;分布式系统
王伟成,罗宇 . 基于分布式架构的星载并行计算机容错技术[J]. 计算机工程与科学, 2011 , 33(3) : 51 -56 . DOI: 10.3969/j.issn.1007130X.2011.
Faulttolerant techniques can provide high reliability for onboard computers
running in the outer space, the current multinode onboard systems are designed as a master
slave structure, which focuses on the strategy of faulttolerance in the master node and
hereby contains a hidden danger. A parallel faulttolerant computer system with a distributed
framework is proposed in this paper. Based on the framework, the computing nodes and fault
tolerant units are designed and some novel faulttolerant strategies are introduced. Our work
can serve as an important guideline for the development of the related projects.
[1]Ramos J,Samson J,Lupia D,et al.Highperformance, Dependable Multiprocessor[C]∥Proc
of the 2006 IEEE Aerospace Conf, 2006.
[2]左朝树.基于寄生式故障检测的分布式并行服务器系统容错技术:[博士学位论文][D].成都:电子
科技大学, 2005.
[3]张国强.星载并行处理计算机系统容错技术研究:[硕士学位论文][D].长沙:国防科学技术大
学,2006.
[4]Wensley J H.SIFT Software Implemented Fault Tolerance[C]∥Proc of the Fall Joint
Computer Conf,1972:243253.
[5]Vxworks程序开发实践[EB/OL].[20090213].http://www.netyi.net/Book/143dd3d66ec2
461f9017c9df0e1818c8.htm.
[6]Ayav T,Fradet P,Girault A. Implementing FaultTolerance in RealTime Systems by
Automatic Program Transformations[C]∥Proc of the 6th ACM & IEEE Int’l Conf on Embedded
Software,2006:205214.
[7]Bronevetsky G, Marques D, Pingali K, et al. Automated ApplicationLevel Checkpointing of
MPI Programs[C]∥Proc of the ACM SIGPLAN Symp on Principles and Practice of Parallel
Programming (PPoPP 2003) and Workshop on Partial Evaluation and SemanticsBased Program
Manipulation,2003:8494.
[8]史殿习,吴泉源,王怀民,等.嵌套式动态容错协议的研究与设计[J]. 软件学报, 2002,13(2):235
238.
[9]魏昕.COTS技术在企业中的应用[J].计算机系统应用,2000(11):69.
[10]陈宇.高可靠容错实时系统的支撑技术研究:[博士学位论文][D].成都:电子科技大学, 2002.
/
| 〈 |
|
〉 |