• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2011, Vol. 33 ›› Issue (3): 51-56.doi: 10.3969/j.issn.1007130X.2011.

• 论文 • 上一篇    下一篇

基于分布式架构的星载并行计算机容错技术

王伟成,罗宇   

  1. (国防科学技术大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2009-07-03 修回日期:2009-11-05 出版日期:2011-03-25 发布日期:2011-03-25
  • 作者简介:王伟成(1981),男,陕西西安人,硕士生,研究方向为操作系统、容错技术、计算机网络。罗宇(1963),男,湖南衡阳人,博士,教授,研究方向为操作系统、计算机网络。

FaultTolerance Techniques for OnBoard Parallel  Computer System Based on Distributed Architecture

WANG Weicheng,LUO Yu   

  1. (School of Computer Science,National University of Defense Technology,Changsha 410073,China)
  • Received:2009-07-03 Revised:2009-11-05 Online:2011-03-25 Published:2011-03-25

摘要:

星载计算机需要容错技术来满足在外太空运行的可靠性要求。目前的星载计算机多机系统通常设计为

主从结构,集中于一个主节点上进行容错策略控制,这种结构存在着一点失效即瘫痪的隐患。为此,本文提

出一种分布式架构下的星载并行容错计算机系统,将集中控制的容错部件分布化于各个节点之上,提高了系

统的容错可靠性,在此架构上提出了计算节点、容错部件和I/O等容错策略,并给出了相应的模型及模拟测

试结果,为进行类似项目的开发研究提供了有价值的指导和参考。

关键词: 容错;星载计算机;分布式系统

Abstract:

Faulttolerant techniques can provide high reliability for onboard computers

running in the outer space, the current multinode onboard systems are designed as a master

slave structure, which focuses on the strategy of faulttolerance  in the master node and

hereby contains a hidden danger. A parallel faulttolerant computer system with a distributed

framework is proposed in this paper. Based on the framework, the computing nodes and fault

tolerant units are designed and some novel faulttolerant strategies are introduced. Our work

can serve as an important guideline for the development of the related projects.

Key words: faulttolerance;onboard computer;distributed system