• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 高性能计算 • 上一篇    下一篇

实时操作系统CPU使用率监测的软件容错研究

王余伟1,曹东2,施书成1   

  1. (1.南京航空航天大学自动化学院,江苏 南京 211106;2.南京航空航天大学飞行控制研究所,江苏 南京 211106)
  • 收稿日期:2017-06-07 修回日期:2017-08-15 出版日期:2018-08-25 发布日期:2018-08-25

Software fault-tolerance based on monitoring CPU
utilization ratio in real-time operating systems

WANG Yuwei1,CAO Dong2,SHI Shucheng1   

  1. (1.College of Automation Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing 211106;
    2.Institute of Flight Control Research,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China)

     
  • Received:2017-06-07 Revised:2017-08-15 Online:2018-08-25 Published:2018-08-25

摘要:

在硬件实时操作系统中,系统CPU的使用率是系统性能的一项重要指标,如果任务占据了系统的全部CPU,其它任务将无法继续运行,给系统带来灾难性后果。
通过分析实时操作系统中软件运行的特点,系统设计需要采取一定容错策略,以提高系统可靠性和容错能力。在μC/ OS-Ⅱ实时操作系统下对飞行控制软件中的任务进行实时监测。首先给出在μC/ OSⅡ实时操作系统下CPU使用率的计算方法,合理提出CPU的监测周期。其次,给出对CPU使用率异常的故障检测算法,对故障进行故障处置,提高系统的容错能力。最后,通过在MPC5674飞行控制计算机中编写嵌入式飞行控制软件来验证四种对CPU使用率异常的处置方法。仿真结果表明,实时操作系统中CPU的软件容错方法可以有效提高系统可靠性和容错能力。

关键词: 实时系统, 软件容错, CPU使用率, 异常处置

Abstract:

In hardware realtime operating systems, the CPU utilization ratio is an important indicator of system performance. If a task occupies the entire CPU, others will not continue, which can induce a disastrous consequence of system performance. By analyzing the characteristics of software running in realtime operating systems, a certain software faulttolerance strategy must be used to enhance system reliability and faulttolerance. In the μC/ OSⅡ realtime operating system, tasks in the flight control software are realtime monitored. Firstly, the calculation method of CPU utilization ratio is given and the CPU monitoring period is proposed reasonably. Secondly, a fault detection algorithm for abnormal CPU utilization ratio is presented. By dealing with the faults, the system's fault tolerance ability can be improved. Finally, the flight control software is designed in the embedded MPC5674 flight control computer to validate the ability and effectiveness of the four methods’ handling abnormal CPU utilization ratio. Simulation results show that software faulttolerance based on monitoring CPU utilization ratio can enhance the system reliability and faulttolerance ability effectively.
 
 

Key words: real-time system, software fault-tolerance, CPU utilization ratio, exception handling