RMIT University
Browse

A fault tolerant parallel computing architecture for remote sensing satellites

Download (7.4 MB)
thesis
posted on 2024-11-23, 20:15 authored by Sharon Siok Lin Lim
This thesis is concerned with the design concepts of a fault tolerant, high performance parallel computing payload for remote-sensing missions. Current small satellite missions generally do not have high computational power onboard due to limitations of power, space, volume or budget. This thesis researches on a cost-effective way of designing space computing architectures that enable reliability, despite the usage of Commercial-off-the-shelf (COTS) components.<br><br>The COTS-enabling technology from this work has achieved a high reliability figure for the PPU computing payload, designed using commercial grade processors, Field Programmable Gate Arrays (FPGAs), memory chips and serial flash chips. The optimal usage of resources in the PPU has made it a valuable high performance computing resource for small satellite missions. The PPU’s computational power will enable a new class of space applications for small satellite missions. <br><br>The computing payload proposed in this thesis is a parallel cluster of COTS processing nodes, interconnected using network elements that are based on COTS FPGAs. Part of this research work has been adopted for use in the Parallel Processing Unit (PPU) - a secondary payload onboard the XSat micro-satellite. The XSat is built by the Centre of Research for Satellite Technologies (CREST), and scheduled for launch in 2011. The satellite centre is located in Nanyang Technological University, Singapore. The author is a full-time project member in this centre, in charge of the PPU payload development. <br><br>The computing payload uses parallelism of COTS processor nodes to achieve high computing performance, and fault tolerant schemes to maintain reliability. This thesis focuses on the provision of highly fault tolerant and reconfigurable networks that enable reliable communication not only among parallel processors, but also with memory chips and external interfaces. Provision of multiple communication schemes that consist of an inter-cluster ring network and mesh processor array, both of which are fault tolerant and reconfigurable, have given the payload a high probability of survival in the harsh space radiation environment. This is coupled with autonomous processor fault detection and recovery schemes. <br><br>The PPU computing payload is also highly adaptive to changing reliability and computation needs, allowing a trade-off between the two at mission runtime. The PPU adopts industrial standards for part reliability computation and system reliability modelling. The PPU’s system reliability figure is a valuable check that the extent of fault tolerance is sufficient yet not over-catered. Over-catering of fault tolerant paths results in unnecessary wastage of valuable and expensive resources onboard the satellite.

History

Degree Type

Doctorate by Research

Imprint Date

2009-01-01

School name

School of Science, RMIT University

Former Identifier

9921859068601341

Open access

  • Yes

Usage metrics

    Theses

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC