RMIT University
Browse

Architecture-Based Fault Tolerance Support for Grid Applications

conference contribution
posted on 2024-10-31, 10:33 authored by Iman Ibrahim Yusuf, Heinrich SchmidtHeinrich Schmidt, Ian Peake
Failure in long running grid applications is arguably in- evitable and costly. Therefore, fault tolerance (FT) sup- port for grid applications is needed. This paper evaluates an extension of our prior work on Recovery Aware Compo- nents (RAC), a component-based FT approach. Our ex- tension utilizes the grid application architecture according to a small number of architectural classes. In this paper, we evaluate the MapReduce architecture only and analyze the reliability improvement MapReduce applications would gain by adopting the RAC approach. Our analysis shows that signi cant increases in reliability are possible at mod- erate extra cost. Obviously the cost of FT depends on the failure rate of the managed system, i.e., the system to be pro- tected from faults, and the FT strategy chosen. Our work aims to give High Performance Computing (HPC) software architects the tools to control these factors for di erent grid application architectures.

History

Related Materials

  1. 1.
    ISBN - Is published in 9781450307246 (urn:isbn:9781450307246)

Start page

177

End page

181

Total pages

5

Outlet

Proceedings of the Joint ACM SIGSOFT Conference

Editors

Jens Happe and Dorina Petriu

Name of conference

Quality of software architectures 2011

Publisher

ACM

Place published

New York, New York, United States

Start date

2011-06-21

End date

2011-06-23

Language

English

Copyright

© Copyright 2011 ACM

Former Identifier

2006026161

Esploro creation date

2020-06-22

Fedora creation date

2011-07-14