RMIT University
Browse

Benchmarking the Performance of Hadoop Triple Replication and Erasure Coding on a Nation-Wide Distributed Cloud

conference contribution
posted on 2024-11-03, 12:54 authored by Lakshmi J Mohan, Renji Harold, Pablo Caneleo, Udaya Parampalli, Aaron Harwood
Large Scale distributed storage systems play a vital role in maintaining data across storage locations globally. These systems use replication as the default mechanism for providing fault-tolerance. Recently, erasure codes are being used as a viable alternative to replication, since they provide the same fault-tolerance for reduced storage overhead. However, their performance is unclear in a geographically diverse distributed storage system. This paper compares the performance of triple replication with the erasure coding (Reed-Solomon codes) used in Apache Hadoop's implementation of a distributed file system, on a cluster distributed across Australia that runs on the NeCTAR research cloud. Our results show that using erasure coding does not degrade the read performance in such a setting. We also compare the Hadoop's code with a local reconstruction code, implemented in the XORBAS version of Hadoop. These codes perform well in our clusters but the performance gain observed in our results does not conform to the results reported. Hence, we need new codes that perform better, addressing the geographical diversity issue. We believe that our framework is readily usable to test a range of novel erasure codes that are being introduced in the literature.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1109/NETCOD.2015.7176790
  2. 2.
    ISBN - Is published in 9781479919123 (urn:isbn:9781479919123)

Start page

61

End page

65

Total pages

5

Outlet

Proceedings of the 2015 International Symposium on Network Coding, NetCod 2015

Name of conference

NetCod 2015

Publisher

IEEE

Place published

United States

Start date

2015-06-22

End date

2015-06-24

Language

English

Copyright

© 2015 IEEE

Former Identifier

2006102448

Esploro creation date

2020-11-06