Network transfer and disk read constitute the two most time-consuming operations in the repair process for node failures in erasure-code-based distributed storage systems. Recent developments on Reed-Solomon codes have demonstrated repair schemes that achieve optimal network bandwidths in the recovery of single failures, although in certain cases at the expense of a trivially high I/O cost, a term referring to the number of disk reads performed in a repair scheme. We are interested in the lowest I/O cost a repair scheme can achieve for Reed-Solomon codes. We establish two repair schemes for a family of Reed-Solomon codes with two parities that achieve the optimal I/O cost
Funding
Advanced coding techniques for fast failure recovery in storage systems