« décembre 2006 | Main | avril 2007 »

février 16, 2007

The rsync algorithm applied to S3

Many people would like to see rsync work with S3, unfortunately S3 is a "dumb" file storage that can't run rsync. It's possible to run rsync on a EC2 that accesses your files on S3, but it's complex, (more) expensive and not elegant.

I spent some time reading the rsync algorithm and seeing how it could be adapted to S3 (or any other dumb file storage like a FTP server).

Suppose we have one general purpose computer α and a dumb file storage service S3. We want to minimize the amount of data transfered between α and S3.

Upload

Computer α wants to place file A on S3.

The modified rsync algorithm consists of the following steps:


  1. α splits the file A into a series of non-overlapping fixed-sized blocks of size S bytes. The last block may be shorter than S bytes.

  2. For each of these blocks α calculates two checksums: a weak "rolling" 32-bit checksum (described in the rsync algorithm) and a strong 128-bit MD4 checksum.

  3. α creates a XML file A' on S3, that contains the checksums calculated in step 2, a whole-file MD5 checksum and the metadata for file A

  4. α compresses and encrypt each block of file A, then stores each of them in a different object on S3. Each of these block-object has its MD4 checksum as its name and is stored in a bucket P. If the block-object already exists in P, there's no need to resend the file.

Download

Computer α wants to get file B from S3 and already has file A where file A and B are "similar."

The modified rsync algorithm consists of the following steps:


  1. α splits the file A into a series of non-overlapping fixed-sized blocks of size S bytes. The last block may be shorter than S bytes.

  2. For each of these blocks α calculates two checksums: a weak "rolling" 32-bit checksum (described in the rsync algorithm) and a strong 128-bit MD4 checksum.

  3. α gets XML file B' from S3.

  4. α searches through A to find all blocks of length S bytes (at any offset, not just multiples of S) that have the same weak and strong checksum as one of the blocks of A. This can be done in a single pass very quickly using a special property of the rolling checksum described in the rsync algorithm.
  5. α gets each block-object listed in B' from P that is does not already has. It then decrypts and decompresses them.
  6. α reconstitutes file B using blocks from file A and blocks downloaded from P.

  7. α validates the integrity of the file by verifying it's checksum listed in B'.

  8. Α restores the file metadata as listed in B'.

Similarity with existing products

With hindsight, this looks a lot like what S3InfiDisk is doing, with the exception that S3InfiDisk uses 1 MB blocks while rsync uses 1 KB blocks.

Posted by gfk at 8:31 AM | Comments (1)