S3 Sync Update
A while back I wrote about my strategy for synchronizing between two machines using Amazon’s S3 and the JungleDisk tool. I just wanted to post a quick update that refines that strategy a bit. First, let me describe what needed improvement. I sync the ~/Documents directory between my home and work MacBooks. However, on my home machine I have some extra files that really don’t belong on my work machine (like Quicken files), so I have a small text file (called sync_files) that enumerates which sub-directories and file in ~/Documents are to be synchronized between the two machines.This all worked pretty well until I noticed duplicates of files appearing in different places. I realized that what had happened was that I had moved the files on one of the disks and then sync’d with S3. With my current scripts this resulted in copying the file to the new location, but not removing the old one.So with a quick glance at the rsync man-page, I found the --delete option. I refined my scripts and ran them. It all looked good–until I got home. Oops, I just lost a whole bunch of files. Uh-oh. It turns out I forgot to use the sync_files file for both directions. This was an easy tweak but reminded me of the Golden Rule of Rsync:
Always runrsyncwith--verboseand--dry-runto make sure it’s doing what you think it’s doing
So I decided it was time to re-write the script to support this. While you can do command-line options with bash, it quickly gets kinda oogy, so I fell back on Ruby instead. I’ve also collapsed the synchronizing down into a single script–one that goes both ways. So without further ado, you can download the script here. This should work with a stock Ruby install, no special gems required.
Update 4/8/2008: Okay, I still don’t know what the hell I’m doing. There is a bug with this script in that if you create a new file locally then try to sync from S3, your new file will get obliterated. Well guess what kids? Synchronization is hard. I’ve been noodling on a variety of hacks to get around this but none are terribly satisfying. Anyway, my Golden Rule (see above) still stands: make sure you test the thing out before you run it “live”.
This entry (Permalink) was posted
on Saturday, April 5th, 2008 at 8:49 am and is filed under Technology.
You can follow any responses to this entry through the RSS 2.0
feed.
You can leave a response
, or trackback
from your own site.


2 Responses to “S3 Sync Update”
April 6th, 2008 at 9:39 am
Not tried this yet, but a JungleDisk rsync script is just what I need. Have you thought about packaging this as a gem on RubyForge?
Ashley
April 6th, 2008 at 10:59 am
Hi Ashley,
I hadn’t thought of packaging this up as a gem, but perhaps that would be useful to folks. That script is pretty much all there is to it. One thing I’d like to do is handle the case where JungleDisk hasn’t been started or has disconnected (often as a resulting of sleeping my MacBook). It would be cool to not have to remember to get JD up and running prior to running this script.
Cheers,
Alex
Leave a Reply