Wednesday, April 09, 2014

Think Like a Unix Geek: Using rsync on Windows to Avoid Restoring From The Cloud

Shira picked up a new laptop a week or so ago. Transferring the files from her old computer to the new one was going to be trivial. She uses CrashPlan as a backup solution, so all she needed to do was install the free version of the app on her new laptop, click the restore tab, and Bam! she was off and running.

Alas, it wasn't so simple. First, the restore dragged and dragged. It took 7 days to download about 400 gigs of data. When she had a mere 10 gigs of data left to restore, the process crashed. When she went to re-execute the restore, it started from scratch. Ugh.

Surely there had to be a better way to get files from Laptop A to Laptop B. The download from cloud method is technically effective, but as I learn every time I restore using Carbonite, it's both fragile and painful.

I noodled over the problem and ended up thinking: if this was a Linux box I'd just kick off rsync and the problem would practically solve itself. Wait a second, I thought, why don't I do just that?

I was inspired by the instructions here. First off, I downloaded rsync through Cygwin on both machines.

On the source (old) laptop, I setup the following /etc/rsyncd.secrets file:

agent:somepass

Also on the source (old) laptop, I setup the following /etc/rsyncd.conf file:

hosts allow = 192.168.1.12 192.168.1.11 192.168.1.14
auth users = agent
secrets file = /etc/rsyncd.secrets
read only = true
use chroot = no
transfer logging = true
log file = /var/log/rsyncd.log

[agent]
path = /cygdrive/c/Users/ShirasUserDirectory

Note: you'll want to tweak the IP's above so they correspond to your network. And you'll want to update the path under the [agent] block. The username 'agent' is fine to use, it doesn't need to exist as a local account or anything.

With those files in place, I was able to run:

 rsync --daemon

 tail -f /var/log/rsyncd.log

and to my shock and amazement, it worked!

From the destination (new) laptop I executed these commands:

  $ cd foo
  $ /usr/local/bin/rsync.exe -av  --ignore-existing   agent@192.168.1.17::'agent/foo/'  

This assumes, of course, that the source laptop is hanging out at 192.168.1.17. The above command asked me for a password, I entered somepass and contents of foo were successfully copied over to my new laptop. The --ignore-existing insures that files that are already on the new laptop don't get re-transferred.

I thought I was home free at this point. I tapped out the following command, hit enter and waited:

  $ /usr/local/bin/rsync.exe -av  --ignore-existing   agent@192.168.1.17::'agent/'  

This however, transferred a few files and then appeared to hang. I came back a few hours later and it had made some progress. But still, it appeared to be stuck.

My first thought was that I ran into the rsync hangs on cygwin problem. There's a number of solutions to this, which I tried.

The thing is, I don't actually think rsync was hanging. I think instead it was just slowing moving data around. When I finally added enough 'v's to the command (-avvvv is more verbose than -avv) I confirmed that data was being transferred.

The next step was to try to narrow down what was being copied. I had some initial success with this command (inspired by this post):

/usr/local/bin/rsync.exe -av --timeout=10 \
  --ignore-existing --exclude='AppData/*' --include='*/'  \
  --include='*.jpg' --exclude='*' agent@192.168.1.12::'agent/' .

This says to exclude the AppData directory (which is filled with temp stuff), include all other folders, include JPEG's and most importantly exclude everything else. This caused rsync to start transferring actually useful jpeg's over to the new computer (versus temp stuff). Finally, some progress. Unfortunately, it was still ridiculously slow.

I looked at what I could optimize next: how about getting rid of the Wireless connection? I dragged both computers downstairs to the physical router and pugged them in. I kicked off the above rsync command. Whooooo! The files whizzed by. I checked the Windows Network Performance, I had gone from 2mb/s to 50mb/s! Now we're talking.

After further experimentation, I kicked off my original command to copy all the files. Finally, the system had enough bandwidth, and it zipped along. At one point, there was a throughput of 300mb/s. Behold, the power of wires!

In the end, I learned some valuable lessons from this experience: (1) never underestimate the power of physical cables, (2) it pays to think like a Unix geek and (3) rsync rocks.

Next restore I'm skipping the download from the cloud and going right to this rsync solution. However, clumsy it is, it's way better than crossing my fingers for a week straight and hoping the massive restore works.

No comments:

Post a Comment