I decided to set up a home backup system. The system should:
- be accessible across the Internet, not just from home;
- be suitable for multiple users (so that, for example, my housemate can use it as well);
- be suitable for syncing between multiple computers, flagging up conflicts;
- be cross-platform, especially on Linux machines but also Windows and OS X;
- be ‘self-sufficient’, relying as little as possible on third parties, and being as open and standards-compliant as possible.
There are a lot of hard drives marketed as ‘home cloud’ solutions, which offer the first three points but not the last two. All the ones that I could find make you use their proprietary software, and given my tribulations with Time Machine, my distrust in black-box systems is once again where it should be. These software usually do not support Linux, and sometimes don’t even support OS X.
On the other extreme, plenty of people on Linux forums such as StackOverflow suggested simply using rsync, which is a bit too bare — in particular, it doesn’t do conflict resolution, just overwrites stuff.
I eventually decided to construct my own system, using a 4 TB USB hard drive and a Raspberry Pi. The total cost, including peripherals for the Pi, was around £160, although I plan to install further drives, including an SSD. The power consumption is under 10 W, which works out as about £1/month, depending on electricity prices.
Here’s how I set it up.
Preparing the system
When the Raspberry Pi turns on, it mostly worked out of the box (although I had some difficulty following the instructions, and couldn’t work out how to turn it on!). I attached the peripherals and plugged it into a TV using an HDMI cable. I changed the default account’s password, changed the hostname (calling it resilience), set up an account for myself and one for my housemate, and turned on SSH. The Wi-Fi adaptor that came with it doesn’t seem to work (and I can’t figure out why), so I connected it to the router via Ethernet. This required moving the Pi away from the TV, which is why I set up SSH first.
I prepared (a partition on) the drive by formatting it as ext4, and connected it to the Pi. I edited /etc/fstab so that the drive would be automatically mounted on startup. This particular drive is called asclepius and is mounted on /media/asclepius. I created a subdirectory for each user, making sure to set ownership and group-ownership as necessary.
The drive needs to have its own power supply, either built-in or by using a USB hub. The Pi by itself is unable to supply the required power through a USB connection.
Unison seems to provide many of the features that I need, and a reasonably friendly interface (although I haven’t tried the GUI interface, or on Windows) as well as good documentation.
When installing Unison, one has to note that different versions are not cross-compatible. The Pi’s Raspbian repositories, as well as the machines at DAMTP, currently offer version 2.40.102. For my laptop, the Ubuntu repositories offer a later one, so I had to build version 2.40.102 myself. This wasn’t too difficult.
Addendum (7 November 2017): It turns out that same versions of Unison may be incompatible with each other if they were compiled using different versions of OCaml, due to a change in OCaml’s serialisation format between versions 4.01 and 4.02 (and there’s no guarantee that further changes won’t happen). I therefore decided to also build ocamlopt from source instead of relying on the repositories, going with version 4.02.
I’m not sure what got updated in the Ubuntu repositories to break my setup, but there seem to have been similar problems with Homebrew users in the past. Annoyingly, because the serialisation formats have differed, my reinstall of Unison has wiped the synchroniser state, so that the lengthy process of state detection must be repeated.
Therefore, let this be a lesson: When creating a new protocol, define your own file format properly, rather than relying on third-party algorithms.
Connecting to the outside world
In order to make resilience accessible from the outside world, I had to configure our router to forward SSH connections on Port 22 to resilience.
Like most residential connections, we are on DHCP and so our IP address changes from time to time. To get around this, I am temporarily using the free No-IP service, although we should contact our ISP to request a static IP address.
The connection to the outside world is quite slow, since it is going through a residential connection. The connection is sometimes unreliable and might break once every few hours, but Unison apparently handles interruptions well.
Unison has a number of features for customisation, which I haven’t fully explored yet.
It might be useful, especially if I have more than two users on the system, to set up quotas on the disc. Alternatively, each person could supply their own disc.