I did not have any kind of backup strategy for a long time. Last April I decided to occupy myself with this topic and implement a solution. Now I want to present the result of these efforts.
My backup in the past
A while ago I did not have any kind of regular backup strategy. Whenever I felt the necessity to back up files, I just copied whatever I had to wherever I had space left.
My data to back up was spread on several physical computers, cloud drives and mobile devices, like my smartphone. These files where copied to my second hard disk built in my computer, any available external hard disk or, more recently, on my NAS.
After a couple of years my backups where quite messy. Some files where duplicated up to ten times. There where snapshots of directories from different points in time; with some files added, removed or modified. For example, there was a copy of my university folder from every year. I was unable to determine the latest snapshot of my files, what to delete and, more imporant, what to keep. This approach caused my backups to claim more than a terabyte.
Preparations
Before I was able to implement a regular backup, I had to clean up my data. This was essentially to define a few places where I store my files. Everyday data, like project files, email, notes and so on are stored on my notebook only now. Static files, as photos, music or ebooks, are stored on an external drive1. Last but not least I have my smartphone where I use Syncthing to sync all important files to my notebook.
The next step was to crawl through all my files and backups to identify and organize all my data to the mentioned places. This is a hard and manual task, which I did not yet finish until today. It is worth to mention that I used Czkawka, which is a tool to efficiently find and delete duplicated files.
I still accept file duplication up to a certain degree:
Whenever I need a static file for a project, I just copy it to my notebook.
This might be a couple of photos I need for a collage or a
In retrospect I wish I had kept my data and backups organized from the beginning.
My requirements for a backup
There is a plethora of buzzwords regarding requirements for backups. Those worth mentioning are the Grandfather-father-son backup, 3-2-1 or 3-2-1-1-0 backup strategy and the humoristic Tao of Backup.
I wont explain these principles in detail. There are plenty of better sources to be found online. Instead I only want to mention the most interesting conclusions.
My requirements for my backup are:
Simple and fast execution of the backup. The best case is a fire-and-forget solution, where I do an initial backup and the rest works automatically.
The strategy must keep multiple snapshots of my files. Whenever a file gets corrupted, changed or deleted without me noticing this immediatially, I need to be able to restore it even after a certain time.
The backup should make efficient use of storage. My previous backup strategy was the perfect counterexample for this: Whenever I needed a backup, I copied all my files again. Some files where stored a couple of times and therefore claimed a lot of space. Established approaches make use of differential or incremental backups. More sophisticated solutions use data depuplication, even combined with compression.
My backup must be redundant. Any backup media is prone to physical damage. So if I drop the external drive I use for saving my files, it may be destroyed2. Backups should be separated spatially. A fire at my home could destroy all copies at the same time.
I insist on data sovereignty. A third party should never be able to read and use my data easily, when it get hands on it. This can be achieved by encryption. Also, when trusting my files to a third party, I do not want to have to fully rely on this service. If the cloud provider that stores my backup decides to shut down the other day, I still need to be able to get my data.
It is critical to be able to check consistency of my backup at any time. Whenever this check fails, I need to be able to correct this issue.
Restoring Data should be possible as fast as possible. This might not be feasible for terabytes of data on a cloud service.
Last but not least price and maintenance should be reasonable. Neither do I want to pay a monthly fee that sums up with time, nor am I willing to setup a complex system that requires regular updates and maintenance.
Note that my backup will only include personal files. This means, I want to save my photos, music, documents and selected software and configuration. Even though it contradicts the Tao of backup, I do not want to backup my operating system, all my software or full configurations.
The novice said: "I will save my working files, but not my system and application files, as they can be always be reinstalled from their distribution disks."
The master made no reply.
The next day, the novice's disk crashed. Three days later, the novice was still reinstalling software.
In my experience I should be able to work after reinstalling my operating system within a couple of hours.
My current backup solition
This section describes my current backup solution. Deep, technical details are omitted on purpose.
Storage media
I use three external hard drives as backup media. These are used independently from each other and rotated roughly once a month.
One drive always lies in my locked drawer at work. This offers some degree of spatial separation. So, if all of my drives at home are destroyed by any reason, there is still one copy left, at least some kilometers away.
Alternatively I could have chosen to store my backup on a cloud storage. But this requires a montly fee and is - compared to USB 3.0 hard drives - very slow when it comes to restoring my files.
Restic backup software
After some research I decided to use restic.
Restic is an open source tool that offers all it functionality through a
Restic stores the current state of all my files as a snapshot. This makes use of deduplication: Data stored once will not be stored (or even transfered) a second time. This saves time when doing the backup and space on the backup drive. To save even more space, restic also makes use of compression.
Restic uses encryption by default. If someone takes the drive from my drawer at work, he or she will not be able to make use of my data without high effort.
Using deduplication, restic is able to efficiently retain almost any number of snapshots. Still I want to clean up snapshots regularly. For this purpose restic offers policies, how long a daily, weekly, monthly or yearly snapshot should be kept or not. This enables me to implement a simple grandfather-father-son backup. At the same time restic is able to distinguish between the sources of my backups: Snapshots of my notebook are cleaned up indepentendly from the snapshots of my static files.
Restic calculates and stores checksums of all files and enables me to use them to check backup consistency.
This takes some time; a couple of hours even for my local repository.
Unfortunately restic is not able to restore errors without the original files.
Due to the fact that I use multiple independent backups, I can go without methods of
Using resticprofile to ease the the use of restic
As I said previously, restic is a pure command line application. For daily usage a multitude of options and parameters needs to be passed to the software, for example the path of the repository, the password file, and the command (backup, clean up, check). Depending on the command, even more parameters are added. This makes the use quite cumbersome.
resticprofile resolves this issue. This software by third party developers offers the possibility to define all these parameters in a config file and generates the corresponding command (and runs it). Additionally it offers automation of several tasks, for example backups or checks, using cron or windows scheduled tasks.
My daily backup
Currently I perform all tasks manually. The main reason is, that in order to create a backup, the external hard drive must be plugged into my notebook. A scheduled task would have to detect the drive prior its action, otherwise most backups would abort unsuccessfully.
I perform my backups at least once a week. My to-do app (Tasks.org) reminds me of this regularly, just as to change of the backup medium every month.
Maintenance tasks are performed once a month. This ensures that every hard drive is checked quarterly. It consists of cleaning up older snapshots and performing a full check of the backup disk. While this always takes some time, it can be easily done during a home office day.
Outlook
Currently I am very satisfied by the backup solution. I have great trust in restic and my established process, to save and restore my files. This is a tremendous progress compared to my previous, chaotic backup solution.
I would like to improve the automation of my backup.
While basically resticprofile is able to accomplish this, the backup medium has to be available all the time.
This might be feasible using either my
Additionally, the backup itself should be secured from unintended changes.
In case of the corruption of my system by malware, there is nothing to prevent it from manipulating my backup, when the hard disk is plugged in.
An idea for a solution could be restic Rest Server running on my
Last but not least, some error correction would be appreciated. While this feature is favoured by some users of restic, the developers currently do not priorize it. Maybe I can do a do-it-yourself solution one day.
I later decided to move my static files to my
NAS for two reasons: First, I needed the external drive for other purposes. Second, this enables me to prevent accidental modification of files using access permissions in the operating system of my NAS . ↩︎ This not not a theoretical problem: I already dropped one of my backup disks and it got damaged critically. ↩︎