Marc Meier

How I do my backup

I did not have any kind of backup strategy for a long time. Last April I decided to occupy myself with this topic and implement a solution. Now I want to present the result of these efforts.

My backup in the past

A while ago I did not have any kind of regular backup strategy. Whenever I felt the necessity to back up files, I just copied whatever I had to wherever I had space left.

My data to back up was spread on several physical computers, cloud drives and mobile devices, like my smartphone. These files where copied to my second hard disk built in my computer, any available external hard disk or, more recently, on my NAS.

After a couple of years my backups where quite messy. Some files where duplicated up to ten times. There where snapshots of directories from different points in time; with some files added, removed or modified. For example, there was a copy of my university folder from every year. I was unable to determine the latest snapshot of my files, what to delete and, more imporant, what to keep. This approach caused my backups to claim more than a terabyte.

Preparations

Before I was able to implement a regular backup, I had to clean up my data. This was essentially to define a few places where I store my files. Everyday data, like project files, email, notes and so on are stored on my notebook only now. Static files, as photos, music or ebooks, are stored on an external drive1. Last but not least I have my smartphone where I use Syncthing to sync all important files to my notebook.

The next step was to crawl through all my files and backups to identify and organize all my data to the mentioned places. This is a hard and manual task, which I did not yet finish until today. It is worth to mention that I used Czkawka, which is a tool to efficiently find and delete duplicated files.

I still accept file duplication up to a certain degree: Whenever I need a static file for a project, I just copy it to my notebook. This might be a couple of photos I need for a collage or a PDF file for preparing a pen and paper session. We will see, this duplication will make no difference in the backup.

In retrospect I wish I had kept my data and backups organized from the beginning.

My requirements for a backup

There is a plethora of buzzwords regarding requirements for backups. Those worth mentioning are the Grandfather-father-son backup, 3-2-1 or 3-2-1-1-0 backup strategy and the humoristic Tao of Backup.

I wont explain these principles in detail. There are plenty of better sources to be found online. Instead I only want to mention the most interesting conclusions.

My requirements for my backup are:

Note that my backup will only include personal files. This means, I want to save my photos, music, documents and selected software and configuration. Even though it contradicts the Tao of backup, I do not want to backup my operating system, all my software or full configurations.

The novice said: "I will save my working files, but not my system and application files, as they can be always be reinstalled from their distribution disks."

The master made no reply.

The next day, the novice's disk crashed. Three days later, the novice was still reinstalling software.

In my experience I should be able to work after reinstalling my operating system within a couple of hours.

My current backup solition

This section describes my current backup solution. Deep, technical details are omitted on purpose.

Storage media

I use three external hard drives as backup media. These are used independently from each other and rotated roughly once a month.

One drive always lies in my locked drawer at work. This offers some degree of spatial separation. So, if all of my drives at home are destroyed by any reason, there is still one copy left, at least some kilometers away.

Alternatively I could have chosen to store my backup on a cloud storage. But this requires a montly fee and is - compared to USB 3.0 hard drives - very slow when it comes to restoring my files.

Restic backup software

After some research I decided to use restic. Restic is an open source tool that offers all it functionality through a CLI. While the lack of a GUI might be a hindrance for a user without any technical background, I do not consider this a problem for me.

Restic stores the current state of all my files as a snapshot. This makes use of deduplication: Data stored once will not be stored (or even transfered) a second time. This saves time when doing the backup and space on the backup drive. To save even more space, restic also makes use of compression.

Restic uses encryption by default. If someone takes the drive from my drawer at work, he or she will not be able to make use of my data without high effort.

Using deduplication, restic is able to efficiently retain almost any number of snapshots. Still I want to clean up snapshots regularly. For this purpose restic offers policies, how long a daily, weekly, monthly or yearly snapshot should be kept or not. This enables me to implement a simple grandfather-father-son backup. At the same time restic is able to distinguish between the sources of my backups: Snapshots of my notebook are cleaned up indepentendly from the snapshots of my static files.

Restic calculates and stores checksums of all files and enables me to use them to check backup consistency. This takes some time; a couple of hours even for my local repository. Unfortunately restic is not able to restore errors without the original files. Due to the fact that I use multiple independent backups, I can go without methods of ECC.

Using resticprofile to ease the the use of restic

As I said previously, restic is a pure command line application. For daily usage a multitude of options and parameters needs to be passed to the software, for example the path of the repository, the password file, and the command (backup, clean up, check). Depending on the command, even more parameters are added. This makes the use quite cumbersome.

resticprofile resolves this issue. This software by third party developers offers the possibility to define all these parameters in a config file and generates the corresponding command (and runs it). Additionally it offers automation of several tasks, for example backups or checks, using cron or windows scheduled tasks.

My daily backup

Currently I perform all tasks manually. The main reason is, that in order to create a backup, the external hard drive must be plugged into my notebook. A scheduled task would have to detect the drive prior its action, otherwise most backups would abort unsuccessfully.

I perform my backups at least once a week. My to-do app (Tasks.org) reminds me of this regularly, just as to change of the backup medium every month.

Maintenance tasks are performed once a month. This ensures that every hard drive is checked quarterly. It consists of cleaning up older snapshots and performing a full check of the backup disk. While this always takes some time, it can be easily done during a home office day.

Outlook

Currently I am very satisfied by the backup solution. I have great trust in restic and my established process, to save and restore my files. This is a tremendous progress compared to my previous, chaotic backup solution.

I would like to improve the automation of my backup. While basically resticprofile is able to accomplish this, the backup medium has to be available all the time. This might be feasible using either my NAS or cloud storage. Fees apply.

Additionally, the backup itself should be secured from unintended changes. In case of the corruption of my system by malware, there is nothing to prevent it from manipulating my backup, when the hard disk is plugged in. An idea for a solution could be restic Rest Server running on my NAS or a raspberry pi. This method shields the backup from direct file system modifications. Additionally, restic Rest Server offers an "append-only" mode, where existing files can not be modified or deleted by any clients.

Last but not least, some error correction would be appreciated. While this feature is favoured by some users of restic, the developers currently do not priorize it. Maybe I can do a do-it-yourself solution one day.


  1. I later decided to move my static files to my NAS for two reasons: First, I needed the external drive for other purposes. Second, this enables me to prevent accidental modification of files using access permissions in the operating system of my NAS↩︎

  2. This not not a theoretical problem: I already dropped one of my backup disks and it got damaged critically. ↩︎