Vous savez que vous devriez

Ce chapitre est philosophique et théorique à propos des backups. Il traite des raisons pour lesquelles vous devriez faire des sauvegardes, les concepts fondamentaux, ce à quoi vous devez penser lorsque vous mettez en place vos sauvegardes et que faire au long terme (vérification, etc). Il mentionne également quelques hypothèses que fait Obnam et les contraintes que cela vous impose.

Pourquoi sauvegarder ?

FIXME : Ajouter quelques histoires horrible ici qui prouvent que les sauvegardes sont importantes. Avec rérefences/liens.

Les concepts des sauvegardes

Cette section couvre les concepts principaux des sauvegardes, et introduit la terminologie utilisée dans ce manuel.

Les données actives sont les données sur lesquelles vous travaillez ou que vous conservez. Ce sont les fichiers sur votre disque dur : les documents que vous écrivez, les photos que vous enregistrez, les romans non terminés que vous espérer finir.

La plupart des données actives sont précieuses, c'est à dire que vous seriez contrarié si elles disparaissaient. Certains fichiers actifs ne sont pas précieux : le cache de votre navigateur web, par exemple. Cette distinction peut vous permettre de limiter la quantité de données que vous devez sauvegarder, ce qui diminue de manière significative le coût de vos sauvegardes.

Une sauvegarde est une copie supplémentaire de vos données actives. Si vous perdez une partie ou la totalité de vos fichiers actifs, vous pouvez les récupérer (restaurer) depuis vos sauvegardes. Cette copie de sauvegarde est, en pratique, plus vieille que vos données actives, mais si vous sauvegardez régulièrement, vous ne devriez jamais perdre grand chose.

Des fois il est utile d'avoir plus qu'une seule copie de sauvegarde de vos données actives. Vous pouvez avoir une série de sauvegardes, faites à des moments différents, vous offrant un historique de sauvegardes. Chaque copie de vos données dans votre historique est une génération. Ceci vous permet de récupérer un fichier que vous avez supprimé il y a un certain temps, mais sans que vous le remarquiez jusqu'à présent. Si vous n'avez qu'une copie de sauvegarde, vous ne pouvez plus le récupérer, mais si vous avez par exemple une sauvegarde par jour sur un mois passé, vous avez un mois pour réaliser qu'il vous manque un fichier avec qu'il soit perdu pour toujours.

L'emplacement où vos sauvegardes sont stockées est le dépôt de sauvegarde (ndt : "backup repository"). Vous pouvez utiliser de nombreux types de media de sauvegarde pour le stocker : disque dur, disque option (DVD-R, CD-R, etc), clé USB, espace en ligne, etc. Chaque type de medium a différentes caractéristiques : taille, vitesse, commodité, fiabilité, prix, que vous devez prendre en compte pour choisir la solution la plus raisonnable pour vous.

Il est possible que vous ayez besoin de plusieurs dépôts ou media, l'un d'entre eux situé hors-site, loin de votre machines habituelles. Sinon, si votre maison brûle, vous perdez vos sauvegardes avec.

Vous devez vérifier que vos sauvegardes marchent. Ce serait dommage de faire l'effort de sauvegarder régulièrement et ensuite de ne pas pouvoir les restaurer lorsque vous en avez besoin. Vous devriez sans doute tester votre plan après sinistre en imaginant que vos fichiers ont disparu, à l'exception de vos media de sauvegarde. Parvenez-vous à récupérer ? Vous devriez sans doute faire ceci régulièrement, pour être certain que votre solution de sauvegarde marche toujours.

Il y a un grande variété d'outils de sauvegarde. Ils peuvent être simples et manuels : copiez vos fichier sur une clé USB en utilisant votre explorateur de fichiers, tous les trente-six du mois. Ils peuvent aussi être très complexes : les produits de sauvegarde conçus pour les entreprises qui coûte énormément d'argent et incluent plusieurs jours de de formation pour votre IT, et qui nécessite une attention constante de la part de cette équipe.

Vous devez définir une stratégie de sauvegarde pour tout relier ensemble : quelles données active sauvegarder, sur quel medium, en utilisant quel outils, quel historique conserver et comment vérifier que tout marche.

Backup strategies

You've set up a backup repository, and you have been backing up to it every day for a month now: your backup history is getting long enough to be useful. Can you be happy now?

Welcome to the world of threat modelling. Backups are about insurance, of mitigating small and large disasters, but disasters can strike backups as well. When are you so safe that no disaster will harm you?

There is always a bigger disaster waiting to happen. If you backup to a USB drive on your work desk, and someone breaks in and steals both your computer and the USB drive, the backups did you no good.

You fix that by having two USB drives, and you keep one with your computer and the other in a bank vault. That's pretty safe, unless there's an earth quake that destroys both your home and the bank.

You fix that by renting online storage space from another country. That's quite good, except there's a bug in the operating system that you use, which happens to be the same operating system the storage provider uses, and hackers happen to break into both your and their systems, wiping all files.

You fix that by hiring a 3D printer that prints slabs of concrete on which your data is encoded using QR codes. You're safe until there's a meteorite hits Earth and destroys the entire civilisation.

You fix that by sending out satellites with copies of your data, into stable orbits around all nine planets (Pluto is too a planet!) in the solar system. Your data is safe, even though you yourself are dead from the meteorite, until the Sun goes supernova and destroys everything in the system.

There is always a bigger disaster. You have to decide which ones are likely enough that you want to consider them, and also decide what the acceptable costs are for protecting against them.

A short list of scenarios for thinking about threats:

These questions do not cover everything, but they're a start. For each one, think about:

The threat modelling here is about safety against accidents and natural disasters. Threat modelling against attacks and enemies is similar, but also different, and will be the topic of the next episode in the adventures of Bac-Kup.

Backups and security

You're not the only one who cares about your data. A variety of governments, corporations, criminals, and overly curious snoopers are probably also interested. (It's sometimes hard to tell them apart.) They might be interested to find evidence against you, blackmail you, or just curious about what you're talking about with your other friends.

They might be interested in your data from a statistical point of view, and don't particularly care about you specifically. Or they might be interested only in you.

Instead of reading your files and e-mail, or looking at your photos and videos, they might be interested in preventing your access to them, or to destroy your data. They might even want to corrupt your data, perhaps by planting child porn in your photo archive.

You protect your computer as well as you can to prevent these and other bad things from happening. You need to protect your backups with equal care.

If you back up to a USB drive, you should probably make the drive be encrypted. Likewise, if you back up to online storage. There are many forms of encryption, and I'm unqualified to give advice on this, but any of the common, modern ones should suffice except for quite determined attackers.

Instead of, or in addition to, encryption, you could ensure the physical security of your backup storage. Keep the USB drive in a safe, perhaps, or a safe deposit box.

The multiple backups you need to protect yourself against earthquakes, floods, and roving gangs of tricycle-riding clowns, are also useful against attackers. They might corrupt your live data, and the backups at your home, but probably won't be able to touch the USB drive encased in concrete and buried in the ground at a secret place only you know about.

The other side of the coin is that you might want to, or need to, ensure others do have access to your backed up data. For example, if the clown gang kidnaps you, your spouse might need access to your backups to be able to contact your MI6 handler to ask them to rescue you. Arranging safe access to (some) backups is an interesting problem to which there are various solutions. You could give your spouse the encryption passphrase, or give the passphrase to a trusted friend or your lawyer. You could also use something like libgfshare to escrow encryption keys more safely.

Backup storage media considerations

This section discusses possibilities for backup storage media, and their various characteristics, and how to choose the suitable one for oneself.

There are a lot of different possible storage media. Perhaps the most important ones are:

We'll skip more exotic or unusual forms, such as microfilm.

Magnetic tapes are traditionally probably the most common form of backup storage. They can be cheap per gigabyte, but tend to require a fairly hefty initial investment in the tape drive. Much backup terminology comes from tape drives: full backup vs incremental backup, especially. Obnam doesn't support tape drives at all.

Hard drives are a common modern alternative to tapes, especially for those who do not wish pay for a tape drive. Hard drives have the benefit of every bit of backup being accessible at the same speed as any other bit, making finding a particular old file easier and faster. This also enables snapshot backups, which is the model Obnam uses.

Different types of hard drives have different characteristics for reliability, speed, and price, and they may fluctuate fairly quickly from week to week and year to year. We won't go into detailed comparisons of all the options. From Obnam's point of view, anything that can look like a hard drive (spinning rust, SSD, USB flash memory stick, or online storage) is usable for storing backups, as long as it is re-writable.

Optical disks, particularly the kind that are write-once and can't be updated, can be used for backup storage, but they tend to be best for full backups that are stored for long periods of time, perhaps archived permanently, rather than for a actively used backup repository. Alternatively, they can be used as a kind of tape backup, where each tape is only ever used once. Obnam does not support optical drives as backup storage.

Paper likewise works better for archival purposes, and only for fairly small amounts of data. However, a backup printed on good paper with archival ink can last decades, even centuries, and is a good option for small, but very precious data. As an example, personal financial records, secret encryption keys, and love letters from your spouse. These can be printed either normally (preferably in a font that is easy to OCR), or using two-dimensional barcode (e.g, QR). Obnam doesn't support these, either.

Obnam only works with hard drives, and anything that can simulate a read/writable hard drive, such as online storage. By amazing co-incidence, this seems to be sufficient for most people.

Glossary