Tuesday, January 09, 2024

Incremental Backups: Evergreen and Other Use Cases

As of this writing, I know of three ways to make use of the incremental backup feature that I committed near the end of last month. I'll be interested to see how people deploy in practice. The first idea is to replace some of the full backups you're currently doing with incremental backups, saving backup time and network transfer. The second idea is to do just as many full backups as you do now, but add incremental backups between them, so that if you need to do PITR, you can use pg_combinebackup to reach the latest incremental backup before the point to which you want to recover, reducing the amount of WAL that you need to replay, and probably speeding up the process quite a bit. The third idea is to give up on taking full backups altogether and only ever take incremental backups.

You can't quite do that, because you will have to take at least one full backup to get the ball rolling. But, after that, you could just take an incremental backup every day -- or whatever time period -- based on the backup from the previous day. Over time, the chain of backups will get longer and longer, which is a problem, because you don't want to keep all of your backups forever.

But the architecture of the incremental backup feature allows for a solution to this problem. When you restore a backup, you're normally going to execute pg_combinebackup with the full backup and all of the incrementals as arguments:

pg_combinebackup sunday monday tuesday wednesday thursday friday -o fridayfull

But the result of pg_combinebackup can also be used as the input to a future invocation of pg_combinebackup. So you could instead do this:

pg_combinebackup sunday monday -o mondayfull
rm -rf sunday monday
mv mondayfull monday

Now you've shortened the chain of backups by one. If you later want to restore to the state as it existed on Wednesday, you can do:

pg_combinebackup monday tuesday wednesday -o wednesdayfull

With this kind of approach, there's no hard requirement to ever take another full backup. You can just keep taking incremental backups forever, and periodically consolidate them into the one "evergreen" full backup that you always maintain, as shown above.

But if you're anything like me, you'll already see that this arrangement has two serious weaknesses. First, if there are any data-corrupting bugs in pg_combinebackup or any of the server-side code that supports incremental backup, this approach could get you into big trouble. And I'm certainly not here to promise that a freshly-committed feature has no defects, but I hope eventually we'll have enough experience with this - and have fixed enough bugs - to have confidence that it works well enough that you can reasonably rely on it. Second, the consolidation process is going to do a lot of I/O.

pg_combinebackup is designed so that the total I/O is proportional to the size of the output directory, not the sum of the sizes of the input directories. It does not naively copy the first input directory and then overlay the rest one by one; instead, for each file, it builds a map of where the blocks for that file should be found, and then reads each block in turn from the appropriate source. So, file from the input backups are only read to the extent necessary to generate the output, not in their entirety.

But even so, a command like pg_combinebackup sunday monday -o mondayfull is potentially very expensive. You probably wouldn't be worrying about this feature unless you had a big database. Maybe a full backup is 3TB; if so, this command is going to read ~3TB from sunday and write about the same amount into mondayfull. That's not great. The obvious alternative would be to overlay the monday backup onto the sunday backup in place, overwriting the relevant parts with new data. The problem with that is that if you fail partway through the operation, or the system crashes partway through the operation, then you've got big problems. So, for right now, that mode of operation is not supported.

I think it would be possible to teach pg_combinebackup to do this in a manner that is idempotent; that is, set things up so that if you do fail partway through, you can retry and, if the retry succeeds, you end up in a good state at the end. That's still pretty scary, in my view, but some people might feel that the efficiency makes it worth it. In the meanwhile, a perhaps-workable alternative is to consolidate several days at a time, instead of every day. Even if you do consolidate every day, it's only local I/O that need not affect the servers in production at all, as long as the hardware is provisioned separately. Tell me in the comments below (or by some other means) whether you think that's good enough or how you would propose to improve things.


  1. > The problem with that is that if you fail partway through the operation, or the system crashes partway through the operation, then you've got big problems.

    Every procedure I write starts with "Step 1: Take a snapshot". It's either a ZFS snapshot, or a cloud provider block device snapshot. I've even done lvm snapshots in the distant past. It gives me a lot more confidence to do things like `pg_upgrade --link`