Robert Haas: Why pg_dump Is Amazing

I wrote a blog post a couple of weeks ago entitled Is pg_dump a Backup Tool?. In that post, I argued in the affirmative, but also said that it's probably shouldn't be your primary backup mechanism. For that, you probably shouldn't directly use anything that is included in PostgreSQL itself, but rather a well-maintained third-party backup tool such as barman or pgbackrest. But today, I want to talk a little more about why I believe that pg_dump is both amazingly useful for solving all kinds of PostgreSQL-related problems and also just a great piece of technology.

The core value proposition of pg_dump is that the output is human-readable text. You'll get DDL commands that you can use to recreate your database objects, and you'll get COPY commands (or INSERTs, if you so request) that you can use to reload your table data. That is not really an advantage if you're just trying to back up and restore an entire database cluster, because converting all of your data from PostgreSQL's internal formats into text and back again is going to use a bunch of CPU resources. If you instead take and restore a physical backup, you can avoid all of that overhead.

But what if you're trying to do something else? There are lots of situations where a physical backup doesn't help you at all. Because all of your data is stored in PostgreSQL's internal formats, you can only use that data with a compatible version of PostgreSQL. So, if you're hoping to get your data into a system other than PostgreSQL, or if you're hoping to get your data into a different major version of PostgreSQL, that physical backup is not helping. In fact, even if you're just switching to a different CPU architecture, that physical backup is probably not helping, either: values such as integers and floating point numbers are represented using the CPU's native format, and if that is different, then the on-disk format is incompatible.

Now, pg_dump does not automatically solve all of these problems. For instance, it's unlikely that a non-PostgreSQL system will be able to process unmodified pg_dump output without errors, but the great thing about SQL is that you can modify it yourself, with a text editor, or a script. Similarly, major version upgrades using pg_dump may be slow, and they can even fail if, for example, you've created a view that relies on a system catalog column that has been removed in the new release. But this is the kind of problem you're likely to be able to troubleshoot and fix yourself, in contrast to trouble with a physical backup, which you probably can't and shouldn't try to fix yourself. (NB: Reloading a dump can also fail if you use the older version of pg_dump when creating the dump, so don't do that; use the newer version. Some of you might also be thinking that pg_upgrade rather than pg_dump should be used for major version upgrades, but remember that it uses pg_dump under the hood.)

There are also situations where a physical backup might help you, but a logical dump is potentially a lot better. For example, you might want to back up and restore a single table, or a single object of some other type, like a function definition. If you use physical backups for this, you can either restore the entire database cluster and then extract the objects you need from it ... likely using pg_dump. But if you use pg_dump, you can be selective. pg_dump has a lengthy list of options that can be used to control what is included in the dump, which is nice, ut the real magic happens on the restore side. If the dump is taken using tar format (pg_dump -Ft) or custom format (pg_dump -Fc) then pg_restore can pick out exactly the objects you want to restore and restore exactly those. In addition to the filtering that you can do directly on the pg_restore command line, don't miss the extremely powerful and flexible -L, --use-list option to pg_restore. (To be fair, some out-of-core backup solutions also offer selective dump and restore, but to my knowledge none of them are nearly as powerful as what you can do with pg_dump and pg_restore; further, they work by doing things which PostgreSQL itself would consider to be unsupported scenarios, so you really need to hope that the tool author is aware of all the risks.)

The reason why pg_dump can offer these options is precisely that the output is in text format. In PostgreSQL's binary format, the data is all sort of .. entangled. For instance, in the native binary format, each table row is stored together with the visibility information used to support MVCC, and that visibility information is essentially a bunch of numbers that have to be looked up in files stored elsewhere in the data directory. So you can't just pick up that file and drop it into some other database cluster where the contents of those other files will be different. Also, because PostgreSQL does not lock out writes to data files while a physical backup is being taken, the data directory will be in an inconsistent state until recovery is run. This means there's no good way to identify a specific table or extract its contents from a physical backup, other than to start up the server with the appropriate configuration and let recovery sort things out. Because dumps are logical, they don't have any of these problems. The data is stored in a way that is intended to make it accessible, rather than in a manner that is intended to make database operations efficient.

Converting to and from text format can also be used as a way of eliminating corruption in the internal binary format. For example, imagine that you have a database table containing rows where the xmin of a tuple is less than the relation's relfrozenxid but, sadly, the tuple is not frozen. In other words, you've got a corrupted database. Hopefully this won't happen to you, and if it does, hopefully you will be able to fail over to an unaffected replica or restore from an uncorrupted backup. However, it's easy for this kind of corruption to go undetected for a relatively long period of time because the database is still readable despite the corruption, so it's possible that by the time the problem is realized, all of your replicas and backups will be affected. One day, though, the system will remove old clog files that it things are no longer needed, and suddenly that data will become unreadable. Obviously, I've chosen this example to prove a point, because even if you're not familiar with the details of how PostgreSQL represents data on disk, you can perhaps see that the solution here is to dump and restore the affected table at a time when the data is still readable. In fact, this shows how database dumps and physical backups can work hand in glove to solve real-world problems. When one day the table abruptly becomes unreadable, you can restore from a physical backup where the data still was readable, and then use pg_dump to get the data out, and then reload it wherever you like. (I hasten to point out that in these kinds of complex scenarios, you probably also want commercial support for your PostgreSQL instance, unless you yourself are an expert in such matters. You can get such support, among other places, from my employer.)

But now let's talk about why pg_dump is a small engineering marvel: dependencies. Have you ever tried to manually recreate a database schema from one server on another server? If you have, you've probably discovered that it's pretty much no fun at all. For instance, you might try to create a table and find that it because that table has a foreign key to some other table that doesn't exist yet. You need to recreate things in the right order for it to work but you don't know what the right order is. Sometimes you might even have circular dependencies, where no matter which table you create first, it fails for some reason. Then, you might need to solve the dependency problem by first creating the table without subsidiary objects like foreign keys or constraints or indexes, and adding those later, after you've set up some other database objects. And the great thing is that instead of doing all this yourself, you can just run pg_dump and it will figure it out for you. If you're dumping the data and not just the schema, pg_dump will even arrange the dump so that the data restoration is as efficient as possible, as by postponing the creation of indexes until after the data load. All of this can save enormous amounts of DBA time and energy.

Now, I expect that somebody is going to point out that pg_dump does not solve the dependency problem 100% reliably in all cases. And that's correct. However, I think that complaint just shows how spoiled PostgreSQL users are. Recreating a complex database schema without a dependency-aware tool like pg_dump is an absolute nightmare. With pg_dump, it's usually trivial. On those occasions where it isn't and user intervention is required, well, that sucks, but I feel pretty confident that you won't have more fun if you abandon pg_dump and do it all with your own home-grown code. A huge amount of time and energy from very smart people, principally Tom Lane, has gone into making this work as well as it does, and at least some of the remaining problems cannot be solved without first solving the halting problem. SQL is very, very complicated; and solving the dependency problem as well as pg_dump does is quite difficult.

To sum up, I find pg_dump to be an excellent tool for dealing with almost any sort of complicated backup scenario. For routine backup and restore, other options are generally better, but as soon as things are non-routine, pg_dump becomes, at least in my experience, an absolutely indispensable tool. What would have been annoying becomes simple; what would have been infeasible becomes at least possible. It is on my short list of most reliable, most generally useful PostgreSQL tools. If it isn't on yours, perhaps it should be!

4 comments:

AghartaNovember 01, 2024 5:06 PM
Very good article!
pg_dump is a swiss-knife tool and it should be in any dba toolbox, in my opinion.
And don't forget the -Fd flag with -j option.....to parallelize backup jobs!

Again, a really good article in defense of a multi purpose tool!

My best regards,
Agharta
Allan WindNovember 02, 2024 9:03 PM
s/ ut/but/
CaptainKirkNovember 05, 2024 7:44 PM
I wish there was a way to apply a LIMIT to the rows, at least by schema, or supply table specific where clauses. When 90% of your data is logging data you don't need for development (billions of rows in many tables), and I want a backup of the DB... That sucks. If I Could set my limit to 1_000_000 for the logging schema... I'd be in heavy. Lopping 5+ hrs off of the restore process.
pg_restore should also VACUUM ANALYZE all the tables when finished (maybe just ANALYZE?) so the DB is more usable.
And it would be strangely nice if the .CONF file was at least STORED in the toc.dat so it could be compared and differences highlighted... But pretty soon, you are building a real backup solution, and should just use one you mentioned (we selected pgBackRest).
But I love pg_dump/pg_restore overall. As developer tools. Really useful. Scares me in production a bit. Too many loose ends...

Friday, November 01, 2024

Why pg_dump Is Amazing

4 comments: