Wednesday, March 16, 2011

Query Planner Enhancements in PostgreSQL 9.1

PostgreSQL has an awesome query planner.  And it keeps getting better. has a survey asking people Which PostgreSQL 9.1 Feature Are You Most Excited About? (to vote on it, if it's still open when you read this, go here).  For some reason, query planner features tend not to be listed when people talk about what they're most excited about in a new release, but there's ample room for excitement.  If a particular query planner feature happens to apply to your use case, it's possible to see multiple-order-of-magnitude speedups, avoiding the need for a major application redesign or even an expensive hardware purchase.  This is a big deal.

PostgreSQL 9.1 (now in feature freeze) will include four significant enhancements to the query planner, three of which I consider to be major new features:

1. Merge Append.  If you use table partitioning in PostgreSQL, you may have noticed that queries like "SELECT * FROM parent_table ORDER BY indexed_column" are pretty inefficient.  In PostgreSQL 9.0 and prior releases, the planner handled these queries by throwing all of the rows from the parent table and all its children into a big bucket, sorting it, and then reading the results off in order.   PostgreSQL 9.1 will be able to use an index for this, reading rows from the parent table and each child table in alternation and merging the results together into a single sorted output stream. Variants of this optimization also apply to queries like "SELECT * FROM parent_table ORDER BY indexed_column LIMIT 20" and "SELECT min(id) FROM parent_table".  Such queries should run far more quickly in PostgreSQL 9.1 than in any previous version.

2. KNN-GIST.  PostgreSQL has a reputation for excellent GIS support, thanks in large part to the PostGIS project. One of the things you sometimes want to do when dealing with spatial data is ask questions like "what are the twenty pizza parlors nearest my current location?".  Supposing you have a database of points of interest, and each one has a type ("pizza parlor" or "car wash"), and there's an index on the type column, you could execute this query by using the index to find all the pizza parlors in the system, compute the distance between each one and the target location, and then find and return the 20 rows with lowest distance.  This plan isn't necessarily bad, but it could be bad if, for example, your database includes every retail establishment in the United States (or some other large country, or the whole world!).  When your database is really big, you might prefer to somehow scan outward from your current location, locating the nearest points of interest, and then check whether each one is a pizza place.  Depending on the data distribution, this might be much faster, and this is exactly what PostgreSQL 9.1 will do if an appropriate GIST index is available.

3. Right and full hash joins.  My jaw just about hit the floor when I saw this commit go in.  It may seem a bit obscure, but, trust me, it's really good.  Hash joins are incredibly fast, and being able to apply them in more situations is therefore a very good thing.  The specific type of query where this is going to produce a big benefit is something like "SELECT * FROM small_table s LEFT JOIN big_table b ON =".  All modern versions of PostgreSQL can implement this as a hash join, but currently released versions have to hash the large table and store the result in memory; they then scan the small table and make probes into the in-memory hash table as they go.  PostgreSQL 9.1 will be able to hash the small table and then scan the big table, making probes.  Hashing the small relation rather than the large one is very important to hash join performance, so this will make a big difference.  The difference will be even larger for full joins (e.g. SELECT * FROM first_table f FULL JOIN second_table s ON =, which PostgreSQL 9.0 can only be implemented via a merge join.

4. Hashing support for arrays.  I'm not sure how common it is to do joins based on array-type columns, but if you do, you're going to like this feature, because it will allow such joins to be implemented as hash joins, which has not been possible in the past.  And that should make many such queries faster.

There are a few query planner features that I was hoping to see in PostgreSQL 9.1 that didn't make the cut.  These include generalized inner index scans (which help queries like SELECT * FROM small_table s LEFT JOIN (large_table1 l1 JOIN large_table l2 ON = ON =, which currently can only be executed by joining l1 to l2, and then joining the result to s; another join order would be much faster); LATERAL (which would make it possible to write certain kinds of queries with set-returning functions that are difficult to express using SQL today); and inner join removal (which simplifies queries like SELECT FROM foo f, bar b ON = when there is a non-deferrable unique index on bar (id) and a foreign key on bar (id) referencing foo (id), making the given query equivalent to the much simpler SELECT FROM foo f).  Time and round tuits permitting, we may get some of these - and likely some others - in PostgreSQL 9.2.

If you're interested in learning more about the PostgreSQL query planner, I'll be speaking about it next week at PostgreSQL East.  The talk will be pretty much the same one I've given in the past, so you probably want to skip it if you've already seen it, but I keep filling the room, so I keep giving the talk.  A little further down the road, Tom Lane will be giving a talk on Hacking the Query Planner at PGCon 2011; I expect that one to fill the room, too.

In the meantime, if you'd like to play around with the new features, you can download PostgreSQL 9.1alpha4.  To the best of my knowledge, this is currently only available in source code form, and due to some Makefile glitches, this tarball doesn't contain the flex and bison output, meaning that you must have appropriate versions of flex and bison on your system to build it (this is also why we don't have binary installers built for this release).  If you're not feeling quite that brave, we should have another alpha out in another week or two, and hopefully a beta sometime in April, and hopefully we'll have binary installers for those releases.


  1. If you want to sell people on "Merge Append", you should tell them the database does an internal map/reduce over the partition indexes. That will get you some #webscale buzz!

    On a more serious note, if people are interested in trying 9.1, several packagers are bundling them so check your package manager (probably have to look in the experimental section for those who designate such things). I know I was able to get it with no problem from macports.

  2. Very good improvements. I really hoped that index-only scans could make in 9.1. Currently Postgres has to peek into table to validate the row version.

  3. Hi Robert!

    Can you publish your presentation or video recording of the event?

  4. My presentations are on-line at

    I'm not sure if there are any video recordings available at present.