Thursday, May 26, 2011

Performance Optimization

There have been several lively discussions this week on possible performance optimizations for PostgreSQL 9.2. PostgreSQL 9.1 is still in beta, and there's still plenty of bug-fixing going on there, but the number of people who need to and can be involved in that process is not as large as it was a month or two ago. So it's a good time to start thinking about development for PostgreSQL 9.2. Not a lot of code is being written yet, which is probably just right for where we are in the development cycle, but we're kicking the tires of various ideas and trying to figure out what projects are worth spending time on.  Here are a few that I'm excited about right now.

First, there is index-only scans. I've written about this before, so I'll be brief: if all the data you need from an index scan is available from the index tuple, then it's a shame to have to visit the heap (i.e. table) as well. Visiting the heap means essentially reading the same data twice. That means more I/O, and I/O is bad.  Exactly how much benefit will get out of this remains unclear (since it doesn't work yet), but I think it's an important optimization, so I'm working on it.

Second, VACUUM.  Currently, VACUUM uses a two-pass algorithm.  It scans the heap once and removes all the dead tuples, leaving behind something called a "dead line pointer" for each tuple it removes.  The dead line pointer can't be removed until there are no remaining index pointers that reference it.  So, VACUUM next scans the indexes and removes all of the index pointers that reference dead line pointers.  Finally, it makes a second scan of the heap, and removes the dead line pointers.

Pavan Deolasee, a colleague of mine at EnterpriseDB and the man behind the very important heap-only tuple (HOT) optimization in PostgreSQL 8.3, is working on some improvements to this algorithm.  He believes that we may be able to fold the second pass of each VACUUM into the first pass of the next VACUUM.   The worst case for the current algorithm is when VACUUM removes at least one tuple from every page in the relation.  In that case, the current implementation will essentially rewrite the entire relation twice: once to remove the tuples, and a second time to remove the dead line pointers.  If that scenario is common, Pavan's idea should make a big impact: we'll only rewrite each heap block once per VACUUM, rather than twice.

Third, lock contention.  Every transaction that touches a given relation in any way - even just to read data from it - acquires a heavyweight lock on that relation.  This is a necessary evil, because such transactions must be certain that the relation isn't being concurrently dropped, reindexed, clustered, or otherwise changed in some way that would make queries against that relation crash in medias res.  Unfortunately, if many CPUs are all trying to access the same relation, and especially if there are many transactions that consist of just one statement, the time it takes to acquire and release those locks can become a bottleneck.  Noah Misch and I have been discussing some possible ways to speed up the very common case of acquiring and releasing relation locks, at the possible expense of making lock acquisition for DDL operations a bit less efficient.

All of this, in my opinion, should be viewed as basic research.  We can't really how much performance benefit we'll get out of any of these ideas, or how hard they'll be to implement, until we try it.   But it's good that we're thinking about these problems, because performance is a feature that never goes out of style.

1 comment:

  1. I think that index-only scans are much more important than just "avoids some extra I/O." It's *random* I/O that's being avoided (in most cases). The win is huge.