Although we've made great progress in speeding up PostgreSQL over the last few years, there's always more to be done. Performance, with PostgreSQL as with any other database, is largely determined by the availability of three resources: CPU, memory, and disk. What could we do to use each of these resources most efficiently?
PostgreSQL is already pretty efficient at using the CPU. For high-concurrency databases, I don't anticipate that things will get much better than they already are. For low-concurrency databases, we need parallel query - that is, the ability to use more than one CPU to process the same query.
Memory is a little bit more of a problem. We do a good job keeping our memory footprint small, but we don't manage it terribly well. work_mem limits the maximum size of a sort or hash, but takes no account of current conditions: if the system is swapping due to memory pressure, you get the same plan as if the system has 40GB of free memory. And all the memory allocated to shared_buffers remains allocated even when it isn't truly needed.
I/O is perhaps the biggest problem. I don't think this problem is unique to PostgreSQL - I believe all databases probably share this pain point to some degree. Disks are slow. With respect to PostgreSQL specifically, there are a number of things we need to do to minimize our I/O bandwidth, including index-only scans and further improvements to VACUUM. Partial vacuum (implemented in 8.4) is a pretty big deal, but there's more that needs to be done.
We also need to put more effort into minimizing our on-disk format and WAL volume. The actual disk space is cheap, but the time needed to read and write a larger volume of data hurts performance.