In a recent blog post on Performance Optimization, I mentioned that Noah Misch and I had discussed some methods of reducing the overhead of frequent relation locks. Every transaction that touches a given table or index locks it and, at commit time, unlocks it. This adds up to a lot of locking and unlocking, which ends up being very significant on machines with many CPU cores. I ended up spending a good chunk of last week hacking on this problem, with very promising results: I have a prototype patch that improves throughput on a SELECT-only pgbench test by about 3.5x on a system with 24 cores. Not bad for a couple days work.
And it looks like we haven't exhausted the possibilities of this line of attack. Some further investigation (inspired by a note from Stefan Kaltenbrunner) revealed that even with the patch applied, locking contention is still a severe problem on this workload. Each transaction takes and releases a lock on it's virtual transaction ID, or VXID. This scheme allows other processes to wait for the completion of a particular transaction. However, we don't actually use that mechanism very much: it appears that it is only needed for CREATE INDEX CONCURRENTLY, and for Hot Standby. I think it may be possible to arrange things so that the VXID locks are only taken when someone wants to wait for them, which would probably lead to a significant further boost in throughput.
These optimizations will primarily benefit read-only workloads. On at least some read-write workloads, throughput is limited by an LWLock known as WALInsertLock, which serializes insertions into the write-ahead log stream. I don't have a clear idea how to solve that problem at the moment, but intend to look into it further.