Currently, if you'd like to test out parallel sequential scan, you need to look at four different patches. Most of these patches are being updated at a rather brisk pace, so by the time you read this the versions mentioned here may no longer be the latest ones. But right now, I think the latest versions are:
- parallel-mode-v8.1.patch. This patch introduces the basic infrastructure we need to do work in parallel in PostgreSQL. It allows a process to request that the postmaster launch a given number of background workers, and it automatically arranges to share important pieces of backend-private state such as GUC settings and snapshots with those workers. This infrastructure is intended to be used not only for parallel sequential scan, or even for parallel query more generally, but also for parallel utility commands (e.g. parallel CLUSTER or parallel VACUUM). It could even be used user-defined functions that spin up parallel workers for particularly compute-intensive tasks. In general, I think this patch is in pretty good shape. Andres Freund has concerns that the approach I have taken to handling heavyweight locking may not be robust, and this was certainly a valid criticism of earlier versions, but I have tightened it up quite a bit since then. Hopefully it will pass muster.
- assess-parallel-safety-v4.patch. This patch introduces introduces a framework for deciding whether a particular query is safe to run in parallel mode. It works by classifying functions as parallel-safe (meaning that the use of that function imposes no restrictions on the use of parallelism), parallel-restricted (meaning that a parallel query can use that function, but the function itself must be executed in the parallel group leader, not one of the workers), or parallel-unsafe (meaning that a query that includes this function cannot use parallelism at all). An example of the latter category is setval(). Right now, no process involved in a parallel operation can perform any data writes, so if we choose a parallel plan for a query containing setval(), it will fail at execution time.
There are a couple of thorny problems around this patch. One is that the patch currently enables parallelism only when the simple query protocol is used. This is because parallelism can only be used when the query will be run to completion. If the client sends a Query (Q) message, that always runs the query all the way through. If the client sends Parse (P), Bind (B), and Execute (E) messages, the execute message could specify a row count, meaning that we should not run the query to completion, but only until that many rows are generated. libpq doesn't actually support this, but the underlying wire protocol does, and some drivers may be relying on it. Another question is whether this patch will add too much overhead in cases where parallelism is not used.
- parallel-heap-scan.patch. This patch provides a way for several cooperating backends to perform a coordinated sequential scan of some relation. Your first reaction might be to think this is the payoff patch, but it's not, because it's exposing a C API for this functionality, not an SQL one. It turns out that, with the infrastructure provided by the parallel-mode patch, this is actually quite simple. The hard part is making the functionality visible at the SQL level.
- parallel_seqscan_v10.patch. This one is the payoff patch. It introduces two new executor nodes, one called Funnel and another called Partial Seq Scan. Funnel is actually a generic kind of node that will be useful for other forms of parallelism we may want to introduce in the future. A funnel node has a single child, which represents the operation to be run in parallel. It launches a designated number of background workers and consolidates the output of all of those background workers into a single stream of tuples.
The other new type of node is a Partial Seq Scan, which will appear under a Funnel. The idea is that each worker does a Partial Seq Scan, and together those partial scans add up to a full scan. The Funnel consolidates those partial results into a single stream of output tuples.
In addition, this patch also introduces a bunch of infrastructure which will be needed for any kind of parallel query, though not necessarily for other kinds of parallel operations such as parallel utility statements. For example, it makes parameters passed to the query visible inside all of the child backends, and it adds a bunch of infrastructure for transporting the portion of the query to be executed by the parallel worker from the parallel leader to each worker. Currently, this will always be a Partial Seq Scan node, but it is generic enough to transport any other plan node we might want to push down to a worker in the future.
The current version of this patch, v10, is not yet integrated with the assess-parallel-safety patch. Amit is working on correcting that, and is also working on fixing a number of bugs that have been reported. Expect a new version soon. This patch still needs some more refactoring, and there are more mundane decisions that need to be made as well, like how the costing should work. Still, Amit has made great progress in making this infrastructure much more general and improving the structure of it over the last few weeks, and I am excited about it.