Wednesday, March 04, 2026

pg_plan_advice: Plan Stability and User Planner Control for PostgreSQL?

I'm proposing a very ambitious patch set for PostgreSQL 19. Only time will tell whether it ends up in the release, but I can't resist using this space to give you a short demonstration of what it can do. The patch set introduces three new contrib modules, currently called pg_plan_advice, pg_collect_advice, and pg_stash_advice.

pg_plan_advice allows you to generate a "plan advice" string that describes the overall shape of the plan. You can then use that plan advice string to ensure that the same plan is recreated, or you can vary it to cause a different plan to be generated. Here's an example excerpted from the regression tests:

LOAD 'pg_plan_advice';

// ...test setup omitted for brevity...

EXPLAIN (COSTS OFF, PLAN_ADVICE)

        SELECT * FROM join_fact f JOIN join_dim d ON f.dim_id = d.id;

             QUERY PLAN             

------------------------------------

 Hash Join

   Hash Cond: (f.dim_id = d.id)

   ->  Seq Scan on join_fact f

   ->  Hash

         ->  Seq Scan on join_dim d

 Generated Plan Advice:

   JOIN_ORDER(f d)

   HASH_JOIN(d)

   SEQ_SCAN(f d)

   NO_GATHER(f d)

Now that we have the plan advice string, we can use it to make sure the same plan is chosen, or we can be more targeted, and just include the relevant elements of the plan advice string, to make sure that those particular choices are preserved:

BEGIN;

SET LOCAL pg_plan_advice.advice = 'HASH_JOIN(d)';

EXPLAIN (COSTS OFF, PLAN_ADVICE)

        SELECT * FROM join_fact f JOIN join_dim d ON f.dim_id = d.id;

             QUERY PLAN             

------------------------------------

 Hash Join

   Hash Cond: (f.dim_id = d.id)

   ->  Seq Scan on join_fact f

   ->  Hash

         ->  Seq Scan on join_dim d

 Supplied Plan Advice:

   HASH_JOIN(d) /* matched */

 Generated Plan Advice:

   JOIN_ORDER(f d)

   HASH_JOIN(d)

   SEQ_SCAN(f d)

   NO_GATHER(f d)


Let's try varying the plan advice string to cause a different plan to be selected:


SET LOCAL pg_plan_advice.advice = 'MERGE_JOIN_PLAIN(d)';

EXPLAIN (COSTS OFF, PLAN_ADVICE)

        SELECT * FROM join_fact f JOIN join_dim d ON f.dim_id = d.id;

                           QUERY PLAN                           

----------------------------------------------------------------

 Merge Join

   Merge Cond: (f.dim_id = d.id)

   ->  Index Scan using join_fact_dim_id on join_fact f

   ->  Index Scan using join_dim_pkey on join_dim d

 Supplied Plan Advice:

   MERGE_JOIN_PLAIN(d) /* matched */

 Generated Plan Advice:

   JOIN_ORDER(f d)

   MERGE_JOIN_PLAIN(d)

   INDEX_SCAN(f public.join_fact_dim_id d public.join_dim_pkey)

   NO_GATHER(f d)


So far, you might be thinking that this sounds like a supercharged version of the various planner-related settings (e.g. enable_hashjoin, enable_mergejoin). What if you need to arrange to automatically manipulate the plan for a certain query every time it occurs? For that, you need pg_stash_advice:

CREATE EXTENSION pg_stash_advice;

EXPLAIN (COSTS OFF, PLAN_ADVICE, VERBOSE)

        SELECT * FROM join_fact f JOIN join_dim d ON f.dim_id = d.id;

// by using VERBOSE we'll see the query identifier

\set qid <whatever-the-query-identifier was>

SELECT pg_create_advice_stash('my_stash');

SELECT pg_set_stashed_advice('my_stash', :'qid', 'MERGE_JOIN_PLAIN(d)');

SET pg_stash_advice.stash_name = 'my_stash';


Now, every time the system plans a query with that query identifier in a session where pg_stash_advice.stash_name is set to my_stash, the query advice MERGE_JOIN_PLAIN(d) will automatically be applied (unless pg_plan_advice.advice is set, in which case that setting will take precedence). For production use, we'd probably want to do something like:


ALTER SYSTEM SET shared_preload_libraries = 'pg_stash_advice';

ALTER SYSTEM SET pg_stash_advice.stash_name = 'my_stash';


In other words, you can use these modules to control planner behavior on a system-wide basis without requiring application changes. I think that's extremely powerful, and could be used to address a wide variety of operational challenges that many PostgreSQL users have.


I also want to mention that I have intentionally attempted to separate mechanism from policy: pg_plan_advice is the core technology that provides only the simplest possible ways for you to generate advice (namely, EXPLAIN) and provide advice (namely, the pg_plan_advice.advice setting), but it's pluggable. pg_collect_advice is an example of how that pluggability can be used to make more powerful advice collection possible, and pg_stash_advice is an example of how that pluggability can be used to make more powerful advice application possible. While I believe both are somewhat useful as they are, it's also completely possible to create better extensions that do even cooler things, or just different things. For example, pg_stash_advice matches advice string to queries based on the query identifier, but you could write your own extension that does it some other way. pg_stash_advice stores advice strings in dynamic shared memory, but you could do something else. In that sense, pg_plan_advice is not intended as a complete solution, but as a toolkit around which we can further innovate together.


While I'm very excited about the possibilities of these modules, I want to emphasize that this is very much version 1.0 technology. It has plenty of limitations, and there are a number of things that are still fairly clunky. Plus, it's not committed yet, and when or whether that will happen, and whether it will stay committed, is currently unknown. But time is rapidly running out to get things into v19, and more reviewers and testers are needed, so if you're interested, please come help!


Links to v18 patch documentation: pg_plan_advice, pg_collect_advice, pg_stash_advice.

Discussion on pgsql-hackers: original post, v18 patch set, continuing discussion under modified subject line

9 comments:

  1. How does this align with the long standing policy of "the planner should do the right thing without hints"?

    How do I see if any advice has been stashed? Say I'm coming to an instance with performance problems cold. I see from the EXPLAIN, but more generally, is there a information_schema.pg_plan_advice or something?

    ReplyDelete
    Replies
    1. If you check the pg_stash_advice documentation, you'll see that there's a set-returning function for that.

      Delete
  2. Man, that would be a bloody amazing feature. I do like PostgreSQL's stance on the query planner doing the right thing by default, but that does not mean we shouldn't also have the option to override that behaviour at our own risk.

    ReplyDelete
  3. I thought we decided to keep hints out of the core many years ago. And pg_hint_plan served this task very well.


    That's worse, if I understand correctly: the planner code has been complicated in favour of this feature. It would be better to provide more hooks/callbacks and let this quite sophisticated system be as simple as possible, so that it's easier to adopt 'true' optimisation techniques.
    Just feedback.

    ReplyDelete
    Replies
    1. Note that pg_hint_plan works by copying and pasting a huge amount of planner code. Lukas Fittl has a nice post where he explains how this infrastructure will allow much of that duplicated code to be deleted. Maintaining duplicated code is generally no fun.

      Delete
    2. When we discussed how to add new planner hooks to let extension pass planning-time data to later stages, you said that it is ok for extension to generate lots of overhead and maintain lots of code (at least the idea was like that) - that's why, instead of a clear single ‘path -> plan’ hook, we're doing something very strange. So, I don't buy your current reason in light of the opposite decision made before.

      Delete
  4. It looks like a great feature. I am a DBA fortunate enough to witness thousands of postgres clusters in production and I don't agree with long prevalent dogma against hints. Performance edge cases become routine at a scale regardless of database protocol. This is a very positive direction from DBA perspective.

    ReplyDelete
    Replies
    1. Yeah, exactly. This time DBA's point of view dominates over developer's one. Let's watch where we end up. Hope, not Oracle 2.0 ;)

      Delete