I'm proposing a very ambitious patch set for PostgreSQL 19. Only time will tell whether it ends up in the release, but I can't resist using this space to give you a short demonstration of what it can do. The patch set introduces three new contrib modules, currently called pg_plan_advice, pg_collect_advice, and pg_stash_advice.
pg_plan_advice allows you to generate a "plan advice" string that describes the overall shape of the plan. You can then use that plan advice string to ensure that the same plan is recreated, or you can vary it to cause a different plan to be generated. Here's an example excerpted from the regression tests:
LOAD 'pg_plan_advice';
// ...test setup omitted for brevity...
EXPLAIN (COSTS OFF, PLAN_ADVICE)
SELECT * FROM join_fact f JOIN join_dim d ON f.dim_id = d.id;
QUERY PLAN
------------------------------------
Hash Join
Hash Cond: (f.dim_id = d.id)
-> Seq Scan on join_fact f
-> Hash
-> Seq Scan on join_dim d
Generated Plan Advice:
JOIN_ORDER(f d)
HASH_JOIN(d)
SEQ_SCAN(f d)
NO_GATHER(f d)
Now that we have the plan advice string, we can use it to make sure the same plan is chosen, or we can be more targeted, and just include the relevant elements of the plan advice string, to make sure that those particular choices are preserved:
BEGIN;
SET LOCAL pg_plan_advice.advice = 'HASH_JOIN(d)';
EXPLAIN (COSTS OFF, PLAN_ADVICE)
SELECT * FROM join_fact f JOIN join_dim d ON f.dim_id = d.id;
QUERY PLAN
------------------------------------
Hash Join
Hash Cond: (f.dim_id = d.id)
-> Seq Scan on join_fact f
-> Hash
-> Seq Scan on join_dim d
Supplied Plan Advice:
HASH_JOIN(d) /* matched */
Generated Plan Advice:
JOIN_ORDER(f d)
HASH_JOIN(d)
SEQ_SCAN(f d)
NO_GATHER(f d)
Let's try varying the plan advice string to cause a different plan to be selected:
SET LOCAL pg_plan_advice.advice = 'MERGE_JOIN_PLAIN(d)';
EXPLAIN (COSTS OFF, PLAN_ADVICE)
SELECT * FROM join_fact f JOIN join_dim d ON f.dim_id = d.id;
QUERY PLAN
----------------------------------------------------------------
Merge Join
Merge Cond: (f.dim_id = d.id)
-> Index Scan using join_fact_dim_id on join_fact f
-> Index Scan using join_dim_pkey on join_dim d
Supplied Plan Advice:
MERGE_JOIN_PLAIN(d) /* matched */
Generated Plan Advice:
JOIN_ORDER(f d)
MERGE_JOIN_PLAIN(d)
INDEX_SCAN(f public.join_fact_dim_id d public.join_dim_pkey)
NO_GATHER(f d)
EXPLAIN (COSTS OFF, PLAN_ADVICE, VERBOSE)
SELECT * FROM join_fact f JOIN join_dim d ON f.dim_id = d.id;
// by using VERBOSE we'll see the query identifier
\set qid <whatever-the-query-identifier was>
SELECT pg_create_advice_stash('my_stash');
SELECT pg_set_stashed_advice('my_stash', :'qid', 'MERGE_JOIN_PLAIN(d)');
SET pg_stash_advice.stash_name = 'my_stash';
Now, every time the system plans a query with that query identifier in a session where pg_stash_advice.stash_name is set to my_stash, the query advice MERGE_JOIN_PLAIN(d) will automatically be applied (unless pg_plan_advice.advice is set, in which case that setting will take precedence). For production use, we'd probably want to do something like:
ALTER SYSTEM SET shared_preload_libraries = 'pg_stash_advice';
ALTER SYSTEM SET pg_stash_advice.stash_name = 'my_stash';
In other words, you can use these modules to control planner behavior on a system-wide basis without requiring application changes. I think that's extremely powerful, and could be used to address a wide variety of operational challenges that many PostgreSQL users have.
I also want to mention that I have intentionally attempted to separate mechanism from policy: pg_plan_advice is the core technology that provides only the simplest possible ways for you to generate advice (namely, EXPLAIN) and provide advice (namely, the pg_plan_advice.advice setting), but it's pluggable. pg_collect_advice is an example of how that pluggability can be used to make more powerful advice collection possible, and pg_stash_advice is an example of how that pluggability can be used to make more powerful advice application possible. While I believe both are somewhat useful as they are, it's also completely possible to create better extensions that do even cooler things, or just different things. For example, pg_stash_advice matches advice string to queries based on the query identifier, but you could write your own extension that does it some other way. pg_stash_advice stores advice strings in dynamic shared memory, but you could do something else. In that sense, pg_plan_advice is not intended as a complete solution, but as a toolkit around which we can further innovate together.
While I'm very excited about the possibilities of these modules, I want to emphasize that this is very much version 1.0 technology. It has plenty of limitations, and there are a number of things that are still fairly clunky. Plus, it's not committed yet, and when or whether that will happen, and whether it will stay committed, is currently unknown. But time is rapidly running out to get things into v19, and more reviewers and testers are needed, so if you're interested, please come help!
Links to v18 patch documentation: pg_plan_advice, pg_collect_advice, pg_stash_advice.
Discussion on pgsql-hackers: original post, v18 patch set, continuing discussion under modified subject line
No comments:
Post a Comment