Robert Haas: PostgreSQL Past, Present, and Future: Moving The Goalposts

Tuesday, January 12, 2016

PostgreSQL Past, Present, and Future: Moving The Goalposts

It's nice to see that PostgreSQL 9.5 is finally released! There are a number of blog posts out about that already, not to mention stories in InfoWorld, V3, and a host of other publications. Of all the publicity, though, I think my favorite piece is a retrospective post by Shaun Thomas reviewing how far PostgreSQL has come over the last five years. As Shaun notes, both the scalability and the feature set of PostgreSQL have increased enormously over the last five years. It's easy to miss when you look one release, even (as I do) one commit at a time, but the place where we are now is a whole new world compared to where we were back then.

Back in 2010, I wrote a blog post entitled Big Ideas and a follow-up post called Lots and Lots of PostgreSQL Feature Requests summarizing first the ideas that I had and then the responses that I heard from other people about what was missing from PostgreSQL. Five years later, many of the items on those lists appear somewhat dated. PostgreSQL 9.5 adds INSERT .. ON CONFLICT, which answers many of the use cases that led people to ask for merge; and it adds row-level security. Index-only scans, granular collation support, SE-Linux integration, LATERAL, updateable views, and materialized views are all features we've had for so long now that most PostgreSQL users probably don't even think of them as new features any more. Previous releases have also added new data types, particularly jsonb, which allows easy manipulation of JSON data and which has been further enhanced in PostgreSQL 9.5; as well as new index types like SP-GIST. PostgreSQL 9.5 adds BRIN indexes, part of a growing body of work to adapt PostgreSQL to progressively larger workloads.

Many of the really important features from those lists which haven't yet been released are well underway. For example, we don't have multi-master replication in core just yet, but enormous progress has been made with the inclusion of logical decoding in PostgreSQL 9.4, and there will in all likelihood be more progress on logical replication in PostgreSQL 9.6. A very simple version of parallel query has already been committed to PostgreSQL 9.6 and further enhancements are on the way. Work is also under way on partitioning syntax.

It's time, then, to look to the future. What are the major gaps that exist in PostgreSQL today, as opposed to five years ago? Specifically, once we get through the twin knotholes of parallel query and logical replication, long-overdue projects where the slow progress we've made is a direct result of just how very difficult it is to get them off the ground, what comes next? Leave a comment below with your thoughts.

I outlined some of my own ideas about this in a presentation called The Elephants in the Room, which I gave at both pgconf.us 2015 and pgconf.eu 2015. Both video and slides are online. That presentation mentions both parallel query and logical replication, of course, plus a few other things:

1. Horizontal Scalability. While PostgreSQL 9.5 scales to large boxes better than any previous release (and probably worse than any future release, since we keep making improvements!), what happens when you need more than one server for your workload? PostgreSQL 9.5 has a little-noted feature to allow foreign tables to participate in table inheritance, which is a long way of spelling "sharding" if you tilt your head just right, but the query planner and executor capabilities to make it a really killer feature are not there yet. More generally, regardless of how we get there, leveraging one box effectively is good, but leveraging multiple boxes is better.

2. On-Disk Format. At each of the last two instances of PGCon (protip: you should go), there has been some discussion about making PostgreSQL's table storage pluggable, just as our indexing system has been for many years. This would open the door either to replacing PostgreSQL's storage format entirely with something better, without thereby breaking backward compatibility; or perhaps more likely, to introducing specialized storage formats which are better for certain applications. Storage formats which are more compact and therefore allow reading the same amount of useful data from the disk with less physical I/O seem particularly important.

3. Built-In Connection Pooling. I'm not sure how many users are out there using an external connection pool and wishing they could be rid of it, but I'll bet there are some.

The presentation also talks about direct I/O, but in some sense that's not really a feature from a user perspective. If somebody implements it and that turns out to be beneficial, the feature will be improved performance. Of course, one can never have enough performance, whatever the source.

Again, thoughts on other things PostgreSQL needs are very welcome. Please comment below on what else you think should be added. Thanks.

79 comments:

AnonymousJanuary 12, 2016 3:01 PM
1. Procedural control of the transaction, i.e. PL/[whatever] code should be able to issue COMMIT, SAVEPOINT, ROLLBACK. This is vital for writing ETL in stored code.

2. Autonomous transactions. There's a hack that does this via loopback FDW, but... ugh.

I bugged you about these things briefly at PGConfUS 2015. I get that most PG users don't realize they need these things or are content to work around them, but for Oracle users looking at moving to PG they are likely showstoppers.

3. Wait state monitoring, also ala Oracle

4. True clustered (a.k.a. index-organized) tables. Presumably this is tied to flexibility in the on-disk data format.

5. On-disk bitmap indexes (someone might be working on this?)

Compared to these things though, INSERT ON CONFLICT is huuuuuge. Declarative partitioning would also make me ultra happy.

PG is in a great place in general. Looking back on the Oracle shop I worked in for years, the only things preventing a switch to PG -- which would save millions in licensing -- are #1 and #2 above.
ReplyDelete
Replies
Ron DunnJanuary 12, 2016 7:12 PM
I'd like to see two priorities:
1. In-place upgrades. I've been spoiled by SQL Server in the past, and find it really painful to upgrade servers to new versions.
2. Cross-platform completeness. No more features that are platform dependent, such as replication that does not support Windows.
Neither is sexy, but both represent big steps in operational maturity.
ReplyDelete
Replies
UnknownJanuary 12, 2016 8:12 PM
Maybe not in the exact order, mixed bag of small and big features:
1. Declarative partitioning with proper FK propagation and global indicies
2. Optimization in pushing down joins on partitions' keys/FKs (cascading them down), so related to p. 1!
3. IOT - Index Organized Tables
4. On-disk pluggable formats
5. Pluggable tablespaces - much easier to manage when dealing with huge tables, also easier to move between different IO paths
6. Long term dream: Merge of Postgres-XL into the core. Sharding will never be enough unless there's a mesh/actor/load-balancing logic in there. I wish XL/X2 would get GTM integrated into the core or got rid of and then multiple coordinators/datanodes could be configured per node using standard PG.
7. In-place upgrades
8. CPU/IO vertical parallelization, this could be also higher in the list and very happy to see it's getting there steadily.
ReplyDelete
Replies
UnknownJanuary 12, 2016 9:55 PM
I am so happy about INSERT ON CONFLICT, that is the biggest thing I am excited about in 9.5

Going forward I would agree with others that in place upgrades would be very helpful.

The other one that I don't know that there is anything that can be done about is the amount of locking that is done during update table. If there is anything that can be done to improve this I think that would help a lot because that would help with being able to run schema changes while the db has users using the db.
ReplyDelete
Replies
TostinoJanuary 12, 2016 10:59 PM
One thing that I would love to see a little work done on is enum types. We use them extensively for our application, and they are amazing for not having to have a ton of static lookup tables, yet still being type safe.

I would love if there was some love given to being able to rename / delete an enum value without having to re-create the enum entirely (with renamed / deleted values) and migrate all table columns to the new enum, then drop the old one.

It's a usability request, but man i've spent too much time on this because other developers were too eager in creating types, and the enum value names didn't match the standards we had laid out. Fixing them is a real pain in the butt now.

Other than that, I want to echo #1, #2, and #4 from Noah.
ReplyDelete
Replies
AnonymousJanuary 13, 2016 1:02 AM
Make that "Peered Horizontal Scalability" - ie. no SPOF and no masters/workers disjoint - and I believe your list is spot on Robert.

ReplyDelete
Replies
UnknownJanuary 13, 2016 1:17 AM
I'd really like to have point in time queries (oracle flashback) when the archive logs are online.
ReplyDelete
Replies
UnknownJanuary 13, 2016 2:23 AM
Multi-core parallelization is very exciting. Intel is releasing a 72 core chip soon and I expect a very large number of people are CPU bound on their queries these days.

Horizontal scalability is very important in our case. We're evaluating CitusDB now and it's promising, but it would be great to have more of these features in core.
ReplyDelete
Replies
JustinJanuary 13, 2016 3:23 AM
3. Built-In Connection Pooling.

If Postgresql would get the pgbouncer transaction pooling built-in without the caveats, than that would really be a performance enhancing killer feature. Cheap database connections with dynamically managed forked workers.

"Just turn on database pooling in your application" (most of those suck, they keep way too many connections open which is a problem for Postgresql)

And people that already have that setting turned on will almost certainly see a big reduction in worker processes and probably an increase in available cache memory.
ReplyDelete
Replies
AnonymousJanuary 13, 2016 7:08 AM
First a fair disclaimer: as a longtime (22+) Oracle DBA my opinion is certainly biased, but since we're actively looking for a "cheaper" alternative (for at least some workloads) I see only PostgreSQL as a possible alternative. I love everything about PostgreSQL. Almost. The lack of two crucial features are preventing us to deploy PG at the moment. First is the "neglected" state of backup and recovery in PG. The "feature" we evaluate first with every DB engine is how strong it's backup & recovery tools are. Sorry to say, but we're spoiled by Oracle RMAN, hence it feels like we're in a stone-age while dealing with a backup and recovery procedures in PG.
Ok, we could somehow survive with the current state of b/r. But the complete lack of builtin auditing capabilities is a show stopper for us (and no, log_statement='all' doesn't count as a proper audit, pgaudit extensions looks promising).
Of course, we're looking forward for other already mentioned enhancements as well (declarative partitioning being at the top of our list) but our vote goes to: #1 true incremental binary backup and recovery, #2 add full support for DML/DDL auditing.
ReplyDelete
Replies
UnknownJanuary 13, 2016 7:31 AM
Our top wish list

- cross-column/multidimensional statistics
- option to always force a column to go to 'external' storage, even with under 2K row size
- modify an enum type within a transaction!
- almost-online upgrade between major versions (with pg_basebackup + pg_upgrade?)
- multicolumn MAX()/MIN() ... GROUP BY using loose index scan
- incrementally refreshed materialized views
- pg_hba.conf / auth settings should be modifiable by db superuser via SQL

ReplyDelete
Replies
UnknownJanuary 13, 2016 7:31 AM
... and proper partitioning of course...
ReplyDelete
Replies
volkJanuary 13, 2016 8:23 AM
1. Declarative partitioning.
2. Quorum commit (aka semi-synchronous replication).
3. Failover on libpq level.
4. Page-level incremental backups with compression and parallelism.
5. Other compression methods (i.e. lz4 or snappy) for data.
6. Built-in connection pooling.
7. Online implementations for most operations (turning on checksums, REINDEX, most of ALTER TABLE commands).
8. Vertical scalability (the way to use 80% of RAM for shared_buffers when data doesn't fit in RAM).

#1 should include (may be later) partitioning by hash and normal unique indexes/foreign keys on whole table.
ReplyDelete
Replies
vdpJanuary 13, 2016 9:11 AM
Doesn't FDW pretty much give us pluggable disk format already ? For example with https://github.com/citusdata/cstore_fdw ?
ReplyDelete
Replies
UnknownJanuary 13, 2016 9:21 AM
Senpai noticed me!

But seriously, this is all thanks to you and the other devs really kicking ass on the engine these last few years.

For me, there are only a few big "missing" things I've always wanted in Postgres:

1. Sharding. I work with a few of huge databases that need horizontal scalability, and this has always been a manual process. Postgres-XL gets us some of the way, but depending on a coordinator precludes independent node loading and drastically reduces import speed. CitusDB still has major issues with transaction limitations. We're getting there, but it's still very much a roll-your-own situation right now.
2. Correlated column statistics. I know this is an extremely difficult problem to solve, but it's seriously hurting the planner in some very specific ways. I always cry a little inside when I have to move WHERE conditions outside of an optimization fence to speed up queries because the multiplied probabilities seriously screwed up the row estimates.
3. Not query hints, but... something. What we have right now is arguably worse, since we're hard-coding optimization fences like CTEs, OFFSET 0, and other such quirks directly into our queries. Those are a lot harder to ignore, disable, or deprecate than some SQL decorators. I've seen this discussion come up several times in the mailing lists, but it always vanishes into The Ether within a few days. :(

Parallel queries *used* to be on my list too, but that's clearly being addressed in 9.6 and subsequent versions. Putting the background workers in was really a genius move to get this moving, considering the implications for unrelated extensions leveraging it as well.

In any case, keep up the good work; I'm not sure you'll ever really comprehend just how grateful many of us really are.
ReplyDelete
Replies
Daniel LyonsJanuary 13, 2016 12:50 PM
I may be in the minority here, but having seen the lies and havoc that MySQL's "storage engines" cause, I am against having different on-disk formats. I suppose my worries could be mitigated if it were guaranteed that there would not be substantial features limited to certain formats but this feels like the kind of thing where once you open the door, you can't be sure what will walk in.
ReplyDelete
Replies
Gary ClarkeJanuary 13, 2016 4:07 PM
Incremental and auto refresh of materialized views would be my top choice for much needed missing capability in Postgres.
ReplyDelete
Replies
UnknownJanuary 13, 2016 4:18 PM
+1 for "logical replication". Right now there's no online way to split a cluster or migrate a single database to another cluster. If you use pg_dump, you have to stop all writes until the database has dumped & restored, or risk losing transactions. The only other way is to use replication to duplicate the whole cluster, then remove the databases you don't want to migrate. If you're splitting because of space, having to duplicate databases that you'll immediately throw away is a waste of time and storage. Being able to scale horizontally means having convenient tools to manage scaling out. Right now the inability to split a cluster online is a big impediment.

ReplyDelete
Replies
UnknownJanuary 13, 2016 9:18 PM
Not mentioned here yet is a real need for better tools for parsing and understanding query plans. [explain] output is too hard to read and too cryptic for the casual database user. Query tuning is a black art and I don't think it should be. I was thinking of putting together some better data visualizations of explain outputs for pgconf us this year, but unfortunately I'm not sure now that I'll have the time to do it.
ReplyDelete
Replies
AnonymousJanuary 14, 2016 3:17 AM
PL/pgSQL RETURN NEXT should not collect all data in memory/on disk

see: http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-STATEMENTS-RETURNING :
The current implementation of RETURN NEXT and RETURN QUERY stores the entire result set before returning from the function, as discussed above. That means that if a PL/pgSQL function produces a very large result set, performance might be poor: data will be written to disk to avoid memory exhaustion, but the function itself will not return until the entire result set has been generated. A future version of PL/pgSQL might allow users to define set-returning functions that do not have this limitation.

ReplyDelete
Replies
AnonymousJanuary 14, 2016 3:44 AM
-Event Trigger variables (tg_objectis,tg_schema,..)
-parallel query
ReplyDelete
Replies
AnonymousJanuary 14, 2016 8:22 AM
For me:
1) please fix the CTE problem - saying "oh but people are relying on our broken behavior" is not acceptable. So fix it in 10.0 or something, but don't leave it broken.

2) parallel queries

3) merge
ReplyDelete
Replies
AnonymousJanuary 14, 2016 9:06 AM
incremental backup done properly
ReplyDelete
Replies
Richard BJanuary 14, 2016 1:09 PM
A clear longer-term strategy for the not-unrelated ..
- Storage Class Memory (bye bye disks)
- In-memory OLTP (see e.g. M$ Hekaton)
ReplyDelete
Replies
umlcatJanuary 14, 2016 6:50 PM
What about something already existing ?

An alternative syntax for "INSERT" command similar to "UPDATE"

INSERT INTO ThisTable
SET
SomeKey = 77,
SomeField = 'Hello World'

Cheers.
ReplyDelete
Replies
Alfonso VicenteJanuary 15, 2016 12:17 AM
My whish list contains many of the improvements mentioned above, plus flashback queries (e.g. select ... as of 5 minutes ago). Since PostgreSQL has MVCC, I think this feature can be implemented and should work althought an involved dead row has been deleted by a vacuum (error can be something like: flashback query has no visibility for the required timestamp).

This kind of features turn transactions "oops-safe" :-)

Another big dream in my whish list, not related to the PostgreSQL core, is an Apache mod_plpgsql module (capable of execute a function from http), in base of which can be implemented a database-centric development framework. With that, any SQL/plpgsql developer in the world becomes a super-productive Web developer

ReplyDelete
Replies
Jeremy PalmerJanuary 15, 2016 1:06 AM
Incremental refresh of materialised views which does not need a full re-computation of the view. It would be nice if it could deal with nested materialised views, joins (LEFT, INNER) and some windowing and aggregate functions (e.g row_number, count, min/max, sum, string_agg). This would really reduce the time to produce analytics from large warehouse datasets that only change a small amount on each update.
ReplyDelete
Replies
AnonymousJanuary 15, 2016 7:37 AM
Reduce per row overhead from 24b to probably less than 4b
ReplyDelete
Replies
AnonymousJanuary 15, 2016 5:09 PM
Configure the server to retain a snapshot of the database that is at least x minutes old (like a rolling replication slot) and offer the ability to revert a table (or set of tables) back to a specific trans ID.
ReplyDelete
Replies
RobertJanuary 15, 2016 6:19 PM
Loosening some of the conditions in https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
ReplyDelete
Replies
AnonymousJanuary 15, 2016 7:42 PM
#3: Connection pooling on the server side is a hack. Clients should always do their own pooling because even if the server has a pool, the TCP handshake is an unavoidable delay. Client based pools skip the TCP handshake when a connection is reused. Therefore, server based pools are always inferior to client pools. No hacks please.
ReplyDelete
Replies
AnonymousJanuary 17, 2016 8:23 AM
I woluld like to see sql hints. I konow that PG should choose best query plan based on available stats. Sadly, we do not want (or could) make such stats because of high data volume we manage. There should be a way to manually improve queries when needed.
ReplyDelete
Replies
AnonymousJanuary 18, 2016 1:00 PM
Full support for RANGE within WINDOW FUNCTIONS would be really helpful. There is no elegant and efficient way in PostgreSQL to write a query that references a window spaced by defined periods of time.

PostgreSQL documentation states that the "frame_clause" in "window_definition" has the following limitation: "The value PRECEDING and value FOLLOWING cases are currently only allowed in ROWS mode".

That is supported by many other DBMS e.g. in Oracle the following works:

SELECT amount * 100 / AVG(amount) OVER (PARTITION BY client_id ORDER BY purchase_date RANGE BETWEEN INTERVAL '1' YEAR PRECEDING AND INTERVAL '1' YEAR PRECEDING)
FROM purchases
ReplyDelete
Replies
Nathan ThomJanuary 21, 2016 7:15 PM
Create index with parallel processes! We want to use Postgres for some very large BI warehouses, but the index creation is seriously hurting. 9.5 was a huge win for performance, but parallelism is what we really need.
ReplyDelete
Replies
AnonymousFebruary 09, 2016 1:13 PM
I would realy like to have a option to hint postgres that a table should be hold in memory. I mean not temporary tables. If a update occur the data should be written to disc. Would be realy nice to have such a feature.
ReplyDelete
Replies
Isn't ManMay 14, 2017 7:31 AM
PosgreSQL is proving i's growing , maturing and its strong is enoug against existing giant, Oralce or any. I have a chance bidding a project proposing 9.5 against Oracle where users 's very impressive the ability our cloud app with PosgreSQL can offer instead of Oracle. This is a huge challenge in Thailand.
Keep the good work PosgreSQL.
ReplyDelete
Replies

Add comment