About 9 months ago, Peter Eisentraut reported on the results of a UserVoice survey he had set up. The purpose of the survey was to gather information from PostgreSQL users on which features they were most interested in seeing added to PostgreSQL. Of course, to what extent the features that got high marks in that survey represent the real interests of the community is questionable: certainly, not every PostgreSQL user (or potential user) reads Peter's blog. Still, the results seem to indicate a lot of interest in materialized views.
Recently, Pavel Baros made a proposed a possible implementation of materialized views on the pgsql-hackers mailing list as a Google Summer of Code project. What he's proposing to do is just about the simplest possible implementation of materialized views I can imagine, comprising just two commands:
CREATE MATERIALIZED VIEW name AS SELECT ...
ALTER MATERIALIZED VIEW name REFRESH;
Of course, right away this opens a can of worms. Basically, there are two things that you might mean when you talk about a "materialized view":
1. A materialized view that is always up to date. In other words, whenever the underlying tables are updated, there is some incremental-update process that concurrently modifies the materialized view data so that it remains up to date also. The tricky part is that it's not easy to update a materialized view of an arbitrary query for a reasonable cost. If the query is complex, the only feasible way to keep the materialized view up to date may be to re-execute the entire query whenever any of the underlying tables are modified (or when any of the functions or operators are redefined), which is probably not feasible for most people. But the plus side is that, when a cheap incremental update IS possible, you don't really need to know that you're working with a materialized view at all. It's indistinguishable from a regular view, up to performance.
2. A materialized view that isn't always up to date. This would include the proposed implementation, which is manual-refresh-only, as well as, for example, views that are refreshed on a regular schedule. (It's pretty easy to convert manual-refresh-only to refreshed-on-a-regular-schedule if you have access to cron!) Materialized views of this type are possibly useful in cases where incremental refresh isn't easily done, such as aggregates over a huge data set that produce a much smaller output table. Of course, you can do this kind of thing today using triggers, but maybe it would be nice to have something in the database that would do it for you automatically.
In the end, I suspect we'll want to support both of these features in PostgreSQL, but I'm curious which one people think is more useful. Our initial implementation of either feature is likely to be a bit rough around the edges, so don't assume we're going to provide all the optimizations we'd ever like to have on day one!