Friday, October 07, 2011

Index-Only Scans: We've Got 'Em

Tom Lane committed a patch for index-only scans by myself and Ibrar Ahmed, which also incorporated some previous work by Heikki Linnakangas, after hacking on it some more himself. Woohoo!

There is, of course, more work to be done here - performance fine-tuning, cost estimation, extensions to the core functionality - but the core of the feature is now in.  If you get a chance, please test it out and let us know how it works for you.

For those that may not have been following along at home, what we're essentially doing here is allowing any index to act as a "covering index".  If all of the columns the query needs are available from the index tuple, we'll skip fetching the corresponding heap (table) page if every tuple on that page is visible to all running transactions.

Although I know we're not even really done with this feature yet, I can't help wondering what's next.  Index-only scans have so often be cited as "the big performance feature that PostgreSQL is missing" that it's become something of a cliché.  Now that we have them, what will take their place as the next big thing?


  1. awesome. thank you! any chance applying the patch to 9.1?

  2. @ Slava sorry, there's no chance of that. It's a new feature, and 9.1 has already been released.

  3. Absolutely. Fantastic work Robert, Ibrar, Heikki, and of course, Tom.

  4. This fantastic news! Great work. The Postgres hits keep on coming. Thank you.

  5. Next missing feature, that should be processed, is more DBA friendly partitioning.

    very nice


  6. I agree with Pavel. More friendly partitioning would be very welcome.

    Thank you for this feature.

  7. Great, well done!

    +1 for partitioning

  8. Nice Work, Also agree with partitioning.

  9. Woo! Does this mean COUNT(*) is cheap? Or is this necessary, but not sufficient for that?

  10. Including non-indexed columns in an index (a la SQL Server's INCLUDE semantics) would make index-only scans very powerful.

  11. Other missing features, that should be processed, Parallel Query and In-Database Compression.


  12. @hugo: What is in-database compression? PostgreSQL already has has one type of compression. Automatic compression of large TOAST.

  13. @Jay Levitt: This patch only makes index use more efficient; it doesn't extend the use of indexes to new situations. So COUNT(*) won't use an index unless it has an indexable WHERE clause.

  14. Robert great work. I got used to having index only scans in Oracle so having this in my toolbox on PG is a welcome addition.

    Another poster does bring up a good point though, what is the incremental work to stop having to do FTS' just to get a count. I can't tell you how many (times a day) that I stub my toe on that.

    I'm no expert on PG's internals but it seems logical that if you can scan the index to retrieve consistent data that, unless the index is partial, that you could use it to get a faster count(*) than scaning the table.

    Thanks again for all your great work!

  15. For the next pig thing, I have heard noise about not spreading a query across all available processor? (no first had knowledge, just mentioning a complaint I heard)

  16. Random Friday wish list: Inlining read-only PL/pgSQL functions, LATERAL support, pushing predicates down into CTEs, something like NATURAL joins but supporting Rails-style foreign key naming (users JOIN posts ON posts.user_id =

  17. So pleased to see such a valuable feature in my favourite DB.

    My wish-list to Santa (c/o Robert Haas et al.) would ask for proper transaction handling (begin/commit/rollback/set isolation level) in pgplsql functions, and integrated indexes.

    No hurry... ;-)

  18. Any plans for enforcing / allowing only index-only scans (i.e. if not all columns in the query have index then just return error)?

    This is *THE* feature that I absolutely love about Google App Engine's datastore and it would be great to see it in PostgreSQL as well!

    Thanks for all the hard work :)

  19. I agree with real partitioning as being very useful.

    Although for many cases if clustering were kept updated (not necessarily in adjacent blocks) it would be as good or better for some workloads. Keeping record clustered in chunks could reduce IO by a lot, although it might not give the clean sequential scans full clustering does.

    I'd be interested in contributing 1k to any kind of bounty for this feature. Email me at dave at oneit dot com dot au if anyone tries to organise this.

  20. Gunther SchadowJune 10, 2012 10:13 AM

    Nice, we've made the "think indexes" work, but I'm sure what Tom Lane put in is much better.

    I think, while we are at indexes, the next thing would be index compression -- not in the sense of block compression, but simply in not repeating to store the data for the front part of the composite index key.

    a b row
    1 1 1
    1 2 2
    1 3 3
    1 3 a
    1 3 b
    1 3 c
    1 3 d
    1 4 4
    2 1 5
    2 2 6
    2 3 7
    2 4 8

    would become

    a b row
    1 1 1
    - 2 2
    - 3 3
    - - a
    - - b
    - - c
    - - d
    - 4 4
    2 1 5
    - 2 6
    - 3 7
    - 4 8

    where "-" would require no space. If you have this, there would be no need for the non-1NF arrays any more to implement things like text indexing.

    While we are at it, another nice feature would be constant columns that are stored only in metadata. This works well together with partitioned table keys. Example could be the "discriminator" column in Hibernate. If you partition table by Hibernate discriminator, why store that same value over and over again?

    Less redundantly stored data -> less IO. And block compression does not really address that and has too many difficulties still.

  21. another "next big performance big thing"?
    How about an auto compressing column oriented data store backend :)

  22. A "next big thing" for performance?

    How about a column oriented data store backend for data analysis applications :) ?

    many thx! for the index only scans !!!