Comments on Robert Haas: PostgreSQL's Hash Indexes Are Now Cool

Thanks so much for this guys. Just wanted to add ...

2018-01-14T00:48:35.132-05:00

Thanks so much for this guys.

Just wanted to add some context about unique hash indexes. For a variety of reasons, mostly around federation and distributed computing, our product is moving toward use of random uuids as the primary keys for our domain objects -- so i'd love to see unique hash indexes supported by pg.

in my very simple and unscientific microbenchmarking using pgcrypto, inserts of random uuids are about 25% faster with hashes than btrees - on pg9.6! I expected this as I understand that btree performance is a lot poorer than hashes on random data, simply because the cache footprint is so much higher as the tree is traversed. In addition, the size of the hash index was actually bit lower for hashes. Again, 9.6. (Yeah I'll upgrade to 10 soon enough :)

We used hash indexes extensively when working with a previous database server in the 90s for just this reason; and it's extremely rare for us to use range queries on primary keys. If I could, I would make all primary keys in our database use hash indexes.

Sadly I don't have the PG chops to add this functionality myself. :-(

I don't know of a plan to implement that featu...

2018-01-05T14:35:04.500-05:00

I don't know of a plan to implement that feature. I think it would be cool if someone did.

Mailing list discussion here: http://postgr.es/m/6318fb86-0a64-61e7-e4e2-714db2b3407a@anastigmatix.net

This is great news. I went to try it out to make u...

2017-11-30T04:01:45.490-05:00

This is great news. I went to try it out to make unique field indexes smaller. Like Uuid and integer primary keys that don't need range support. Sadly it returned an error that unique is not supported. Is there any plan to implement that? It would save lots of space for larger unique key indexes and also be fast to lookup in the case of foreign keys

Please go ahead!

2017-10-07T15:01:29.583-04:00

Please go ahead!

Hello,I want to translate this article into Chines...

2017-10-03T21:17:49.582-04:00

Hello,I want to translate this article into Chinese, and share it on the web.Of course, I will clearly mark the author and some necessary information.Would you like to authorize?

Thanks. As a default behavior, I think it's b...

2017-09-29T14:18:29.114-04:00

Thanks. As a default behavior, I think it's better to always hash. The performance cost probably isn't much, and it avoids problems if your UUIDs are less than random (which is quite possible they are anything other than v4 UUIDs, the only kind that are randomly generated).

However, if for a particular application you really want some other behavior, you could create a custom hash operator class that defines the hash function in any way you like - for example, you could make it a function which just extracts the first 32 bits. This would take a handful of lines of SQL and a small C extension module, but no core changes.

If anyone decides to try it out, I'd be interested in hearing how the performance compares with the standard uuid_ops.

most excellent post. for uuid/digest any thoughts...

2017-09-29T07:23:17.970-04:00

most excellent post.

for uuid/digest any thoughts on using the first 32 bits as hash value, instead of hashing the uuid?

Excellent post, Robert!

2017-09-28T09:06:48.395-04:00

Excellent post, Robert!