Monday, November 29, 2021

Collation Stability

When PostgreSQL needs to sort strings, it relies on either the operating system (by default) or the ICU collation library (if your PostgreSQL has been built with support for ICU and you have chosen to use an ICU-based collation) to tell it in what order the strings ought to be sorted. Unfortunately, operating system behaviors are confusing and inconsistent, and they change relatively frequently for reasons that most people can't understand. That's a problem for PostgreSQL users, especially PostgreSQL users who create indexes on text columns. The first step in building a btree index to sort the data, and if this sort order differs from the one used for later index lookups, data that is actually present in the index may not be found, and your queries may return wrong answers.