Wednesday, December 08, 2021

Surviving Without A Superuser - Part One

PostgreSQL users and developers are generally aware that it is best to minimize the number of tasks performed as superuser, just as at the operating system level most Linux and UNIX users are aware that it's best not to do too many things as root. For that reason, PostgreSQL has over the last few years introduced a number of predefined roles that have special privileges and which in some case can be used in place of the superuser role. For instance, the pg_read_all_data role, new in version 14, has the ability to read all data in every table in the database - not only the tables that currently exist, but any that are created in the future. In earlier versions, you could achieve this effect only by handing out superuser permissions, which is not great, because the superuser role can do much more than just read all the data in the database. The new predefined role allows for a very desirable application of the principle of least privilege.

Unfortunately, the predefined roles which exist in current releases of PostgreSQL do not, in my view, really come close to solving the problem. It's good that we have them, but there are still a large number of things which can't be done without superuser privileges, and even if we make as much progress in the next 3 years as we have in the past 10, we still won't be all that close to a full solution. We need to do better. Consider, for example, the case of a service provider who would like to support a database with multiple customers as tenants. The customers will naturally want to feel as if they have the powers of a true superuser, with the ability to do things like create new roles, drop old ones, change permissions on objects that they don't own, and generally enjoy the freedom to bypass permission checks at the SQL level which superusers enjoy. The service provider, who is the true superuser, also wants this, but does not want the customers to be able to do the really scary things that a superuser can do, like changing archive_command to rm -rf / or deleting the entire contents of pg_proc so that the system crashes and the database in which the operation was performed is permanently ruined.

Furthermore, while the service provider does want to give privileged users from each customer the ability to manage all database objects owned by any user associated with that customer, such users shouldn't have any special ability to mess with objects owned by other customers. Thus, the privileged account for each customer is like a superuser account, but with two restrictions: (1) truly scary operations that would break the whole cluster remain prohibited and (2) the account can bypass SQL-level permissions for some objects but not others, and specifically those where the user who owns the object is from the same customer as the privileged account. A system of predefined roles might eventually be able to accomplish (1), but it can't reasonably achieve (2). While (1) would have value for some users of PostgreSQL even without (2), it would be better to have both.

To achieve (2), it's necessary to provide some way of grouping users together. My colleague Mark Dilger is trying to work out some way of solving that problem that will prove acceptable to the PostgreSQL community as a whole. One idea suggested by Tom all the way back in 2005 and pursued by Mark more recently was to introduce a notion of role ownership. The privileged account for each customer could own all the other roles for that customer. This proposal attracted some objections, I think partly because the idea of role ownership is easily confused with role membership. While there are some differences between the existing concept of role membership and Mark's proposal to introduce role ownership, both members and the owner of a role would enjoy all the privileges of that role, so they would be closely related concepts.

In response to some of the concerns that were raised, I offered a competing proposal: suppose that we introduce a new CREATE TENANT command, and every role is a member of exactly one tenant. To make this work, there would be one predefined tenant to which all roles would initially belong; later, superusers could create other tenants and move roles between them. Then, some roles could be marked as tenant administrators, making them, in effect, superuser with respect to all roles that are members of that tenant and all objects which those roles own. This serves exactly the same purpose as the role ownership scheme proposed by Mark, but perhaps provides a cleaner conceptual separation, because the thing that is used to group roles is not itself a role but some other kind of object. Unfortunately this proposal hasn't attracted much comment so far.

I hope we can converge on some proposal which is broadly acceptable, or which at least nobody hates too terribly much. As glad as I am that we have predefined roles and that they have become more capable over the years, I believe the growing popularity of service provider environments demands something more. Whether that has to include true multi-tenancy or just the ability to distinguish between service-provider-only operations and things that the customer is allowed to do is perhaps arguable, but I think it's possible to solve both problems at once with a relatively simple design, and I think that's better than solving only one of them.

For purposes of this blog post, I have oversimplified just a bit by binning everything together into just two categories: the truly scary stuff that can break everything, and the ability to override SQL-level permissions. The reality is a bit more complex, so I'll talk more about those issues when I get around to writing part two.

5 comments:

  1. Hi Robert,
    > service provider

    This has triggered me. Aren't you afraid that starting building multi-tenancy into PostgreSQL is just taking course onto big iceberg? (rather than keeping current course of being "simple" /KISS in positive UNIX understanding). What are you going to do if one of the tenants requires PITR? (ok, you can assume each tenat is on dedicated tablespace and leave the dirty tablespace PITR to external tools).

    But still PostgreSQL lacks
    - insight into what's happening (are you planning to expose "tenant" into a lot of pg_stat_* views?)
    - quotas (other than per-role connection_limitation) / Resource management/QoS(both CPU and IO) not to mention any more advanced throttling to protect single-threaded checkpointer, bgwriter, walwriter or stuff like "tenant work_mem"?
    - way to transfer the "noisy neighbor" to another silo

    So, isn't the stance of pushing multi-tenancy complexity outside (the OS, VMs, k8) a better alternative than reimplementing the same into the core?

    ReplyDelete
    Replies
    1. That's probably true in many cases, but the cost for having some facilities in PostgreSQL itself is low. Even if you only want to support a service provider with a single tenant, you need some way to distinguish between which users the tenant can control and which can be controlled only by the service provider.

      Mark and I initially thought of having a grantable permission of some sort which would give a particular PostgreSQL user power over all non-superusers on the system, but not over superusers. A kind of sub-superuser, if you will. But it was pointed out in the mailing list discussion that this is not really adequate, because the service provider might want to have less-privileged users for some purposes (e.g. a non-superuser with replication privileges can take a backup) and yet not want the tenant to have power over those. So it seems that we need a way of associating the tenant users back to the tenant administrator, even if there is only one tenant.

      And once you have that, I don't really see why it hurts to also allow multiple tenants. True, it need not be done, and probably most service providers wouldn't use it, but it doesn't really buy anything to nail down the grouping mechanism in such a way that it can only track one group.

      I also agree with your point that this isn't a complete solution to anything. But PostgreSQL rarely adopts a patch that is a complete solution to every problem in a particular area. Incremental progress is the name of the game.

      The merits of a quota system, for example, can be debated separately. My personal opinion on that particular idea is that it would have some use even in a single-tenant environment and would probably be a net positive for PostgreSQL, but also that it's not a particularly high priority since there are other things that would deliver more benefit for the same work. As in all things, I acknowledge that other people have every right to take a different view.

      Delete
    2. Thank you for responding. It makes more sense now. My initial understanding was mainly about chain reaction of even having "multitenancy" minimal feature released (CREATE TENANT as you described). A lot of people would say: "Look! PostgreSQL is multitenant, we can now throw 1k apps onto single DB", DBAs life would be miserable at that point. Maybe it is because "multitenancy" has a little negative connotation in my dictionary and was overhyped(?) - so it's mostly about naming things.

      Delete
  2. Agree with Jakub. The era of app/service level tenancy, with a huge contact (attack!) surface is over. In 2021 we have so many good tools for container/OS/VM level segregation, it's just not worth it.

    ReplyDelete
  3. I some ways, I think what you are proposing is akin to what happened to Linux when containers were introduced. As with many who followed the saga over at LWN, it has been some 20-odd years of lots of interesting changes, and it may have taken a while to generate something useful as a "product" but the outcome was far more useful than anyone could have predicted IMHO. I think something similar would happen in Postgres: it would be extremely difficult, it would touch every aspect of the database, but ultimately it would bring benefits that we can't even think of now. Great article.

    ReplyDelete