Task Based Access Control in the Cloud
The computer industry has been managing access at a role level since as early as the 1970s. The upside to role based access control (RBAC) is the low operational cost to implement and maintain. The downside is that it can result in permission leakage, as the least privileges to perform a role may be more privileges than required to perform a task. For most of the last couple decades, this trade off of permission leakage for low management cost has been acceptable. But with the shift to cloud computing, I’m beginning to reconsider RBAC.
RBAC Example
To explain my concern, let’s start with an example of Tim, a server engineer responsible for supporting several applications in production operation team. On a regular basis, Tim needs to access and make changes to the production server infrastructure. Using a typical implementation of RBAC, Tim has been granted full rights to the production server infrastructure - but no rights to any of the non-server infrastructure (i.e. physical infrastructure, networks, databases, applications).
Using RBAC, we have granted Tim the least privilege for his job function as a server engineer - but the most privileges required within his job function. For example, over the course of a given work week, Tim may access the server infrastructure 100 times, make server modifications 2 times, and deploy or remove infrastructure 1 time. But due to Tim’s role, we have granted him the maximum set of privileges required to perform his job function throughout the week - not the least privileges required to perform a given task.
What Changed?
Managing access control at a more granular level than a role is hard, so there has been little incentive for us to consider alternatives. But several things have changed in the cloud:
- All infrastructure can be provisioned and unprovisioned remotely - Compute, storage and application services are virtualized, and can be changed remotely using APIs calls. Even characteristics that were formerly physical, such as memory, storage, and compute, can be changed with a single API call.
- Infrastructure is transient - The lifetime of a node in a cloud architecture may be days or weeks, as techniques such as blue-green deployments promote the use of deploying new nodes instead of upgrading existing ones.
- Infrastructure is code - The investment in automation required to manage at scale in the cloud has resulted in the ability to rapidly make complex changes across widely distributed infrastructure. The same code that makes it easy to provision and deploy a complex application cluster in minutes, can also just as quickly tear down that cluster.
The result is that least privileges to perform a role in the cloud can result in very powerful and dangerous permissions being held by a few individuals. But if not RBAC, then what?
Solution: Task Based Access Control (TBAC)
I’ve been giving serious consideration to task-based access control. With TBAC, the least privileges granted to an engineer supporting production infrastructure would be none, and either a rule-based system (i.e. response to an alarm) or human being (i.e. manual approval of a ticket) would grant additional privileges. The challenge is that: 1) there are few available turnkey systems available to manage TBAC, and 2) it requires that cloud vendors provide the necessary infrastructure for temporary authentication (i.e. token-based authentication).
In the next few years, I expect a gradual shift away from RBAC, driven by a need to provide better security for our dynamic cloud-based infrastructure.
Related Posts: What To Do In Response To Code Spaces, Managing Multiple Read Only AWS Privileges In a Decentralized Environment