• Feb 22, 2022
This post originally appeared on https://marcusnoble.co.uk/.
If you've managed multi-user/multi-tenant Kubernetes clusters, then there's a good chance you've come across RBAC (Role-Based Access Control). RBAC provides a strong method of providing permissions to users, groups, or service accounts within a cluster. These permissions can either be cluster-wide, with ClusterRole
, or namespace scoped, with Role
. Roles can be combined to build up all the rules stating what the associated entity is allowed to perform. These rules are additive, starting from a base of no permissions to do anything, building up what is allowed to be performed and there's no syntax to take away a permission that is granted by another rule.
Generally, and by default, operators of the cluster are assigned to the cluster-admin
ClusterRole. This gives the user access and permission to do all operations on all resources in the cluster.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-admin
labels:
kubernetes.io/bootstrapping: rbac-defaults
rules:
- apiGroups:
- '*'
resources:
- '*'
verbs:
- '*'
- nonResourceURLs:
- '*'
verbs:
- '*'
There's very good reason for this, an admin generally needs to be able to setup and manage the cluster, including the ability to define and assign roles. But what if we need to block an action performed by cluster admins? We can't do it with RBAC, it only allows for adding of permissions, not taking them away.
Well, recently at Giant Swarm, we had an issue in one of our CLIs used by cluster admins that incorrectly deleted more resources than intended. This was causing issues authenticating with the cluster, and we needed to get a fix in place fast. The problem was we had no control over when people would update their CLI even once we'd released the fix.
We couldn't hot-fix this by updating RBAC rules as we couldn't subtract the specific permission, but what we could do was leverage an admission controller to block the request to the API.
At Giant Swarm, we're pretty big users of Kyverno (which deploys an admission controller) and use it for a lot of validation and defaulting of resources in all our clusters. Not only can Kyverno validate and block based on a resource's values but it can also validate the operation being performed, and by who.
With this functionality available, we were able to deploy a cluster policy that would block cluster-admin (and anyone else) from performing the specific delete all action that was causing us issues.
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: block-bulk-certconfigs-delete
annotations:
policies.kyverno.io/title: Block bulk certconfig deletes
policies.kyverno.io/subject: CertConfigs
policies.kyverno.io/description: >-
A bug in an old kubectl-gs version causes all certconfigs to
be deleted on performing a login, this policy blocks that
action while still allowing the single delete.
spec:
validationFailureAction: enforce
background: false
rules:
- name: block-bulk-certconfigs-delete
match:
any:
- resources:
kinds:
- CertConfig
preconditions:
any:
- key: ""
operator: Equals
value: ""
validate:
message: "Your current kubectl-gs version contains a critical bug, please update to the latest version using `kubectl gs selfupdate`"
deny:
conditions:
- key: ""
operator: In
value:
- DELETE
There are a few things going on here so I'm going to explain each bit individually.
First up, we specify the Kind of resource we want this policy to apply to, in our case, that's CertConfig
but it can be anything within the cluster. It's also possible to target specific groups and API versions if needed.
match:
any:
- resources:
kinds:
- CertConfig
Next up, we're setting a precondition that allows us to do some filtering based on the details of the request. In this instance, we're only interested in requests that don't have a name specified (this is the case when operating on a list of resources rather than a single resource) which allows us to target only those 'delete all' requests.
preconditions:
any:
- key: ""
operator: Equals
value: ""
Finally, we have the actual validation rule. We're specifying a deny
rule that'll block requests (matching the previous match
and preconditions
) with the DELETE
operation. We're also able to define a message that'll be returned to the client as the reason why the admission controller rejected the request. We're using this to inform users about the bug and encouraging them to upgrade.
validate:
message: "Your current kubectl-gs version contains a critical bug, please update to the latest version using `kubectl gs selfupdate`"
deny:
conditions:
- key: ""
operator: In
value:
- DELETE
With this single policy deployed to our clusters we've been able to block the bug in our CLI, even when the user performing the action has cluster-admin level permissions.
While we needed this to work around a bug in our client, there are other situations where this approach could be useful.
do-not-delete: "true"
annotation on it to prevent accidental deletion of critical resources (such as persistent volumes or secrets).So, if you haven't already, I recommend taking a look at Kyverno and especially taking a look at the example Policies that they have for an idea of what can be done.
These Stories on Tutorial
Explore K8s on VMware Cloud Director and uncover how it bridges on-premises infrastructure with cloud native innovation with Giant Swarm.
Unleash Arm's potential with Giant Swarm's guide — Empowering Kubernetes on ARM.
External Secrets Operator (ESO) integrates external secrets services with Kubernetes, providing a convenient way to retrieve and inject secret data as Kubernetes Secret objects.
We empower platform teams to provide internal developer platforms that fuel innovation and fast-paced growth.
GET IN TOUCH
General: hello@giantswarm.io
CERTIFIED SERVICE PROVIDER
No Comments Yet
Let us know what you think