How to Audit and Secure AKS Access
How to audit actions in your AKS Cluster
The What
In this post I want to improve your AKS access control implementation by demonstrating the following:
- What the Local Account is in AKS and why we want it disabled
- What integrating with Azure Active Directory means and how that enables us to apply RBAC within the cluster
- How we can use the AKS Diagnostic Settings to log all the events generated in the cluster and who generated them (meaning we can identify who is creating, updating, deleting resources in the cluster).
The Why
One of the guiding principles of applying a Zero Trust security strategy is to implement least privilege access - meaning users and applications should only be granted access to data and operations that are required for them to perform their jobs. Additionally, it is valuable to audit what users and applications are doing to determine whether there is an identity that is overprivileged.
The How
Local Account
Do you know what the Local Account is with Azure Kubernetes Service? This account is a shared account and by default it will have cluster admin privileges.
Let's start from a default cluster that you would deploy from the portal with no additional configurations, just something quick where we can focus on the identity portion. This is going to set us up to talk about what exactly the Local Account is in AKS:
When you run az aks get-credentials --resource-group <RESOURCE_GROUP> --name <AKS_CLUSTER_NAME>
that will download a kubeconfig file that gives you access to the cluster as a cluster admin using a certificate signed by the internal Cluster CA, meaning you can do anything within the cluster and call all APIs. And furthermore, every user that runs this command is functioning as the same identity, so there's no way of knowing who is actually doing what in the cluster.
We can see this by using a kubectl plugin tool called the rbac-tool. Once we get the kubeconfig file and connect to the cluster, we can see who we are authenticated as and the associated Kubernetes role we've been granted. Notice how we have the cluster-admin role granted for this identity:
Take note that at this stage, if you wanted to give users access to AKS, the two Azure RBAC roles "Azure Kubernetes Service Cluster Admin Role" and "Azure Kubernetes Service Cluster User Role" are the same - they both give you permissions to call az aks get-credentials
and merges the access credential in your local kubeconfig file. Here's a note from the doc reference regarding this:
Clearly this isn't ideal and doesn't align with the Zero Trust principle of using least privilege access. We really need a way to limit access to different resources and namespaces in AKS according to who is accessing the cluster. That will allow us to limit access by user identity or their group membership in AAD. By integrating AKS with AAD, we can go down that path.
Integrating with Azure Active Directory
Let's continue to use our default cluster and upgrade it to leverage AAD for authentication. To prove out our work, I have one admin user (hshahin) and another user (test-user). hshahin is an admin and will be part of the admin group we will provide when upgrading AKS to use AAD. test-user is only given permission "Azure Kubernetes Service Cluster User Role" at this point.
I'll run the following to upgrade the cluster to use AAD (this will continue using Kubernetes RBAC):
Once that's complete, you should see in the portal that the configuration now shows we're integrated with AAD and leveraging the basic/default Kubernetes RBAC. Take note that the "Kubernetes local accounts" is still checked, so we'll explore that in a bit.
Let's try to understand what this means for how we can access AKS. On this first terminal shown, I'm logged in as hshahin, who is part of the Admin Group. Notice how when I run the usual az aks get-credentials
nothing looks different, but then when I run kubectl get pods
I get prompted to authenticate with AAD as if I'm signing into the Azure CLI:
Once I login, all works as expected. Let's use the rbac-tool
plugin to learn more about my identity. You can see that I'm logged in to AKS as the following user with the following group permissions. Notice how all the group IDs that I'm a member of within AAD appear as well:
Let's focus on the 66992c59-... group since this is the object ID of the group I told AKS to make the admin group. If I run a kubectl rbac-tool lookup
command with that group ID I get the following, showing that it is assigned the cluster-admin role:
Now, let's try with the test user. Right now, the user has no permissions at all to AKS, while the hshahin user was part of the admin group:
I can't run az aks get-credentials
command, and this is because I still need to be assigned to at least the Azure Kubernetes Service Cluster User Role to pull down the kubeconfig file:
From there, I can then do the pull. However, the kubectl get pods
denies me which makes sense because I have not been assigned any roles within the cluster itself through Kubernetes RBAC:
And from here, it's a matter of working with Kubernetes RBAC or Azure RBAC to grant this user the necessary roles to execute operations within the cluster. I won't dive into the "how" for RBAC with this post, but you can see we're now setup to apply least privilege based on the users or groups accessing AKS, versus before we had no concept of identity or group because we were using a shared credential.
Here are good references on implementing Kuberenetes RBAC with AAD and Azure RBAC for AKS.
Disabling the Local Account
Now let's return to the local account still being available. Notice how when I switch back to the hshahin user, I can run the az aks get-credentials
command with --admin
flag:
When I run that, I get back the local clusterAdmin account. Effectively the --admin
is a backdoor to me becoming a cluster admin and bypassing the AAD authentication process and associated roles implemented through RBAC.
Once you disable that local account (either through the portal or the CLI) you will find that you can no longer use the --admin
flag:
Audit Log for AKS
So far we've seen why the local account is bad and how integrating with AAD unlocks our ability to get permissions applied to specific users and groups. Finally, we want a way to audit the operations different users and groups are performing so we can periodically review that we are in line with our intended RBAC implementation.
Let's navigate to the Diagnostic Settings for the cluster where we can configure a rule to store platform logs from the control plane of AKS. The full reference of what each table provides is here. We really care about seeing which identities are creating, updating, deleting etc. in the cluster. While get/list operations may be necessary, take note that they significantly increase the number of logs captured. AKS conveniently provides the kube-audit-admin
table to remove reads so we can focus on the operations we care about:
Once this is turned on, we can review the logs and develop a Kusto query to review overall operations on the cluster and the identity who performed that operation.
Here is a sample query that provides a good starting point for understanding the operations being executed against the cluster. We will use this as part of our test to see what the hshahin user has been doing in the cluster:
AzureDiagnostics
| where Category == "kube-audit-admin"
| extend event = parse_json(log_s)
| extend HttpMethod = tostring(event.verb)
| extend ResponseCode = tostring(event.responseStatus.code)
| extend Authorized = tostring(event.annotations["authorization.k8s.io/decision"])
| extend User = tostring(event.user.username)
| extend Groups = tostring(event.user.groups)
| extend Apiserver = pod_s
| extend SourceIP = tostring(event.sourceIPs[0])
| project TimeGenerated, Category, HttpMethod, ResponseCode, Authorized, User, Groups, event
To test this, let's create a pod using a kubectl run
command:
From there, let's head over to the logs to run the query and see what we find:
We can see the activity that hshahin is using. The query projects helpful columns and extracts additional information from the event json object.
Now let's see how this looks from an unauthorized user - we will use the test-user account. Right now the test-user has no permissions to create a pod:
If we view what this looks like in the logs, we should expect to see the "Authorized" column shows us forbidden. That column is extrapolated from a kubernetes annotation applied to the event and we project it as a column to make the query easier to view:
Summary
I'm hoping this helps to improve your security posture with AKS. These are a few simple configurations to enable that get you much closer to applying least privilege and ensuring your cluster is being accessed in a controlled manner. There is still other concepts to explore around this topic, including the creation of RBAC roles and also thinking through securing Service Accounts and traditional Service Principles that we might apply to CI/CD and automation workflows that interact with the cluster.