Understanding Azure AD with Azure RBAC on AKS

A full walkthrough of Azure AD Integration with AKS.

Understanding Azure AD with Azure RBAC on AKS
Photo by Folco Masi / Unsplash

In this post we'll go through what it means to integrate AKS with AAD (Entra) and also how RBAC works when we enable Azure RBAC for AKS.

Step 1 - Understanding the Default AKS Authentication Mechanism

As demonstrated in my prior post 0n Auditing Access to AKS, the default configuration of a Local Account means that as long as I can pull the AKS credential (i.e. running az aks get-credential), then I am a cluster admin from AKS's point of view. Behind the scenes, this populates the kube-config file with a certificate that authenticates you to the cluster. It should look something like the following:

Running az aks get-credentials populates the kube-config file with a certificate when local accounts are enabled

This means that the only "RBAC" you have at your disposal is whether or not an identity can pull the credential - once they have the credential though, they are admins within the cluster itself.

The following two roles allow the list credential action, and when you are using the Local Account configuration, there is no difference between these two roles despite the name of Cluster Admin vs Cluster User:

AKS Cluster Roles for Pulling the Kube-Config

We have only discussed human identities at this point, but also recognize that in this configuration you may also be running az aks get-credentials from a CI/CD pipeline, which means the pipeline's identity would also be cluster admin like a human account.

Step 2 - Integrate Azure AD Authentication with AKS

Let's go ahead and integrate Azure AD with Azure RBAC. We'll work through what this means as it gets setup. As shown below, make sure you don't select the Kubernetes local account since that will give a backdoor method to still login as the prior cluster admin user:

Configure Azure AD Authentication with Azure RBAC. Be sure to not include the local account to avoid a backdoor

Once this is configured, you will see that when the credential is pulled the kube-config file looks different from the local account. It no longer contains the private key used for authentication; instead, we are now using a credential plugin cli tool called kubelogin to obtain a token from Azure AD before a request gets sent to AKS:

kube-config file after setting up Azure AD Authentication

What happens from here is that when you run any kubectl command such as kubectl get pods you will see that by default the kubelogin plugin will first have you authenticate with AAD:

Initial kubectl command after shifting to AAD Authentication

Kubelogin has different login modes which you will need to review so that you can use the proper one for the environment you're working in. For example, by default as you can see in the kube-config file above, it defaults to a devicecode login. However, since I'm already running with the AZ CLI meaning I should already have an access token in my shell, I can run the convert-kubeconfig file to the azure cli login mode, which will use the already logged-in context when obtaining a token to pass to AKS:

Converting to the Azure CLI Login Mode allows for kubelogin to use the logged-in context when making kubectl commands. Notice that no prompt to authenticate appears after running kubectl get pods

Step 3 - Understanding Azure RBAC

Now that we've setup authentication with AAD, let's revisit how to apply RBAC within AKS. As mentioned before, prior to this point there was no default RBAC occurring within AKS (for example, you had no way of specifying that the Dev Team A can only deploy into the Dev-A namespace).

In the prior step, you'll notice that we updated our cluster to use Azure AD Authentication with Azure RBAC. What this means is that we can actually use our traditional Azure RBAC system to grant roles and permissions within AKS itself.

To demonstrate how this works, I have a new user called the test-aad-user that currently has no permissions to the cluster. To prove that point, if I even try to run az aks get-credentials as the test-user, you will find that they cannot pull down the kube-config file:

test-aad-user is not authorized to pull the kube-config since they have not been granted to list the cluster credential

So first step, before even getting to RBAC within AKS, is that I need to allow the test user to list the cluster credential. Keep in mind that this simply sets up the kube-config file, but doesn't "pull" any real credential:

Allow Test-User to List Cluster Credential

Once granted that role, the test-user can then run the az aks get-credentials command. We will also run the convert-kubelogin command to make it such that kubelogin uses the AZ CLI access token already local to the shell.

When I try to run any kubectl commands however, you'll notice that the test-user is forbidden. This is what we expect at this stage. The test-user has no permissions within AKS itself, all the test-user can do is authenticate to the cluster:

The Test-User can pull the credential but not run any commands within the cluster

This is where the innovate part is when it comes to integrating AKS with AAD - instead of only using the traditional Kubernetes RBAC approach (which I can still leverage with Azure RBAC), I can now offload managing authorization with Azure AD for in-cluster RBAC. As described below, you can still use Kubernetes RBAC, but many struggle with it given it's complexity so Azure RBAC often gets you a far way along your desired RBAC goals.

The conceptual flow is that when a user sends an API Request (i.e. a kubectl command) they will first authenticate with AAD if they have not done so - in our case we have since we're using the AZ CLI token login mode - and then we are authorized by AAD for the API request we are trying to run based on the roles we've been granted on AAD for what we can do within AKS. Notice that Kubernetes RBAC has not gone away, so you can actually combine both systems for finer-grained roles. For example, you may use Azure RBAC to designate different AAD Groups to different namespaces, and then within the cluster you can apply traditional Roles and RoleBindings to further define what each team member can execute:

Azure RBAC for Kubernetes authorization flow
AKS Identity and Access

In terms of the roles AAD provides for in-cluster actions, you will find that four roles already exist that you can grant users on either a cluster-wide scope or namespace-scope. We often want to use these at the namespace scope, since that gives you more governance when using your cluster in a shared manner. Notice that each of the roles have RBAC in the name:

AKS Built-In RBAC Roles

Step 4 - Granting Roles to Users to In-Cluster Authorization

Let's grant the test-user Reader first at the cluster-wide scope, and then at a namespace scope.

When defining roles at the cluster-wide scope, I can use the Access-Control blade on the portal as usual:

Access Control Blade on AKS
Assign Reader to test-user. This is applied cluster-wide since in the portal there is no way to scope it to a namespace

Once granted, I can now run kubectl get commands; however, notice from above that the Reader role does not allow the user to read secrets. The example below demonstrates how the test-user can view all namespaces, see pods running in a particular namespace (I have Istio deployed in this cluster), but cannot read the secrets in the istio-system namespace:

test-user can read in the cluster, but not secrets. They will also not be able to deploy in the cluster

Let's take it one step further and say that among the list of namespaces shown from above, I want the test-user to be able to deploy/delete resources within the httpbin namespace. As a reader, the test-user cannot for example deploy, update, or delete pods that currently exist, so we will need to change the role they are assigned:

As a Reader, the test-user cannot modify resources in any namespace, including deletions, updates, and creations

The built-in Writer role is a good role to add for the test-user, so let's see how we can grant that role specifically for the httpbin namespace. For this piece, we cannot use the portal since that would be scoped at the cluster, meaning that the test-user would be Writer across every namespace. We will need to use the AZ CLI if we want to scope the test-user only to the httpbin namespace.

The generic form of granting a role to an identity in AAD at a namespace scope looks like the following. Keep in mind that the --assignee may not only be human identities, you may have a service principal or managed identity in Azure that you want to be able to interact with AKS:

# generic command
# the ADD-ENTITY-ID is the object ID of the identity
az role assignment create --role "RBAC ROLE NAME" --assignee <AAD-ENTITY-ID> --scope $AKS_ID/namespaces/<namespace-name>

For the test-user, the command looks as follows:

# replace OBJECT-ID with object ID of user
# replace SUBID with subscription id where cluster is deployed

az role assignment create --role "Azure Kubernetes Service RBAC Writer" --assignee OBJECT-ID --scope /subscriptions/SUBID/resourceGroups/rg-aks-otlp/providers/Microsoft.ContainerService/managedClusters/aks-east/namespaces/httpbin

After this is applied, let's re-run the commands from above to see what I can do. Keep in mind that I have not yet removed the cluster-wide Reader role, so at this point the test-user has two permissions, Reader at cluster-scope and Writer only on the httpbin namespace:

The test-user can Write on httpbin namespace

As a Writer, I can also create/delete secrets but only in the httpbin namespace. I still can't read/write secrets in other namespaces:

The Writer can create/delete Secrets in the specified namespace only

So far you've seen me only grant roles to users, but everything you've seen here applied the same way to AAD Groups. For example, if I have a set of cluster-admins in a group, I can give them the Cluster Admin RBAC role:

Assigning RBAC Role to AAD Group

Likewise, I can scope a group to a namespace as done above, just make sure you specify as the --assignee the object ID of the AAD Group.

Step 5 - (Advanced) Combining with Kubernetes RBAC

To start this section, I'm going to remove the test-user's cluster-wide Reader role. From there, we'll better understand the interaction with Kubernetes RBAC and Azure RBAC:

Remove Cluster-Wide Reader role for test-user

Once removed, the test-user can no longer see anything outside of httpbin since it still has Writer to the httpbin namespace:

test-user cannot see anything outside of httpbin

Let's now create a Kubernetes RBAC ClusterRole that allows the test-user to read pods across the cluster:

Create ClusterRole and ClusterRoleBinding for test-user

Now as the test-user if I try to see pods in a different namespace, I can do it because of the assigned Kubernetes role above. However, I cannot read the kubectl get ns command because that action was not part of the Pod-Reader ClusterRole defined above:

test-user can get pods, but not read anything else across the cluster

This is an important point to note: Kubernetes RBAC still applies even when we enable Azure RBAC. Kubernetes RBAC can get you finer-grained roles and permissions as shown above - we were able to define a Pod-Reader role which applied at the cluster-scope. This is more detailed than the built-in Azure Reader RBAC role.

You likely can get in a very good state just sticking to Azure RBAC, but I wanted to show this point should you have finer-grained requirements.

Step 6 - Diagnostic Logs for Auditing Access

Beyond the natural security benefit of integrating with AAD for authentication, we also now have the ability to audit and see who is doing what within the cluster through the AKS Diagnostic Logs, specifically the kube-audit-admin table (which is a subset of kube-audit that will ignore gets and lists). In the examples above, you'll notice for example that the test-user at one point made a secret within the httpbin namespace called db-user-pass. Let's see if we can see that in the logs.

First, make sure that you are capturing the kube-audit-admin logs in your Diagnostic Settings:

Enable the Kubernetes Audit Admin Logs

Next, let's run a query in Log Analytics with a few filters to see if we can get to that point in time where we created the secret. You have to work with the Kusto query a bit, but eventually I put enough filters there to narrow down to when I created the secret in the namespace:

Audit Log for test-user creating a secret in httpbin namespace

You can do a lot more with this audit log, here's just one example though of how you can use it to understand what is being modified in AKS and which identity is doing so.

Step 7 - Non-User Identities

Up to this point we've mainly focused on human identities. What happens now with AAD Integration with let's say a GitHub Actions Pipeline? First, the identity of the pipeline needs to authenticate to AAD to call the AKS API Server for deployments. This is done using kubelogin just like we did in our terminal. The following example shows how this can be done securely using AAD Federated Identity in your GitHub Actions Workflow.

Once logged in, Azure RBAC now governs what happens in-cluster, there's no difference from AKS's perspective as to whether we are a human or automation account. We simply need to make sure that whatever identity the pipeline uses is assigned the necessary roles for the actions it plans to take. For example, if it plans to do deployments in the httpbin namespace, we could provide it the Writer role.

One other account to take note of is Kubernetes Service Accounts. These are accounts given to pods running within AKS and grant them access to the API Server. These are not accounts integrated with AAD, they are in-cluster only. Thus, Azure RBAC will not apply for these account - the only way to limit scope of these service accounts is to use native Kubernetes RBAC. The docs on Azure RBAC also call this out:

Azure RBAC for Kubernetes Authorization

Here is a good article on the Kubernetes docs on how to configure service accounts assigned to pods.

Summary

In short, this guide should get you started on integrating with AAD (Entra) and getting more familiar with how to apply RBAC to your AKS Cluster.