Kubernetes Featured

Securing Kubernetes with Microsoft Defender for Containers

This post dives into Microsoft Defender for Containers and how to secure Kubernetes

Haitham Shahin

05 Aug 2024 • 11 min read

Understanding how to secure Kubernetes and Containers is a challenging topic for any group of platform engineers responsible for deliverying Kubernetes as a service. The threat matrix is confusing, there are many layers that need to be observed and assessed, and worst of all is there is significant tool sprawl in this area - and it can be unclear exactly what capabilities are necessary and provided per tool.

To me, that's where a tool like Microsoft Defender for Containers really fits well: it's a few clicks to enable in your environment, it reports back against the threat matrix in a way where you aren't required to understand everything ahead of time, it covers all layers from vulnerability management to runtime/node level protections, and it's capabilities can be brought back on-prem and across other cloud environments.

If time is of the essence and you need a way to get started on securing your Kubernetes environments, Defender for Containers is a great fit. This will be the first post of a handful of blogs covering all the details. I'll be starting here with walking through all the key points and capabilities, specifically focusing on hardening Azure Kubernetes Service (AKS). In future posts, we'll then cover how to bring this to other clouds and Kubernetes clusters.

Executive Summary on Defender for Containers

The key points on Defender for Containers are shown below.

Pricing

Defender for Containers is priced based on the number of cores running across your clusters that you are hardening. It is $7/VM core/Month.

That covers all capabilities and includes image scanning for 20 unique images for every core covered. Unique here means image digest, so if I have an app:v1 image and then push app:v2 those are two unique images that count against my limit.

To put this into context, in my subscription I have 2 AKS clusters that together have 24 cores running, so I'll be paying $7/month*24 = $168/month. That covers me for 480 unique images that I can push and have continuously scanned.

1) Agentless Vulnerability Management of Container Images and Running Containers

Any image pushed to ACR will be scanned for CVEs. Any image running in AKS that is pulled from ACR will be scanned and trigger an alert if a new CVE is found. "Agentless" here means nothing is deployed to your environment to get this scanning.

2) Agentless Discovery and Control Plane Hardening

Defender will read your Kubernetes configurations, inventory, and audit log to detect any threats or risks that make you vulnerable and allow you to run security posture management queries. "Agentless" here means nothing is deployed to your environment.

3) Defender Agent for Runtime and Node Protection (also called Defender Sensor)

This is a set of pods deployed to each node to detect and log runtime and node level threats. No manual deployment required, Defender will deploy the pods to the cluster once the capability is turned on.

4) Azure Policy for Compliance and Governance

This is a set of pods deployed to the cluster to audit and enforce compliance with Kubernetes deployment and configuration best practices. Ensuring your containers are not running as root or that deployments have health probes configured. Azure deploys this on your behalf once enabled.

Enabling

You can enable Defender for Containers per subscription through the portal as shown below.

First navigate to Environment Settings under the Defender for Cloud page and select the subscription you want to enable.

Next, simply turn on the Defender for Containers line item to get all these capabilities.

For the non-agentless capabilities, ensure the necessary outbound FQDNs are allowed if your clusters are running in internal VNETs. Details are shown below.

Details on Capabilities

There are four key capabilites Defender for Containers brings. We will walk through each of them in depth below.

Agentless Container Vulnerability Assessment

This capability provides vulnerability management for images stored in ACR and running images in your AKS clusters.

When you turn it on, you will be able to ineract with the output of the scans through the UI and through Azure Resource Graph queries. You can select on particular images, view each discovered CVE by severity, and then view known remediations and package fixes:

Remediated Version of Package for High CVE

How Exactly Does it Scan?

Defender for Containers pulls the image from the registry and runs it in an isolated sandbox with Microsoft Defender Vulnerability Management for multicloud environments to extract both OS and Language-specific vulnerabilities. This is why the scanning is considered "Agentless", since nothing is provisioned within your own environment to run the scan. You can even see the pull events in your logs by searching for pull events with the UserAgent of AzureContainerImageScanner.

As mentioned, both OS Packages and Language Packages and Dependencies are scanned and returned in the output.

Image scans are supported even if your ACR has Private Link enabled. Ensure that you allow access to trusted services so that Defender is able to rull the pull and scan your images.

When/How Are Scans Triggered?

The following are the triggers for Defender to scan / rescan an image:

Upon a push/import of an image, a scan is triggered. Results should appear within a few minutes.
Every day, a rescan occurs for any image that meets any of the following:
- Pushed within last 90 days
- Pulled within last 30 days
- Currently running on any Kubernetes cluster monitored by Defender for Cloud (which is known either by the Agentless Discovery or the Defender Sensor)

There is not a way to trigger an on-demand scan. However, in a CI/CD flow you can have an action get the results of the scan through the REST API before progressing to let's say push that same image to higher environment registry. Here is a nice blog post on how to do that.

How Does it Scan Running Images?

This capability is more of a link than a distinct scan. The way it works is your images must be pushed to one of the monitored registries, which is where it gets scanned. Then, as long as the "Agentless Discovery" capability is enabled or the "Defender Sensor" is enabled, Defender will be able to link that image to any image running in a monitored Kubernetes environment.

From there, Defender can show you which running images have vulnerabilities and will continue to rescan those images every day since it sees it is in use.

Running Images Detected with Vulnerabilities Recommendation in Defender

Agentless Discovery for Kubernetes

What this means is that Defender for Containers can query your Kubernetes API server to inventory the resources deployed to your Kubernetes environment (pods, services, namespaces, etc.). It also allows Defender to do Control Plane Hardening, meaning Defender will detect suspicious activity for Kubernetes based on your audit trail or configuration applied to the control plane.

The "agentless" piece here means nothing is deployed to your cluster to get this functionality. As long as you allow trusted access to your cluster (even for a private/internal cluster), Defender will create a managed identity in Azure and assign itself the permission to call your Kubernetes API Server - read permissions only. This all happens behind the scenes but if interested to see the roles it provisions, you can find that information here.

What Type of Alerts are Detected?

As an example, maybe a pod in the cluster was given a service account to be able to spawn other pods to run within the environment. If this container gets compromised and that service account is used for abnormal operations against the API Server, Defender will capture that and create the 'Abnormal Kubernetes service account operation detected' alert. The full list of Control Plane alerts detected can be found here and are prefixed with "K8S_".

What Can I Do with the Collected Inventory?

A really cool capability provided by Defender with this capability enabled is that you can inventory what is deployed in your cluster and ask it questions that help remediate risk. For example, let's say you want to identify any pod/deployment running in your cluster that is running with privileges.

You can build a query using fluent language as shown and get a list of results back with the name of the workload and cluster/namespace in which it is deployed:

Query to Return All Privileged Containers

Defender Sensors (also called Defender Agents)

The defender sensor capability is deployed into your cluster as a DaemonSet (which means there is a pod running on each node with the cluster). These sensors collect signals from the host node running your containers to provide runtime-level protection and alerts.

What Type of Threats Are Detected?

This will detect and alert on suspicious activity at the node and process level. For example, there is an alert that can detect if a process running within a container or on a node is traditionally associated with digital currency mining: 'Digital currency mining related behavior detected'. This type of alert is only possible if you have an agent on the node monitoring running processes, which is distinct from the agentless capability since that only has access to your Kubernetes API Server audit log.

All the alerts that can be detected by this capability are listed here with a "K8S.NODE_" prefix.

The alerts will come up as recommendations anytime they are triggered. The key benefit of enabling the Defender Sensors is that you are getting much deeper insights into any threats in your environment since you are now getting runtime level data.

What Are the "Sensors"?

The Sensors really mean that a Daemonset is deployed into the cluster. All the details and resource requirements of the deployed components are shown below.

As expected the daemonset collecting data from the hosts do have privileged capabilities, otherwise they would not be able to collect and detect malicious processes running on the host. This is a common pattern for any privileged capability in Kubernetes, for example CSI Drivers also run with higher privileges in order to mount storage to host nodes.

The following FQDNs must be allowed outbound for the sensors to send data back to Defender in Azure:

Azure Policy on Kubernetes

Azure Policy is actually an AKS Add-on (and also an Azure-Arc Extension) that can be deployed even outside of the Defender for Container suite. Azure Policy provides a way to:

Audit configurations applied to your cluster against a set of baseline best practices
Deny deployments that do not meet the policies applied to your environment

How is this Different from Azure Policy?

Conceptually it is the same as traditional Azure Policy, but the governance is applied at the Kubernetes level and not at the Azure level. Meaning you can enforce that deployments within your cluster meet your compliance standards.

For example, you most likely want to enforce that your clusters are only running images from registries you explicitly trust. Azure Policy on Kubernetes will report back when in audit mode anything "unhealthy", meaning you are running a container that was not in the list of trusted registries and furthermore if you shift the policy to deny mode, Kubernetes will block any deployment if the image is not coming from a trusted registry.

Convert the Policy to Deny Mode to block Non-Compliant Deployments

What Policies Should We Apply?

There is a full set of built-in Kubernetes Policies and Initiatives already defined here. You can apply these initiatives to distinct clusters or to your subscription and all clusters under that subscription will be covered.

You can also customize or even build your own policies - let's say you want all deployments to include a tag named 'env' with values ranging from ['dev','test','prod']. A custom policy would allow you to enforce that.

I recommend starting with the build-in policies and initiatives as those will get you aligned to standard best practices. From there, you can customize and further refine your organiztion's policies.

What Gets Deployed to the Cluster?

These Azure Policy pods run in a distinct gatekeeper-system namespace and require outbound access to the following FQDNs to report back compliance:

Azure Policy Pods running in gatekeeper-system

Required Outbound FQDNs for Azure Policy Pods

Details on Pricing

The details on pricing are based on the number of VM Cores running in your cluster. When you go to enable Defender for Containers, you should see an estimated cost shown based on the number of reported Cores running:

$7/ Kubernetes vCore/month
20 free images scanned per core covered. Any additional scan beyond that limit is $0.29 per image digest. The 20 free images per core is designed to meet 90%+ of customers needs. The limit is based on prior month's core usage (meaning if last month I had 10 cores, then my free images included is 200 for this month).
You do not need Defender for Serves enabled on the the Kubernetes Cluster Nodes

How Do the Image Scans Work?

The 20 free images per core means that I can push 20 unique images to my registry for a given core and anytime a scan is triggered for those images, I am covered. If my cluster has 24 cores, then I can push up to 480 unique images into the registry per month and every time those same images get scanned (based on the triggers defined ealier) I am covered.

If you're concerned about the scans or want to see a deeper estimation, we can deploy the Microsoft Defender for Containers - Cost Estimation Dashboard as a workbook to get a breakdown bsed on prior usage. It will show the number of scans against the free scans included so you can see how close you are to the free limit.

In Terms of Data, Am I Charged in Any Way?

No - everything is included in the $7/core/month.

Data is not stored/retained/managed in any customer-owned workspace. It is all managed and operated by Defender for Containers and you interact with the data through the recommendations, alerts, and Azure Resource Graph.

We will dive deeper in a future post on how to interact, export, and take action against that data.

Details on Enabling

Enabling Defender for Containers is similar to enabling other Defender plans. You can do so through the portal on a per-subscription basis.

First, navigate to Defender for Cloud and click on Environment Settings. Select the subscription where you want to enable Defender for Containers:

Navigate to Subscription in Defender Page

Next, on the Defender plans page, select On to enable all capabilities for every cluster within the subscription.

If you want to enable only certain capabilities, you can click on Settings under coverage and select the ones you specifically want turned on:

Turn On/Off Settings for Defender for Containers

Summary

Enabling Defender for Containers is easy and quickly provides a lot of value to any team struggling to identify a good path for securing their clusters. As a next step, we will take this and show how it can be expanded to any Kubernetes environment, not just those running as AKS. Additionally, we will go into a deeper dive with how to query the information to produce actionable next steps: alerts, trigger downstream actions, etc.