Kubernetes resource utilisation in the AI era

Cast AI’s 2026 State of Kubernetes Optimisation Report examines Kubernetes resource utilisation and identifies areas of inefficiency in GPU, CPU, and memory usage across clusters.

Monday, 27th April 2026 Posted 1 month ago in Cloud Infrastructure + Hardware by Sophie Milburn

Cast AI has recently published its 2026 State of Kubernetes Optimisation Report, providing an analysis of GPU, CPU, and memory utilisation across non-optimised Kubernetes clusters. The report is based on data from tens of thousands of clusters and offers insights into how resources are being used.

One of the key findings is the underutilisation of GPUs relative to their cost, with average usage recorded at 5%. The report notes that expected efficiency gains associated with Kubernetes at scale are not consistently being realised, and that there is a widening gap between organisational spend and actual resource utilisation.

While Kubernetes adoption continues to grow across organisations, the data shows relatively low resource utilisation. In 2025, average CPU utilisation across clusters was 8%, and memory utilisation was 20%.

The report also highlights the increasing use of GPU-equipped nodes as organisations run more AI and machine learning workloads on Kubernetes. Despite this, GPU utilisation remains low at around 5%, indicating a level of unused capacity within AI infrastructure investments.

As organisations expand their use of AI, the report points to the need for ongoing optimisation to manage GPU usage more effectively and reduce waste within cloud environments.

Cast AI also identifies a common limitation in Kubernetes environments: configuration is often treated as a one-time activity at deployment. However, workloads and traffic patterns change over time, meaning initial configurations may become less effective. The same applies to areas such as Spot Instance selection, autoscaler configuration, commitment usage, and node lifecycle management, all of which require ongoing adjustment that can be difficult to maintain manually at scale.

Kubernetes resource utilisation in the AI era

Cast AI’s 2026 State of Kubernetes Optimisation Report examines Kubernetes resource utilisation and identifies areas of inefficiency in GPU, CPU, and memory usage across clusters.

Rovo and the rise of the AI-native organisation

SailPoint launches AI-powered agentic acceleration methodology

The data infrastructure roadblock in scaling AI

Bull and Foxconn partner to expand AI manufacturing capabilities in Europe

Vista Equity Partners and Cambium Capital launch enterprise AI inference cloud

VAST data partners with Megaport to enhance AI infrastructure

Cambridge, AMD and Dell launch sovereign AI innovation lab to advance UK research

AI investment outpacing infrastructure readiness in the UK