vCluster Labs, the company pioneering Kubernetes virtualization, announced its Infrastructure Tenancy Platform for AI to help organizations build and operate high-performance AI infrastructure on GPU-focused compute clusters, including support for NVIDIA DGX systems.
The company’s new Reference Architecture for NVIDIA DGX systems is now available, offering architectural guidance for building secure, scalable Kubernetes environments optimized for NVIDIA AI infrastructure.
The company’s new Reference Architecture for NVIDIA DGX systems is now available, offering architectural guidance for building secure, scalable Kubernetes environments optimized for NVIDIA AI infrastructure. Alongside this, vCluster introduced several new technologies, including vCluster Private Nodes, vCluster VPN, the Karpenter-based vCluster Auto Nodes feature, and direct integrations with NVIDIA Base Command Manager, KubeVirt, and the network isolation controller Netris, all of which form the foundation of the vCluster Infrastructure Tenancy Platform for AI, a unified framework for deploying and managing AI workloads on AI supercomputers in the private cloud as well as on top of hyperscalers and emerging neoclouds.
AI Authority Trend: Voxel51 Leverages NVIDIA to Launch Physical AI Data Engine for Simulations
“Our mission is to make AI infrastructure as dynamic and efficient as the workloads it supports,” said Lukas Gentele, CEO of vCluster. “With our Infrastructure Tenancy Platform for AI, organizations running NVIDIA AI infrastructure can operate secure, elastic Kubernetes environments anywhere, with the performance, control, and efficiency that AI-scale workloads demand. It feels like getting the most cutting edge public cloud managed Kubernetes but on your bare metal AI supercomputer.”
Building Blocks for the AI Infrastructure Era
As enterprises race to operationalize AI at scale, platform teams need a Kubernetes foundation that can manage GPU resources efficiently while ensuring workload isolation, mobility, and security. The Infrastructure Tenancy Platform for AI addresses these challenges through the following key innovations:
- vCluster Private Nodes & Auto Nodes – Enable virtual clusters to dynamically autoscale GPU and CPU capacity across clouds, data centers, and bare metal environments using Karpenter-based automation. These features help maximize GPU utilization while maintaining full isolation and flexibility.
- vCluster VPN – A Tailscale-powered overlay network that establishes secure communication between control planes and worker nodes across hybrid infrastructure. vCluster VPN simplifies burst-to-cloud scenarios, where GPU clusters seamlessly extend from on-premises NVIDIA DGX systems to public cloud environments.
- NVIDIA Base Command Manager Integration – Integrates vCluster with NVIDIA Base Command Manager to bring Auto Nodes to NVIDIA DGX clusters, enabling elasticity, GPU lifecycle management, and efficient scaling across on-prem NVIDIA infrastructure.
- KubeVirt Integration – Enables the creation of virtual machines on demand as nodes within a virtual cluster using KubeVirt, allowing large bare-metal servers to be partitioned into smaller, isolated compute units. This extends Auto Nodes to on-prem and bare-metal environments, giving platform teams elastic, tenant-aware GPU infrastructure under Kubernetes.
- Netris Integration – Provides automated network isolation and lifecycle management for virtual clusters, giving each tenant its own dedicated network path and enabling multi-tenant GPU environments to run securely on shared infrastructure.
- vNode Runtime – A secure, Kubernetes-native container sandbox that helps prevent container break-outs, enabling multi-tenant GPU workloads without reverting to VMs.
Together, these technologies create the foundation of the vCluster Infrastructure Tenancy Platform for AI – a composable, Kubernetes-native framework purpose-built for running AI, ML, and GPU-intensive workloads anywhere.
AI Authority Trend: F5 Boosts Enterprise AI Performance with NVIDIA RTX PRO Server
Industry analysts are increasingly highlighting the urgency of optimizing GPU utilization and simplifying AI infrastructure management.
“As AI infrastructure becomes the new competitive frontier, organizations are under immense pressure to operationalize GPUs efficiently while maintaining security and governance across hybrid environments,” stated Paul Nashawaty, Practice Lead and Principal Analyst at theCUBE Research. “We find that 71% of enterprises cite GPU utilization inefficiency as a major barrier to scaling AI workloads, and nearly two-thirds are exploring Kubernetes-native approaches to unify AI operations across cloud and on-prem. vCluster Labs’ Infrastructure Tenancy Platform for AI directly addresses this gap by enabling dynamic, multi-tenant GPU orchestration with the same elasticity and control enterprises expect from the public cloud, now extended to private NVIDIA-powered AI systems.”
vCluster Reference Architecture for NVIDIA DGX Systems
The new vCluster Reference Architecture for NVIDIA DGX systems outlines best practices for deploying virtual clusters on GPU-centric systems, enabling enterprises to deliver a cloud-like Kubernetes experience on-premises. With vCluster, teams can create lightweight virtual clusters that autoscale GPU resources, integrate securely with both on-prem and cloud networks, and maintain consistent performance across environments.
“We’ve been using vCluster for a while and we love the technology,” said Nick Jones, VP of Engineering at Nscale. “We’re using vCluster to optimise GPU utilisation and accelerate Kubernetes cluster provisioning — delivering higher performance and efficiency that directly benefit our customers.”
Enabling Cloud Agility for NVIDIA GPU Infrastructure
From AI factories to private GPU clouds, vCluster brings the scalability and efficiency of public cloud Kubernetes to NVIDIA environments.
Organizations using vCluster report:
- Faster cluster provisioning – virtual clusters spin up in seconds with fully declarative provisioning via Terraform and GitOps
- Higher GPU utilization – fewer idle GPUs across teams and tenants while ensuring fair use for everyone across the organization
- Simplified day 2 operations – automated control plane and node upgrades, automatic backups with vCluster Snapshots and standardized guidance for integration into common cloud-native observability stacks
AI Authority Trend: CrowdStrike, AWS, and NVIDIA Expand Global Cybersecurity Startup Accelerator
Source – businesswire
To share your insights, please write to us at info@intentamplify.com



