Executive Summary: VMware Private AI Foundation with NVIDIA

VMware Private AI Foundation with NVIDIA occupies a structurally unique position in this assessment series: it is neither an infrastructure OEM (Dell, HPE), a hyperscaler (AWS, Google Cloud), nor a data platform vendor (VAST). It is a virtualization and private cloud platform — the abstraction layer that sits between physical infrastructure and workloads. Broadcom’s strategic thesis is that VCF is ‘the permanent abstraction layer between AI software and physical chips,’ and the Private AI Foundation extends that thesis into AI workloads specifically.

The 4+1 model reveals both the power and the limits of this position. VCF’s strength is Layer 2A — infrastructure orchestration is VMware’s heritage and its deepest IP. VCF Automation, vSphere Supervisor, VKS, vSAN, NSX/vDefend, and VCF Operations collectively provide the most mature unified orchestration surface for mixed workloads (VMs, containers, AI) of any on-prem vendor assessed. No other vendor in this series manages GPU-accelerated AI workloads, Kubernetes clusters, and traditional VMs from a single control plane with equivalent operational maturity.

At Layer 0, VMware’s multi-accelerator management (AMD, NVIDIA, Intel) requires careful contextualization. GPU vendor choice is not unique to VMware — HPE’s GX5000 supports NVIDIA and AMD blades in the same rack, and hyperscalers fully abstract accelerators at the service layer (a developer calling Vertex AI or Bedrock never sees which silicon powers the response). VMware’s actual differentiator is the level of architectural control: operators manage GPU placement, isolation, and scheduling through familiar vSphere primitives (vGPU profiles, vmclasses, DRS, resource pools). The control plane is borrowed judgment — it is VMware’s opinionated virtualization model applied to acceleration — but that opinion provides stronger knobs that appeal to operators already comfortable with virtualization management. Where hyperscalers abstract the accelerator away from the architect, VMware puts the architect in the driver’s seat through a familiar console.

But the closer the stack gets to AI-specific functions — model serving, retrieval, agent execution, governance — the more authority shifts to NVIDIA (Layer 2B runtime via NVIDIA AI Enterprise), to open-source components (pgvector, Elasticsearch), or to capabilities that are emerging but not yet at the depth of purpose-built alternatives. Private AI Services (Model Runtime, Agent Builder, Data Indexing/Retrieval, Vector Database, Model Store) are genuine platform capabilities delivered as part of the VCF subscription, but they are foundational AI services, not the deep data lifecycle or agent orchestration that Dell (Dataloop), HPE (Ezmeral/Kamiwaza), or VAST (DataEngine/AgentEngine) provide. Layer 1C (data pipelines) is absent. Layer 2C (reasoning plane) has building blocks — MCP Server Governance, GPU/Model Metrics, Intelligent Assist — but none passes the ‘Routing Is Not Reasoning’ test.

The installed base is the strategic moat: nine of the top ten Fortune 500 companies have committed to VCF, with 100M+ cores licensed worldwide. For the enormous VMware installed base, Private AI Foundation is the lowest-friction path to on-prem AI — no new infrastructure vendor, no new management plane, no new operational model, and no incremental cost beyond GPU hardware. The 4+1 question is whether lowest-friction adoption translates to sufficient architectural depth when agentic AI workloads demand governance, policy-driven placement, and cross-agent orchestration that VCF does not yet provide.

VMware Private AI Foundation is the enterprise’s most natural on-ramp to private AI. Hock Tan’s ‘permanent abstraction layer’ framing is a Layer 2A statement, not a Layer 2C statement — the abstraction layer manages resources; the reasoning plane governs them. VMware has the former; it does not have the latter. Whether it becomes the enterprise’s durable AI platform depends on whether Broadcom invests in the Layer 1A governance depth, Layer 1C pipeline capability, and Layer 2C reasoning plane that the 4+1 model identifies as structurally necessary — or whether the Broadcom acquisition thesis (cash generation from the installed base) constrains that investment. VMware’s unique structural advantage is that VCF sees everything from the hypervisor up across all OEM hardware — the data to build a multi-vendor reasoning plane exists. The engineering commitment does not yet.

Layer-by-layer status: Layer 0 (Hardware-Agnostic Abstraction), Layer 1A (Platform Storage, Not AI-Native), Layer 1B (Foundational RAG Services), Layer 1C (Gap), Layer 2A (VMware Heritage Strength), Layer 2B (Platform-Native + NVIDIA-Dependent), Layer 2C (Emerging Signals Only), Layer 3 (+1) (Platform-Enabled, Not Platform-Provided).

Assessment framework: 4+1 Layer AI Infrastructure Model. Scoring model: Decision Authority Placement Model (DAPM) — Retained, Delegated, or Ceded. Published by The CTO Advisor LLC. Author: Keith Townsend. Date assessed: June 20, 2026. Version: v1.4 - Interface-Portability Reconciliation.

VMware Private AI Foundation with NVIDIA

Mapped to the 4+1 Layer AI Infrastructure Model

v1.4 - Interface-Portability Reconciliation·Assessed June 20, 2026·Source: VMware Explore 2025, VCF 9.0/9.1 announcements, Broadcom press releases, VCF Private AI blog series, published 4+1 model. v1.4 (instrument reconciliation): 1B Vector Database (pgvector) Ceded→Delegated (OSS, opinions lift via pg_dump); 2A vSphere Supervisor + VKS Ceded→Delegated (conformant K8s, consistent with cloud managed-K8s treatment).

ACTIVE ASSESSMENT

Strength

Moderate

Gap

Partner

Layer 0 · ComputeCompute & Network FabricHardware-Agnostic Abstraction▼

Raw compute, networking, and acceleration fabric

Vendor-Provided

VMware vSphere 8/9 (Hypervisor)Ceded

Industry-standard virtualization layer. GPU passthrough and vGPU support via NVIDIA AI Enterprise integration. vSphere Supervisor manages both VMs and Kubernetes workloads from a single control plane. vMotion for live migration of AI workloads. DRS for automated load balancing. The hypervisor is Broadcom’s foundational IP — the abstraction layer between physical infrastructure and all workloads above.

Multi-Vendor Hardware SupportRetained

VCF runs on Dell PowerEdge, HPE ProLiant, Lenovo ThinkSystem, Cisco UCS, Supermicro, NEC, Fujitsu, and others. Hardware-agnostic by design — VCF does not manufacture or specify compute. This is the fundamental architectural difference from Dell AI Factory or HPE Private Cloud AI: VMware abstracts hardware, OEMs provide it. The enterprise retains hardware vendor choice.

NVIDIA GPU Integration (vGPU + Passthrough)Delegated

VCF 9.1 supports NVIDIA Blackwell architecture, NVSwitch on HGX platform, GPUDirect RDMA over InfiniBand for distributed LLM inference across multiple HGX servers. Enhanced DirectPath I/O for ConnectX-7 NICs and BlueField-3 DPUs. vGPU profiles mapped to vSphere Namespaces as vmclasses for multi-tenant GPU isolation — memory strictly partitioned per tenant with zero side-channel risk across GPU framebuffer.

NSX / vDefend NetworkingCeded

Software-defined networking with micro-segmentation, Zero Trust enforcement via Distributed Firewall (Antrea CNI for Kubernetes), and in-memory malware defense. Avi Load Balancer provides virtualized load balancing for AI inference endpoints and agentic applications — eliminates hardware appliance requirements. Post-quantum cryptography support.

Multi-Accelerator Management (AMD + NVIDIA + Intel)Ceded

VCF 9.1 manages AMD, NVIDIA, and Intel accelerators through the same virtualization control plane — vGPU profiles, vmclasses, DRS policies, resource pools. The architect retains granular control over GPU placement, isolation, and scheduling using familiar vSphere primitives. This is not unique as a multi-vendor GPU capability (HPE GX5000 supports NVIDIA Rubin + AMD MI430X in the same rack; hyperscalers fully abstract accelerators at the service layer). VMware’s differentiator is the level of architectural control: operators already comfortable with virtualization management get GPU scheduling knobs they know how to turn. The control plane is opinionated — it applies VMware’s virtualization model to acceleration — but that opinion is the value for VMware-native shops.

NVIDIA-Provided

NVIDIA AI Enterprise (NVAIE)

Enterprise AI software suite providing vGPU drivers, GPU Operator for Kubernetes, and validated AI frameworks. NVAIE licenses purchased separately from VCF. Deeply integrated but independently licensed — the same NVIDIA dependency Dell and HPE share.

NVIDIA GPU Silicon + Networking

Blackwell, H100/H200, ConnectX-7/8, BlueField-3 DPUs, NVSwitch, InfiniBand. VMware validates and integrates but does not manufacture or specify GPU silicon.

NVIDIA NIM Microservices

Pre-built inference microservices deployable on Private AI Foundation. Nemotron models and community models available through Model Store.

◆ Gap Analysis

VMware’s Layer 0 position is fundamentally different from every other vendor in this assessment: VMware provides the abstraction layer, not the physical infrastructure. Dell, HPE, and VAST own or specify hardware. Google and AWS own data centers. VMware sits above all of them. This creates a unique DAPM profile: the enterprise Retains hardware vendor choice (can switch from Dell to HPE to Lenovo without changing the management plane) but Delegates GPU runtime to NVIDIA and inherits whatever GPU integration VMware has validated. The abstraction is VMware’s value proposition and its architectural constraint — VMware can only support GPU features that the hypervisor can virtualize or pass through. Multi-accelerator support requires nuanced comparison across the assessed vendors. GPU vendor choice is NOT unique to VMware: • Hyperscalers (Google, AWS) fully abstract accelerators at the service layer — developers call Vertex AI or Bedrock and never see whether TPUs, Trainium, or NVIDIA GPUs power the response. Accelerator choice is Ceded to the cloud provider. Simplest developer experience, least architectural control. • HPE GX5000 supports NVIDIA Rubin and AMD MI430X GPU blades in the same rack architecture. Multi-vendor at the hardware level. • Dell AI Factory is NVIDIA-only. ‘Dell AI Platform with AMD’ is a separate branding, separate software stack — not a unified runtime. • VAST CNode-X is NVIDIA-only. VMware’s actual differentiator is the level of architectural control over acceleration: VCF manages AMD, NVIDIA, and Intel accelerators through familiar virtualization primitives (vGPU profiles mapped to vmclasses, DRS for GPU workload balancing, vMotion for live migration, resource pools for multi-tenant isolation). The enterprise architect retains granular control over GPU placement, scheduling, and isolation using tools they already operate. The control plane is borrowed judgment — it is VMware’s opinionated virtualization model applied to GPU resources — but that opinion provides stronger knobs that appeal specifically to operators already comfortable with vSphere management. Where hyperscalers abstract the accelerator away from the architect, VMware puts the architect in the driver’s seat through a familiar console. The vDefend security story is architecturally significant for AI: micro-segmentation at the packet level between every AI component (model server, vector database, embedding service, API gateway) with Terraform-codified firewall rules. This is infrastructure-layer Zero Trust for AI workloads — a capability that Dell and HPE don’t provide at equivalent depth from the platform layer. VAST’s CrowdStrike integration operates at a different level (application/data-layer security vs. network-layer micro-segmentation).

◆ Borrowed Judgment

Multi-directional: VMware borrows GPU silicon judgment from NVIDIA (same as everyone) and hardware engineering judgment from OEM partners (Dell, HPE, Lenovo build the servers). But VMware retains the abstraction layer — the hypervisor, the networking, the security model, the orchestration. This is the inverse of Dell’s position: Dell retains hardware judgment and borrows software judgment from NVIDIA. VMware retains software judgment and borrows hardware judgment from OEMs. Critically, the VMware control plane is itself borrowed judgment for the enterprise: the architect gains granular GPU management through vSphere primitives, but those primitives encode VMware’s opinions about how acceleration should be virtualized, scheduled, and isolated. The enterprise borrows VMware’s virtualization worldview in exchange for operational familiarity. This is a different trade-off than the hyperscalers (where the enterprise cedes acceleration decisions entirely) or bare-metal (where the enterprise retains full control but builds everything). VMware occupies the middle: more control than cloud, less effort than bare-metal, but through an opinionated lens. The NVIDIA AI Enterprise dependency is real but no deeper than Dell’s or HPE’s: all three require NVAIE for GPU virtualization and AI framework support. VMware’s co-engineering relationship with NVIDIA on VCF integration is comparable to HPE’s Private Cloud AI co-engineering.

◆ Working Notes

The Broadcom acquisition context is impossible to ignore at Layer 0: Gartner projects VMware’s virtualization market share will fall from 70% (2024) to 40% (2029) due to pricing changes. Nutanix CEO has publicly targeted 165,000 of VMware’s approximately 300,000 customers. Broadcom has converted 90%+ of the top 10,000 VMware customers to VCF subscriptions with 200-500% price increases reported. This creates a unique installed-base dynamic: Private AI Foundation’s market opportunity is less about winning new customers than about retaining existing ones by making VCF indispensable for AI workloads. If the enterprise is already paying for VCF, Private AI Services come at no additional cost — a fundamentally different go-to-market than Dell (buy new PowerEdge + NVIDIA), HPE (buy new Private Cloud AI), or VAST (deploy new AI OS). The air-gapped deployment support is significant for regulated industries and government — same capability Dell and HPE emphasize, delivered through the existing VCF automation framework rather than a purpose-built AI appliance.

Layer 1A · StorageData Storage & GovernancePlatform Storage, Not AI-Native▼

Durable, governed data foundation — the Governance Catalog that Layer 2C queries

Vendor-Provided

vSAN (HCI Storage)Ceded

Hyper-converged storage integrated into VCF. Block and file storage natively. Native Object Storage (S3-compatible) in tech preview with VCF 9.1.x — brings S3 interface natively into the platform without third-party licensing. vSAN deduplication and compression for cost reduction. Unified storage policies and multi-tenant self-service access.

vSAN for Recovery + Ransomware RecoveryCeded

Sovereign, in-place ransomware recovery using native snapshot capabilities. Deep snapshot chains and integrated replication workflows. On-prem recovery without external dependencies.

Model Store (Private AI Services)Ceded

Curated LLM repository with integrated RBAC access control. MLOps teams and data scientists can securely manage and provide LLMs with governance and security for enterprise data and IP. NVIDIA models, Nemotron, and community models available. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

External Storage IntegrationDelegated

VCF supports Dell PowerScale, Dell ObjectScale, NetApp ONTAP, Pure Storage, HPE Alletra, and other enterprise storage via vSphere APIs. The storage layer is not limited to vSAN — enterprises can bring existing storage investments. This is a heterogeneous storage approach vs. Dell’s vertically integrated storage (PowerScale/ObjectScale/Exascale) or VAST’s collapsed storage (Element Store).

NVIDIA-Provided

No Direct NVIDIA Layer 1A Dependency

NVIDIA does not provide storage or governance components in the VMware Private AI stack. Storage is VMware-owned (vSAN) or enterprise-chosen (external arrays).

◆ Gap Analysis

VMware’s Layer 1A is fundamentally different from every other assessed vendor because VMware is not a storage company. vSAN provides competent hyper-converged storage with the VCF 9.1 addition of native S3 object storage, but this is general-purpose platform storage — not AI-optimized data infrastructure. Compare to Dell: MetadataIQ indexes billions of files with AI-specific metadata enrichment. Exascale provides 10+ PB/rack unified file+object+fast-file. Trust3 AI provides storage-layer governance for sensitive data discovery. Compare to HPE: Data Fabric v8.1 provides policy-based data placement with Apache Polaris catalog for Iceberg tables. Alletra B10000 provides real-time agentic storage support with semantic understanding. Compare to VAST: Element Store collapses file, object, table, and vector into a single governed data structure with inline metadata enrichment. VMware’s storage story is ‘bring your existing storage’ — which is pragmatic for the installed base but means Layer 1A governance (metadata richness, data lineage, policy-based placement) depends entirely on whichever external storage vendor the enterprise has deployed. VMware itself provides no AI-specific governance catalog, no metadata enrichment, no data lineage tracking. The Model Store capability is a Layer 1A function worth noting: RBAC-governed model repository is a governance primitive that Dell’s AI Factory lacks as a platform-native capability. But Model Store governs models, not data — it does not address the broader question of which data feeds which model under what compliance constraints.

◆ Borrowed Judgment

Low for vSAN (VMware-owned). High for AI-specific storage governance — entirely dependent on whichever external storage vendor the enterprise deploys. If the enterprise runs Dell storage, it inherits Dell’s governance capabilities (MetadataIQ). If it runs NetApp, it inherits NetApp’s. VMware provides no abstraction or unification of storage governance across heterogeneous backends. This is the inverse of VMware’s Layer 0 strength: at Layer 0, VMware abstracts heterogeneous hardware into a unified management plane. At Layer 1A, VMware does NOT abstract heterogeneous storage governance into a unified governance plane. The storage abstraction stops at provisioning and capacity management — it does not extend to metadata, lineage, or policy.

◆ Working Notes

The native S3 Object Storage in VCF 9.1.x (tech preview) is a strategic move: S3 compatibility is the lingua franca of AI data pipelines. Every vendor in this assessment provides S3 access (Dell ObjectScale, HPE Alletra X10000, VAST DataStore, AWS S3, Google Cloud Storage). VMware adding native S3 to vSAN reduces the dependency on external object storage for AI workloads. The Tanzu Marketplace integration provides a curated path to certified middleware and data services — this is VMware’s approach to ecosystem curation at the data layer, comparable in intent (not depth) to HPE’s Unleash AI program or VAST’s Cosmos Community. SQL Server DBaaS as a first-class VCF citizen is a pragmatic enterprise play — most enterprises have SQL Server deployments, and making it a platform service reduces the friction of data access for AI workloads.

Layer 1B · RetrievalContext Management & RetrievalFoundational RAG Services▼

Low-latency retrieval for RAG — vector/hybrid search, context windows

Vendor-Provided

Data Indexing & Retrieval (Private AI Services)Ceded

Index and maintain multiple data sources, making them readily available for consumption by AI applications. Integrated with Model Runtime for RAG workflows. Keeps indexed data current as sources change. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

Vector Database (Private AI Services)Delegated

pgvector on PostgreSQL delivered via Data Services Manager with VMware enterprise-level support. Enables domain-specific, up-to-date context for AI models. PostgreSQL 16.8 with pgvector 0.8.0 extension in Private AI Services 2.1. pgvector is OSS and the consumed interface is standard PostgreSQL — embeddings, schema, and index definitions lift via pg_dump to any Postgres platform; DSM operates the database, the open substrate keeps the retrieval opinions portable.

RAG Pipeline IntegrationDelegated

NVIDIA NIM RAG Blueprint v2.5.0 validated on VCF — production-grade, multi-model RAG pipeline. Pre-built catalog items in VCF Automation for deploying complete RAG workflows. Elasticsearch supported as external vector database for advanced retrieval scenarios.

NVIDIA-Provided

NVIDIA NIM + NeMo Retriever

Inference microservices and retrieval-augmented generation components. RAG Blueprints provide pre-built retrieval patterns. Same capabilities available on Dell and HPE platforms.

NVIDIA AI Enterprise RAG Stack

Validated software stack for RAG workflows on VMware Private AI Foundation. GPU-accelerated embedding generation and retrieval.

◆ Gap Analysis

VMware’s Layer 1B provides functional RAG capabilities through Private AI Services — Data Indexing/Retrieval and Vector Database are genuine platform services, not just partner integrations. But the retrieval stack is foundational, not differentiated. pgvector on PostgreSQL is a competent vector database for moderate-scale use cases but lacks the performance characteristics of purpose-built alternatives. Compare to Dell’s Data Search Engine (Elasticsearch 9.4 with GPU-accelerated hybrid search, MetadataIQ integration). Compare to VAST’s InsightEngine (native to the data platform, no data movement for retrieval). Compare to HPE’s Alletra X10000 with KV cache storage support for inference state persistence. The Data Indexing & Retrieval service addresses the core RAG requirement — keeping context current as data sources change — but without the metadata richness or governance integration that Dell (MetadataIQ + Elastic), HPE (Data Fabric + Kamiwaza), or VAST (Catalog + InsightEngine) provide. No retrieval quality observability (recall@k, latency percentiles) is evident — the same gap identified in the Dell assessment. A Layer 2C placement engine would need retrieval quality metrics to make informed routing decisions.

◆ Borrowed Judgment

Moderate. VMware owns the Data Indexing/Retrieval and Vector Database services. RAG pipeline patterns depend on NVIDIA NIM/Blueprints (same dependency as Dell and HPE). Elasticsearch as external vector database option introduces the same Elastic dependency Dell has — search intelligence is Elastic’s, not VMware’s. The pgvector choice is notable: PostgreSQL is the most widely deployed enterprise database. By building on pgvector, VMware reduces adoption friction (most enterprises already have PostgreSQL expertise) at the cost of retrieval performance ceiling. Dell chose Elasticsearch (higher performance, more complex). VAST built its own (highest integration, most proprietary). VMware chose the most pragmatic option.

◆ Working Notes

The OpenWebUI integration with Private AI Services RAG demonstrates VMware’s approach to Layer 1B: provide the retrieval infrastructure, let the enterprise choose the user-facing application layer. This is consistent with VMware’s platform philosophy — VMware provides infrastructure services, not applications. The RAG Blueprint validation on VCF (multi-model, production-grade, 8x NVIDIA H100 80GB GPUs) provides a concrete reference architecture that enterprises can deploy from VCF Automation catalog items. This is operationally simpler than assembling equivalent RAG infrastructure on bare-metal Dell or HPE hardware.

Layer 1C · PipelinesData Movement & PipelinesGap▼

Move/transform data — ETL/ELT, lineage, cost-aware movement, KV cache tiering

Vendor-Provided

NVIDIA-Provided

NVIDIA Blueprints

Pre-built AI application patterns deployable through VCF Automation. Pipeline templates, not pipeline infrastructure — same as Dell and HPE.

NVIDIA CMX (Future)

KV cache management for context memory offload. When integrated with VCF, could provide the same KV cache tiering Dell has validated (19x TTFT improvement).

◆ Gap Analysis

Layer 1C is VMware’s most significant gap relative to other assessed vendors. VMware provides no equivalent to: • Dell’s Data Orchestration Engine (Dataloop): No-code/low-code AI data lifecycle management, Dell’s most meaningful software acquisition. • HPE’s Ezmeral Unified Analytics: Enterprise-hardened ML pipeline stack (Airflow, Kubeflow, Ray, Feast, MLflow, Spark). • HPE’s Data Fabric: Policy-based data placement with compliance tagging and data lineage. • VAST’s DataEngine: Serverless data transformation with CLI/SDK, built-in observability, triggers, and automated pipelines. VCF Automation provides deployment pipelines (standing up AI infrastructure) but not data pipelines (moving, transforming, and governing data through ML workflows). The enterprise running VMware Private AI Foundation must bring its own data pipeline orchestration — Airflow, Kubeflow, or a commercial alternative — and deploy it on VKS. This is architecturally consistent with VMware’s platform philosophy: VCF provides infrastructure services, not application-layer data engineering tools. But it leaves a functional gap that competitors have filled. An enterprise choosing VMware for AI inherits a Layer 1C assembly problem that Dell (Dataloop), HPE (Ezmeral), or VAST (DataEngine) partially or fully solve.

◆ Borrowed Judgment

High. The enterprise must borrow data pipeline judgment from whatever tools it deploys on VKS — Apache Airflow community, Kubeflow community, or a commercial vendor (Dataloop, Databricks, etc.). VMware provides no opinion on data pipeline architecture, no integration between pipeline metadata and infrastructure governance, and no data lineage capability. Compare to Dell: Dell acquired Dataloop specifically to address Layer 1C. Compare to HPE: HPE assembled Ezmeral through four acquisitions (BlueData, MapR, Ampool, Arrikto). Compare to VAST: VAST built DataEngine as a native platform capability. VMware has made no equivalent investment in data pipeline IP.

◆ Working Notes

The gap is real but may be strategic: VMware has historically succeeded by providing infrastructure primitives that partner ecosystems build on, rather than by building application-layer tooling. The question is whether AI data pipelines are infrastructure (VMware should own them) or applications (VMware should enable them). The Tanzu Marketplace could address this gap through curated data pipeline services — certified Airflow, MLflow, or Kubeflow deployments validated for VCF. This would be a Delegated approach (partner provides the capability, VMware validates the deployment) rather than a Retained approach (VMware builds the capability). Architecturally similar to HPE’s Unleash AI ecosystem model. The KV cache story is notably absent: Dell has validated NVIDIA CMX with 19x TTFT improvement on PowerScale. HPE has native KV cache storage support in Alletra X10000. VAST collocates cache and compute in CNode-X. VMware has not yet announced equivalent KV cache tiering capabilities. (Enhanced NVMe memory tiering in VCF 9.1 addresses memory-bound performance, not data-pipeline orchestration.)

Layer 2A · OrchestrationInfrastructure OrchestrationVMware Heritage Strength▼

GPU scheduling, quotas, RBAC, fair-share scheduling, utilization optimization

Vendor-Provided

VCF Automation (formerly vRealize/Aria Automation)Ceded

Self-service catalog with pre-built AI workload templates. Quickstart deployment for Private AI Foundation. Infrastructure-as-code with Terraform integration. Multi-tenant resource provisioning with RBAC. Live Application Stack Blueprints for versioned, redeployable application topologies. Day 2 operations for AI Blueprints. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

vSphere Supervisor + VKSDelegated

Unified management of VMs, containers, and AI workloads from a single control plane. VKS (vSphere Kubernetes Service) 3.6 supports up to 500 Kubernetes clusters per Supervisor. Simplified Container-as-a-Service for application teams. VM Fast-Deploy for accelerated provisioning. vSphere Elastic Provisioning for zero-touch fleet expansion. GitOps-based infrastructure management. VKS is conformant Kubernetes — manifests and workloads lift to another conformant cluster without rebuilding, so the K8s opinions are portable (scored as the cloud managed-K8s services are); the proprietary fleet and lifecycle management is captured in the Ceded VCF Automation / Operations / SDDC Manager components.

VCF Operations (formerly vRealize/Aria Operations)Ceded

Private AI Model and GPU Metrics — utilization, memory pressure, and model-level visibility on the same console as the rest of the estate. Real-Time Operational Observability turns telemetry into action. Customizable dashboards for AI model and agent performance. Capacity management and compliance monitoring for AI workloads. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

SDDC ManagerCeded

Full-stack lifecycle management for VCF. Automated deployment, patching, and upgrades across vSphere, vSAN, NSX, and VKS. Single-pane fleet management. Expanded fleet size and upgrade scale in 9.1. This is the operational backbone — the equivalent of HPE’s GreenLake or Dell’s APEX management, but with 20+ years of enterprise maturity. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

Advanced Cyber Compliance (ACC)Ceded

Continuous compliance enforcement with automated drift detection and remediation. Hardened infrastructure images. Integrated with vDefend for security posture management. Disaster recovery via vSAN for Recovery. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

MCP Server Governance (VCF 9.1)Ceded

IT operations can centrally manage and control access to MCP tools and associated servers across their environment. Ensures user groups can only access approved MCP tools. Security guardrails for MCP servers via vDefend and Avi Load Balancer. This is a Layer 2A governance function with 2C implications — controlling which agents can access which tools. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

NVIDIA-Provided

NVIDIA GPU Operator

Kubernetes operator for GPU lifecycle management. Manages GPU drivers, container runtime, device plugins. Standard across all NVIDIA-integrated platforms.

NVIDIA vGPU Manager

GPU virtualization profiles and multi-tenant GPU allocation. Memory partitioning and time-sliced compute scheduling. Managed through vSphere Supervisor.

◆ Gap Analysis

Layer 2A is VMware’s strongest layer — arguably the strongest Layer 2A of any vendor in this assessment series. The reason is operational maturity: VCF has been managing enterprise infrastructure for two decades. No other vendor assessed has equivalent depth in lifecycle management, multi-tenant orchestration, compliance automation, and unified VM/container/AI workload management. Specific 2A differentiators vs. other assessed vendors: • Unified workload management: Dell manages AI workloads separately from traditional workloads (OpenManage for servers, Run:ai for GPUs, separate tools for each). HPE manages AI through GreenLake Intelligence + OpsRamp + Private Cloud AI (three systems). VMware manages AI, containers, and VMs from ONE control plane (vSphere Supervisor). VAST manages only VAST workloads. • Operational maturity: VCF’s Day 2 operations (patching, upgrades, compliance, capacity planning) for AI workloads inherit the same proven processes used for the enterprise’s existing VM fleet. New operational model required? Zero. Dell and HPE AI stacks require new operational processes. VAST requires an entirely new operational discipline. • MCP Server Governance is a notable 2A/2C bridge: centrally controlling which user groups can access which MCP tools is an infrastructure-level governance function that no other on-prem vendor provides as a platform native capability. Google’s Agent Gateway provides equivalent capability in cloud. The GPU scheduling gap remains: NVIDIA GPU Operator and vGPU Manager handle GPU allocation, but policy-driven GPU scheduling (which workload gets which GPU based on cost, compliance, and performance constraints) is not a VCF-native function. This is the same gap Dell has with Run:ai — the scheduling intelligence is NVIDIA’s, not the platform vendor’s.

◆ Borrowed Judgment

Low — the lowest of any layer in the VMware assessment. VCF Automation, vSphere, VKS, vSAN, NSX, VCF Operations, and SDDC Manager are all Broadcom/VMware IP. GPU scheduling is the primary borrowed judgment (NVIDIA GPU Operator + vGPU Manager), but this is the same dependency every on-prem vendor shares. Compare to Dell Layer 2A: Dell splits 2A between OpenManage (Dell-owned) and Run:ai (NVIDIA-owned, acquired). VMware retains more 2A authority than Dell. Compare to HPE Layer 2A: HPE’s GreenLake Intelligence is HPE-owned 2A with MCP-based agent communication. VMware’s VCF Operations is VMware-owned 2A with emerging MCP support. Both retain 2A authority; different architectural approaches (HPE: agentic mesh; VMware: traditional orchestration evolving toward agentic).

◆ Working Notes

The Intelligent Assist for VCF (tech preview) signals VMware’s evolution toward agentic infrastructure management. An AI-driven support assistant that diagnoses and resolves issues by consulting Broadcom’s knowledge base is functionally similar to HPE’s GreenLake Intelligence domain agents or Dell’s CloudIQ — but at an earlier stage of development. The 100M+ licensed cores installed base gives VMware an operational data advantage no other on-prem vendor can match: patterns learned from managing the world’s largest virtualization fleet can inform AI workload optimization in ways that newer platforms cannot. Whether Broadcom invests in leveraging this data advantage for AI-specific intelligence is an open question.

Layer 2B · RuntimeApplication Runtime & ExecutionPlatform-Native + NVIDIA-Dependent▼

Model serving, agent execution, inference APIs, distributed inference

Vendor-Provided

Model Runtime (Private AI Services)Ceded

Run inference and embedding models as a service across the organization. API Gateway allows users and AI applications to interact with models directly via API. Multi-accelerator support — same model deployment on AMD and NVIDIA GPUs without refactoring. Model endpoints configurable through VCF Automation UI. Multi-tenant Models-as-a-Service enables secure model sharing across business units to lower costs and reduce power consumption. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

Agent Builder (Private AI Services)Ceded

Build AI agents in a user-friendly playground, leveraging models and knowledge bases created using other Private AI services. Integrated with Model Runtime and Data Indexing/Retrieval for end-to-end agent development. This is a platform-native agent construction surface — not as deep as VAST’s AgentEngine or Google’s Agent Studio/ADK, but integrated into the VCF operational model. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

Deep Learning VMsDelegated

Pre-configured virtual machines with validated AI/ML software stacks: PyTorch, TensorFlow, Miniconda. Software stack validated in advance on NVIDIA GPUs — data scientists start developing immediately without compatibility validation. Provisioned through VCF Automation self-service catalog.

VKS AI ClustersCeded

GPU-capable Kubernetes worker nodes for cloud-native AI/ML workloads. Triton Inference Server deployable from catalog. Distributed LLM inference with GPUDirect RDMA over InfiniBand for models that exceed single-server capacity (DeepSeek-R1, Llama 3.1-405B). This is infrastructure-layer runtime support, not an opinionated agent execution framework.

Tanzu Platform (Application Runtime)Ceded

PaaS-layer application runtime. ‘You provide code, we put it into production.’ Governed agentic coding with Tanzu. Developers can self-publish AI agents and MCP servers, sharing AI applications and tools across the enterprise. MCP server publishing makes Tanzu a distribution surface for enterprise agent tooling.

NVIDIA-Provided

NVIDIA AI Enterprise Runtime

NIM inference microservices, model optimization, GPU-accelerated frameworks. The core AI runtime dependency — VMware’s Model Runtime wraps NVIDIA inference capabilities in a platform-managed service.

NVIDIA NIM Agent Blueprints

Pre-built agentic workflows (RAG, PDF extraction, digital twins). Same blueprints available on Dell, HPE, Cisco, Lenovo. Non-differentiating for VMware at 2B.

NVIDIA Triton Inference Server

Multi-framework model serving. Deployable as VCF Automation catalog item on GPU-capable VKS clusters.

◆ Gap Analysis

VMware’s Layer 2B is the most architecturally interesting in this assessment because it combines platform-native AI services (Model Runtime, Agent Builder) with NVIDIA runtime dependency — a hybrid Retained/Delegated model. The Model Runtime is a genuine platform capability: model serving as a managed VCF service with API Gateway, multi-tenant isolation, and multi-accelerator support. This is structurally different from Dell’s 2B (entirely NVIDIA-dependent — NemoClaw/OpenShell) and closer to HPE’s 2B (HPE provides the deployment platform, NVIDIA provides the execution runtime, with HPE-owned bracketing governance above and below). The Agent Builder is notable as a platform-native agent construction surface. Compare to alternatives: • Dell: No platform-native agent builder. Relies on NVIDIA NIM/NemoClaw post-deployment. • HPE: CrewAI pre-installed (partner framework). Deloitte Zora AI (partner application). • VAST: AgentEngine — deeply integrated, proprietary agent runtime. • Google: Agent Studio (no-code), ADK (code-first), Agent Designer. • AWS: Bedrock Agents (no-code), Strands SDK (code-first). VMware’s Agent Builder is simpler than the hyperscaler offerings but it’s integrated into the VCF operational model — agents built here inherit VCF’s security (vDefend microsegmentation), governance (MCP server controls), and observability (GPU/model metrics). That operational integration is VMware’s differentiator. The Tanzu Platform MCP server publishing capability is a Layer 2B/2C bridge worth tracking: enabling developers to self-publish MCP servers creates an enterprise-internal agent tool marketplace governed by IT. This is a distributed model for agent capability deployment that differs from Google’s centralized Agent Registry or HPE’s curated Unleash AI ecosystem.

◆ Borrowed Judgment

Moderate. VMware owns Model Runtime, Agent Builder, and the Tanzu application runtime. But inference execution depends on NVIDIA AI Enterprise (same structural dependency as Dell and HPE). The multi-accelerator support (AMD + NVIDIA) provides a runtime alternative that Dell AI Factory customers don’t have (Dell’s AMD track is a separate stack), though HPE’s GX5000 also supports multi-vendor GPU blades and hyperscalers abstract accelerators entirely. VMware’s value is that the architect controls which accelerator serves which workload through familiar virtualization primitives — the control plane is opinionated (VMware’s virtualization model applied to GPUs) but provides operational knobs that vSphere-native teams already understand. The NVIDIA dependency at 2B is real but partially mitigated by VMware’s abstraction: Model Runtime provides a VMware-managed API surface. If NVIDIA changes its NIM/NemoClaw architecture, VMware absorbs the integration change — the enterprise’s API doesn’t change. This is the same ‘bracketing’ architecture HPE uses (GreenLake governance above and below the NVIDIA runtime), expressed differently (VMware API abstraction wrapping the NVIDIA runtime).

◆ Working Notes

The multi-accelerator Model Runtime is significant but requires context: running the same AI model on AMD and NVIDIA GPUs without refactoring is a runtime-level abstraction that VMware provides through familiar virtualization management tools. HPE’s GX5000 supports multi-vendor GPU blades in the same rack, and hyperscalers abstract accelerators entirely at the service layer (Vertex AI, Bedrock). VMware’s differentiator is not multi-accelerator support per se but the level of architectural control — the operator manages GPU placement and scheduling through vSphere primitives they already know, with stronger knobs than cloud providers offer. Dell’s AMD track (Dell AI Platform with AMD) remains a separate branding with a separate software stack, not a unified runtime. The Tanzu-mediated MCP server publishing is an emerging capability that could become significant for agentic AI: if every enterprise developer can publish MCP servers through Tanzu, and IT governs access through VCF 9.1’s MCP server governance, VMware creates a platform for enterprise agent tooling that is neither centralized (Google) nor delegated to partners (HPE Unleash AI) but distributed-and-governed. Whether this pattern scales depends on enterprise developer adoption of Tanzu.

Layer 2C · ReasoningAgentic Infrastructure — The Reasoning PlaneEmerging Signals Only▼

Policy-driven placement and resource coordination — the Autonomy Layer

Vendor-Provided

NVIDIA-Provided

No NVIDIA Layer 2C on VMware

NVIDIA provides no agent governance, policy-driven placement, or reasoning plane components in the VMware stack. Same gap as Dell — NVIDIA’s AI-Q is workflow scaffolding, OpenShell is constraint enforcement, Dynamo is performance routing. None is Layer 2C.

◆ Gap Analysis

Applying the ‘Routing Is Not Reasoning’ test: MCP Server Governance = access control. Intelligent Assist = IT operations automation. GPU/Model Metrics = observability telemetry. None provides policy-driven decisions about where compute runs relative to data, which model serves which request, and how cost/compliance/latency are arbitrated in real time. VMware’s Layer 2C position is comparable to Dell’s: not yet evident as a productized capability. The signals (MCP governance, metrics observability, Intelligent Assist) suggest the building blocks exist but have not been composed into a reasoning plane. Compare to other vendors’ Layer 2C status: • Dell: Absent. Dell+Intel ‘actively addressing’ but no product announced. • HPE: Delegated to Kamiwaza (multi-layer orchestration partner via Unleash AI). GreenLake Intelligence provides IT ops 2C. • VAST: PolicyEngine + Polaris — the most aggressive middle-out Layer 2C build. • Google: Agent Identity + Gateway + Registry + Orchestration + Observability — the most complete productized 2C. • AWS: AgentCore with implicit 2C through managed service placement decisions. VMware’s unique 2C opportunity: VCF manages the entire infrastructure estate. If Broadcom builds a Layer 2C that queries vSAN governance metadata, GPU utilization telemetry, vDefend security posture, and Model Runtime performance metrics to make autonomous placement decisions, it would have the broadest infrastructure visibility of any on-prem 2C — because VCF sees everything from the hypervisor up. The data to build 2C exists in VCF Operations. The governance primitives exist in MCP Server Governance and ACC. The placement engine does not.

◆ Borrowed Judgment

Inverted: there IS no judgment to borrow because no Layer 2C exists. Same structural position as Dell. The enterprise must build custom 2C logic, bring a partner (Kamiwaza, potentially), or operate without it. Most will choose option 3. The MCP Server Governance capability is an interesting partial answer: it provides governance over agent-tool interactions without providing placement intelligence. This is access-control-as-governance — necessary but not sufficient for a reasoning plane.

◆ Working Notes

VMware has a structural advantage in building Layer 2C that no other on-prem vendor possesses: VCF is the control plane for the enterprise’s entire virtualized estate. Dell manages Dell hardware. HPE manages HPE hardware. VAST manages VAST storage. VMware manages EVERYTHING virtualized — across Dell, HPE, Lenovo, Cisco, and any other OEM’s hardware. A VMware Layer 2C would be the first multi-vendor infrastructure reasoning plane — making placement decisions across heterogeneous hardware from a single governance surface. No other vendor can build this because no other vendor has the cross-vendor infrastructure visibility. Whether Broadcom invests in this opportunity is an open question. The Broadcom acquisition thesis prioritizes cash generation from the installed base, not R&D investment in new platform capabilities. Layer 2C is a significant engineering investment. The $30B annual infrastructure software segment gives Broadcom the resources; the question is whether the strategic priority exists. Hock Tan’s framing of VCF as ‘the permanent abstraction layer between AI software and physical chips’ is a Layer 2A statement, not a Layer 2C statement. The abstraction layer manages resources. The reasoning plane governs them. VMware has the former; it does not have the latter.

Layer 3 (+1) · ApplicationsAI Application Layer — The Value PlanePlatform-Enabled, Not Platform-Provided▼

AI-powered business capabilities — business logic, workflow automation

Vendor-Provided

Private AI Services (Integrated)Ceded

Model Runtime + Agent Builder + Data Indexing/Retrieval + Vector Database + Model Store + GPU Monitoring — all included in VCF subscription at no additional cost. This is not a Layer 3 application stack — it is an integrated set of AI platform services that enables Layer 3 development. The distinction matters: VMware provides the tools to BUILD AI applications, not the applications themselves. Proprietary VMware/Broadcom platform — opinions captive, no open exit.

NVIDIA Blueprints + NIM on VCFDelegated

Pre-built AI application patterns deployable through VCF Automation catalog. Multimodal PDF Extraction, Digital Twins, RAG pipelines. Same blueprints available on Dell, HPE, Cisco, Lenovo — non-differentiating for VMware.

Tanzu Platform (Agent Distribution)Ceded

Developers self-publish AI agents and MCP servers to the enterprise via Tanzu. IT maintains governance and oversight. Tanzu Marketplace provides curated path to certified middleware, data services, and AI tooling. This is an enterprise app store model for AI capabilities.

ISV + OEM EcosystemDelegated

VMware Private AI Foundation validated on Dell, HPE, Lenovo, Cisco, Supermicro, NEC, Fujitsu hardware. ISV ecosystem spans the entire VMware partner network — thousands of validated applications across every industry. AI-specific ISV validation is emerging but not yet at the curation depth of HPE’s Unleash AI (26+ selected ISV partners) or Dell’s AI Ecosystem Program (OpenAI, Palantir, Google, ServiceNow).

OpenWebUI IntegrationDelegated

Open-source AI user interface integrated with VCF Private AI Services RAG. Provides a ChatGPT-like interface for enterprise users to interact with privately-hosted models. Demonstrates the ‘platform enables applications’ model.

NVIDIA-Provided

NVIDIA Model Ecosystem

Nemotron models, community models, NVIDIA NIM containers available through Model Store. NVIDIA provides the model layer; VMware provides the serving and governance layer.

◆ Gap Analysis

VMware’s Layer 3 is structurally different from every other assessed vendor because VMware is explicitly a platform, not an application provider. VMware provides the tools to build and deploy AI applications (Private AI Services) but does not build the applications themselves. This is the correct architectural position for an infrastructure platform vendor — and it’s the same position Dell occupies (Dell doesn’t build AI applications; it partners with OpenAI, Palantir, ServiceNow). The difference is ecosystem depth: • Dell’s AI ecosystem: OpenAI, Palantir, Google, ServiceNow, SpaceXAI, Hugging Face, 5,000+ deployment customers. Explicitly curated for AI. • HPE’s Unleash AI: 26+ selected ISV partners with validated interoperability. Kamiwaza orchestration. CrewAI pre-installed. Purpose-built for AI. • VAST’s Cosmos Community: CoreWeave, TwelveLabs, CrowdStrike with distinct partner tracks. Focused and vertical. • VMware’s AI ecosystem: Inherits the broader VMware partner ecosystem (thousands of ISVs) but without AI-specific curation depth. Private AI Foundation validation is available on major OEM hardware, but AI-specific ISV partnerships are not yet at the maturity of Dell or HPE programs. The Tanzu-mediated MCP server publishing could evolve into VMware’s distinctive Layer 3 model: instead of curating an external ISV ecosystem (HPE’s approach) or partnering with AI application vendors (Dell’s approach), VMware enables the enterprise’s own developers to build and distribute AI agents internally. This is an internally-generated Layer 3 rather than an externally-sourced one. The VCF installed base is the Layer 3 enabler: 100M+ cores means Private AI Services reach more enterprise infrastructure than any competitor’s AI platform. The AI applications built on VMware will be built by the enterprise’s own developers, using VMware’s tools, on VMware’s platform. Whether that bottom-up, developer-driven approach generates Layer 3 applications as quickly as Dell’s top-down partnerships (OpenAI, Palantir) or HPE’s curated ecosystem (Unleash AI) is the open question.

◆ Borrowed Judgment

Distributed across the enterprise’s own development teams and chosen partners. VMware provides the platform; the enterprise provides the application logic. This is the most explicit Retained model for Layer 3 in this assessment — the enterprise builds its own AI applications rather than consuming a vendor’s or partner’s. The trade-off: maximum control (Retained), maximum effort (the enterprise must build everything above the platform services layer). Dell and HPE offer Delegated shortcuts (partner applications). VMware offers Retained responsibility.

◆ Working Notes

The Private AI Foundation at no additional cost for VCF subscribers is a strategic masterstroke for customer retention: every VCF customer already has access to Model Runtime, Agent Builder, Vector Database, Data Indexing/Retrieval, and Model Store. The marginal cost of trying Private AI is zero (beyond GPU hardware). This is the lowest-barrier entry to on-prem AI of any vendor assessed. The 9/10 Fortune 500 commitment to VCF means Private AI Foundation has the largest potential enterprise deployment footprint of any on-prem AI platform. Whether that potential converts to actual AI workload deployment depends on whether enterprises find Private AI Services sufficient for production AI or whether they choose purpose-built alternatives (Dell AI Factory, HPE Private Cloud AI, VAST AI OS) for deeper capabilities. The competitive dynamic is unusual: VMware doesn’t compete with Dell or HPE at Layer 0 (VMware runs ON their hardware). VMware competes with them at Layers 1-3 (management, orchestration, AI services). An enterprise could run Dell hardware + VMware VCF + VMware Private AI Services — getting Dell’s Layer 0 with VMware’s Layers 2A/2B. Or Dell hardware + Dell AI Factory — getting Dell’s Layer 0 with Dell/NVIDIA’s Layers 2A/2B. The choice is between VMware’s operational maturity and Dell/HPE’s AI-specific depth.

✦ Summary Finding

4+1 Layer AI Infrastructure Model · Vendor Assessment Series · The CTO Advisor LLC · thectoadvisor.com