VMware Private AI Foundation with NVIDIA occupies a structurally unique position in this assessment series: it is neither an infrastructure OEM (Dell, HPE), a hyperscaler (AWS, Google Cloud), nor a data platform vendor (VAST). It is a virtualization and private cloud platform — the abstraction layer that sits between physical infrastructure and workloads. Broadcom’s strategic thesis is that VCF is ‘the permanent abstraction layer between AI software and physical chips,’ and the Private AI Foundation extends that thesis into AI workloads specifically.
The 4+1 model reveals both the power and the limits of this position. VCF’s strength is Layer 2A — infrastructure orchestration is VMware’s heritage and its deepest IP. VCF Automation, vSphere Supervisor, VKS, vSAN, NSX/vDefend, and VCF Operations collectively provide the most mature unified orchestration surface for mixed workloads (VMs, containers, AI) of any on-prem vendor assessed. No other vendor in this series manages GPU-accelerated AI workloads, Kubernetes clusters, and traditional VMs from a single control plane with equivalent operational maturity.
At Layer 0, VMware’s multi-accelerator management (AMD, NVIDIA, Intel) requires careful contextualization. GPU vendor choice is not unique to VMware — HPE’s GX5000 supports NVIDIA and AMD blades in the same rack, and hyperscalers fully abstract accelerators at the service layer (a developer calling Vertex AI or Bedrock never sees which silicon powers the response). VMware’s actual differentiator is the level of architectural control: operators manage GPU placement, isolation, and scheduling through familiar vSphere primitives (vGPU profiles, vmclasses, DRS, resource pools). The control plane is borrowed judgment — it is VMware’s opinionated virtualization model applied to acceleration — but that opinion provides stronger knobs that appeal to operators already comfortable with virtualization management. Where hyperscalers abstract the accelerator away from the architect, VMware puts the architect in the driver’s seat through a familiar console.
But the closer the stack gets to AI-specific functions — model serving, retrieval, agent execution, governance — the more authority shifts to NVIDIA (Layer 2B runtime via NVIDIA AI Enterprise), to open-source components (pgvector, Elasticsearch), or to capabilities that are emerging but not yet at the depth of purpose-built alternatives. Private AI Services (Model Runtime, Agent Builder, Data Indexing/Retrieval, Vector Database, Model Store) are genuine platform capabilities delivered as part of the VCF subscription, but they are foundational AI services, not the deep data lifecycle or agent orchestration that Dell (Dataloop), HPE (Ezmeral/Kamiwaza), or VAST (DataEngine/AgentEngine) provide. Layer 1C (data pipelines) is absent. Layer 2C (reasoning plane) has building blocks — MCP Server Governance, GPU/Model Metrics, Intelligent Assist — but none passes the ‘Routing Is Not Reasoning’ test.
The installed base is the strategic moat: nine of the top ten Fortune 500 companies have committed to VCF, with 100M+ cores licensed worldwide. For the enormous VMware installed base, Private AI Foundation is the lowest-friction path to on-prem AI — no new infrastructure vendor, no new management plane, no new operational model, and no incremental cost beyond GPU hardware. The 4+1 question is whether lowest-friction adoption translates to sufficient architectural depth when agentic AI workloads demand governance, policy-driven placement, and cross-agent orchestration that VCF does not yet provide.
VMware Private AI Foundation is the enterprise’s most natural on-ramp to private AI. Hock Tan’s ‘permanent abstraction layer’ framing is a Layer 2A statement, not a Layer 2C statement — the abstraction layer manages resources; the reasoning plane governs them. VMware has the former; it does not have the latter. Whether it becomes the enterprise’s durable AI platform depends on whether Broadcom invests in the Layer 1A governance depth, Layer 1C pipeline capability, and Layer 2C reasoning plane that the 4+1 model identifies as structurally necessary — or whether the Broadcom acquisition thesis (cash generation from the installed base) constrains that investment. VMware’s unique structural advantage is that VCF sees everything from the hypervisor up across all OEM hardware — the data to build a multi-vendor reasoning plane exists. The engineering commitment does not yet.
Layer-by-layer status: Layer 0 (Hardware-Agnostic Abstraction), Layer 1A (Platform Storage, Not AI-Native), Layer 1B (Foundational RAG Services), Layer 1C (Gap), Layer 2A (VMware Heritage Strength), Layer 2B (Platform-Native + NVIDIA-Dependent), Layer 2C (Emerging Signals Only), Layer 3 (+1) (Platform-Enabled, Not Platform-Provided).
Assessment framework: 4+1 Layer AI Infrastructure Model. Scoring model: Decision Authority Placement Model (DAPM) — Retained, Delegated, Ceded, or Absent. Published by The CTO Advisor LLC. Author: Keith Townsend. Date assessed: May 22, 2026. Version: v1.0 — Initial Assessment.
Raw compute, networking, and acceleration fabric
Industry-standard virtualization layer. GPU passthrough and vGPU support via NVIDIA AI Enterprise integration. vSphere Supervisor manages both VMs and Kubernetes workloads from a single control plane. vMotion for live migration of AI workloads. DRS for automated load balancing. The hypervisor is Broadcom’s foundational IP — the abstraction layer between physical infrastructure and all workloads above.
VCF runs on Dell PowerEdge, HPE ProLiant, Lenovo ThinkSystem, Cisco UCS, Supermicro, NEC, Fujitsu, and others. Hardware-agnostic by design — VCF does not manufacture or specify compute. This is the fundamental architectural difference from Dell AI Factory or HPE Private Cloud AI: VMware abstracts hardware, OEMs provide it. The enterprise retains hardware vendor choice.
VCF 9.1 supports NVIDIA Blackwell architecture, NVSwitch on HGX platform, GPUDirect RDMA over InfiniBand for distributed LLM inference across multiple HGX servers. Enhanced DirectPath I/O for ConnectX-7 NICs and BlueField-3 DPUs. vGPU profiles mapped to vSphere Namespaces as vmclasses for multi-tenant GPU isolation — memory strictly partitioned per tenant with zero side-channel risk across GPU framebuffer.
Software-defined networking with micro-segmentation, Zero Trust enforcement via Distributed Firewall (Antrea CNI for Kubernetes), and in-memory malware defense. Avi Load Balancer provides virtualized load balancing for AI inference endpoints and agentic applications — eliminates hardware appliance requirements. Post-quantum cryptography support.
VCF 9.1 manages AMD, NVIDIA, and Intel accelerators through the same virtualization control plane — vGPU profiles, vmclasses, DRS policies, resource pools. The architect retains granular control over GPU placement, isolation, and scheduling using familiar vSphere primitives. This is not unique as a multi-vendor GPU capability (HPE GX5000 supports NVIDIA Rubin + AMD MI430X in the same rack; hyperscalers fully abstract accelerators at the service layer). VMware’s differentiator is the level of architectural control: operators already comfortable with virtualization management get GPU scheduling knobs they know how to turn. The control plane is opinionated — it applies VMware’s virtualization model to acceleration — but that opinion is the value for VMware-native shops.
Enterprise AI software suite providing vGPU drivers, GPU Operator for Kubernetes, and validated AI frameworks. NVAIE licenses purchased separately from VCF. Deeply integrated but independently licensed — the same NVIDIA dependency Dell and HPE share.
Blackwell, H100/H200, ConnectX-7/8, BlueField-3 DPUs, NVSwitch, InfiniBand. VMware validates and integrates but does not manufacture or specify GPU silicon.
Pre-built inference microservices deployable on Private AI Foundation. Nemotron models and community models available through Model Store.
VMware’s Layer 0 position is fundamentally different from every other vendor in this assessment: VMware provides the abstraction layer, not the physical infrastructure. Dell, HPE, and VAST own or specify hardware. Google and AWS own data centers. VMware sits above all of them. This creates a unique DAPM profile: the enterprise Retains hardware vendor choice (can switch from Dell to HPE to Lenovo without changing the management plane) but Delegates GPU runtime to NVIDIA and inherits whatever GPU integration VMware has validated. The abstraction is VMware’s value proposition and its architectural constraint — VMware can only support GPU features that the hypervisor can virtualize or pass through. Multi-accelerator support requires nuanced comparison across the assessed vendors. GPU vendor choice is NOT unique to VMware: • Hyperscalers (Google, AWS) fully abstract accelerators at the service layer — developers call Vertex AI or Bedrock and never see whether TPUs, Trainium, or NVIDIA GPUs power the response. Accelerator choice is Ceded to the cloud provider. Simplest developer experience, least architectural control. • HPE GX5000 supports NVIDIA Rubin and AMD MI430X GPU blades in the same rack architecture. Multi-vendor at the hardware level. • Dell AI Factory is NVIDIA-only. ‘Dell AI Platform with AMD’ is a separate branding, separate software stack — not a unified runtime. • VAST CNode-X is NVIDIA-only. VMware’s actual differentiator is the level of architectural control over acceleration: VCF manages AMD, NVIDIA, and Intel accelerators through familiar virtualization primitives (vGPU profiles mapped to vmclasses, DRS for GPU workload balancing, vMotion for live migration, resource pools for multi-tenant isolation). The enterprise architect retains granular control over GPU placement, scheduling, and isolation using tools they already operate. The control plane is borrowed judgment — it is VMware’s opinionated virtualization model applied to GPU resources — but that opinion provides stronger knobs that appeal specifically to operators already comfortable with vSphere management. Where hyperscalers abstract the accelerator away from the architect, VMware puts the architect in the driver’s seat through a familiar console. The vDefend security story is architecturally significant for AI: micro-segmentation at the packet level between every AI component (model server, vector database, embedding service, API gateway) with Terraform-codified firewall rules. This is infrastructure-layer Zero Trust for AI workloads — a capability that Dell and HPE don’t provide at equivalent depth from the platform layer. VAST’s CrowdStrike integration operates at a different level (application/data-layer security vs. network-layer micro-segmentation).
Multi-directional: VMware borrows GPU silicon judgment from NVIDIA (same as everyone) and hardware engineering judgment from OEM partners (Dell, HPE, Lenovo build the servers). But VMware retains the abstraction layer — the hypervisor, the networking, the security model, the orchestration. This is the inverse of Dell’s position: Dell retains hardware judgment and borrows software judgment from NVIDIA. VMware retains software judgment and borrows hardware judgment from OEMs. Critically, the VMware control plane is itself borrowed judgment for the enterprise: the architect gains granular GPU management through vSphere primitives, but those primitives encode VMware’s opinions about how acceleration should be virtualized, scheduled, and isolated. The enterprise borrows VMware’s virtualization worldview in exchange for operational familiarity. This is a different trade-off than the hyperscalers (where the enterprise cedes acceleration decisions entirely) or bare-metal (where the enterprise retains full control but builds everything). VMware occupies the middle: more control than cloud, less effort than bare-metal, but through an opinionated lens. The NVIDIA AI Enterprise dependency is real but no deeper than Dell’s or HPE’s: all three require NVAIE for GPU virtualization and AI framework support. VMware’s co-engineering relationship with NVIDIA on VCF integration is comparable to HPE’s Private Cloud AI co-engineering.
The Broadcom acquisition context is impossible to ignore at Layer 0: Gartner projects VMware’s virtualization market share will fall from 70% (2024) to 40% (2029) due to pricing changes. Nutanix CEO has publicly targeted 165,000 of VMware’s approximately 300,000 customers. Broadcom has converted 90%+ of the top 10,000 VMware customers to VCF subscriptions with 200-500% price increases reported. This creates a unique installed-base dynamic: Private AI Foundation’s market opportunity is less about winning new customers than about retaining existing ones by making VCF indispensable for AI workloads. If the enterprise is already paying for VCF, Private AI Services come at no additional cost — a fundamentally different go-to-market than Dell (buy new PowerEdge + NVIDIA), HPE (buy new Private Cloud AI), or VAST (deploy new AI OS). The air-gapped deployment support is significant for regulated industries and government — same capability Dell and HPE emphasize, delivered through the existing VCF automation framework rather than a purpose-built AI appliance.
Durable, governed data foundation — the Governance Catalog that Layer 2C queries
Hyper-converged storage integrated into VCF. Block and file storage natively. Native Object Storage (S3-compatible) in tech preview with VCF 9.1.x — brings S3 interface natively into the platform without third-party licensing. vSAN deduplication and compression for cost reduction. Unified storage policies and multi-tenant self-service access.
Sovereign, in-place ransomware recovery using native snapshot capabilities. Deep snapshot chains and integrated replication workflows. On-prem recovery without external dependencies.
Curated LLM repository with integrated RBAC access control. MLOps teams and data scientists can securely manage and provide LLMs with governance and security for enterprise data and IP. NVIDIA models, Nemotron, and community models available.
VCF supports Dell PowerScale, Dell ObjectScale, NetApp ONTAP, Pure Storage, HPE Alletra, and other enterprise storage via vSphere APIs. The storage layer is not limited to vSAN — enterprises can bring existing storage investments. This is a heterogeneous storage approach vs. Dell’s vertically integrated storage (PowerScale/ObjectScale/Exascale) or VAST’s collapsed storage (Element Store).
NVIDIA does not provide storage or governance components in the VMware Private AI stack. Storage is VMware-owned (vSAN) or enterprise-chosen (external arrays).
VMware’s Layer 1A is fundamentally different from every other assessed vendor because VMware is not a storage company. vSAN provides competent hyper-converged storage with the VCF 9.1 addition of native S3 object storage, but this is general-purpose platform storage — not AI-optimized data infrastructure. Compare to Dell: MetadataIQ indexes billions of files with AI-specific metadata enrichment. Exascale provides 10+ PB/rack unified file+object+fast-file. Trust3 AI provides storage-layer governance for sensitive data discovery. Compare to HPE: Data Fabric v8.1 provides policy-based data placement with Apache Polaris catalog for Iceberg tables. Alletra B10000 provides real-time agentic storage support with semantic understanding. Compare to VAST: Element Store collapses file, object, table, and vector into a single governed data structure with inline metadata enrichment. VMware’s storage story is ‘bring your existing storage’ — which is pragmatic for the installed base but means Layer 1A governance (metadata richness, data lineage, policy-based placement) depends entirely on whichever external storage vendor the enterprise has deployed. VMware itself provides no AI-specific governance catalog, no metadata enrichment, no data lineage tracking. The Model Store capability is a Layer 1A function worth noting: RBAC-governed model repository is a governance primitive that Dell’s AI Factory lacks as a platform-native capability. But Model Store governs models, not data — it does not address the broader question of which data feeds which model under what compliance constraints.
Low for vSAN (VMware-owned). High for AI-specific storage governance — entirely dependent on whichever external storage vendor the enterprise deploys. If the enterprise runs Dell storage, it inherits Dell’s governance capabilities (MetadataIQ). If it runs NetApp, it inherits NetApp’s. VMware provides no abstraction or unification of storage governance across heterogeneous backends. This is the inverse of VMware’s Layer 0 strength: at Layer 0, VMware abstracts heterogeneous hardware into a unified management plane. At Layer 1A, VMware does NOT abstract heterogeneous storage governance into a unified governance plane. The storage abstraction stops at provisioning and capacity management — it does not extend to metadata, lineage, or policy.
The native S3 Object Storage in VCF 9.1.x (tech preview) is a strategic move: S3 compatibility is the lingua franca of AI data pipelines. Every vendor in this assessment provides S3 access (Dell ObjectScale, HPE Alletra X10000, VAST DataStore, AWS S3, Google Cloud Storage). VMware adding native S3 to vSAN reduces the dependency on external object storage for AI workloads. The Tanzu Marketplace integration provides a curated path to certified middleware and data services — this is VMware’s approach to ecosystem curation at the data layer, comparable in intent (not depth) to HPE’s Unleash AI program or VAST’s Cosmos Community. SQL Server DBaaS as a first-class VCF citizen is a pragmatic enterprise play — most enterprises have SQL Server deployments, and making it a platform service reduces the friction of data access for AI workloads.
Low-latency retrieval for RAG — vector/hybrid search, context windows
Index and maintain multiple data sources, making them readily available for consumption by AI applications. Integrated with Model Runtime for RAG workflows. Keeps indexed data current as sources change.
pgvector on PostgreSQL delivered via Data Services Manager with VMware enterprise-level support. Enables domain-specific, up-to-date context for AI models. PostgreSQL 16.8 with pgvector 0.8.0 extension in Private AI Services 2.1.
NVIDIA NIM RAG Blueprint v2.5.0 validated on VCF — production-grade, multi-model RAG pipeline. Pre-built catalog items in VCF Automation for deploying complete RAG workflows. Elasticsearch supported as external vector database for advanced retrieval scenarios.
Inference microservices and retrieval-augmented generation components. RAG Blueprints provide pre-built retrieval patterns. Same capabilities available on Dell and HPE platforms.
Validated software stack for RAG workflows on VMware Private AI Foundation. GPU-accelerated embedding generation and retrieval.
VMware’s Layer 1B provides functional RAG capabilities through Private AI Services — Data Indexing/Retrieval and Vector Database are genuine platform services, not just partner integrations. But the retrieval stack is foundational, not differentiated. pgvector on PostgreSQL is a competent vector database for moderate-scale use cases but lacks the performance characteristics of purpose-built alternatives. Compare to Dell’s Data Search Engine (Elasticsearch 9.4 with GPU-accelerated hybrid search, MetadataIQ integration). Compare to VAST’s InsightEngine (native to the data platform, no data movement for retrieval). Compare to HPE’s Alletra X10000 with KV cache storage support for inference state persistence. The Data Indexing & Retrieval service addresses the core RAG requirement — keeping context current as data sources change — but without the metadata richness or governance integration that Dell (MetadataIQ + Elastic), HPE (Data Fabric + Kamiwaza), or VAST (Catalog + InsightEngine) provide. No retrieval quality observability (recall@k, latency percentiles) is evident — the same gap identified in the Dell assessment. A Layer 2C placement engine would need retrieval quality metrics to make informed routing decisions.
Moderate. VMware owns the Data Indexing/Retrieval and Vector Database services. RAG pipeline patterns depend on NVIDIA NIM/Blueprints (same dependency as Dell and HPE). Elasticsearch as external vector database option introduces the same Elastic dependency Dell has — search intelligence is Elastic’s, not VMware’s. The pgvector choice is notable: PostgreSQL is the most widely deployed enterprise database. By building on pgvector, VMware reduces adoption friction (most enterprises already have PostgreSQL expertise) at the cost of retrieval performance ceiling. Dell chose Elasticsearch (higher performance, more complex). VAST built its own (highest integration, most proprietary). VMware chose the most pragmatic option.
The OpenWebUI integration with Private AI Services RAG demonstrates VMware’s approach to Layer 1B: provide the retrieval infrastructure, let the enterprise choose the user-facing application layer. This is consistent with VMware’s platform philosophy — VMware provides infrastructure services, not applications. The RAG Blueprint validation on VCF (multi-model, production-grade, 8x NVIDIA H100 80GB GPUs) provides a concrete reference architecture that enterprises can deploy from VCF Automation catalog items. This is operationally simpler than assembling equivalent RAG infrastructure on bare-metal Dell or HPE hardware.
Move/transform data — policy-driven placement, lineage, cost-aware movement
Pre-built self-service catalog items for AI workload deployment: Deep Learning VMs (PyTorch, TensorFlow pre-installed), AI Kubernetes clusters with GPU worker nodes, Triton Inference Servers. Quickstart templates eliminate weeks of manual setup. These are deployment pipelines, not data pipelines.
VKS 3.6 with GitOps-based infrastructure and application management. Tanzu Platform provides container runtime and application deployment. Pipeline orchestration tools (Airflow, Kubeflow, MLflow) can run on VKS but are not VMware-provided — the enterprise must bring its own ML pipeline stack.
Intelligently tiers DRAM and NVMe for memory-bound AI and database workloads. Topology Aware Scheduling places workloads with NUMA and accelerator locality. These are infrastructure-level data movement optimizations, not AI data pipeline orchestration.
Pre-built AI application patterns deployable through VCF Automation. Pipeline templates, not pipeline infrastructure — same as Dell and HPE.
KV cache management for context memory offload. When integrated with VCF, could provide the same KV cache tiering Dell has validated (19x TTFT improvement).
Layer 1C is VMware’s most significant gap relative to other assessed vendors. VMware provides no equivalent to: • Dell’s Data Orchestration Engine (Dataloop): No-code/low-code AI data lifecycle management, Dell’s most meaningful software acquisition. • HPE’s Ezmeral Unified Analytics: Enterprise-hardened ML pipeline stack (Airflow, Kubeflow, Ray, Feast, MLflow, Spark). • HPE’s Data Fabric: Policy-based data placement with compliance tagging and data lineage. • VAST’s DataEngine: Serverless data transformation with CLI/SDK, built-in observability, triggers, and automated pipelines. VCF Automation provides deployment pipelines (standing up AI infrastructure) but not data pipelines (moving, transforming, and governing data through ML workflows). The enterprise running VMware Private AI Foundation must bring its own data pipeline orchestration — Airflow, Kubeflow, or a commercial alternative — and deploy it on VKS. This is architecturally consistent with VMware’s platform philosophy: VCF provides infrastructure services, not application-layer data engineering tools. But it leaves a functional gap that competitors have filled. An enterprise choosing VMware for AI inherits a Layer 1C assembly problem that Dell (Dataloop), HPE (Ezmeral), or VAST (DataEngine) partially or fully solve.
High. The enterprise must borrow data pipeline judgment from whatever tools it deploys on VKS — Apache Airflow community, Kubeflow community, or a commercial vendor (Dataloop, Databricks, etc.). VMware provides no opinion on data pipeline architecture, no integration between pipeline metadata and infrastructure governance, and no data lineage capability. Compare to Dell: Dell acquired Dataloop specifically to address Layer 1C. Compare to HPE: HPE assembled Ezmeral through four acquisitions (BlueData, MapR, Ampool, Arrikto). Compare to VAST: VAST built DataEngine as a native platform capability. VMware has made no equivalent investment in data pipeline IP.
The gap is real but may be strategic: VMware has historically succeeded by providing infrastructure primitives that partner ecosystems build on, rather than by building application-layer tooling. The question is whether AI data pipelines are infrastructure (VMware should own them) or applications (VMware should enable them). The Tanzu Marketplace could address this gap through curated data pipeline services — certified Airflow, MLflow, or Kubeflow deployments validated for VCF. This would be a Delegated approach (partner provides the capability, VMware validates the deployment) rather than a Retained approach (VMware builds the capability). Architecturally similar to HPE’s Unleash AI ecosystem model. The KV cache story is notably absent: Dell has validated NVIDIA CMX with 19x TTFT improvement on PowerScale. HPE has native KV cache storage support in Alletra X10000. VAST collocates cache and compute in CNode-X. VMware has not yet announced equivalent KV cache tiering capabilities.
Lifecycle management, resource scheduling, policy enforcement, unified ops
Self-service catalog with pre-built AI workload templates. Quickstart deployment for Private AI Foundation. Infrastructure-as-code with Terraform integration. Multi-tenant resource provisioning with RBAC. Live Application Stack Blueprints for versioned, redeployable application topologies. Day 2 operations for AI Blueprints.
Unified management of VMs, containers, and AI workloads from a single control plane. VKS (vSphere Kubernetes Service) 3.6 supports up to 500 Kubernetes clusters per Supervisor. Simplified Container-as-a-Service for application teams. VM Fast-Deploy for accelerated provisioning. vSphere Elastic Provisioning for zero-touch fleet expansion. GitOps-based infrastructure management.
Private AI Model and GPU Metrics — utilization, memory pressure, and model-level visibility on the same console as the rest of the estate. Real-Time Operational Observability turns telemetry into action. Customizable dashboards for AI model and agent performance. Capacity management and compliance monitoring for AI workloads.
Full-stack lifecycle management for VCF. Automated deployment, patching, and upgrades across vSphere, vSAN, NSX, and VKS. Single-pane fleet management. Expanded fleet size and upgrade scale in 9.1. This is the operational backbone — the equivalent of HPE’s GreenLake or Dell’s APEX management, but with 20+ years of enterprise maturity.
Continuous compliance enforcement with automated drift detection and remediation. Hardened infrastructure images. Integrated with vDefend for security posture management. Disaster recovery via vSAN for Recovery.
IT operations can centrally manage and control access to MCP tools and associated servers across their environment. Ensures user groups can only access approved MCP tools. Security guardrails for MCP servers via vDefend and Avi Load Balancer. This is a Layer 2A governance function with 2C implications — controlling which agents can access which tools.
Kubernetes operator for GPU lifecycle management. Manages GPU drivers, container runtime, device plugins. Standard across all NVIDIA-integrated platforms.
GPU virtualization profiles and multi-tenant GPU allocation. Memory partitioning and time-sliced compute scheduling. Managed through vSphere Supervisor.
Layer 2A is VMware’s strongest layer — arguably the strongest Layer 2A of any vendor in this assessment series. The reason is operational maturity: VCF has been managing enterprise infrastructure for two decades. No other vendor assessed has equivalent depth in lifecycle management, multi-tenant orchestration, compliance automation, and unified VM/container/AI workload management. Specific 2A differentiators vs. other assessed vendors: • Unified workload management: Dell manages AI workloads separately from traditional workloads (OpenManage for servers, Run:ai for GPUs, separate tools for each). HPE manages AI through GreenLake Intelligence + OpsRamp + Private Cloud AI (three systems). VMware manages AI, containers, and VMs from ONE control plane (vSphere Supervisor). VAST manages only VAST workloads. • Operational maturity: VCF’s Day 2 operations (patching, upgrades, compliance, capacity planning) for AI workloads inherit the same proven processes used for the enterprise’s existing VM fleet. New operational model required? Zero. Dell and HPE AI stacks require new operational processes. VAST requires an entirely new operational discipline. • MCP Server Governance is a notable 2A/2C bridge: centrally controlling which user groups can access which MCP tools is an infrastructure-level governance function that no other on-prem vendor provides as a platform native capability. Google’s Agent Gateway provides equivalent capability in cloud. The GPU scheduling gap remains: NVIDIA GPU Operator and vGPU Manager handle GPU allocation, but policy-driven GPU scheduling (which workload gets which GPU based on cost, compliance, and performance constraints) is not a VCF-native function. This is the same gap Dell has with Run:ai — the scheduling intelligence is NVIDIA’s, not the platform vendor’s.
Low — the lowest of any layer in the VMware assessment. VCF Automation, vSphere, VKS, vSAN, NSX, VCF Operations, and SDDC Manager are all Broadcom/VMware IP. GPU scheduling is the primary borrowed judgment (NVIDIA GPU Operator + vGPU Manager), but this is the same dependency every on-prem vendor shares. Compare to Dell Layer 2A: Dell splits 2A between OpenManage (Dell-owned) and Run:ai (NVIDIA-owned, acquired). VMware retains more 2A authority than Dell. Compare to HPE Layer 2A: HPE’s GreenLake Intelligence is HPE-owned 2A with MCP-based agent communication. VMware’s VCF Operations is VMware-owned 2A with emerging MCP support. Both retain 2A authority; different architectural approaches (HPE: agentic mesh; VMware: traditional orchestration evolving toward agentic).
The Intelligent Assist for VCF (tech preview) signals VMware’s evolution toward agentic infrastructure management. An AI-driven support assistant that diagnoses and resolves issues by consulting Broadcom’s knowledge base is functionally similar to HPE’s GreenLake Intelligence domain agents or Dell’s CloudIQ — but at an earlier stage of development. The 100M+ licensed cores installed base gives VMware an operational data advantage no other on-prem vendor can match: patterns learned from managing the world’s largest virtualization fleet can inform AI workload optimization in ways that newer platforms cannot. Whether Broadcom invests in leveraging this data advantage for AI-specific intelligence is an open question.
Model serving, agent execution, inference APIs, distributed inference
Run inference and embedding models as a service across the organization. API Gateway allows users and AI applications to interact with models directly via API. Multi-accelerator support — same model deployment on AMD and NVIDIA GPUs without refactoring. Model endpoints configurable through VCF Automation UI. Multi-tenant Models-as-a-Service enables secure model sharing across business units to lower costs and reduce power consumption.
Build AI agents in a user-friendly playground, leveraging models and knowledge bases created using other Private AI services. Integrated with Model Runtime and Data Indexing/Retrieval for end-to-end agent development. This is a platform-native agent construction surface — not as deep as VAST’s AgentEngine or Google’s Agent Studio/ADK, but integrated into the VCF operational model.
Pre-configured virtual machines with validated AI/ML software stacks: PyTorch, TensorFlow, Miniconda. Software stack validated in advance on NVIDIA GPUs — data scientists start developing immediately without compatibility validation. Provisioned through VCF Automation self-service catalog.
GPU-capable Kubernetes worker nodes for cloud-native AI/ML workloads. Triton Inference Server deployable from catalog. Distributed LLM inference with GPUDirect RDMA over InfiniBand for models that exceed single-server capacity (DeepSeek-R1, Llama 3.1-405B). This is infrastructure-layer runtime support, not an opinionated agent execution framework.
PaaS-layer application runtime. ‘You provide code, we put it into production.’ Governed agentic coding with Tanzu. Developers can self-publish AI agents and MCP servers, sharing AI applications and tools across the enterprise. MCP server publishing makes Tanzu a distribution surface for enterprise agent tooling.
NIM inference microservices, model optimization, GPU-accelerated frameworks. The core AI runtime dependency — VMware’s Model Runtime wraps NVIDIA inference capabilities in a platform-managed service.
Pre-built agentic workflows (RAG, PDF extraction, digital twins). Same blueprints available on Dell, HPE, Cisco, Lenovo. Non-differentiating for VMware at 2B.
Multi-framework model serving. Deployable as VCF Automation catalog item on GPU-capable VKS clusters.
VMware’s Layer 2B is the most architecturally interesting in this assessment because it combines platform-native AI services (Model Runtime, Agent Builder) with NVIDIA runtime dependency — a hybrid Retained/Delegated model. The Model Runtime is a genuine platform capability: model serving as a managed VCF service with API Gateway, multi-tenant isolation, and multi-accelerator support. This is structurally different from Dell’s 2B (entirely NVIDIA-dependent — NemoClaw/OpenShell) and closer to HPE’s 2B (HPE provides the deployment platform, NVIDIA provides the execution runtime, with HPE-owned bracketing governance above and below). The Agent Builder is notable as a platform-native agent construction surface. Compare to alternatives: • Dell: No platform-native agent builder. Relies on NVIDIA NIM/NemoClaw post-deployment. • HPE: CrewAI pre-installed (partner framework). Deloitte Zora AI (partner application). • VAST: AgentEngine — deeply integrated, proprietary agent runtime. • Google: Agent Studio (no-code), ADK (code-first), Agent Designer. • AWS: Bedrock Agents (no-code), Strands SDK (code-first). VMware’s Agent Builder is simpler than the hyperscaler offerings but it’s integrated into the VCF operational model — agents built here inherit VCF’s security (vDefend microsegmentation), governance (MCP server controls), and observability (GPU/model metrics). That operational integration is VMware’s differentiator. The Tanzu Platform MCP server publishing capability is a Layer 2B/2C bridge worth tracking: enabling developers to self-publish MCP servers creates an enterprise-internal agent tool marketplace governed by IT. This is a distributed model for agent capability deployment that differs from Google’s centralized Agent Registry or HPE’s curated Unleash AI ecosystem.
Moderate. VMware owns Model Runtime, Agent Builder, and the Tanzu application runtime. But inference execution depends on NVIDIA AI Enterprise (same structural dependency as Dell and HPE). The multi-accelerator support (AMD + NVIDIA) provides a runtime alternative that Dell AI Factory customers don’t have (Dell’s AMD track is a separate stack), though HPE’s GX5000 also supports multi-vendor GPU blades and hyperscalers abstract accelerators entirely. VMware’s value is that the architect controls which accelerator serves which workload through familiar virtualization primitives — the control plane is opinionated (VMware’s virtualization model applied to GPUs) but provides operational knobs that vSphere-native teams already understand. The NVIDIA dependency at 2B is real but partially mitigated by VMware’s abstraction: Model Runtime provides a VMware-managed API surface. If NVIDIA changes its NIM/NemoClaw architecture, VMware absorbs the integration change — the enterprise’s API doesn’t change. This is the same ‘bracketing’ architecture HPE uses (GreenLake governance above and below the NVIDIA runtime), expressed differently (VMware API abstraction wrapping the NVIDIA runtime).
The multi-accelerator Model Runtime is significant but requires context: running the same AI model on AMD and NVIDIA GPUs without refactoring is a runtime-level abstraction that VMware provides through familiar virtualization management tools. HPE’s GX5000 supports multi-vendor GPU blades in the same rack, and hyperscalers abstract accelerators entirely at the service layer (Vertex AI, Bedrock). VMware’s differentiator is not multi-accelerator support per se but the level of architectural control — the operator manages GPU placement and scheduling through vSphere primitives they already know, with stronger knobs than cloud providers offer. Dell’s AMD track (Dell AI Platform with AMD) remains a separate branding with a separate software stack, not a unified runtime. The Tanzu-mediated MCP server publishing is an emerging capability that could become significant for agentic AI: if every enterprise developer can publish MCP servers through Tanzu, and IT governs access through VCF 9.1’s MCP server governance, VMware creates a platform for enterprise agent tooling that is neither centralized (Google) nor delegated to partners (HPE Unleash AI) but distributed-and-governed. Whether this pattern scales depends on enterprise developer adoption of Tanzu.
Policy-driven placement and resource coordination — the Autonomy Layer
Central management and access control for MCP tools and servers. User groups restricted to approved tools. Security guardrails via vDefend and Avi. This is an access control function — it determines WHICH agents can use WHICH tools. It is NOT a placement function — it does not determine WHERE agents run, WHICH model serves WHICH request, or HOW cost/compliance/latency are arbitrated.
AI-driven support assistant for VCF operations. Diagnoses and resolves infrastructure issues by consulting Broadcom’s knowledge base. Supports on-premises and cloud-hosted language models. This is an agentic infrastructure management tool, not a general-purpose agent orchestration layer.
Private AI Model and GPU Metrics in VCF Operations: utilization, memory pressure, model-level visibility. Customizable dashboards for AI model and agent performance. These metrics could FEED a Layer 2C placement engine — but no such engine exists to consume them.
NVIDIA provides no agent governance, policy-driven placement, or reasoning plane components in the VMware stack. Same gap as Dell — NVIDIA’s AI-Q is workflow scaffolding, OpenShell is constraint enforcement, Dynamo is performance routing. None is Layer 2C.
Applying the ‘Routing Is Not Reasoning’ test: MCP Server Governance = access control. Intelligent Assist = IT operations automation. GPU/Model Metrics = observability telemetry. None provides policy-driven decisions about where compute runs relative to data, which model serves which request, and how cost/compliance/latency are arbitrated in real time. VMware’s Layer 2C position is comparable to Dell’s: not yet evident as a productized capability. The signals (MCP governance, metrics observability, Intelligent Assist) suggest the building blocks exist but have not been composed into a reasoning plane. Compare to other vendors’ Layer 2C status: • Dell: Absent. Dell+Intel ‘actively addressing’ but no product announced. • HPE: Delegated to Kamiwaza (multi-layer orchestration partner via Unleash AI). GreenLake Intelligence provides IT ops 2C. • VAST: PolicyEngine + Polaris — the most aggressive middle-out Layer 2C build. • Google: Agent Identity + Gateway + Registry + Orchestration + Observability — the most complete productized 2C. • AWS: AgentCore with implicit 2C through managed service placement decisions. VMware’s unique 2C opportunity: VCF manages the entire infrastructure estate. If Broadcom builds a Layer 2C that queries vSAN governance metadata, GPU utilization telemetry, vDefend security posture, and Model Runtime performance metrics to make autonomous placement decisions, it would have the broadest infrastructure visibility of any on-prem 2C — because VCF sees everything from the hypervisor up. The data to build 2C exists in VCF Operations. The governance primitives exist in MCP Server Governance and ACC. The placement engine does not.
Inverted: there IS no judgment to borrow because no Layer 2C exists. Same structural position as Dell. The enterprise must build custom 2C logic, bring a partner (Kamiwaza, potentially), or operate without it. Most will choose option 3. The MCP Server Governance capability is an interesting partial answer: it provides governance over agent-tool interactions without providing placement intelligence. This is access-control-as-governance — necessary but not sufficient for a reasoning plane.
VMware has a structural advantage in building Layer 2C that no other on-prem vendor possesses: VCF is the control plane for the enterprise’s entire virtualized estate. Dell manages Dell hardware. HPE manages HPE hardware. VAST manages VAST storage. VMware manages EVERYTHING virtualized — across Dell, HPE, Lenovo, Cisco, and any other OEM’s hardware. A VMware Layer 2C would be the first multi-vendor infrastructure reasoning plane — making placement decisions across heterogeneous hardware from a single governance surface. No other vendor can build this because no other vendor has the cross-vendor infrastructure visibility. Whether Broadcom invests in this opportunity is an open question. The Broadcom acquisition thesis prioritizes cash generation from the installed base, not R&D investment in new platform capabilities. Layer 2C is a significant engineering investment. The $30B annual infrastructure software segment gives Broadcom the resources; the question is whether the strategic priority exists. Hock Tan’s framing of VCF as ‘the permanent abstraction layer between AI software and physical chips’ is a Layer 2A statement, not a Layer 2C statement. The abstraction layer manages resources. The reasoning plane governs them. VMware has the former; it does not have the latter.
AI-powered business capabilities — business logic, workflow automation
Model Runtime + Agent Builder + Data Indexing/Retrieval + Vector Database + Model Store + GPU Monitoring — all included in VCF subscription at no additional cost. This is not a Layer 3 application stack — it is an integrated set of AI platform services that enables Layer 3 development. The distinction matters: VMware provides the tools to BUILD AI applications, not the applications themselves.
Pre-built AI application patterns deployable through VCF Automation catalog. Multimodal PDF Extraction, Digital Twins, RAG pipelines. Same blueprints available on Dell, HPE, Cisco, Lenovo — non-differentiating for VMware.
Developers self-publish AI agents and MCP servers to the enterprise via Tanzu. IT maintains governance and oversight. Tanzu Marketplace provides curated path to certified middleware, data services, and AI tooling. This is an enterprise app store model for AI capabilities.
VMware Private AI Foundation validated on Dell, HPE, Lenovo, Cisco, Supermicro, NEC, Fujitsu hardware. ISV ecosystem spans the entire VMware partner network — thousands of validated applications across every industry. AI-specific ISV validation is emerging but not yet at the curation depth of HPE’s Unleash AI (26+ selected ISV partners) or Dell’s AI Ecosystem Program (OpenAI, Palantir, Google, ServiceNow).
Open-source AI user interface integrated with VCF Private AI Services RAG. Provides a ChatGPT-like interface for enterprise users to interact with privately-hosted models. Demonstrates the ‘platform enables applications’ model.
Nemotron models, community models, NVIDIA NIM containers available through Model Store. NVIDIA provides the model layer; VMware provides the serving and governance layer.
VMware’s Layer 3 is structurally different from every other assessed vendor because VMware is explicitly a platform, not an application provider. VMware provides the tools to build and deploy AI applications (Private AI Services) but does not build the applications themselves. This is the correct architectural position for an infrastructure platform vendor — and it’s the same position Dell occupies (Dell doesn’t build AI applications; it partners with OpenAI, Palantir, ServiceNow). The difference is ecosystem depth: • Dell’s AI ecosystem: OpenAI, Palantir, Google, ServiceNow, SpaceXAI, Hugging Face, 5,000+ deployment customers. Explicitly curated for AI. • HPE’s Unleash AI: 26+ selected ISV partners with validated interoperability. Kamiwaza orchestration. CrewAI pre-installed. Purpose-built for AI. • VAST’s Cosmos Community: CoreWeave, TwelveLabs, CrowdStrike with distinct partner tracks. Focused and vertical. • VMware’s AI ecosystem: Inherits the broader VMware partner ecosystem (thousands of ISVs) but without AI-specific curation depth. Private AI Foundation validation is available on major OEM hardware, but AI-specific ISV partnerships are not yet at the maturity of Dell or HPE programs. The Tanzu-mediated MCP server publishing could evolve into VMware’s distinctive Layer 3 model: instead of curating an external ISV ecosystem (HPE’s approach) or partnering with AI application vendors (Dell’s approach), VMware enables the enterprise’s own developers to build and distribute AI agents internally. This is an internally-generated Layer 3 rather than an externally-sourced one. The VCF installed base is the Layer 3 enabler: 100M+ cores means Private AI Services reach more enterprise infrastructure than any competitor’s AI platform. The AI applications built on VMware will be built by the enterprise’s own developers, using VMware’s tools, on VMware’s platform. Whether that bottom-up, developer-driven approach generates Layer 3 applications as quickly as Dell’s top-down partnerships (OpenAI, Palantir) or HPE’s curated ecosystem (Unleash AI) is the open question.
Distributed across the enterprise’s own development teams and chosen partners. VMware provides the platform; the enterprise provides the application logic. This is the most explicit Retained model for Layer 3 in this assessment — the enterprise builds its own AI applications rather than consuming a vendor’s or partner’s. The trade-off: maximum control (Retained), maximum effort (the enterprise must build everything above the platform services layer). Dell and HPE offer Delegated shortcuts (partner applications). VMware offers Retained responsibility.
The Private AI Foundation at no additional cost for VCF subscribers is a strategic masterstroke for customer retention: every VCF customer already has access to Model Runtime, Agent Builder, Vector Database, Data Indexing/Retrieval, and Model Store. The marginal cost of trying Private AI is zero (beyond GPU hardware). This is the lowest-barrier entry to on-prem AI of any vendor assessed. The 9/10 Fortune 500 commitment to VCF means Private AI Foundation has the largest potential enterprise deployment footprint of any on-prem AI platform. Whether that potential converts to actual AI workload deployment depends on whether enterprises find Private AI Services sufficient for production AI or whether they choose purpose-built alternatives (Dell AI Factory, HPE Private Cloud AI, VAST AI OS) for deeper capabilities. The competitive dynamic is unusual: VMware doesn’t compete with Dell or HPE at Layer 0 (VMware runs ON their hardware). VMware competes with them at Layers 1-3 (management, orchestration, AI services). An enterprise could run Dell hardware + VMware VCF + VMware Private AI Services — getting Dell’s Layer 0 with VMware’s Layers 2A/2B. Or Dell hardware + Dell AI Factory — getting Dell’s Layer 0 with Dell/NVIDIA’s Layers 2A/2B. The choice is between VMware’s operational maturity and Dell/HPE’s AI-specific depth.
VMware Private AI Foundation with NVIDIA occupies a structurally unique position in this assessment series: it is neither an infrastructure OEM (Dell, HPE), a hyperscaler (AWS, Google Cloud), nor a data platform vendor (VAST). It is a virtualization and private cloud platform — the abstraction layer that sits between physical infrastructure and workloads. Broadcom’s strategic thesis is that VCF is ‘the permanent abstraction layer between AI software and physical chips,’ and the Private AI Foundation extends that thesis into AI workloads specifically.
The 4+1 model reveals both the power and the limits of this position. VCF’s strength is Layer 2A — infrastructure orchestration is VMware’s heritage and its deepest IP. VCF Automation, vSphere Supervisor, VKS, vSAN, NSX/vDefend, and VCF Operations collectively provide the most mature unified orchestration surface for mixed workloads (VMs, containers, AI) of any on-prem vendor assessed. No other vendor in this series manages GPU-accelerated AI workloads, Kubernetes clusters, and traditional VMs from a single control plane with equivalent operational maturity.
At Layer 0, VMware’s multi-accelerator management (AMD, NVIDIA, Intel) requires careful contextualization. GPU vendor choice is not unique to VMware — HPE’s GX5000 supports NVIDIA and AMD blades in the same rack, and hyperscalers fully abstract accelerators at the service layer (a developer calling Vertex AI or Bedrock never sees which silicon powers the response). VMware’s actual differentiator is the level of architectural control: operators manage GPU placement, isolation, and scheduling through familiar vSphere primitives (vGPU profiles, vmclasses, DRS, resource pools). The control plane is borrowed judgment — it is VMware’s opinionated virtualization model applied to acceleration — but that opinion provides stronger knobs that appeal to operators already comfortable with virtualization management. Where hyperscalers abstract the accelerator away from the architect, VMware puts the architect in the driver’s seat through a familiar console.
But the closer the stack gets to AI-specific functions — model serving, retrieval, agent execution, governance — the more authority shifts to NVIDIA (Layer 2B runtime via NVIDIA AI Enterprise), to open-source components (pgvector, Elasticsearch), or to capabilities that are emerging but not yet at the depth of purpose-built alternatives. Private AI Services (Model Runtime, Agent Builder, Data Indexing/Retrieval, Vector Database, Model Store) are genuine platform capabilities delivered as part of the VCF subscription, but they are foundational AI services, not the deep data lifecycle or agent orchestration that Dell (Dataloop), HPE (Ezmeral/Kamiwaza), or VAST (DataEngine/AgentEngine) provide. Layer 1C (data pipelines) is absent. Layer 2C (reasoning plane) has building blocks — MCP Server Governance, GPU/Model Metrics, Intelligent Assist — but none passes the ‘Routing Is Not Reasoning’ test.
The installed base is the strategic moat: nine of the top ten Fortune 500 companies have committed to VCF, with 100M+ cores licensed worldwide. For the enormous VMware installed base, Private AI Foundation is the lowest-friction path to on-prem AI — no new infrastructure vendor, no new management plane, no new operational model, and no incremental cost beyond GPU hardware. The 4+1 question is whether lowest-friction adoption translates to sufficient architectural depth when agentic AI workloads demand governance, policy-driven placement, and cross-agent orchestration that VCF does not yet provide.
VMware Private AI Foundation is the enterprise’s most natural on-ramp to private AI. Hock Tan’s ‘permanent abstraction layer’ framing is a Layer 2A statement, not a Layer 2C statement — the abstraction layer manages resources; the reasoning plane governs them. VMware has the former; it does not have the latter. Whether it becomes the enterprise’s durable AI platform depends on whether Broadcom invests in the Layer 1A governance depth, Layer 1C pipeline capability, and Layer 2C reasoning plane that the 4+1 model identifies as structurally necessary — or whether the Broadcom acquisition thesis (cash generation from the installed base) constrains that investment. VMware’s unique structural advantage is that VCF sees everything from the hypervisor up across all OEM hardware — the data to build a multi-vendor reasoning plane exists. The engineering commitment does not yet.