NVIDIA hand-delivers first Vera CPUs to Anthropic, OpenAI, SpaceX, Oracle — first standalone NVIDIA CPU lands at frontier labs

TL;DR: NVIDIA’s Ian Buck hand-delivered the first Vera CPU systems in late May 2026 to four frontier-lab partners: Anthropic (San Francisco), OpenAI (Mission Bay), SpaceX (Palo Alto), and Oracle Cloud Infrastructure (Santa Clara). Vera is NVIDIA’s first standalone data-center CPU — a direct competitor to Intel Xeon and AMD EPYC, purpose-built for agentic AI workloads. Production status: in full production since March 2026; first delivery in May 2026. Why this matters: NVIDIA is moving from “GPU-only supplier” to “full-stack compute partner” — owning the CPU, GPU, networking (Spectrum-X, NVLink), and now the CPU-GPU coherent memory architecture that agentic workloads need. For each customer, Vera adds another layer to a multi-vendor compute supply chain — Anthropic now sits on NVIDIA + AWS Trainium + Google TPU + Microsoft Maia 200 (early talks) + Vera. The frontier-lab compute supply chain is becoming substantially more complex — and substantially less NVIDIA-monopolistic — than it was twelve months ago.

What was delivered

The reporting from Bloomberg, NVIDIA’s official blog, wccftech, TweakTown, TipRanks, and TNW confirms:

Product: NVIDIA Vera — first NVIDIA standalone data-center CPU
Architecture: ARM-based, purpose-built for agentic AI workloads
Production status: full production since March 2026 (per Bloomberg / TipRanks)
First customers: Anthropic, OpenAI, SpaceX, Oracle Cloud Infrastructure
Delivery: hand-delivered by Ian Buck (NVIDIA’s GM of Hyperscale and HPC)
Sites: Anthropic San Francisco, OpenAI Mission Bay, SpaceX Palo Alto (delivered Friday late May 2026), Oracle Cloud Infrastructure Santa Clara (delivered the following Monday)
Successor to: NVIDIA Grace (the previous-generation CPU released alongside Hopper GPUs)
Competitive position: directly competes with Intel Xeon and AMD EPYC

Why Vera is structurally important

Three reads matter.

1. NVIDIA is becoming a full-stack compute supplier. For most of the AI compute era, NVIDIA sold GPUs while customers paired them with Intel Xeon or AMD EPYC CPUs. Vera changes that. NVIDIA now sells:

GPUs (H100 → B100 → B200 → Rubin)
CPUs (Vera, formerly Grace)
Networking (Spectrum-X Ethernet, NVLink, InfiniBand)
Coherent memory architecture (NVLink C2C ties Vera CPUs directly to NVIDIA GPUs at memory-coherent latency)
Reference platforms (DGX systems, MGX server designs)
Software (CUDA, NIM microservices, AI Enterprise)

This is the same vertical-integration pattern Apple ran with Apple Silicon — NVIDIA designing every layer of the compute stack for AI-specific workloads, leaving Intel and AMD with the workloads where the integration premium doesn’t apply.

2. Agentic AI workloads have distinct compute requirements. Traditional ML training is GPU-dominated, CPU-light. Agentic AI is different — agents spend significant compute on orchestration logic, tool calls, planning, and context management that runs on CPU rather than GPU. As Dynamic Workflows, Project Polaris, and similar parallel-subagent architectures become the dominant Claude Code / Copilot deployment pattern through H2 2026, CPU optimization for agentic workloads matters more. Vera is NVIDIA’s bet that those workloads should run on NVIDIA-designed silicon, not Intel or AMD.

3. The compute supply chain story for frontier labs keeps getting more complex. For each customer:

Lab	Compute supply chain (May-June 2026)
Anthropic	NVIDIA + AWS Trainium + Google TPU + Microsoft Maia 200 (early talks) + Vera
OpenAI	NVIDIA + Microsoft Maia 200 + Vera + (rumored own-silicon roadmap)
SpaceX	NVIDIA + Vera (Colossus 1 + Colossus 2 with sites added through 2026)
Oracle Cloud	NVIDIA + Vera (commercial cloud customers)

Compare to twelve months ago when “AI compute supply chain” meant “what NVIDIA GPUs can you get and when.” The 2026 picture is multi-vendor across CPU, GPU, custom AI accelerator (TPU / Trainium / Maia), and increasingly multi-cloud (AWS + Google + Microsoft + Oracle). For Claude, ChatGPT, and other API consumers, the practical implication is more reliable serving (multi-vendor diversification) and continued downward pressure on per-token costs (cross-vendor pricing competition).

What it means for the four customers

For Anthropic: Vera is the fifth distinct compute-source layer in the Anthropic supply chain (alongside NVIDIA GPUs, AWS Trainium, Google TPU via the $40B investment, and the early-stage Microsoft Maia 200 talks). For Anthropic’s drive toward sustainable operating profit and the October 2026 IPO trajectory, having a multi-vendor compute portfolio is procurement leverage, not just resilience.

For OpenAI: Vera is the bridge between NVIDIA’s current GPU stack and OpenAI’s reported in-house silicon ambitions. Running Vera alongside Microsoft Maia 200 (which is also serving OpenAI’s GPT-5.2 in production) creates a CPU-GPU compute portfolio that doesn’t depend on Intel or AMD for any layer.

For SpaceX: Vera adds to the Colossus 1 and forthcoming Colossus 2 data-center buildouts. The Anthropic-SpaceX 300+ MW capacity deal reaches a milestone with Vera deployment — SpaceX’s data centers can now offer NVIDIA-full-stack capacity to Anthropic and other AI tenants.

For Oracle Cloud: Vera delivery positions OCI as a serious AI compute hyperscaler. Combined with Oracle’s Stargate partnership work, this is OCI’s move from “AI compute also-ran” to “credible AI compute supplier.”

What it means for Claude and ChatGPT users

Practically: nothing changes operationally. Vera-powered serving is invisible to end users — your Claude API call lands on whatever hardware is closest and cheapest. Anthropic and OpenAI just gained another lever in the routing decision.

Structurally: every additional compute supplier in the stack improves serving reliability (lower probability of region-wide outage) and continues the downward pressure on per-token economics. For Claude Code at $2.5B+ ARR and ChatGPT on confidential S-1 filing, every cent of compute cost reduction at the unit level translates into material P&L improvement.

The honest caveats

Three caveats:

Vera benchmarks vs Xeon / EPYC aren’t yet independently verified at production scale. NVIDIA’s claims about Vera CPU performance for agentic workloads come from internal testing. Independent third-party benchmarking will surface over the next 30-90 days as customers deploy Vera in production.

“First delivery” is symbolic, not volume-shipping. Hand-delivered first systems are press-event-scale. Full deployment at Anthropic / OpenAI / SpaceX / Oracle scale will take quarters, not weeks. The structural impact on per-token costs is a 2026-2027 story, not a Q3 2026 one.

ARM-based CPUs require software-stack work. Vera being ARM-based (vs Intel/AMD’s x86) means some workloads need recompilation. Most modern AI infrastructure (Linux + Docker + Python + CUDA) supports ARM cleanly, but edge cases (proprietary x86 tools, legacy enterprise software) will require migration effort.

What it changes for Pick Right readers tomorrow

If you’re a Claude Pro subscriber, nothing changes immediately. If you’re a Claude Code heavy user, expect continued capacity expansion and stable-to-improving rate limits through H2 2026 as Vera-powered serving scales.

For broader context, see the Claude review, the ChatGPT review, the Anthropic-Microsoft Maia 200 chip talks, the SpaceX Colossus capacity unlock, the Google + Broadcom multi-gigawatt TPU partnership, and the Anthropic S-1 filing for the broader compute-supply-chain and IPO-pipeline picture.

Sources

Related tool reviews

Questions or corrections? Email Pick Right. Want the full list? See all news.