NVIDIA hand-delivers first Vera CPUs to Anthropic, OpenAI, SpaceX, Oracle — first standalone NVIDIA CPU lands at frontier labs
TL;DR: NVIDIA’s Ian Buck hand-delivered the first Vera CPU systems in late May 2026 to four frontier-lab partners: Anthropic (San Francisco), OpenAI (Mission Bay), SpaceX (Palo Alto), and Oracle Cloud Infrastructure (Santa Clara). Vera is NVIDIA’s first standalone data-center CPU — a direct competitor to Intel Xeon and AMD EPYC, purpose-built for agentic AI workloads. Production status: in full production since March 2026; first delivery in May 2026. Why this matters: NVIDIA is moving from “GPU-only supplier” to “full-stack compute partner” — owning the CPU, GPU, networking (Spectrum-X, NVLink), and now the CPU-GPU coherent memory architecture that agentic workloads need. For each customer, Vera adds another layer to a multi-vendor compute supply chain — Anthropic now sits on NVIDIA + AWS Trainium + Google TPU + Microsoft Maia 200 (early talks) + Vera. The frontier-lab compute supply chain is becoming substantially more complex — and substantially less NVIDIA-monopolistic — than it was twelve months ago.
What was delivered
The reporting from Bloomberg, NVIDIA’s official blog, wccftech, TweakTown, TipRanks, and TNW confirms:
- Product: NVIDIA Vera — first NVIDIA standalone data-center CPU
- Architecture: ARM-based, purpose-built for agentic AI workloads
- Production status: full production since March 2026 (per Bloomberg / TipRanks)
- First customers: Anthropic, OpenAI, SpaceX, Oracle Cloud Infrastructure
- Delivery: hand-delivered by Ian Buck (NVIDIA’s GM of Hyperscale and HPC)
- Sites: Anthropic San Francisco, OpenAI Mission Bay, SpaceX Palo Alto (delivered Friday late May 2026), Oracle Cloud Infrastructure Santa Clara (delivered the following Monday)
- Successor to: NVIDIA Grace (the previous-generation CPU released alongside Hopper GPUs)
- Competitive position: directly competes with Intel Xeon and AMD EPYC
Why Vera is structurally important
Three reads matter.
1. NVIDIA is becoming a full-stack compute supplier. For most of the AI compute era, NVIDIA sold GPUs while customers paired them with Intel Xeon or AMD EPYC CPUs. Vera changes that. NVIDIA now sells:
- GPUs (H100 → B100 → B200 → Rubin)
- CPUs (Vera, formerly Grace)
- Networking (Spectrum-X Ethernet, NVLink, InfiniBand)
- Coherent memory architecture (NVLink C2C ties Vera CPUs directly to NVIDIA GPUs at memory-coherent latency)
- Reference platforms (DGX systems, MGX server designs)
- Software (CUDA, NIM microservices, AI Enterprise)
This is the same vertical-integration pattern Apple ran with Apple Silicon — NVIDIA designing every layer of the compute stack for AI-specific workloads, leaving Intel and AMD with the workloads where the integration premium doesn’t apply.
2. Agentic AI workloads have distinct compute requirements. Traditional ML training is GPU-dominated, CPU-light. Agentic AI is different — agents spend significant compute on orchestration logic, tool calls, planning, and context management that runs on CPU rather than GPU. As Dynamic Workflows, Project Polaris, and similar parallel-subagent architectures become the dominant Claude Code / Copilot deployment pattern through H2 2026, CPU optimization for agentic workloads matters more. Vera is NVIDIA’s bet that those workloads should run on NVIDIA-designed silicon, not Intel or AMD.
3. The compute supply chain story for frontier labs keeps getting more complex. For each customer:
| Lab | Compute supply chain (May-June 2026) |
|---|---|
| Anthropic | NVIDIA + AWS Trainium + Google TPU + Microsoft Maia 200 (early talks) + Vera |
| OpenAI | NVIDIA + Microsoft Maia 200 + Vera + (rumored own-silicon roadmap) |
| SpaceX | NVIDIA + Vera (Colossus 1 + Colossus 2 with sites added through 2026) |
| Oracle Cloud | NVIDIA + Vera (commercial cloud customers) |
Compare to twelve months ago when “AI compute supply chain” meant “what NVIDIA GPUs can you get and when.” The 2026 picture is multi-vendor across CPU, GPU, custom AI accelerator (TPU / Trainium / Maia), and increasingly multi-cloud (AWS + Google + Microsoft + Oracle). For Claude, ChatGPT, and other API consumers, the practical implication is more reliable serving (multi-vendor diversification) and continued downward pressure on per-token costs (cross-vendor pricing competition).
What it means for the four customers
For Anthropic: Vera is the fifth distinct compute-source layer in the Anthropic supply chain (alongside NVIDIA GPUs, AWS Trainium, Google TPU via the $40B investment, and the early-stage Microsoft Maia 200 talks). For Anthropic’s drive toward sustainable operating profit and the October 2026 IPO trajectory, having a multi-vendor compute portfolio is procurement leverage, not just resilience.
For OpenAI: Vera is the bridge between NVIDIA’s current GPU stack and OpenAI’s reported in-house silicon ambitions. Running Vera alongside Microsoft Maia 200 (which is also serving OpenAI’s GPT-5.2 in production) creates a CPU-GPU compute portfolio that doesn’t depend on Intel or AMD for any layer.
For SpaceX: Vera adds to the Colossus 1 and forthcoming Colossus 2 data-center buildouts. The Anthropic-SpaceX 300+ MW capacity deal reaches a milestone with Vera deployment — SpaceX’s data centers can now offer NVIDIA-full-stack capacity to Anthropic and other AI tenants.
For Oracle Cloud: Vera delivery positions OCI as a serious AI compute hyperscaler. Combined with Oracle’s Stargate partnership work, this is OCI’s move from “AI compute also-ran” to “credible AI compute supplier.”
What it means for Claude and ChatGPT users
Practically: nothing changes operationally. Vera-powered serving is invisible to end users — your Claude API call lands on whatever hardware is closest and cheapest. Anthropic and OpenAI just gained another lever in the routing decision.
Structurally: every additional compute supplier in the stack improves serving reliability (lower probability of region-wide outage) and continues the downward pressure on per-token economics. For Claude Code at $2.5B+ ARR and ChatGPT on confidential S-1 filing, every cent of compute cost reduction at the unit level translates into material P&L improvement.
The honest caveats
Three caveats:
Vera benchmarks vs Xeon / EPYC aren’t yet independently verified at production scale. NVIDIA’s claims about Vera CPU performance for agentic workloads come from internal testing. Independent third-party benchmarking will surface over the next 30-90 days as customers deploy Vera in production.
“First delivery” is symbolic, not volume-shipping. Hand-delivered first systems are press-event-scale. Full deployment at Anthropic / OpenAI / SpaceX / Oracle scale will take quarters, not weeks. The structural impact on per-token costs is a 2026-2027 story, not a Q3 2026 one.
ARM-based CPUs require software-stack work. Vera being ARM-based (vs Intel/AMD’s x86) means some workloads need recompilation. Most modern AI infrastructure (Linux + Docker + Python + CUDA) supports ARM cleanly, but edge cases (proprietary x86 tools, legacy enterprise software) will require migration effort.
What it changes for Pick Right readers tomorrow
If you’re a Claude Pro subscriber, nothing changes immediately. If you’re a Claude Code heavy user, expect continued capacity expansion and stable-to-improving rate limits through H2 2026 as Vera-powered serving scales.
For broader context, see the Claude review, the ChatGPT review, the Anthropic-Microsoft Maia 200 chip talks, the SpaceX Colossus capacity unlock, the Google + Broadcom multi-gigawatt TPU partnership, and the Anthropic S-1 filing for the broader compute-supply-chain and IPO-pipeline picture.
Sources
- Nvidia Says Anthropic, OpenAI Among Users of New Vera Chip (Bloomberg)
- Vera Arrives: NVIDIA's First CPU Built for Agents Lands at Top AI Labs (NVIDIA Blog)
- NVIDIA Hand-Delivers First Vera CPUs to Anthropic, OpenAI, SpaceX and Oracle (wccftech)
- NVIDIA names Anthropic and OpenAI among first users of its Vera chip (TNW)
Related tool reviews
Questions or corrections? Email Pick Right. Want the full list? See all news.