s/CPX/LPX/g

It seems like only a few months ago that @NVIDIA showed off Rubin CPX, a system combining the now-in-production Rubin data-center GPU with an RTX-like GPU that employed GDDR DRAM instead of #HBM. Well, that's not the system you're looking for. [Waves like a Jedi.] Today, Jensen announced LPX. As I had speculated, it's basically CPX but with Groq LPUs (NPUs) in place of the RTX-like chips. The LPU is the new LP30 design. A midlife kicker, LP35, is coming that supports NVFP4, Nvidia's FP4 data format. The GPU-LPU interconnect is a special low-latency version of Ethernet. I don't know if that's for better time to market than adding NVLink (assuredly the case for LP30) and will be here to stay. Target applications are premium-tier customers requiring fast token rates. Jensen mooted 1000 TPS/user. The challenge for GPUs has been the tradeoff between throughput and per-user token rate (essentially, latency). Batching is good for the former but bad for the latter. The LPU keeps throughput from collapsing as token rate increases. The LP30 has 500 MB of SRAM, which is great for latency but too little capacity for modern workloads' KV caches, hence the partitioning of LLMs phases between the 288 GB GPU and the 400 MB LPU. Speaking of DDR, Vera will talk to LPDDR instead of DDR, which should provide greater bandwidth at lower cost but without the DIMM-based expandability many applications require.

Other contents

Physical Design Trends for AI Accelerators in 2026

s/CPX/LPX/g

Other contents

Physical Design Trends for AI Accelerators in 2026

BWR 12: The Memory Episode

Google TPUv8: Early Specs and Performance Gains

Hynix Infusion Helps Semidynamics Diversify into AI Chips

Codasip Pivots Away from RISC-V IP

to boldly go where no man has gone before, to seek out new fabs

Amazon Puts a Dollar Figure to Its AI Chip Business

Buttering No Parsnips: Google Says Nice Things About Intel's Chips

Repost: How is the Memory Crisis Reshaping the AI and Server Worlds? 🧠💻

Google TPU Could Sell as Well as Nvidia Rubin