Repost: How is the Memory Crisis Reshaping the AI and Server Worlds? 🧠💻

The latest episode of The Byrne-Wheeler Report is live, and we are exploring the memory pricing and availability (funny how those two go together) crisis that’s currently hitting the tech industry. From $135B quarterly hyperscaler spends to innovative startups trying to bypass DRAM altogether, this is an episode you can't afford to miss if you're tracking the future of hardware. Inside the episode: 🎙️ Hosts: Joe Byrne & Bob Wheeler 💡 Special Guests: Gary Smerdon, CEO of MEXT Jim Handy, Principal Analyst at Objective Analysis Key Episode 12 segments: The main event: A game changer for server costs: Gary Smerdon, CEO of MEXT, joins us to explain how they are using AI drivers to swap cold pages to flash memory transparently, effectively doubling system memory at a fraction of the cost. Special guest star: star memory analyst Jim Handy joins the show to share his insights. Why have DRAM prices quadrupled since September? Jim breaks down the trade ratio between HBM and DDR and why the shortage might last another two years. Intro Chatter The Quantization Revolution: We discuss Prism ML (out of Caltech) and Google’s TurboQuant. Are we moving toward a world of "skinnier weights" where 1-bit precision allows frontier models to run on your MacBook? The KV Cache Bottleneck: While quantization helps model size, the pressure is shifting to the KV cache—especially for Mixture-of-Experts (MoE) architectures. This is what the 2025 TurboQuant paper addresses. That technique trades computing cycles for KV cache. (Not discussed: recent TurboQuant adaptations that are more computationally efficient, thus impacting token-generation rates.)

Other contents

Phoronix Shows Nvidia Vera to be Among the Fastest Server Processors

Repost: How is the Memory Crisis Reshaping the AI and Server Worlds? 🧠💻

Other contents

Phoronix Shows Nvidia Vera to be Among the Fastest Server Processors

Bolt Confounds Skeptics, Tapes Out Zeus GPU

Matrix Multiplication Comes to x86

Blowout AMD & Intel Earnings, Google TPUv8, Astera Scorpio-X, RISC-V New, and More: It’s BWR Ep 13.

AMD floats Venice

Alibaba steps up data-center AI accelerator cadence

Vera! What has become of you!?

Will Qualcom or Intel help world's largest chip startup take flight?

Physical Design Trends for AI Accelerators in 2026

Google TPUv8: Early Specs and Performance Gains

Repost: How is the Memory Crisis Reshaping the AI and Server Worlds? 🧠💻

Other contents

Phoronix Shows Nvidia Vera to be Among the Fastest Server Processors

Bolt Confounds Skeptics, Tapes Out Zeus GPU

Matrix Multiplication Comes to x86

Blowout AMD &amp; Intel Earnings, Google TPUv8, Astera Scorpio-X, RISC-V New, and More: It’s BWR Ep 13.

AMD floats Venice

Alibaba steps up data-center AI accelerator cadence

Vera! What has become of you!?

Will Qualcom or Intel help world's largest chip startup take flight?

Physical Design Trends for AI Accelerators in 2026

Google TPUv8: Early Specs and Performance Gains

Blowout AMD & Intel Earnings, Google TPUv8, Astera Scorpio-X, RISC-V New, and More: It’s BWR Ep 13.