JetStream Archives

The power of the middle lane: why a hybridized approach to memory gives the best of both worlds

January 27, 2026

Conversations around fast inference typically focus on one approach: blazing fast token generation with gargantuan models. Both still play an important part in many circles. Models come in huge flavors, including Qwen (235B)… Read More

The power of the middle lane: why a hybridized approach to memory gives the best of both worlds

January 26, 2026

Conversations around fast inference typically focus on one approach: blazing fast token generation with gargantuan models. Both still play an important part in many circles. Models come in huge flavors, including Qwen (235B)… Read More

Using what’s on hand: spare data center space is an untapped gold mine

December 18, 2025

The first cloud computing revolution triggered a tsunami-grade buildout for data centers—both for emerging hyperscalers like Amazon Web Services, and companies re-orienting their systems around cloud-based operations. At the time, we were… Read More

Why we decoupled execution to accelerate I/O

October 14, 2025

d-Matrix’s advantage over classic GPU architecture is extremely low latency, ultra-efficient small batch inference. We approach each of those problems with a large range of techniques and must excel at… Read More

Scaling AI the Right Way: Introducing Our Rack-Level Inference Solution

October 14, 2025

In today’s rapidly evolving AI landscape, it’s clear that inference — not just training — is becoming the new scaling challenge. As models grow in size and capability, the infrastructure… Read More

Open standards are the path to the next AI breakthrough

October 14, 2025

We are inevitably moving toward a world where AI applications need substantially better performance, and classic GPUs alone can’t keep up. And at the same time, customers need flexibility in… Read More

Why we needed a new Transparent NIC solution

September 18, 2025

Large-scale AI workloads are deployed by stringing together many cards and many nodes. Any increase in model size, context length or users impacts the number of cards needed. Gone are… Read More

d-Matrix Blog - Tag: JetStream

The power of the middle lane: why a hybridized approach to memory gives the best of both worlds

The power of the middle lane: why a hybridized approach to memory gives the best of both worlds

Using what’s on hand: spare data center space is an untapped gold mine

Why we decoupled execution to accelerate I/O

Scaling AI the Right Way: Introducing Our Rack-Level Inference Solution

Open standards are the path to the next AI breakthrough

Why we needed a new Transparent NIC solution