d-Matrix Corsair AI Inference Platform Enters Full Production to Meet Customer Demand

Surge in demand follows published 10x inference speed-up using heterogeneous disaggregated compute that pairs d-Matrix Corsair accelerators with GPUs
With multi-year supply and fabrication services secured, products to begin shipping in volume to priority hyperscalers, neoclouds, and frontier labs

SANTA CLARA, Calif., June 9, 2026 — d-Matrix, the pioneer in low-latency AI inference for data centers, today announced its Corsair™ inference accelerator platform is in full production, with products to begin shipping in volume to priority customers.

Demand for Corsair has surged as agentic AI tools — led by the late-2025 breakouts of Claude Code and OpenClaw — are pushing inference workloads beyond what GPU-only infrastructure was designed to handle. To solve for this, customers have begun using novel disaggregation techniques that can speed up AI model response time by more than 10x when using heterogeneous computing clusters with a mix of GPUs, CPUs, and accelerators.[1]

With its supply and fabrication services secured, d-Matrix is ramping to meet commitments from high-profile hyperscalers, neoclouds, and frontier AI labs — all eager to deliver faster, more interactive AI experiences at scale.

“We built Corsair specifically for this moment, the Age of AI Inference,” said Sid Sheth, founder and CEO of d-Matrix. “The applications that matter most today — agentic AI, interactive coding, real-time voice agents — live or die on latency. Corsair takes off from where the GPU leaves off, and this summer our customers will be able to experience the turbocharge d-Matrix brings at full rack scale.”

The Heterogeneous Inference Era

d-Matrix’s Corsair platform is quickly emerging as the infrastructure of choice for heterogeneous AI compute deployments, where operators pair Corsair accelerators alongside GPUs in the same rack to unlock dramatically faster token generation, supporting a premium-token economy.

Disaggregated computing techniques are increasingly being adopted by hyperscalers and frontier AI labs to slash response latency as well as the compute and energy costs that accompany GPU-only approaches. For disaggregated workloads, GPUs dominate the compute-intensive prefill portion of the workload, while Corsair excels at the decode phase. GPUs and Corsair accelerators operate in concert to deliver premium-level interactive AI experiences that individual users and enterprises are increasingly seeking — and paying for.

Delivering Data Center-Scale Inference

d-Matrix’s SquadRack reference design built with Arista, Broadcom and Supermicro delivers a complete, production-ready rack-scale inference solution. d-Matrix’s acquisition of GigaIO’s data center business this April brought a team of proven rack-scale systems engineers directly into d-Matrix, deepening the company’s expertise in large-scale data center deployment, integration, and field operations and accelerating SquadRack’s path to production.

Designed to run in industry-standard data center environments, SquadRack solutions do not require liquid cooling and can be deployed within days of installation. SquadRack integrates Corsair inference accelerators, d-Matrix JetStream™ high-speed networking, and d-Matrix’s Aviator™ software stack into a cohesive rack-level system — purpose-built for the latency-sensitive, always-on inference demands of frontier AI labs and large-scale cloud providers.

d-Matrix provides customers the ability to customize their own server racks, with Corsair-powered solutions available in rack, server and air-cooled PCIe card formats.

Purpose-Built for Supply Chain Certainty

d-Matrix designed Corsair from the ground up with supply chain predictability as a core requirement, not an afterthought. Manufactured in partnership with TSMC and Alchip Technologies on TSMC’s established N6 process node, Corsair benefits from reliable, high-volume manufacturing capability that enables d-Matrix to meet customer commitments on schedule.

Corsair’s SRAM-based in-memory compute chiplet architecture — built on organic substrates, rather than HBM-based CoWoS packaging — is a deliberate design choice that streamlines the supply chain and sidesteps the memory integration complexities that have slowed other AI accelerator deployments. Combined with LP-DDR5 memory technology, d-Matrix has assembled a manufacturing ecosystem built for ease of supply.

With ecosystem partners in place, production underway, and customers able to receive product in volume beginning this summer, d-Matrix is positioned to meet the inference moment the industry has been building toward.

Partner Quotes

“TSMC is pleased to support d-Matrix’s production ramp of the Corsair inference platform on our N6 process technology,” said Lucas Tsai, Vice President of New Strategic Engagement at TSMC North America. “We look forward to continuing our collaboration with innovators like d-Matrix to develop transformative solutions powered by our leading-edge technology, as the demand for low-latency AI inference continues to grow.”

“Alchip has partnered closely with d-Matrix from the earliest stages of Corsair’s design, so we’re extremely proud that it is in full production,” said Johnny Shen, Chairman, CEO and President, Alchip Technologies. “The d-Matrix team provides exceptional silicon expertise, and the Corsair architecture is a compelling example of purpose-built inference achievement. We look forward to scaling this platform together.”

Availability

d-Matrix’s Corsair inference platform is being made available to select, qualified customers. For inquiries and consideration, please visit: https://www.d-matrix.ai/contact-sales/.

About d-Matrix

d-Matrix is pioneering accelerated computing for AI inference, breaking through the limits of latency, cost and energy. Its Corsair inference accelerators, JetStream networking accelerators, Aviator software, and SquadRack rack-scale solutions deliver fast, sustainable AI inference at data center scale. Learn more at www.d-matrix.ai or follow us on LinkedIn.

Media Contacts:

Kristin Bryson
d-Matrix
VP of Corporate Communications
kbryson@d-matrix.ai
+1 203.241.9190

Aircover Communications
d-matrix@aircoverpr.com

[1]Independent testing by Gimlet Labs demonstrated that a baseline 24-second response time was reduced to less than two seconds when pairing Corsair accelerators with GPUs, as opposed to using GPUs only. See March 11, 2026 blog “Low-Latency Inference with Speculative Decoding on d-Matrix Corsair and GPU” at https://gimletlabs.ai/blog/low-latency-spec-decode-corsair.

d-Matrix Corsair AI Inference Platform Enters Full Production to Meet Customer Demand

Suggested Articles

Where heterogeneous pipelines power an agentic coding future

Why modern AI workloads demand a disaggregated approach

The fight for latency: why agents have changed the game

d-Matrix Corsair AI Inference Platform Enters Full Production to Meet Customer Demand

Suggested Articles

Where heterogeneous pipelines power an agentic coding future

Why modern AI workloads demand a disaggregated approach

The fight for latency: why agents have changed the game

Get the latest d-Matrix updates to your inbox. Sign up below: