Post Archive - d-Matrix

Eye on AI’s Craig S. Smith deep dive with d-Matrix’s CEO Sid Sheth

May 5, 2025

Listen or watch 𝗘𝘆𝗲 𝗼𝗻 𝗔𝗜’s Craig S. Smith deep dive with d-Matrix’s CEO Sid Sheth on the state of AI Infrastructure. They discuss Sid’s journey building multi-billion-dollar businesses in semiconductors to founding d-Matrix — a startup…Read More

How to Bridge Speed and Scale: Redefining AI Inference with Ultra-Low Latency Batched Throughput

February 27, 2025

Recent breakthroughs in Reasoning AI have shifted focus towards scalable inference-time compute. Enterprises want to leverage these AI advancements to deliver real-time, interactive user experiences while also serving a vast…Read More

The Complete Recipe to Unlock AI Reasoning at Enterprise Scale

February 13, 2025

From train-time to inference-time compute Scaling laws are moving beyond pre-training to post-training and test-time scaling. Reasoning language models like OpenAI o1 and o3 are engineered to emulate human-like problem-solving.…Read More

Impact of the DeepSeek Moment on Inference Compute

January 31, 2025

In this talk, d-Matrix CTO/Cofounder Sudeep Bhoja discusses the impact that the release of the Deep Seek R1 model is having on inference compute. Stepping through the evolution of reasoning…Read More

Think more vs. Train more

January 29, 2025

With the arrival of Reasoning and Inference-Time compute, we are at an inflection point in the AI computing journey. Finally, revenue generation from AI models is aligning with the cost…Read More

Next-Gen Techniques for AI Inference from d-Matrix Spotlighted at NeurIPS

December 6, 2024

d-Matrix engineers will be sharing their latest research at NeurIPS again this year. Among the innovations are new techniques for transforming workloads for more efficient inference. “I’d put d-Matrix R&D…Read More

Transforming AI through Hardware-Software Codesign for Gen AI Inference

November 13, 2024

Transformers-based large language models (LLMs) have emerged as the underpinning architecture for modern natural language processing. Today, the at-scale deployment of generative AI is gated by the prohibitive cost of…Read More

d-Matrix Takes OCP Stage To Talk How They are Transforming Gen AI Inference from Unsustainable to Attainable with Open Source

November 5, 2024

AI Inference is on an unsustainable trajectory The arrival of ChatGPT in 2022 caught the world’s imagination around AI – and there has been no looking back. The discussions have…Read More

Introducing dmx.compressor

October 16, 2024

Quantization plays a key role in reducing memory usage, speeding up inference, and lowering energy consumption at inference time. As large language models (LLMs) continue to grow exponentially in size —…Read More

Ushering a New Era of AI Computing by Rethinking Hardware-Software Codesign

September 30, 2024

Landscape of AI Computing Artificial intelligence (AI) has permeated countless fields, powered by the advances in latest generative architectures, to a point where a form of artificial general intelligence (AGI)…Read More

d-Matrix Blog - Archives