The Next
100x

Follow our journey as we aim to disrupt the economics of AI compute.

Get in touch with our media team at media@d-matrix.ai
Overhead view of a large warehouse filled with server racks.
Explore Blog
News

AI Compute Company Banks on Chiplets for Future Processors

By:
Design News

Chiplet packaging is catching on with companies designing high-performance processors for data center and AI applications. While familiar names such as Intel and AMD are in this space, so are some smaller startup companies. One of them is d-Matrix, a young company developing technology for AI-compute and inference processors.

Read Article
News

A Better Approach to Hyperscale Computing

By:
Playground Global

Why We Invested in d-Matrix

Read Article
News

d-Matrix Announces $44 Million in Funding

By:
Business Wire

d-Matrix Announces $44 Million in Funding to Build a One-of-a-kind Compute Platform Targeted for At-Scale Transformer AI Datacenter Inference

Read Article
News

D-Matrix’s new chip will optimize matrix calculations

By:
VentureBeat

Today, D-Matrix, a company focused on building accelerators for complex matrix math supporting machine learning, announced a $44 million series A round. Playground Global led the round with support from Microsoft’s M12 and SK Hynix. The three join existing investors Nautilus Venture Partners, Marvell Technology and Entrada Ventures.

Read Article
News

d-Matrix Gets Funding to Build SRAM ‘Chiplets’ for AI Inference

By:
Datanami

Hardware startup d-Matrix says the $44 million it raised in a Series A round today will help it continue development of a novel “chiplet” architecture that uses 6 nanometer chip embedded in SRAM memory modules for accelerating AI workloads.

Read Article
News

D-Matrix Debuts With $44 Million and an AI Solution

By:
Futuriom

A four-year-old startup named d-Matrix has scored $44 million in funding for a silicon solution tailored to massive artificial intelligence (AI) workloads. Investors include M12 (Microsoft’s venture fund) and Marvell Technology (Nasdaq: MRVL).

Read Article
Conference Presentation

A Chiplet Based Generative Inference Architecture With Block Floating Point Datatypes

By:
Sudeep Bhoja

Abstract: The advent of large transformer based language models (BERT, GPT3, ChatGPT, Lamda, Switch) for Natural Language Processing (NLP) and their growing explosive use across Generative AI business and consumer applications has made it imperative for AI accelerated computing solutions to provide an order of magnitude improvements in efficiency. We will discuss a modular, chiplet based spatial CGRA-like architecture optimized for generative inference with a generalized framework for the successful implementation of deep RL-based mappers in compilers for spatial and temporal architectures. We’ll present results for weight and activation quantization in block floating point formats, building on GPTQ and SmoothQuant, and their support in PyTorch. To reduce KV cache size and bandwidth, we’ll present an extension to EL-attention.

Read Article
Press Article

ChatGPT And Generative AI Are Booming, But The Costs Can Be Extraordinary

By:
CNBC

- The cost to develop and maintain the software can be extraordinarily high. - Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000. - Analysts and technologists estimate that the critical process of training a large language model such as GPT-3 could cost over $4 million.

Read Article
Press Article

Infrastructure To Support Open AI's ChatGPT Could Be Costly

By:
TechTarget

The CEO of one chip startup with a product now in development said users need a massive number of GPUs to handle the range of common AI tasks such as training models, inference, and high-performance computing.

Read Article
News

New Microsoft partnership accelerates generative AI development

By:
VentureBeat

One of the hottest trends in artificial intelligence (AI) this year has been the emergence of popular generative AI models. With technologies including DALL-E and Stable Diffusion, there are a growing number of startups and use cases that are emerging.

Read Article
News

AI Compute Company Banks on Chiplets for Future Processors

By:
Design News

Chiplet packaging is catching on with companies designing high-performance processors for data center and AI applications. While familiar names such as Intel and AMD are in this space, so are some smaller startup companies. One of them is d-Matrix, a young company developing technology for AI-compute and inference processors.

Read Article
Press Article

Generative AI drives an explosion in compute: The looming need for sustainable AI

By:
Sid Sheth

The memory wall refers to the physical barriers limiting how fast data can be moved in and out of memory. It’s a fundamental limitation with traditional architectures. In-memory computing or IMC addresses this challenge by running AI matrix calculations directly in the memory module, avoiding the overhead of sending data across the memory bus.

Read Article
Conference Presentation

A Chiplet Based Generative Inference Architecture With Block Floating Point Datatypes

By:
Sudeep Bhoja

Abstract: The advent of large transformer based language models (BERT, GPT3, ChatGPT, Lamda, Switch) for Natural Language Processing (NLP) and their growing explosive use across Generative AI business and consumer applications has made it imperative for AI accelerated computing solutions to provide an order of magnitude improvements in efficiency. We will discuss a modular, chiplet based spatial CGRA-like architecture optimized for generative inference with a generalized framework for the successful implementation of deep RL-based mappers in compilers for spatial and temporal architectures. We’ll present results for weight and activation quantization in block floating point formats, building on GPTQ and SmoothQuant, and their support in PyTorch. To reduce KV cache size and bandwidth, we’ll present an extension to EL-attention.

Read Article
Conference Presentation

Developing Scalable AI Inference Chip with Cadence Flow in Azure Cloud

By:
Farhad Shakeri (Sr. Director IT/Cloud)

d-Matrix set up a productive Azure cloud infrastructure running Cadence flow, the lessons learned and the key success factors that led to delivering first AI chip within 14 months.

Read Article
Conference Presentation

Accelerating Transformers for Efficient Inference of Giant NLP Models

By:
Sudeep Bhoja (Co-founder / CTO)

Large Transformer Models are finding uses across speech, text, video, and images. In this presentation, we explain the challenges of accelerating these large models in hardware.

Read Article
Press Article

ChatGPT And Generative AI Are Booming, But The Costs Can Be Extraordinary

By:
CNBC

- The cost to develop and maintain the software can be extraordinarily high. - Nvidia makes most of the GPUs for the AI industry, and its primary data center workhorse chip costs $10,000. - Analysts and technologists estimate that the critical process of training a large language model such as GPT-3 could cost over $4 million.

Read Article
Press Article

Infrastructure To Support Open AI's ChatGPT Could Be Costly

By:
TechTarget

The CEO of one chip startup with a product now in development said users need a massive number of GPUs to handle the range of common AI tasks such as training models, inference, and high-performance computing.

Read Article
Press Article

Generative AI drives an explosion in compute: The looming need for sustainable AI

By:
Sid Sheth

The memory wall refers to the physical barriers limiting how fast data can be moved in and out of memory. It’s a fundamental limitation with traditional architectures. In-memory computing or IMC addresses this challenge by running AI matrix calculations directly in the memory module, avoiding the overhead of sending data across the memory bus.

Read Article
White Paper

Designing Next-Gen AI Inferencing Chips Using Azure's Scalable IT Cloud Infrastructure

By:
d-Matrix, Microsoft & Six Nines

A case study describing how d-Matrix built its first proof-of-concept AI chip entirely in Microsoft's Azure cloud.

Read Article
No items found.