TSMC bg copy

d-Matrix and Gimlet Labs to Deliver Low Latency AI Inference
Combined solution enables 10x performance benefits in latency and throughput per Watt compared to GPU-only approach. 
View the Press Release

d-Matrix and Gimlet’s combined solution can deliver order-of-magnitude performance increases on both inference latency and throughput per Watt compared to traditional GPU-only deployments. The solution is ideal for latency-sensitive workloads including speculative decoding, which is commonly adopted by large-scale AI deployments to reduce latency.


With d-Matrix Corsair accelerators on Gimlet’s Cloud, workloads already well-optimized for agentic AI can achieve even greater performance gains, enabling token delivery speeds that enable industry-leading levels of interactivity required for today’s most critical applications.

d-Matrix & Gimlet Labs Interest Form

Complete the form below and our team will reach out to connect.