
Why we needed a new Transparent NIC solution
Large-scale AI workloads are deployed by stringing together many cards and many nodes. Any increase in model size, context length or users impacts the number of cards needed. Gone are…Read More
Large-scale AI workloads are deployed by stringing together many cards and many nodes. Any increase in model size, context length or users impacts the number of cards needed. Gone are…Read More
AI adoption is accelerating at an unprecedented pace, but the economics of running AI models are becoming unsustainable. And the cracks are beginning to show. As more people rush to…Read More
We’ve become accustomed to a brand new model coming out every few weeks that one-ups the last one at this point. Developers don’t sit with just a single model—in fact,…Read More
The future of AI workflows is almost certainly going to be multi-modal agents. But rather than incredibly complex, compute-hungry multimodal models, there’s already a much easier pathway to get there. …Read More
‘AI Inference’ has been trending everywhere, from keynote speeches to quarterly earnings reports and in the news. You have probably heard phrases like “inference-time compute”, “reasoning” and “AI deployment” and…Read More
In this talk, d-Matrix CTO/Cofounder Sudeep Bhoja discusses the impact that the release of the Deep Seek R1 model is having on inference compute. Stepping through the evolution of reasoning…Read More