Corsair. Built for
Corsair is the first-of-its-kind compute platform. It offers unparalleled performance and ability to scale with your generative AI needs in cloud and enterprise environments.
More. Faster. Cheaper.
Lowest latency and highest throughput delivers lowest cost per query, blazing fast token generation and best TCO.
Up to 20x higher throughput for generative inference on LLMs
Up to 20x lower inference latency for generative LLMs (lower is better)
Up to 30x better TCO to help your bottom line
- 130 billion transistors
- 84TB/s NOC bandwidth
- Native FP & block float compute
Corsair is a one-of-a-kind digital in-memory compute engine with a blazing fast die-to-die interconnect for scale-up and scale-out.
- 2048 DIMC cores
- 8TB/s die-to-die bandwidth
- 2GB SRAM, 150TB/s
Corsair is designed to fit your model entirely in memory, giving you the power to deploy the way you intended.
- 2400 – 9600 TFLOPS
- 1TB/s chip-to-chip bandwidth
- 256GB LPDDR5 prompt cache
Corsair comes complete with a frictionless Software Development Kit (SDK) for “push-button, zero-touch” deployment.
20x better power
Aviator: The only way to pilot your model
Corsair’s architecture was designed with software in mind. Our Aviator software stack delivers on the promise of a “frictionless” user experience.
Aviator uses broadly adopted open-source software to allow end users to easily deploy their model.
System software easily integrates into inference servers for process spawning, pipeline management, and scale-out communication.
Compiler and runtime produces highly optimal placements and schedules for graph execution without manual intervention.