Corsair. Built for
Generative AI
Corsair is the first-of-its-kind compute platform. It offers unparalleled performance and ability to scale with your generative AI needs in cloud and enterprise environments.
More. Faster. Cheaper.
Lowest latency and highest throughput delivers lowest cost per query, blazing fast token generation and best TCO.
Results are preliminary and subject to change.
High Throughput
Up to 20x higher throughput for generative inference on LLMs
Results are preliminary and subject to change.
Low Latency
Up to 20x lower inference latency for generative LLMs (lower is better)
Results are preliminary and subject to change.
Significant Savings
Up to 30x better TCO to help your bottom line
Introducing Corsair
- 130 billion transistors
- 84TB/s NOC bandwidth
- Native FP & block float compute
Corsair is a one-of-a-kind digital in-memory compute engine with a blazing fast die-to-die interconnect for scale-up and scale-out.
- 2048 DIMC cores
- 8TB/s die-to-die bandwidth
- 2GB SRAM, 150TB/s
Corsair is designed to fit your model entirely in memory, giving you the power to deploy the way you intended.
- 2400 – 9600 TFLOPS
- 1TB/s chip-to-chip bandwidth
- 256GB LPDDR5 prompt cache
Corsair comes complete with a frictionless Software Development Kit (SDK) for “push-button, zero-touch” deployment.
20x better power
efficiency
20x lower
latency
10x lower
TCO
Aviator: The only way to pilot your model
Corsair’s architecture was designed with software in mind. Our Aviator software stack delivers on the promise of a “frictionless” user experience.
Open source
Aviator uses broadly adopted open-source software to allow end users to easily deploy their model.
Scalable
System software easily integrates into inference servers for process spawning, pipeline management, and scale-out communication.
Frictionless
Compiler and runtime produces highly optimal placements and schedules for graph execution without manual intervention.