Corsair. Built for
Generative AI

Corsair is the first-of-its-kind compute platform. It offers unparalleled performance and ability to scale with your generative AI needs in cloud and enterprise environments.

More. Faster. Cheaper.

Lowest latency and highest throughput delivers lowest cost per query, blazing fast token generation and best TCO.

Results are preliminary and subject to change.

High Throughput

Up to 20x higher throughput for generative inference on LLMs

Results are preliminary and subject to change.

Low Latency

Up to 20x lower inference latency for generative LLMs (lower is better)

Results are preliminary and subject to change.

Significant Savings

Up to 30x better TCO to help your bottom line

Introducing Corsair

  • 130 billion transistors
  • 84TB/s NOC bandwidth
  • Native FP & block float compute

Corsair is a one-of-a-kind digital in-memory compute engine with a blazing fast die-to-die interconnect for scale-up and scale-out.

  • 2048 DIMC cores
  • 8TB/s die-to-die bandwidth
  • 2GB SRAM, 150TB/s

Corsair is designed to fit your model entirely in memory, giving you the power to deploy the way you intended.

  • 2400 – 9600 TFLOPS
  • 1TB/s chip-to-chip bandwidth
  • 256GB LPDDR5 prompt cache

Corsair comes complete with a frictionless Software Development Kit (SDK) for “push-button, zero-touch” deployment.

20x better power

20x lower

10x lower

Aviator: The only way to pilot your model

Corsair’s architecture was designed with software in mind. Our Aviator software stack delivers on the promise of a “frictionless” user experience.

Open source

Aviator uses broadly adopted open-source software to allow end users to easily deploy their model.


System software easily integrates into inference servers for process spawning, pipeline management, and scale-out communication.


Compiler and runtime produces highly optimal placements and schedules for graph execution without manual intervention.

Warp speed

Frictionless experience

Lowest TCO

AI Delivered