Together AI releases Mamba-3 with an updated recurrence design

Together AI releases Mamba-3 with an updated recurrence design

Together.ai introduces Mamba-3, a new state space model architecture featuring a more expressive recurrence formula and complex-valued state tracking, prioritizing inference efficiency over training speed. A multi-input, multi-output (MIMO) variant increases accuracy without added decoding latency. Benchmark tests show that Mamba-3 SISO surpasses Mamba-2, Gated DeltaNet, and Llama-3.2-1B in prefill and decode latency at the 1.5B parameter scale. The open-sourced kernels use Triton, TileLang, and CuTe DSL for optimized hardware performance.

by Fla

MORE ABOUT: #Together AI
  • Paid
  • Proprietary
  • ...

Run and fine-tune generative AI models with easy-to-use APIs and highly scalable infrastructure. Train and deploy models at scale on our AI Acceleration Cloud and scalable GPU clusters. Optimize performance and cost.

No comments so far, maybe you want to be first?
Gu