
Together AI releases Mamba-3 with an updated recurrence design
Together.ai introduces Mamba-3, a new state space model architecture featuring a more expressive recurrence formula and complex-valued state tracking, prioritizing inference efficiency over training speed. A multi-input, multi-output (MIMO) variant increases accuracy without added decoding latency. Benchmark tests show that Mamba-3 SISO surpasses Mamba-2, Gated DeltaNet, and Llama-3.2-1B in prefill and decode latency at the 1.5B parameter scale. The open-sourced kernels use Triton, TileLang, and CuTe DSL for optimized hardware performance.
No comments so far, maybe you want to be first?
Gu