NVIDIA debuts Nemotron 3 open models for scalable agent systems with up to 500B parameters

NVIDIA debuts Nemotron 3 open models for scalable agent systems with up to 500B parameters

NVIDIA has introduced the Nemotron 3 family of open models, along with datasets and libraries aimed at advancing agentic AI across industries. The models use a hybrid latent mixture of experts architecture designed to reduce communication overhead, limit context drift, and lower inference costs while supporting scalable multi agent systems.

The lineup includes three variants: Nemotron 3 Nano with 30B parameters and 3B active per task, Super with 100B parameters and 10B active per token, and Ultra with 500B parameters and 50B active per token. This range allows developers to align model choice with workload complexity and cost targets. Compared to Nemotron 2 Nano, the Nano model delivers up to four times higher throughput, reduces reasoning token generation by up to sixty percent, and supports long horizon tasks with a one million token context window.

Independent benchmarks place Nemotron 3 Nano at the top of its size class for openness, efficiency, and accuracy. Nemotron 3 Super and Ultra use NVFP4 four bit training on NVIDIA Blackwell architecture to reduce memory usage and speed up training. Nemotron 3 Nano is available through Hugging Face, multiple inference providers like Together AI, OpenRouter or Fireworks AI, and as an NVIDIA NIM microservice, while Super and Ultra are expected in the first half of 2026.

by Mauricio B. Holguin

gd
gd7ecbbu found this interesting
MORE ABOUT: #NVIDIA NIM
  • FreemiumProprietary
  • ...

NVIDIA NIM is a collection of accelerated inference microservices designed for deploying AI models on NVIDIA GPUs across various environments. It enables organizations to efficiently utilize NVIDIA's hardware capabilities for AI tasks, offering scalable and flexible solutions for running complex models.

No comments so far, maybe you want to be first?
Gu