Alibaba releases open-source Qwen3-Next model with 10x power, long context and lower costs

Alibaba releases open-source Qwen3-Next model with 10x power, long context and lower costs

Alibaba has introduced Qwen3-Next, a new generation of open-source large language models claimed to be ten times more powerful and ten times cheaper to build than their predecessor. The release includes two variants, Qwen3-Next-Instruct and Qwen3-Next-Thinking, both under the Apache 2.0 license and available on Hugging Face, ModelScope, Kaggle, Alibaba Cloud, and Qwen Chat.

Qwen3-Next uses a hybrid architecture combining Gated DeltaNet for fast processing and Gated Attention for reasoning accuracy. Only 3 billion of its 80 billion parameters are activated per token, improving efficiency. It also expands its Mixture-of-Experts design to 512 experts, up from 128 in Qwen3, balancing performance and cost. The models support a native 256,000-token context window and can handle up to one million tokens with RoPE scaling, while offering at least 25 percent lower pricing than Qwen3-235B on Alibaba Cloud.

Additional features include native multi-token prediction for faster inference, updated normalization for more stable training, and major throughput improvements on long contexts. The models are integrated with Hugging Face Transformers, SGLang, vLLM, and Qwen-Agent, and can run at 80 billion parameters on a single Nvidia H200 GPU.

by Mauricio B. Holguin

alexandrezanni
alexandrezanni found this interesting
Qwen iconQwen
  21
  • ...

Qwen is Alibaba Cloud's general-purpose AI model, designed as an AI chatbot. It is rated 4.5 and offers features such as being AI-powered and ad-free. Qwen serves as an alternative to other AI chatbots, providing users with a streamlined and efficient conversational experience.

No comments so far, maybe you want to be first?
Gu