Alibaba unveils Qwen3 open weight models with hybrid reasoning and multilingual support

Alibaba unveils Qwen3 open weight models with hybrid reasoning and multilingual support

Chinese tech giant Alibaba has unveiled Qwen3, the latest in its series of large language models designed for processing text, code, math, images, and audio. The flagship model, Qwen3-235B-A22B, boasts competitive benchmark results against other leading models like DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro. Notably, the smaller Mixture of Experts (MoE) model, Qwen3-30B-A3B, outperforms QwQ-32B with significantly fewer activated parameters, while the compact Qwen3-4B rivals the performance of Qwen2.5-72B-Instruct.

Alibaba is open-weighting two MoE models, including the Qwen3-235B-A22B with 235 billion total parameters (22 billion activated parameters) and Qwen3-30B-A3B with 30 billion total parameters (3 billion activated parameters). Additionally, six dense models are open-weighted under the Apache 2.0 license. Qwen3 models utilize a hybrid problem-solving approach with Thinking and Non-Thinking modes, catering to both complex and simple queries.

Supporting 119 languages and dialects, Qwen3 models also feature enhanced agentic capabilities. Post-trained models, such as Qwen3-30B-A3B, are available on platforms like Hugging Face and Kaggle, and can be deployed or run locally using software like Ollama and LM Studio.

by Paul

pj
AllanChain
pjburner found this interesting
Qwen iconQwen
  21
  • ...

Qwen is Alibaba Cloud's general-purpose AI model, designed as a Large Language Model (LLM) with a focus on AI-powered capabilities. It holds a rating of 4 and is positioned as a versatile tool for various AI applications. Key alternatives to Qwen include ChatGPT, HuggingChat, and Perplexity, each offering distinct features for different user needs.

No comments so far, maybe you want to be first?
Gu