Qwen3 is the latest large language model series developed by Qwen team, Alibaba Cloud. These models are advanced and intelligent systems, improving upon QwQ and Qwen2.5. The weights of Qwen3 are publicly available, including both dense and Mixture-of-Expert (MoE) models.
Key highlights:
- Diverse Models: Dense and MoE models in sizes 0.6B, 1.7B, 4B, 8B, 14B, 32B, 30B-A3B, and 235B-A22B.
- Thinking Mode: Seamlessly switch between thinking (reasoning, math, coding) and non-thinking modes (general chat).
- Reasoning: Enhanced reasoning capabilities surpassing QwQ and Qwen2.5 in math, code, and logical reasoning.
- Human Alignment: Superior alignment for creative writing, role-playing, and instruction following.
- Agent Expertise: Precise integration with external tools in both thinking modes.
- Multilingual: Supports 100+ languages with strong multilingual instruction following and translation.
Qwen3 supports multiple inference frameworks, including Transformers, SGLang, vLLM, TensorRT-LLM, llama.cpp, Ollama, and more. It can be deployed on various platforms, including cloud, local CPU/GPU, and mobile devices.
For tool use, Qwen-Agent provides a wrapper around these APIs to support tool use or function calling with MCP support. Finetuning can be done with frameworks like Axolotl, UnSloth, Swift, and Llama-Factory.

