Qwen3: Think Deeper, Act Faster

Alibaba’s Qwen crew has unleashed Qwen 3, an Apache-2.0 stack from a svelte 0.6 B up to a 235 B-parameter MoE colossus (only 22 B active) that humbles much larger rivals. A new “hybrid-thinking” switch lets it waffle through chain-of-thought when problems get knottier, or snap out terse replies when latency is tighter than a Monty Python cheese shop. Fed a gluttonous 36 T tokens in 119 languages, sporting 128 K context, sharper coding and agent skills (MCP + Qwen-Agent), and plug-and-play support for vLLM, Ollama, SGLang and friends, the dense models outclass Qwen 2.5 at half the size while MoE variants trim inference bills by ~90 %. Translation: deeper reasoning, faster answers, and your finance chief won’t need to resort to the comfy chair.

QWEN3