Alibaba Says Qwen Reasoning Model on Par with DeepSeek
March 10, 2025
Alibaba is making AI news again, releasing another Qwen reasoning model, QwQ-32B, which was trained and scaled using reinforcement learning (RL). The Qwen team says it “has the potential to enhance model performance beyond conventional pretraining and post-training methods.” QwQ-32B, a 32 billion parameter model, “achieves performance comparable to DeepSeek-R1, which boasts 671 billion parameters (with 37 billion activated),” Alibaba claims. While parameters refer to the total set of adjustable weights and biases in the model’s neural network, “activated” parameters are a subset used for a specific inference task, like generating a response.
On a GitHub post, the Qwen team calls QwQ-32B’s parity with Chinese rival DeepSeek-R1 “remarkable,” attributing the result to “the effectiveness of RL when applied to robust foundation models pretrained on extensive world knowledge.”
The team also touts the fact that it has “integrated agent-related capabilities into the reasoning model, enabling it to think critically while utilizing tools and adapting its reasoning based on environmental feedback” — advancements that not only demonstrate “the transformative potential of RL,” but also lay the groundwork for further innovations “in the pursuit of artificial general intelligence.”
QwQ, short for Qwen-with-Questions, was introduced by Alibaba in late 2024 as an open-source reasoning model that aimed to compete with OpenAI o1, introduced in September with claims of PhD-level performance.
“Since QwQ’s initial release, the AI landscape has evolved rapidly,” writes VentureBeat, noting that “the limitations of traditional LLMs have become more apparent, with scaling laws yielding diminishing returns in performance improvements.”
That turn of events has triggered interest in large reasoning models (LRMs), described by VentureBeat as “a new category of AI systems that use inference-time reasoning and self-reflection to enhance accuracy.” Included among them: OpenAI’s o3 series and DeepSeek-R1, which VentureBeat calls “massively successful.”
Decrypt couches the compact QwQ-32B’s reception as “a David versus Goliath achievement” that “manages to match the performance of much larger competitors despite being a fraction of their size.” In doing so, Alibaba’s new model “has caught the attention of AI researchers and developers globally,” Decrypt adds.
QwQ-32B is available “on Hugging Face and on ModelScope under an Apache 2.0 license,” reports VentureBeat, noting that it is “available for commercial and research uses, so enterprises can employ it immediately to power their products and applications (even ones they charge customers to use).”
No Comments Yet
You can be the first to comment!
Leave a comment
You must be logged in to post a comment.