Microsoft Says Phi-2 Can Outperform Large Language Models

Microsoft is releasing Phi-2, a text-to-text small language model (SLM) that outperforms some LLMs, yet is light enough to run on a mobile device or laptop, according to Microsoft CEO Satya Nadella. The 2.7 billion-parameter SLM beat Meta Platforms’ Llama 2 and Mistral 7B from France (each with 7 billion parameters) says Microsoft, emphasizing its complex reasoning and language comprehension are exceptional for a model with less than 13 billion parameters. For now, Microsoft is making it available “for research purposes only” under a custom license.

On complex benchmarks, Phi-2 can match or best models up to 25x larger, says Microsoft, positioning Phi-2 as “an ideal playground” for those exploring mechanistic interpretability, safety improvements, or fine-tuning, per a Microsoft Research blog post.

Microsoft researchers also claim that that despite a half a billion fewer parameters, Phi-2 “matches or outperforms the recently-announced Google Gemini Nano 2,” the smallest of the LLM models based on multimodal Gemini technology developed by Google DeepMind.

VentureBeat caught Microsoft “taking a little dig at Google’s now much-criticized, staged demo video for Gemini,” by getting Phi-2 to correctly solve the same “fairly complex physics problems” using the same prompts as Gemini Ultra, the series’ top of the line.

This latest SLM from Microsoft was preceded by the original Phi-1, a 1.3 billion parameter model that delivered what Microsoft says was “state-of-the-art” Python coding when compared with existing SLMs (using HumanEval and MBPP benchmarks). The improved Phi-1.5, also with 1.3 billion parameters, offered superior “common sense reasoning and language understanding,” with “performance comparable to models 5x larger,” according to the company.

Microsoft attributes Phi-2’s performance to “being trained on the carefully curated and textbook-quality data that was geared toward teaching reasoning, knowledge and common sense, meaning it can learn more from less information,” SiliconANGLE reports, adding that “Microsoft researchers also implemented techniques that allow the embedding of knowledge from smaller models.”

Phi-2 is being showcased at Microsoft’s Azure AI Studio, a hub for foundation model discovery that is currently in public preview. That means Phi-2 is “available now for researchers and developers looking to integrate it into third-party applications,” writes SiliconANGLE.

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.