Inflection Shares Test Results for Its First AI Language Model

AI-startup Inflection has unveiled a new foundation LLM (large language model) to power its Pi chatbot. Inflection-1 approximates OpenAI’s GPT-3.5 in terms of size and functionality, which puts it on a par with ChatGPT insofar as model training. Inflection claims its LLM exceeds some benchmarks when tested against that competing system, as well as Meta Platforms’ LLaMA, DeepMind’s Chinchilla and Google’s PaLM-540B. Pi is short for Personal Intelligence, and Inflection compiled its LLM with a goal of creating an emotive AI whose conversation provides a reasonable facsimile of empathy and human-like sensibilities.

According to the company’s published results (detailed here) Inflection-1 “performs well on various measures, like middle- and high school-level exam tasks (think biology 101) and ‘common sense’ benchmarks (things like ‘if Jack throws the ball on the roof, and Jill throws it back down, where is the ball?’),” writes TechCrunch.

On the commonly used Massive Multitask Language Understanding (MMLU) benchmark that tests academic knowledge — including exams in 57 categories ranging from high school and college to professional level difficulty — Inflection says its new model not only outperformed GPT 3.5, LLaMA and PaLM 540B on five tasks (achieving 90 percent accuracy) but its 85 percent score on 15 of the tasks beat the “average” human (34.5 percent), though it fell short of a human expert’s 89.8 percent.

TechCrunch identifies a soft spot in coding, “where GPT-3.5 beats it handily and, for comparison, GPT-4 smokes the competition,” no surprise given “OpenAI’s biggest model is well known to have been a huge leap in quality.”

On a practical level, Inflection is “working hard to minimize hallucinations,” according to an Inflection-1 backgrounder, explaining Pi “would rather say it doesn’t know than get something wrong.” TechCrunch writes that Inflection plans to publish results “for a larger model comparable to GPT-4 and PaLM-2(L)” once it’s ready for primetime.

Overall, the company says it’s proud of what it’s accomplished since launching a mere year ago. “The startup built all of the AI tools in-house from start to finish, a goal for Inflection’s founders Mustafa Suleyman and Reid Hoffman, who had some experience with the need for integrating technology into other services as co-founders of DeepMind and LinkedIn, respectively,” Voicebot reports.

Pi.ai debuted in May and is available for iOs (and coming to Android) or online. TechCrunch cautions that “until Inflection opens up its model to widespread use and independent evaluation, all its vaunted benchmarks must be taken with a grain of salt.” For now, it is exclusively powering Pi, and will soon be available via Inflection’s waitlisted conversational API.

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.