OpenAI: sCM Generates Media 50x Faster Than Other Models

By Paula Parisi
October 28, 2024

OpenAI is taking a new approach to generating media that it says is 50 times faster than the models commonly used today. Called sCM, the approach is a “consistency model,” a variation on the diffusion method used by many leading systems. OpenAI claims its new model is ideal for training for large scale datasets and generating video, audio and images that are of “comparable sample quality to leading diffusion models.” Such models often require hundreds of steps, creating challenges when it comes to real-time applications. OpenAI aims to change this with a faster system that requires less power.

That means the sCM system is capable of “generating images in nearly a 10th of a second compared to more than 5 seconds for regular diffusion,” accelerating the process “without compromising on quality,” reports VentureBeat.

“Various distillation techniques have been developed to accelerate sampling, but they often come with limitations, such as high computational costs, complex training, and reduced sample quality,” OpenAI explains in an introductory post by researchers Cheng Lu and Yang Song, who also authored the research paper.

Extending the company’s previous research on consistency models, OpenAI “simplified the formulation and further stabilized the training process of continuous-time consistency models.” The result, sCM, “scales the training of continuous-time consistency models to an unprecedented 1.5 billion parameters on ImageNet at 512×512 resolution,” OpenAI writes.

Its largest sCM model, with 1.5 billion parameters, “generates a single sample in just 0.11 seconds on a single A100 GPU without any inference optimization. Additional acceleration is easily achievable through customized system optimization, opening up possibilities for real-time generation in various domains such as image, audio and video.”

Tech Xplore says those real-time applications are expected to be coming soon.

Diffusion models, popular for generative media uses, typically rely on “three major components: forward and reverse processes and a sampling procedure,” Tech Xplore notes, adding that “most such models execute hundreds of steps to generate an end product, which is why most of them take a few moments to carry out their tasks.”

The two-step streamlining of sCM not only makes it much faster, but also requires “a lot less computational power than other models, an ongoing issue with AI applications in general as their use skyrockets,” according to Tech Xplore.

OpenAI: sCM Generates Media 50x Faster Than Other Models

No Comments Yet

Leave a comment