Meta’s 3D Gen Bridges Gap from AI to Production Workflow

Meta Platforms has introduced an AI model it says can generate 3D images from text prompts in under one minute. The new model, called 3D Gen, is billed as a “state-of-the-art, fast pipeline” for turning text input into high-resolution 3D images quickly. The app also adds textures to AI output or existing images through text prompts, and “supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications,” Meta explains, adding that in internal tests, 3D Gen outperforms industry baselines on “prompt fidelity and visual quality” and for speed.

Meta says 3D Gen integrates “key technical components, Meta 3D AssetGen and Meta 3D TextureGen,” developed by the company for text-to-3D and text-to-texture generation, respectively.

“By combining their strengths, 3D Gen represents 3D objects simultaneously in three ways: in view space, in volumetric space, and in UV (or texture) space,” Meta explains in a technical analysis that concludes “the integration of these two techniques achieves a win rate of 68 percent with respect to the single-stage model.”

PetaPixel notes that Meta claims its “two-stage approach” results in “higher-quality 3D generation for immersive content creation.”

“Meta 3D Gen is a new combined AI system that can generate high-quality 3D assets, with both high-resolution textures and material maps end-to-end,” the company posted on X.

VentureBeat identifies the PBR materials support as a “significant feature” in bridging the gap between “AI and professional 3D workflows” by applying realistic relighting to generative 3D objects. “This capability is crucial for integrating AI-generated assets into real-world applications and existing 3D pipelines,” VentureBeat writes.

“This system can generate 3D assets with high-resolution textures & material maps end-to-end with results that are superior in quality to previous state-of-the-art solutions — at 3-10x the speed of previous work,” Meta AI posted on Threads with some examples.

Worth noting, according to PetaPixel, is “that by separating mesh models and texture maps, 3D Gen promises significant control over the final output and allows for the iterative refinement common to text-to-image generators.” Users can make adjustments to the input for texture style without having to change the underlying model.

As for speed, “the team estimates an average inference time of just 30 seconds in creating the initial 3D model using Meta’s 3D AssetGen model,” Digital Trends reports, adding that “users can then go back and either refine the existing model texture or replace it with something new, both via text prompts, using Meta 3D TextureGen, a process the company figures should take no more than an additional 20 seconds of inference time.”

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.