Meta, Oxford Advance 3D Object Generation with VFusion3D
August 15, 2024
VFusion3D is the latest AI model unveiled by Meta Platforms, which developed it in conjunction with the University of Oxford. The powerful model, which uses single-perspective images or text prompts to generate high-quality 3D objects, is being hailed as a breakthrough in scalable 3D AI that can potentially transform sectors including VR, gaming and digital design. The platform tackles the challenge of scarce 3D training data in a world teeming with 2D images and text descriptions. The VFusion3D approach leverages what the developers call “a novel method for building scalable 3D generative models utilizing pre-trained video diffusion models.”
The researchers use “pre-trained video AI models to generate synthetic 3D data, allowing them to train a more powerful 3D generation system,” VentureBeat explains.
They do this by fine-tuning “an existing video AI model to produce multi-view video sequences, essentially teaching it to imagine objects from multiple angles,” then using the synthetic data to train VFusion3D, reports VentureBeat, which calls the results “really impressive.”
The model is capable of generating a 3D asset from a single image frame in mere seconds, according to the technical paper, which notes a test audience preferred VFusion3D’s 3D constructions to those achieved by what are considered current state-of-the-art systems “over 90 percent of the time.”
VentureBeat assesses the platform’s scalability as “exciting,” with the potential to “eventually accelerate innovation across industries relying on 3D content.”
Potential use-cases include rapid prototyping by game developers, architects and product designers, and VR/AR applications becoming more immersive with near real-time generation of 3D assets.
“The VFusion3D foundation model has been trained with nearly 3 million ‘synthetic multi-view data,’” writes TechSpot, which says VFusion3D “can improve the quality of generated 3D assets when a larger dataset is used for training,” with “stronger” video diffusion models and a rapidly growing asset library helping the model evolve quickly.
The goal, notes TechSpot, “is to provide companies working in the entertainment business with a much easier way to create 3D graphics,” adding that “we hope there will be no underpaid, uncredited human workers hiding behind the generative AI uncanny curtains this time.”
Hugging Face is hosting a demo, with the code made publicly available by Meta on GitHub under a non-commercial license, which means it is open-source for academic and research purposes, among other uses.
No Comments Yet
You can be the first to comment!
Leave a comment
You must be logged in to post a comment.