Apple Advances Computer Vision with Its Depth Pro AI Model

By Paula Parisi
October 8, 2024

Apple has released a new AI model called Depth Pro that can create a 3D depth map from a 2D image in under a second. The system is being hailed as a breakthrough that could potentially revolutionize how machines perceive depth, with transformative impact on industries from augmented reality to self-driving vehicles. “The predictions are metric, with absolute scale” without relying on the camera metadata typically required for such mapping, according to Apple. Using a consumer-grade GPU, the model can produce a 2.25-megapixel depth map using a single image in only 0.3 seconds.

In fact, “the researchers behind the project claim that the depth maps generated by AI are better than the ones generated with the help of multiple cameras,” writes Gadgets 360.

“This could have far-reaching applications across sectors where real-time spatial awareness is key,” reports VentureBeat, adding that “the model’s creators, led by Aleksei Bochkovskii and Vladlen Koltun, describe Depth Pro as one of the fastest and most accurate systems of its kind.”

In a technical paper titled “Depth Pro: Sharp Monocular Metric Depth in Less Than a Second,” Apple researchers say of the zero-shot learning system that monocular depth estimation “underpins a growing variety of applications, such as advanced image editing, view synthesis, and conditional image generation” such as is required for AR applications.

“The model can provide real-world measurements, which is essential for applications like augmented reality, where virtual objects need to be placed in precise locations within physical spaces,” VentureBeat explains.

“Depth Pro produces metric depth maps with absolute scale on arbitrary images ‘in the wild’ without requiring metadata such as camera intrinsics,” according to the research paper. A demo that invites those interested to try it for themselves is available on Hugging Face, while the code, which Apple says is open source, is available at GitHub.

PetaPixel points out that “a depth map model can also help with AI image generation, as a deep understanding of depth maps can help a synthesis model produce more realistic results.”

Apple has said it will offer AI image generation as part of Apple Intelligence. Due to roll out by the end of the month, iOS 18.1 will bring Apple Intelligence to the iPhone, iPad and Mac, though the initial iteration is expected to offer only basic AI tools like a writing assistant, improvements to Siri and photo editing tools, with generative images and video coming later.

Apple Advances Computer Vision with Its Depth Pro AI Model

No Comments Yet

Leave a comment