Apple Unveils New Advances in Artificial Intelligence Research

Apple recently announced advances in artificial intelligence research that could introduce more immersive visual experiences and enable sophisticated AI systems to run on the company’s popular mobile devices. Two new research papers highlight techniques for creating 3D avatars from video content and efficiently deploying large language models on devices challenged by limited memory. The real-time ability to create avatars and 3D scenes from an iPhone camera could bring a range of new possibilities for CE devices in areas such as synthetic media, telepresence, social interaction, virtual try-on and more.

“In the first research paper, Apple scientists propose HUGS (Human Gaussian Splats) to generate animated 3D avatars from short monocular videos (i.e. videos taken from a single camera)” reports VentureBeat.

“Our method takes only a monocular video with a small number of (50-100) frames, and it automatically learns to disentangle the static scene and a fully animatable human avatar within 30 minutes,” explains Muhammed Kocabas, lead author of the paper, a collaboration with the Max Planck Institute for Intelligent Systems. “To capture details that are not modeled by SMPL (e.g., cloth, hairs), we allow the 3D Gaussians to deviate from the human body model.”

SMPL refers to a statistical body shape model that helps initialize the human models. According to the researchers, HUGS is up to 100 times faster than previous avatar generation methods in training and rendering.

“In the second paper, Apple researchers tackled a key challenge in deploying large language models (LLMs) on devices with limited memory,” VentureBeat adds. Language models such as “GPT-4 contain hundreds of billions of parameters, making inference expensive on consumer hardware. The proposed system minimizes data transfer from flash storage into scarce DRAM during inference.”

A technique called “windowing” reuses activations from earlier inferences to expedite the process. Another, called “row-column bundling,” stores rows and columns together to access larger blocks of data. The researchers suggest these methods improve inference latency by 20-25 times on a GPU.

“The optimizations could soon allow complex AI assistants and chatbots to run smoothly on iPhone, iPads, and other mobile devices” notes VB.

“Running the kind of large AI model that powers ChatGPT or Google’s Bard on a personal device brings formidable technical challenges, because smartphones lack the huge computing resources and energy available in a data center,” notes Ars Technica. “Solving this problem could mean that AI assistants respond more quickly than they do from the cloud and even work offline.”

Related:
Apple Publishes Research to Bring AI Models to iPhones and Make Videos into 3D AvatarsSiliconANGLE, 12/21/23
Apple Research Reveals Some Dazzling AI Tech Could Be Headed to Your iPhone, ZDNet, 12/21/23
Apple Develops Breakthrough Method for Running LLMs on iPhones, MacRumors, 12/21/23
Apple’s AI Research Signals Ambition to Catch Up with Big Tech Rivals, Financial Times, 12/21/23
Apple Quietly Released an Open Source Multimodal LLM in October, VentureBeat, 12/23/23

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.