By
Paula ParisiNovember 8, 2024
Wonder Animation is the latest tool from Wonder Dynamics, the AI startup founded by actor Tye Sheridan and VFX artist Nikola Todorovic in 2017 that Autodesk purchased in May. Now in beta, Wonder Animation can automatically transpose live-action footage into stylized 3D animation. Creators can shoot using any camera, on any set or location, and easily convert to 3D CGI. Matching the camera position and movement to the characters and environment, Wonder Animation lets you film using any camera system and lenses, edit those shots using Maya, Blender or Unreal, and then reconstruct the result as 3D animation using AI. Continue reading Autodesk’s AI Tool Turns Live-Action Video into 3D Animation
By
Paula ParisiNovember 6, 2024
New York-based AI firm Runway has added 3D video camera controls to Gen-3 Alpha Turbo, giving users the ability to manipulate granular aspects of the scene they are generating using effects whether originating from text prompts, uploaded images or self-created video. Users can zoom in and out on a subject or scene, moving around an AI-generated character or form in 3D as if on a real set or actual location. The new feature, available now, lets creators “choose both the direction and intensity of how you move through your scenes for even more intention in every shot,” Runway explains. Continue reading Runway Adds 3D Video Cam Controls to Gen-3 Alpha Turbo
By
Paula ParisiOctober 25, 2024
Runway is launching Act-One motion capture system that uses video and voice recordings to map human facial expressions onto characters using the company’s latest model, Gen-3 Alpha. Runway calls it “a significant step forward in using generative models for expressive live action and animated content.” Compared to past facial capture techniques — which typically require complex rigging — Act-One is driven directly and only by the performance of an actor, requiring “no extra equipment,” making it more likely to capture and preserve an authentic, nuanced performance, according to the company. Continue reading Runway’s Act-One Facial Capture Could Be a ‘Game Changer’
By
Paula ParisiOctober 16, 2024
Adobe has launched a public beta of its Generate Video app, part of the Firefly Video model, which users can try for free on a dedicated website. Login is required, and there is still a waitlist for unfettered access, but the Web app facilitates up to five seconds of video generation using text and image prompts. It can turn 2D pictures into 3D animation and is also capable of producing video with dynamic text. The company has also added an AI feature called “Extend Video” to Premiere Pro to lengthen existing footage by two seconds. The news has the media lauding Adobe for beating OpenAI’s Sora and Google’s Veo to market. Continue reading Adobe Promos AI in Premiere Pro, ‘Generate Video’ and More
By
Paula ParisiOctober 8, 2024
Apple has released a new AI model called Depth Pro that can create a 3D depth map from a 2D image in under a second. The system is being hailed as a breakthrough that could potentially revolutionize how machines perceive depth, with transformative impact on industries from augmented reality to self-driving vehicles. “The predictions are metric, with absolute scale” without relying on the camera metadata typically required for such mapping, according to Apple. Using a consumer-grade GPU, the model can produce a 2.25-megapixel depth map using a single image in only 0.3 seconds. Continue reading Apple Advances Computer Vision with Its Depth Pro AI Model
By
Paula ParisiSeptember 20, 2024
A newly redesigned Snapchat experience is built around a three-tab user interface called Simple Snapchat. As part of that effort, the social platform is launching more generative video features, including text-to-video as part of the app’s Lens Studio AR authoring tool. Easy Lens allows the quick generation of Lenses by typing text prompts, making it possible to do things like experiment with Halloween costumes or explore looks for back to school. Launching in beta for select creators, Snap says the new features are designed for all ability levels. The company is also updating its GenAI Suite and adding an Animation Library of “hundreds of high-quality movements.” Continue reading Snapchat Is Getting a Redesign and Generative Text-to-Video
By
Paula ParisiSeptember 10, 2024
During the 10th annual Roblox Developers Conference (RDC 2024) in San Jose, the gaming platform announced it is opening to global currencies in addition to its own Robux, which generates billions in virtual transactions each year. Starting later this year, a small test bed of developers will be able to charge real money for games that charge fees, with a program expected to open “to all eligible creators by mid-2025.” The massively multiplayer online platform that lets users build online game worlds also discussed a project to develop its own AI foundation model to power generative 3D creation on the platform. Continue reading Roblox Adds Real Currency, Teases Its Coming Generative AI
By
Paula ParisiAugust 29, 2024
Canadian generative video startup Viggle AI, which specializes in character motion, has raised $19 million in Series A funding. Viggle was founded in 2022 on the premise of providing a simplified process “to create lifelike animations using simple text-to-video or image-to-video prompts.” The result has been robust adoption among meme creators, with many viral videos circulating among social media platforms powered by Viggle, including one featuring Joaquin Phoenix as the Joker mimicking the movements of rapper Lil Yachty. Viggle’s Discord community has four million members including “both novice and experienced animators,” according to the company. Continue reading Viggle AI Raises $19 Million on the Power of Memes and More
By
Paula ParisiAugust 26, 2024
Samsung Electronics, which teased a glasses-free 3D gaming monitor at CES in January, officially announced the scheduled release of two versions at Gamescom last week. Both sizes employ light field display (LFD) technology to create what Samsung calls “lifelike 3D images” from 2D content by using a lenticular lens on the front panel. “Combined with Eye Tracking and View Mapping technology, Odyssey 3D ensures an optimized 3D experience without the need for separate 3D glasses,” according to Samsung. A built-in stereo camera monitors the movement of both eyes while proprietary View Mapping continuously adjusts the image to fuel depth perception. Continue reading Samsung Set to Release Glasses-Free Odyssey 3D Monitors
By
Paula ParisiAugust 15, 2024
VFusion3D is the latest AI model unveiled by Meta Platforms, which developed it in conjunction with the University of Oxford. The powerful model, which uses single-perspective images or text prompts to generate high-quality 3D objects, is being hailed as a breakthrough in scalable 3D AI that can potentially transform sectors including VR, gaming and digital design. The platform tackles the challenge of scarce 3D training data in a world teeming with 2D images and text descriptions. The VFusion3D approach leverages what the developers call “a novel method for building scalable 3D generative models utilizing pre-trained video diffusion models.” Continue reading Meta, Oxford Advance 3D Object Generation with VFusion3D
By
Paula ParisiJuly 31, 2024
After 50 years of SIGGRAPH, the conference has come full circle, from high-tech for PhDs to AI for everyone. That was Nvidia founder and CEO Jensen Huang’s message in back-to-back keynote sessions, including a Q&A with Meta CEO Mark Zuckerberg. Huang touted Universal Scene Description (OpenUSD), discussing developments aiming to speed adoption of the universal 3D data interchange framework for use in everything from robotics to the creation of “highly accurate virtual worlds for the next evolution of AI.” As Zuckerberg’s interlocutor, he prompted the Facebook founder to share a vision of AI’s personalization of social media. Continue reading Nvidia Debuts New Products to Accelerate Adoption of GenAI
By
Paula ParisiJuly 29, 2024
Stability AI has unveiled an experimental new model, Stable Video 4D, which generates photorealistic 3D video. Building on what it created with Stable Video Diffusion, released in November, this latest model can take moving image data of an object and iterate it from multiple angles — generating up to eight different perspectives. Stable Video 4D can generate five frames across eight views in about 40 seconds using a single inference, according to the company, which says the model has “future applications in game development, video editing, and virtual reality.” Users begin by uploading a single video and specifying desired 3D camera poses. Continue reading Stable Video 4D Adds Time Dimension to Generative Imagery
By
Paula ParisiJuly 11, 2024
Generative video creation and editing platform Captions has raised $60 million in Series C funding. Founded in 2021 by former Microsoft engineer Gaurav Misra and Goldman Sachs alum Dwight Churchill, the company’s technologies — Lipdub, AI Edit and the 3D avatar app AI Creator — have amassed more than 10 million downloads for mobile, the firm says. The C round brings its total raise to $100 million for a stated market valuation of $500 million. With the new funding, Captions plans to expand its presence in New York City, which is “emerging as the epicenter for AI research,” according to Misra. Continue reading Captions: Generative Video Startup Raises $60 Million in NYC
By
Paula ParisiJuly 10, 2024
Meta Platforms has introduced an AI model it says can generate 3D images from text prompts in under one minute. The new model, called 3D Gen, is billed as a “state-of-the-art, fast pipeline” for turning text input into high-resolution 3D images quickly. The app also adds textures to AI output or existing images through text prompts, and “supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications,” Meta explains, adding that in internal tests, 3D Gen outperforms industry baselines on “prompt fidelity and visual quality” and for speed. Continue reading Meta’s 3D Gen Bridges Gap from AI to Production Workflow
By
Paula ParisiJuly 3, 2024
Apple has released a public demo of the 4M AI model it developed in collaboration with the Swiss Federal Institute of Technology Lausanne (EPFL). The technology debuts seven months after the model was first open-sourced, allowing informed observers the opportunity to interact with it and assess its capabilities. Apple says 4M was built by applying masked modeling to a single unified Transformer encoder-decoder “across a wide range of input/output modalities — including text, images, geometric and semantic modalities, as well as neural network feature maps.” Continue reading Apple Launches Public Demo of Its Multimodal 4M AI Model