By
Paula ParisiAugust 6, 2024
A new generative AI startup called Black Forest Labs has hit the scene, debuting with a suite of text-to-image models branded FLUX.1. Based in Germany, Black Forest was founded by some of the researchers involved in developing Stable Diffusion and has raised $31 million in funding from principal investor Andreessen Horowitz and angels including CAA founder and former talent agent Michael Ovitz. The FLUX.1 suite focuses on “image detail, prompt adherence, style diversity and scene complexity,” the company says of its three initial variants: FLUX.1 [pro], FLUX.1 [dev] and FLUX.1 [schnell]. Continue reading Black Forest Labs Announces Suite of Text-to-Image Models
By
Rob ScottAugust 1, 2024
Graphic design company Canva announced it is acquiring fellow Australian startup Leonardo AI with plans to have Leonardo’s 120 employees, including executives, join the Canva AI team. Financial terms of the deal were not disclosed. Sydney-based Leonardo has been gaining attention for its advanced generative AI platform that helps users create images and art based on the open-source Stable Diffusion model developed by Stability AI. The Leonardo team claims its offering is different than other AI art platforms since it provides users with more control. Users can experiment with text prompts and quick sketches as Leonardo.ai creates photorealistic images in real time. Continue reading Canva Aims to Boost Its GenAI Efforts with Leonardo Purchase
By
Paula ParisiJuly 29, 2024
Stability AI has unveiled an experimental new model, Stable Video 4D, which generates photorealistic 3D video. Building on what it created with Stable Video Diffusion, released in November, this latest model can take moving image data of an object and iterate it from multiple angles — generating up to eight different perspectives. Stable Video 4D can generate five frames across eight views in about 40 seconds using a single inference, according to the company, which says the model has “future applications in game development, video editing, and virtual reality.” Users begin by uploading a single video and specifying desired 3D camera poses. Continue reading Stable Video 4D Adds Time Dimension to Generative Imagery
By
Paula ParisiJuly 1, 2024
The world’s first AI-powered movie camera has surfaced. Still in development, it aims to enable filmmakers to turn footage into AI imagery in real time while shooting. Called the CMR-M1, for camera model 1, it is the product of creative tech agency SpecialGuestX and media firm 1stAveMachine, with the goal of providing creatives with a familiar interface for AI imagemaking. It was inspired by the Cine-Kodak device, the first portable 16mm camera. “We designed a camera that serves as a physical interface to AI models,” said Miguel Espada, co-founder and executive creative technologist at SpecialGuestX, a company that does not think directors will use AI sitting at a keyboard. Continue reading New Prototype Is the World’s First AI-Powered Movie Camera
By
ETCentric StaffMarch 28, 2024
Researchers from the Massachusetts Institute of Technology and Adobe have unveiled a new AI acceleration tool that makes generative apps like DALL-E 3 and Stable Diffusion up to 30x faster by reducing the process to a single step. The new approach, called distribution matching distillation, or DMD, maintains or enhances image quality while greatly streamlining the process. Theoretically, the technique “marries the principles of generative adversarial networks (GANs) with those of diffusion models,” consolidating “the hundred steps of iterative refinement required by current diffusion models” into one step, MIT PhD student and project lead Tianwei Yin says. Continue reading New Tech from MIT, Adobe Advances Generative AI Imaging
By
ETCentric StaffMarch 25, 2024
Stability AI has released Stable Video 3D, a generative video model based on the company’s foundation model Stable Video Diffusion. SV3D, as it’s called, comes in two versions. Both can generate and animate multi-view 3D meshes from a single image. The more advanced version also let users set “specified camera paths” for a “filmed” look to the video generation. “By adapting our Stable Video Diffusion image-to-video diffusion model with the addition of camera path conditioning, Stable Video 3D is able to generate multi-view videos of an object,” the company explains. Continue reading Stable Video 3D Generates Orbital Animation from One Image
By
ETCentric StaffMarch 11, 2024
Alibaba is touting a new artificial intelligence system that can animate portraits, making people sing and talk in realistic fashion. Researchers at the Alibaba Group’s Institute for Intelligent Computing developed the generative video framework, calling it EMO, short for Emote Portrait Alive. Input a single reference image along with “vocal audio,” as in talking or singing, and “our method can generate vocal avatar videos with expressive facial expressions and various head poses,” the researchers say, adding that EMO can generate videos of any duration, “depending on the length of video input.” Continue reading Alibaba’s EMO Can Generate Performance Video from Images
By
ETCentric StaffFebruary 16, 2024
Stability AI, purveyor of the popular Stable Diffusion image generator, has introduced a completely new model called Stable Cascade. Now in preview, Stable Cascade uses a different architecture than Stable Diffusion’s SDXL that the UK company’s researchers say is more efficient. Cascade builds on a compression architecture called Würstchen (German for “sausage”) that Stability began sharing in research papers early last year. Würstchen is a three-stage process that includes two-step encoding. It uses fewer parameters, meaning less data to train on, greater speed and reduced costs. Continue reading Stability AI Advances Image Generation with Stable Cascade
By
Paula ParisiJanuary 26, 2024
Google has come up with a new approach to high resolution AI video generation with Lumiere. While most GenAI video models output individual high resolution frames at various points in the sequence (called “distant keyframes”), fill in the missing frames with low-res images to create motion (known as “temporal super-resolution,” or TSR), then up-res that connective tissue (“spatial super-resolution,” or SSR) of non-overlapping frames, Lumiere takes what Google calls a “Space-Time U-Net architecture,” which processes all frames at once, “without a cascade of TSR models, allowing us to learn globally coherent motion.” Continue reading Google Takes New Approach to Create Video with Lumiere AI
By
Paula ParisiJanuary 24, 2024
The Asus ROG Phone 8 series — demonstrated at CES 2024 in Las Vegas last week — is generating excellent reviews for its gaming capabilities and additional praise for its functionality as a smartphone. The devices start at $1,100 and tick up to an entry level of $1,500 for the ROG Phone 8 Pro. Asus calls the ROG Phone 8 series “the biggest redesign in its history,” and says it has evolved from just a gaming phone into a device suitable for streamers and content creators. At the heart of that is Qualcomm’s Snapdragon 8 Gen 3 Mobile Platform, supported by 8,533 Mbps LPDDR5X RAM and UFS 4.0 storage. Continue reading CES: The Asus ROG Phone 8 Series Highlights Mobile Gaming
By
Paula ParisiJanuary 23, 2024
HP has updated its popular flagship laptop, the HP Spectre x360, and the early reviews are quite impressive. HP has added Intel Core Ultra processors with neural processing for AI tasks and a 9MP webcam and Wi-Fi 7 capability. The Spectre x360 14 features a 14-inch screen and Intel Arc integrated graphics processing, while the Spectre x360 16 screen is two-inches larger, and includes the option to add an Nvidia GeForce RTX 4050 GPU. Both OLED screens display at 2,880 x 1,800, 120 Hz, with VESA True Black HDR 400. The 2-in-1 laptops use Intel’s latest H series chips, which are 14th generation, Meteor Lake, integrating both x86 and Arm cores on the same chip. Continue reading CES: HP Spectre Laptops Get Intel Core Ultra, 9MP Webcam
By
Paula ParisiDecember 22, 2023
Google has unveiled a new large language model designed to advance video generation. VideoPoet is capable of text-to-video, image-to-video, video stylization, video inpainting and outpainting, and video-to-audio. “The leading video generation models are almost exclusively diffusion-based,” Google says, citing Imagen Video as an example. Google finds this counter intuitive, since “LLMs are widely recognized as the de facto standard due to their exceptional learning capabilities across various modalities.” VideoPoet eschews the diffusion approach of relying on separately trained tasks in favor of integrating many video generation capabilities in a single LLM. Continue reading VideoPoet: Google Launches a Multimodal AI Video Generator
By
Paula ParisiDecember 8, 2023
Meta Platforms is moving Imagine with Meta from its test bed as a generative AI experience in chats to a standalone experience on the Web that allows users to create high-resolution images using natural language text prompts. That is one of more than 20 generative AI features Meta is deploying to create new business opportunities globally leveraging AI across search, ads, business messaging and more. While most will wind up on Facebook, Instagram, Messenger and WhatsApp, some say Meta’s popular Facebook and Instagram platforms have plateaued at 2 to 3 billion users per month, circumscribing ad growth. Continue reading Standalone Image Generator Is Among New AI Tools by Meta
By
Paul BennunDecember 4, 2023
Stability AI, developer of Stable Diffusion (one of the leading visual content generators, alongside Midjourney and DALL-E), has introduced SDXL Turbo — a new AI model that demonstrates more of the latent possibilities of the common diffusion generation approach: images that update in real time as the user’s prompt updates. This feature was always a possibility even with previous diffusion models given text and images are comprehended differently across linear time, but increased efficiency of generation algorithms and the steady accretion of GPUs and TPUs in a developer’s data center makes the experience more magical. Continue reading Stability AI Intros Real-Time Text-to-Image Generation Model
By
Paula ParisiNovember 27, 2023
Stability AI has opened research preview on its first foundation model for generative video, Stable Video Diffusion, offering text-to-video and image-to-video. Based on the company’s Stable Diffusion text-to-image model, the new open-source model generates video by animating existing still frames, including “multi-view synthesis.” While the company plans to enhance and extend the model’s capabilities, it currently comes in two versions: SVD, which transforms stills into 576×1024 videos of 14 frames, and SVD-XT that generates up to 24 frames — each at between three and 30 frames per second. Continue reading Stability Introduces GenAI Video Model: Stable Video Diffusion