Google DeepMind Releases Imagen 3 for Free to U.S. Users

Google DeepMind has made its latest AI image generator, Imagen 3, free for use in the U.S. via the company’s ImageFX platform. Imagen 3 will be available in multiple versions, “each optimized for different types of tasks, from generating quick sketches to high-resolution images.” Google announced Imagen 3 at Google I/O in March, and in June made it available to enterprise users through Vertex. Using simplified natural language text input rather than “complex prompt engineering,” Google says Imagen 3 generates high-quality images in a range styles, from photorealistic, painterly and textured to whimsically cartoony. Continue reading Google DeepMind Releases Imagen 3 for Free to U.S. Users

ByteDance Intros Jimeng AI Text-to-Video Generator in China

ByteDance has debuted a text-to-video mobile app in its native China that is available on the company’s TikTok equivalent there, Douyin. Called Jimeng AI, there is speculation that it will be coming to North America and Europe soon via TikTok or ByteDance’s CapCut editing tool, possibly beating competing U.S. technologies like OpenAI’s Sora to market. Jimeng (translation: “dream”) uses text prompts to generate short videos. For now, its responsiveness is limited to prompts written in Chinese. In addition to entertainment, the app is described as applicable to education, marketing and other purposes. Continue reading ByteDance Intros Jimeng AI Text-to-Video Generator in China

xAI’s Grok-2 Generates Realistic Images with Few Guardrails

Grok-2 and Grok-2 mini, the latest generative chatbots from Elon Musk’s xAI, create images with seemingly few guardrails. Early pictures of notable personalities such as Bill Gates, Donald Trump and Kamala Harris in questionable or compromising settings may not appear photorealistic to a trained eye, but they are still described in many cases to be quite realistic. Powered by the FLUX.1 AI model from Black Forest Labs, Grok-2 and Grok-2 mini are available in beta on X social for Premium and Premium+ subscribers and will be coming to xAI’s enterprise API later this month, according to the company. Continue reading xAI’s Grok-2 Generates Realistic Images with Few Guardrails

Black Forest Labs Announces Suite of Text-to-Image Models

A new generative AI startup called Black Forest Labs has hit the scene, debuting with a suite of text-to-image models branded FLUX.1. Based in Germany, Black Forest was founded by some of the researchers involved in developing Stable Diffusion and has raised $31 million in funding from principal investor Andreessen Horowitz and angels including CAA founder and former talent agent Michael Ovitz. The FLUX.1 suite focuses on “image detail, prompt adherence, style diversity and scene complexity,” the company says of its three initial variants: FLUX.1 [pro], FLUX.1 [dev] and FLUX.1 [schnell]. Continue reading Black Forest Labs Announces Suite of Text-to-Image Models

Meta’s 3D Gen Bridges Gap from AI to Production Workflow

Meta Platforms has introduced an AI model it says can generate 3D images from text prompts in under one minute. The new model, called 3D Gen, is billed as a “state-of-the-art, fast pipeline” for turning text input into high-resolution 3D images quickly. The app also adds textures to AI output or existing images through text prompts, and “supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications,” Meta explains, adding that in internal tests, 3D Gen outperforms industry baselines on “prompt fidelity and visual quality” and for speed. Continue reading Meta’s 3D Gen Bridges Gap from AI to Production Workflow

Veo AI Image Generator and Imagen 3 Unveiled at Google I/O

Google is launching two new AI models: the video generator Veo and Imagen 3, billed as the company’s “highest quality text-to-image model yet.” The products were introduced at Google I/O this week, where new demo recordings created using the Music AI Sandbox were also showcased. The 1080p Veo videos can be generated in “a wide range of cinematic and visual styles” and run “over a minute” in length, Google says. Veo is available in private preview in VideoFX by joining a waitlist. At a future date, the company plans to bring some Veo capabilities to YouTube Shorts and other products. Continue reading Veo AI Image Generator and Imagen 3 Unveiled at Google I/O

Meta Tests Image-Generating Social Chatbot on Its Platforms

Meta is testing a new large language chatbot, Meta AI, on social platforms in parts of India and Africa. The chatbot was introduced in late 2023, and began testing on U.S. WhatApp users in March. The test is expanding to include more territories and the addition of Instagram and Facebook Messenger. India is reported to be Meta’s largest social market, with more than 500 million Facebook and WhatsApp users, and has big implications as the company scales up its AI plans to compete against OpenAI and others. The Meta AI chatbot answers questions and generates photorealistic images. Continue reading Meta Tests Image-Generating Social Chatbot on Its Platforms

Google Imagen 2 Now Generates 4-Second Clips on Vertex AI

During Google Cloud Next 2024 in Las Vegas, Google announced an updated version of its text-to-image generator Imagen 2 on Vertex AI that has the ability to generate video clips of up to four seconds. Google calls this feature “text-to-live images,” and it essentially delivers animated GIFs at 24 fps and 360×640 pixel resolution, though Google says there will be “continuous enhancements.” Imagen 2 can also generate text, emblems and logos in different languages, and has the ability to overlay those elements on existing images like business cards, apparel and products. Continue reading Google Imagen 2 Now Generates 4-Second Clips on Vertex AI

New Tech from MIT, Adobe Advances Generative AI Imaging

Researchers from the Massachusetts Institute of Technology and Adobe have unveiled a new AI acceleration tool that makes generative apps like DALL-E 3 and Stable Diffusion up to 30x faster by reducing the process to a single step. The new approach, called distribution matching distillation, or DMD, maintains or enhances image quality while greatly streamlining the process. Theoretically, the technique “marries the principles of generative adversarial networks (GANs) with those of diffusion models,” consolidating “the hundred steps of iterative refinement required by current diffusion models” into one step, MIT PhD student and project lead Tianwei Yin says. Continue reading New Tech from MIT, Adobe Advances Generative AI Imaging

Midjourney Creates a Feature to Advance Image Consistency

Artificial intelligence imaging service Midjourney has been embraced by storytellers who have also been clamoring for a feature that enables characters to regenerate consistently across new requests. Now Midjourney is delivering that functionality with the addition of the new “–cref” tag (short for Character Reference), available for those who are using Midjourney v6 on the Discord server. Users can achieve the effect by adding the tag to the end of text prompts, followed by a URL that contains the master image subsequent generations should match. Midjourney will then attempt to repeat the particulars of a character’s face, body and clothing characteristics. Continue reading Midjourney Creates a Feature to Advance Image Consistency

Stability AI Advances Image Generation with Stable Cascade

Stability AI, purveyor of the popular Stable Diffusion image generator, has introduced a completely new model called Stable Cascade. Now in preview, Stable Cascade uses a different architecture than Stable Diffusion’s SDXL that the UK company’s researchers say is more efficient. Cascade builds on a compression architecture called Würstchen (German for “sausage”) that Stability began sharing in research papers early last year. Würstchen is a three-stage process that includes two-step encoding. It uses fewer parameters, meaning less data to train on, greater speed and reduced costs. Continue reading Stability AI Advances Image Generation with Stable Cascade

Apple Launches Open-Source Language-Based Image Editor

Apple has released MGIE, an open-source AI model that edits images using natural language instructions. MGIE, short for MLLM-Guided Image Editing, can also modify and optimize images. Developed in conjunction with University of California Santa Barbara, MGIE is Apple’s first AI model. The multimodal MGIE, which understands text and image input, also crops, resizes, flips, and adds filters based on text instructions using what Apple says is an easier instruction set than other AI editing programs, and is simpler and faster than learning a traditional program, like Apple’s own Final Cut Pro. Continue reading Apple Launches Open-Source Language-Based Image Editor

Unpacked: Samsung Intros Galaxy AI with Next Gen S Phones

During this week’s Unpacked event, Samsung introduced Galaxy AI, a suite of artificial intelligence tools designed for the new Galaxy S series smartphones — the Galaxy S24, Galaxy S24+, and Galaxy S24 Ultra. “AI amplifies nearly every experience on the Galaxy S24 series,” including real-time text and call translations, a powerful suite of creative tools in the ProVisual Engine and a new kind of “gestural search that lets users circle, highlight, scribble on or tap anything onscreen” to see related search results. The AI enhancements are largely enabled by a multiyear deal with Google and Qualcomm. Samsung also debuted a wearable accessory, the Galaxy Ring. Continue reading Unpacked: Samsung Intros Galaxy AI with Next Gen S Phones

CES: Getty Rolls Out iStock Generative AI Powered by Nvidia

Getty Images and Nvidia are expanding their AI partnership with the addition of the text-to-image platform Generative AI by iStock, designed to produce stock photos that can be used by individuals or enterprise customers. Built on Nvidia Picasso, a foundry for custom AI models, and trained exclusively on data from Getty Images’ proprietary creative libraries, Generative AI by iStock “has been engineered to guard against generations of known products, people, places or other copyrighted elements,” Getty explains, adding that “any licensed visual that a customer generates comes with iStock’s standard $10,000 USD legal coverage.” Continue reading CES: Getty Rolls Out iStock Generative AI Powered by Nvidia

Stability AI Is Offering Paid Membership for Commercial Users

As the pressure ratchets up for AI companies to go beyond the wow factor and make money, Stability AI has formalized three subscription tiers as it seeks to expand commercial use of its open-source, multimodal core models. The Stability AI Membership offerings include a free tier for personal and research (i.e., non-commercial) use, a professional tier that costs $20 a month, and a custom-priced enterprise tier for large outfits. The company says that with the three tiers it is “striking a balance between fostering competitiveness and maintaining openness in AI technologies.” Continue reading Stability AI Is Offering Paid Membership for Commercial Users