By
Paula ParisiOctober 4, 2024
Nvidia has unveiled the NVLM 1.0 family of multimodal LLMs, a powerful open-source AI that the company says performs comparably to proprietary systems from OpenAI and Google. Led by NVLM-D-72B, with 72 billion parameters, Nvidia’s new entry in the AI race achieved what the company describes as “state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models.” Nvidia has made the model weights publicly available and says it will also be releasing the training code, a break from the closed approach of OpenAI, Anthropic and Google. Continue reading Nvidia Releases Open-Source Frontier-Class Multimodal LLMs
By
Paula ParisiOctober 2, 2024
Snap Inc. is leveraging its relationship with Google Cloud to use Gemini for powering generative AI experiences within Snapchat’s My AI chatbot. The multimodal capabilities of Gemini on Vertex AI will greatly increase the My AI chatbot’s ability to understand and operate across different types of information such as text, audio, image, video and code. Snapchatters can use My AI to take advantage of Google Lens-like features, including asking the chatbot “to translate a photo of a street sign while traveling abroad, or take a video of different snack offerings to ask which one is the healthiest option.” Continue reading Snapchat: My AI Goes Multimodal with Google Cloud, Gemini
By
Paula ParisiSeptember 24, 2024
Amazon has joined the ranks of firms offering generative video tools, although its release is aimed only at advertisers, at least for now. Simply called Video Generator, it can turn a product image into a video that showcases the product and even demonstrates its features, “leveraging Amazon’s unique insights to vividly bring a product story to life.” At the company’s Accelerate 2024 conference Amazon also debuted Live Image, which lets brands create animated GIFs from stills, a customizable chatbot assistant for third-party sellers, and a new AI-powered recommendation engine based on customer interests. Continue reading Amazon’s Video Generator Turns Stills into Advertising Clips
By
Paula ParisiSeptember 19, 2024
GoPro has announced two new cameras, the $399 Hero13 Black with swappable lenses, and its smallest 4K camera ever, the $199 Hero. The high-end Hero13 Black boasts better battery performance and four interchangeable Hero Black-series lens modules with automatic adjustments for settings. A 13x Burst Slo-Mo feature captures up to 400 frames per second at 720p, with options for 5.3K at 120 frames per second or 900p at 360 fps. Improved Wi-Fi 6 uploads at up to 40 percent faster transfer speeds and enhanced audio and voice settings are among the upgrades. Continue reading GoPro’s Hero13 Black Earns Adds New Lens Mount and HLG HDR
By
Paula ParisiSeptember 16, 2024
Backed by Alibaba and Tencent, Chinese startup MiniMax has launched a new text-to-video model called Hailuo AI that is quickly gaining traction on social media based on its impressive capabilities, with comments ranging from “fantastical” to “hyper-realistic.” The free, web-based tool has already triggered videos that have gone viral, despite the current limitation of only 6-second clips. However, an image-to-video model is reportedly coming soon, in addition to a version 2 that promises longer video duration and improved motion. Unlike the Jimeng AI text-to-video model that was issued by ByteDance last month, the MiniMax technology is available outside of China. Continue reading Hailuo AI: China’s MiniMax Releases Free Text-to-Video App
By
Paula ParisiSeptember 13, 2024
Adobe is showcasing upcoming generative AI video tools that build on the Firefly video model the software giant announced in April. The offerings include a text-to-video feature and one that generates video from pictures. Each outputs clips of up to five seconds. Adobe has developed Firefly as the generative component of the AI integration it is rolling out across its Adobe’s Creative Cloud applications, which previously focused on editing and now, thanks to gen AI, incorporate creation. Adobe wasn’t a first-mover in the space, but its percolating effort has been received enthusiastically. Continue reading Adobe Publicly Demos Firefly Text- and Image-to Video Tools
By
Paula ParisiAugust 30, 2024
Google is giving Gemini Advanced, Enterprise and Business subscribers the ability to create personalized AI assistants, which the company calls “Gems.” “Create your own personal AI experts on any topic you want,” the Alphabet company says. The search giant is also reintroducing Gemini’s image generation capabilities with its latest Imagen 3 model, which will be available to everyone. Gemini, which is Google’s ChatGPT competitor, will again have the ability to generate images of people, something Google disabled in February after controversy over some of the images. The company announced it has implemented new guardrails. Continue reading Gemini Gets Custom Gems AI Assistants and Adds Imagen 3
By
Paula ParisiAugust 28, 2024
Adobe, OpenAI and Microsoft are among the major firms backing a California bill that would require tech companies to label AI-generated content with watermarks embedded in the metadata. Such data is easily accessible via browser for material circulated on the Internet, and the initiative would likely involve a campaign to educate the general public on how to find it. The proposed law encompasses video and audio as well as images. The three companies currently supporting the bill initially opposed it, using terms like “unworkable” and “overly burdensome.” Continue reading Bill Mandating GenAI Watermarks Gains Support in California
By
Paula ParisiAugust 22, 2024
Google DeepMind has made its latest AI image generator, Imagen 3, free for use in the U.S. via the company’s ImageFX platform. Imagen 3 will be available in multiple versions, “each optimized for different types of tasks, from generating quick sketches to high-resolution images.” Google announced Imagen 3 at Google I/O in March, and in June made it available to enterprise users through Vertex. Using simplified natural language text input rather than “complex prompt engineering,” Google says Imagen 3 generates high-quality images in a range styles, from photorealistic, painterly and textured to whimsically cartoony. Continue reading Google DeepMind Releases Imagen 3 for Free to U.S. Users
By
Paula ParisiAugust 20, 2024
ByteDance has debuted a text-to-video mobile app in its native China that is available on the company’s TikTok equivalent there, Douyin. Called Jimeng AI, there is speculation that it will be coming to North America and Europe soon via TikTok or ByteDance’s CapCut editing tool, possibly beating competing U.S. technologies like OpenAI’s Sora to market. Jimeng (translation: “dream”) uses text prompts to generate short videos. For now, its responsiveness is limited to prompts written in Chinese. In addition to entertainment, the app is described as applicable to education, marketing and other purposes. Continue reading ByteDance Intros Jimeng AI Text-to-Video Generator in China
By
Paula ParisiAugust 19, 2024
Grok-2 and Grok-2 mini, the latest generative chatbots from Elon Musk’s xAI, create images with seemingly few guardrails. Early pictures of notable personalities such as Bill Gates, Donald Trump and Kamala Harris in questionable or compromising settings may not appear photorealistic to a trained eye, but they are still described in many cases to be quite realistic. Powered by the FLUX.1 AI model from Black Forest Labs, Grok-2 and Grok-2 mini are available in beta on X social for Premium and Premium+ subscribers and will be coming to xAI’s enterprise API later this month, according to the company. Continue reading xAI’s Grok-2 Generates Realistic Images with Few Guardrails
By
Paula ParisiAugust 8, 2024
Amazon has made the Amazon Titan Image Generator v2 model generally available to AWS customers using Amazon Bedrock. The improved v2 model allows creation using reference images (called “image conditioning”) and also allows editing capabilities, background removal, iteration and customization, with a focus on maintaining brand style and subject consistency. The new version “can intelligently detect and segment multiple foreground objects,” according to AWS cloud developer Channy Yun. “With the Titan Image Generator v2, you can generate color-conditioned images based on a color palette [and] use the image conditioning feature to shape your creations.” Continue reading Amazon Rolls Out New Upgrades to Its Titan Image Generator
By
Paula ParisiAugust 6, 2024
A new generative AI startup called Black Forest Labs has hit the scene, debuting with a suite of text-to-image models branded FLUX.1. Based in Germany, Black Forest was founded by some of the researchers involved in developing Stable Diffusion and has raised $31 million in funding from principal investor Andreessen Horowitz and angels including CAA founder and former talent agent Michael Ovitz. The FLUX.1 suite focuses on “image detail, prompt adherence, style diversity and scene complexity,” the company says of its three initial variants: FLUX.1 [pro], FLUX.1 [dev] and FLUX.1 [schnell]. Continue reading Black Forest Labs Announces Suite of Text-to-Image Models
By
Rob ScottAugust 1, 2024
Graphic design company Canva announced it is acquiring fellow Australian startup Leonardo AI with plans to have Leonardo’s 120 employees, including executives, join the Canva AI team. Financial terms of the deal were not disclosed. Sydney-based Leonardo has been gaining attention for its advanced generative AI platform that helps users create images and art based on the open-source Stable Diffusion model developed by Stability AI. The Leonardo team claims its offering is different than other AI art platforms since it provides users with more control. Users can experiment with text prompts and quick sketches as Leonardo.ai creates photorealistic images in real time. Continue reading Canva Aims to Boost Its GenAI Efforts with Leonardo Purchase
By
Paula ParisiJuly 26, 2024
Adobe is bringing more Firefly AI features to its popular Photoshop and Illustrator design platforms. The upgrade is a significant step forward for Adobe since the 2023 debut of Firefly, and sees Photoshop finally getting in-app ability to generate AI images, and also a new Generative Shape Fill that is still in beta, allowing designers to quickly add detailed vectors to shapes by entering text prompts directly in the Contextual Taskbar. Improvements to Illustrator include the Dimension Tool, Retype, Style Reference, its own Contextual Taskbar, Retype and two new beta tools, Text to Pattern and Mockup. Continue reading Adobe Adds New Firefly AI Features to Illustrator, Photoshop