Yahoo Using McAfee’s Modified Image Detector to Flag Fakes

Yahoo News has signed up to use San Jose-based cybersecurity company McAfee’s deepfake image detection technology. The scalable McAfee system can “quickly identify images that may have been produced or modified using AI, including deepfake images,” flagging them for the Yahoo News editorial standards team for human review. The standards team then “determines whether the flagged images meet the platform’s editorial guidelines.” The partnership provides news aggregator Yahoo with an extra layer of protection as it deals with a large network of global publishers in addition to policing its original content. Continue reading Yahoo Using McAfee’s Modified Image Detector to Flag Fakes

Midjourney Makes Powerful AI Image Editor Available in Alpha

Midjourney is turning heads with its new image editor, which lets users upload images and then make adjustments. The company’s models — most recently Midjourney 6.1 — accept uploaded images as a reference to use for generative results. Now the Midjourney image editor allows precise adjustments to aspects of the frame. An “image retexturing mode” is also being introduced, as is v2 of its “AI moderator.” The new features are only available to users with yearly memberships, monthly memberships for the past 12 months, or those who have generated at least 10,000 Midjourney images. Continue reading Midjourney Makes Powerful AI Image Editor Available in Alpha

OpenAI: sCM Generates Media 50x Faster Than Other Models

OpenAI is taking a new approach to generating media that it says is 50 times faster than the models commonly used today. Called sCM, the approach is a “consistency model,” a variation on the diffusion method used by many leading systems. OpenAI claims its new model is ideal for training for large scale datasets and generating video, audio and images that are of “comparable sample quality to leading diffusion models.” Such models often require hundreds of steps, creating challenges when it comes to real-time applications. OpenAI aims to change this with a faster system that requires less power. Continue reading OpenAI: sCM Generates Media 50x Faster Than Other Models

Runway’s Act-One Facial Capture Could Be a ‘Game Changer’

Runway is launching Act-One motion capture system that uses video and voice recordings to map human facial expressions onto characters using the company’s latest model, Gen-3 Alpha. Runway calls it “a significant step forward in using generative models for expressive live action and animated content.” Compared to past facial capture techniques — which typically require complex rigging — Act-One is driven directly and only by the performance of an actor, requiring “no extra equipment,” making it more likely to capture and preserve an authentic, nuanced performance, according to the company. Continue reading Runway’s Act-One Facial Capture Could Be a ‘Game Changer’

Meta Announces New GenAI Video Tools at Advertising Week

Meta is rolling out new generative AI advertising tools for video creation on Facebook and Instagram. The expansion to the Advantage+ creative ad suite will become widely available to advertisers in early 2025. The announcement, made at Advertising Week in New York last week, was positioned as a way for marketers to improve campaign performance on Meta’s social platforms. The new tools will allow brands to convert static images into video ads. The company also announced a new full screen video tab for Facebook that feeds short-form Reels with long-form and live-stream content. Continue reading Meta Announces New GenAI Video Tools at Advertising Week

Pyramid Flow Introduces a New Approach to Generative Video

Generative video models seem to be debuting daily. Pyramid Flow, among the latest, aims for realism, producing dynamic video sequences that have temporal consistency and rich detail while being open source and free. The model can create clips of up to 10 seconds using both text and image prompts. It offers a cinematic look, supporting 1280×768 pixel resolution clips at 24 fps. Developed by a consortium of researchers from Peking University, Beijing University and Kuaishou Technology, Pyramid Flow harnesses a new technique that starts with low-resolution video, outputting at full-res only at the end of the process. Continue reading Pyramid Flow Introduces a New Approach to Generative Video

MiniMax’s Hailuo AI Rolls Out New Image-to-Video Capability

Hailuo, the free text-to-video generator released last month by the Alibaba-backed company MiniMax, has delivered its promised image-to-video feature. Founded by AI researcher Yan Junjie, the Shanghai-based MiniMax also has backing from Tencent. The model earned high marks for what has been called “ultra realistic” video, and MiniMax says the new image-to-video feature will improve output across the board as a result of “text-and-image joint instruction following,” which means Hailuo now “seamlessly integrates both text and image command inputs, enhancing your visuals while precisely adhering to your prompts.” Continue reading MiniMax’s Hailuo AI Rolls Out New Image-to-Video Capability

Meta’s Movie Gen Model is a Powerful Content Creation Tool

Meta Platforms has unveiled Movie Gen, a new family of AI models that generates video and audio content. Coming to Instagram next year, Movie Gen also allows a high degree of editing and effects customization using text prompts. Meta CEO Mark Zuckerberg demonstrated its abilities last week in an example shared on his Instagram account, where he sends a leg press machine at the gym through transformations as a steam punk machine and one made of molten gold. The models have been trained on a combination of licensed and publicly available datasets. Continue reading Meta’s Movie Gen Model is a Powerful Content Creation Tool

Apple Advances Computer Vision with Its Depth Pro AI Model

Apple has released a new AI model called Depth Pro that can create a 3D depth map from a 2D image in under a second. The system is being hailed as a breakthrough that could potentially revolutionize how machines perceive depth, with transformative impact on industries from augmented reality to self-driving vehicles. “The predictions are metric, with absolute scale” without relying on the camera metadata typically required for such mapping, according to Apple. Using a consumer-grade GPU, the model can produce a 2.25-megapixel depth map using a single image in only 0.3 seconds. Continue reading Apple Advances Computer Vision with Its Depth Pro AI Model

Nvidia Releases Open-Source Frontier-Class Multimodal LLMs

Nvidia has unveiled the NVLM 1.0 family of multimodal LLMs, a powerful open-source AI that the company says performs comparably to proprietary systems from OpenAI and Google. Led by NVLM-D-72B, with 72 billion parameters, Nvidia’s new entry in the AI race achieved what the company describes as “state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models.” Nvidia has made the model weights publicly available and says it will also be releasing the training code, a break from the closed approach of OpenAI, Anthropic and Google. Continue reading Nvidia Releases Open-Source Frontier-Class Multimodal LLMs

Snapchat: My AI Goes Multimodal with Google Cloud, Gemini

Snap Inc. is leveraging its relationship with Google Cloud to use Gemini for powering generative AI experiences within Snapchat’s My AI chatbot. The multimodal capabilities of Gemini on Vertex AI will greatly increase the My AI chatbot’s ability to understand and operate across different types of information such as text, audio, image, video and code. Snapchatters can use My AI to take advantage of Google Lens-like features, including asking the chatbot “to translate a photo of a street sign while traveling abroad, or take a video of different snack offerings to ask which one is the healthiest option.” Continue reading Snapchat: My AI Goes Multimodal with Google Cloud, Gemini

Amazon’s Video Generator Turns Stills into Advertising Clips

Amazon has joined the ranks of firms offering generative video tools, although its release is aimed only at advertisers, at least for now. Simply called Video Generator, it can turn a product image into a video that showcases the product and even demonstrates its features, “leveraging Amazon’s unique insights to vividly bring a product story to life.” At the company’s Accelerate 2024 conference Amazon also debuted Live Image, which lets brands create animated GIFs from stills, a customizable chatbot assistant for third-party sellers, and a new AI-powered recommendation engine based on customer interests. Continue reading Amazon’s Video Generator Turns Stills into Advertising Clips

GoPro’s Hero13 Black Earns Adds New Lens Mount and HLG HDR

GoPro has announced two new cameras, the $399 Hero13 Black with swappable lenses, and its smallest 4K camera ever, the $199 Hero. The high-end Hero13 Black boasts better battery performance and four interchangeable Hero Black-series lens modules with automatic adjustments for settings. A 13x Burst Slo-Mo feature captures up to 400 frames per second at 720p, with options for 5.3K at 120 frames per second or 900p at 360 fps. Improved Wi-Fi 6 uploads at up to 40 percent faster transfer speeds and enhanced audio and voice settings are among the upgrades. Continue reading GoPro’s Hero13 Black Earns Adds New Lens Mount and HLG HDR

Hailuo AI: China’s MiniMax Releases Free Text-to-Video App

Backed by Alibaba and Tencent, Chinese startup MiniMax has launched a new text-to-video model called Hailuo AI that is quickly gaining traction on social media based on its impressive capabilities, with comments ranging from “fantastical” to “hyper-realistic.” The free, web-based tool has already triggered videos that have gone viral, despite the current limitation of only 6-second clips. However, an image-to-video model is reportedly coming soon, in addition to a version 2 that promises longer video duration and improved motion. Unlike the Jimeng AI text-to-video model that was issued by ByteDance last month, the MiniMax technology is available outside of China. Continue reading Hailuo AI: China’s MiniMax Releases Free Text-to-Video App

Adobe Publicly Demos Firefly Text- and Image-to Video Tools

Adobe is showcasing upcoming generative AI video tools that build on the Firefly video model the software giant announced in April. The offerings include a text-to-video feature and one that generates video from pictures. Each outputs clips of up to five seconds. Adobe has developed Firefly as the generative component of the AI integration it is rolling out across its Adobe’s Creative Cloud applications, which previously focused on editing and now, thanks to gen AI, incorporate creation. Adobe wasn’t a first-mover in the space, but its percolating effort has been received enthusiastically. Continue reading Adobe Publicly Demos Firefly Text- and Image-to Video Tools