Snapchat: My AI Goes Multimodal with Google Cloud, Gemini

Snap Inc. is leveraging its relationship with Google Cloud to use Gemini for powering generative AI experiences within Snapchat’s My AI chatbot. The multimodal capabilities of Gemini on Vertex AI will greatly increase the My AI chatbot’s ability to understand and operate across different types of information such as text, audio, image, video and code. Snapchatters can use My AI to take advantage of Google Lens-like features, including asking the chatbot “to translate a photo of a street sign while traveling abroad, or take a video of different snack offerings to ask which one is the healthiest option.” Continue reading Snapchat: My AI Goes Multimodal with Google Cloud, Gemini

Hailuo AI: China’s MiniMax Releases Free Text-to-Video App

Backed by Alibaba and Tencent, Chinese startup MiniMax has launched a new text-to-video model called Hailuo AI that is quickly gaining traction on social media based on its impressive capabilities, with comments ranging from “fantastical” to “hyper-realistic.” The free, web-based tool has already triggered videos that have gone viral, despite the current limitation of only 6-second clips. However, an image-to-video model is reportedly coming soon, in addition to a version 2 that promises longer video duration and improved motion. Unlike the Jimeng AI text-to-video model that was issued by ByteDance last month, the MiniMax technology is available outside of China. Continue reading Hailuo AI: China’s MiniMax Releases Free Text-to-Video App

Adobe Publicly Demos Firefly Text- and Image-to Video Tools

Adobe is showcasing upcoming generative AI video tools that build on the Firefly video model the software giant announced in April. The offerings include a text-to-video feature and one that generates video from pictures. Each outputs clips of up to five seconds. Adobe has developed Firefly as the generative component of the AI integration it is rolling out across its Adobe’s Creative Cloud applications, which previously focused on editing and now, thanks to gen AI, incorporate creation. Adobe wasn’t a first-mover in the space, but its percolating effort has been received enthusiastically. Continue reading Adobe Publicly Demos Firefly Text- and Image-to Video Tools

ByteDance Intros Jimeng AI Text-to-Video Generator in China

ByteDance has debuted a text-to-video mobile app in its native China that is available on the company’s TikTok equivalent there, Douyin. Called Jimeng AI, there is speculation that it will be coming to North America and Europe soon via TikTok or ByteDance’s CapCut editing tool, possibly beating competing U.S. technologies like OpenAI’s Sora to market. Jimeng (translation: “dream”) uses text prompts to generate short videos. For now, its responsiveness is limited to prompts written in Chinese. In addition to entertainment, the app is described as applicable to education, marketing and other purposes. Continue reading ByteDance Intros Jimeng AI Text-to-Video Generator in China

WordPress Introduces AI Assistant to Help Users with Writing

WordPress parent Automattic has launched Write Brief with AI to help make documents more concise. Available for free to WordPress.com users, Write Brief with AI measures “readability,” suggests edits and will even make them for you. It identifies complex words and offers alternatives and focuses on simplifying convoluted sentences — all from within the editor function in the WordPress dashboard. Write Brief with AI is now built-in to Jetpack for those who host through WordPress.com, available only in English, though the company says it is working to expand language support. Continue reading WordPress Introduces AI Assistant to Help Users with Writing

Adobe Adds New Firefly AI Features to Illustrator, Photoshop

Adobe is bringing more Firefly AI features to its popular Photoshop and Illustrator design platforms. The upgrade is a significant step forward for Adobe since the 2023 debut of Firefly, and sees Photoshop finally getting in-app ability to generate AI images, and also a new Generative Shape Fill that is still in beta, allowing designers to quickly add detailed vectors to shapes by entering text prompts directly in the Contextual Taskbar. Improvements to Illustrator include the Dimension Tool, Retype, Style Reference, its own Contextual Taskbar, Retype and two new beta tools, Text to Pattern and Mockup. Continue reading Adobe Adds New Firefly AI Features to Illustrator, Photoshop

Suno’s AI Music Generator Is Now Available for iOS Devices

Suno, the AI text-to-music startup that along with AI music generator Udio is currently being sued by the Recording Industry Association of America, has launched its long-awaited mobile app. Likened to a pocket-sized virtual music studio, it is available for free (with ads) to iOS users in the U.S. Suno says a global rollout is coming soon, as is a mobile app for Android. “Whether you’re a shower singer or a charting artist, we break barriers between you and the song you dream of making. No instrument needed, just imagination,” touts Suno’s landing page on Apple’s App Store. Continue reading Suno’s AI Music Generator Is Now Available for iOS Devices

Meta’s 3D Gen Bridges Gap from AI to Production Workflow

Meta Platforms has introduced an AI model it says can generate 3D images from text prompts in under one minute. The new model, called 3D Gen, is billed as a “state-of-the-art, fast pipeline” for turning text input into high-resolution 3D images quickly. The app also adds textures to AI output or existing images through text prompts, and “supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications,” Meta explains, adding that in internal tests, 3D Gen outperforms industry baselines on “prompt fidelity and visual quality” and for speed. Continue reading Meta’s 3D Gen Bridges Gap from AI to Production Workflow

Runway Making Gen-3 Alpha AI Video Model Available to All

New York-based AI startup Runway has made its latest frontier model — which creates realistic AI videos from text, image or video prompts — generally available to users willing to upgrade to a paid plan starting at $12 per month for each editor. Introduced several weeks go, Gen-3 Alpha reportedly offers significant improvements over Gen-1 and Gen-2 in areas such as speed, motion, fidelity and consistency. Runway explains it worked with a “team of research scientists, engineers and artists” to develop the upgrades but did not specify where it collected its training data. As the AI video field ramps up, current rivals include Stability AI, OpenAI, Pika and Luma Labs. Continue reading Runway Making Gen-3 Alpha AI Video Model Available to All

Genspark Joins Collection of GenAI-Powered Search Engines

MainFunc Inc. has raised $60 million on the strength of its principal technology — a free, AI-powered search engine called Genspark. The platform responds to queries by writing custom summaries that are presented in a “Sparkpage,” a one-page overview featuring content from around the web. Genspark joins a growing field of generative AI search engines, the best-known of which is Perplexity, which has raised $250 million since its 2022 launch and is currently valued at about $2.5 billion. Reuters says Genspark’s funding values the company at $260 million. Google also offers “AI Overviews” as part of Google search. Continue reading Genspark Joins Collection of GenAI-Powered Search Engines

Anthropic’s Claude 3.5: ‘Frontier Intelligence at 2x the Speed’

Anthropic has launched a powerful new AI model, Claude 3.5 Sonnet, that can analyze text and images and generate text. That its release comes a mere three months after Anthropic debuted Claude 3 indicates just how quickly the field is developing. The Google-backed company says Claude 3.5 Sonnet has set “new industry benchmarks for graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval).” Sonnet is Anthropic’s mid-tier model, between Haiku and, on the high-end, Opus. Anthropic says 3.5 Sonnet is twice as fast as 3 Opus, offering “frontier intelligence at 2x the speed.” Continue reading Anthropic’s Claude 3.5: ‘Frontier Intelligence at 2x the Speed’

Meta’s FAIR Team Announces a New Collection of AI Models

Meta Platforms is publicly releasing five new AI models from its Fundamental AI Research (FAIR) team, which has been experimenting with artificial intelligence since 2013. These models including image-to-text, text-to-music generation, and multi-token prediction tools. Meta is introducing a new technique called AudioSeal, an audio watermarking technique designed for the localized detection of AI-generated speech. “AudioSeal makes it possible to pinpoint AI-generated segments within a longer audio snippet,” according to Meta. The feature is timely in light of concern about potential misinformation surrounding the fall presidential election. Continue reading Meta’s FAIR Team Announces a New Collection of AI Models

Runway’s Gen-3 Alpha Creates AI Videos Up to 10-Seconds

Runway ML has introduced a new foundation model, Gen-3 Alpha, which the company says can generate high-quality, realistic scenes of up to 10 seconds long from text prompts, still images or a video sample. Offering a variety of camera movements, Gen-3 Alpha will initially roll out to Runway’s paid subscribers, but the company plans to add a free version in the future. Runway says Gen-3 Alpha is the first of a new series of models trained on the company’s new large-scale multimodal infrastructure, which offers improvements “in fidelity, consistency, and motion over Gen-2,” released last year. Continue reading Runway’s Gen-3 Alpha Creates AI Videos Up to 10-Seconds

Luma AI Dream Machine Video Generator in Free Public Beta

Northern California startup Luma AI has released Dream Machine, a model that generates realistic videos from text prompts and images. Built on a scalable and multimodal transformer architecture and “trained directly on videos,” Dream Machine can create “action-packed scenes” that are physically accurate and consistent, says Luma, which has a free version of the model in public beta. Dream Machine is what Luma calls the first step toward “a universal imagination engine,” while others are calling it “powerful” and “slammed with traffic.” Though Luma has shared scant details, each posted sequence looks to be about 5 seconds long. Continue reading Luma AI Dream Machine Video Generator in Free Public Beta

ByteDance Rival Kuaishou Creates Kling AI Video Generator

China’s Kuaishou Technology has a video generator called Kling AI in public beta that is getting great word-of-mouth, with comments from “incredibly realistic” to “Sora killer,” a reference to OpenAI’s still in closed beta video generator. Kuaishou claims that using only text prompts, Kling can generate “AI videos that closely mimic the real world’s complex motion patterns and physical characteristics,” in sequences as long as two minutes at 30 fps and 1080p, while supporting various aspect ratios. Kuaishou is China’s second most popular short-form video app, after ByteDance’s Douyin, the Chinese version of TikTok. Continue reading ByteDance Rival Kuaishou Creates Kling AI Video Generator