By
Paula ParisiJuly 10, 2024
Meta Platforms has introduced an AI model it says can generate 3D images from text prompts in under one minute. The new model, called 3D Gen, is billed as a “state-of-the-art, fast pipeline” for turning text input into high-resolution 3D images quickly. The app also adds textures to AI output or existing images through text prompts, and “supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications,” Meta explains, adding that in internal tests, 3D Gen outperforms industry baselines on “prompt fidelity and visual quality” and for speed. Continue reading Meta’s 3D Gen Bridges Gap from AI to Production Workflow
By
Paula ParisiJuly 9, 2024
Meta’s popular instant messaging service WhatsApp is reportedly beta testing a feature that would allow the already integrated Meta AI chatbot to edit and reply to images. The capability was spotted in the WhatsApp beta for Android 2.24.14.20, with AI powered by Llama 3, the company’s newest large language model released in April. The beta version works via a camera button added to the text box for Meta AI chat in WhatsApp. When pressed, the button triggers a pop-up that indicates Meta AI can analyze and edit photos, though it’s currently unclear to what extent. Continue reading Meta AI Image Analysis and Editing Beta Tested for WhatsApp
New York-based AI startup Runway has made its latest frontier model — which creates realistic AI videos from text, image or video prompts — generally available to users willing to upgrade to a paid plan starting at $12 per month for each editor. Introduced several weeks go, Gen-3 Alpha reportedly offers significant improvements over Gen-1 and Gen-2 in areas such as speed, motion, fidelity and consistency. Runway explains it worked with a “team of research scientists, engineers and artists” to develop the upgrades but did not specify where it collected its training data. As the AI video field ramps up, current rivals include Stability AI, OpenAI, Pika and Luma Labs. Continue reading Runway Making Gen-3 Alpha AI Video Model Available to All
By
Paula ParisiJuly 2, 2024
Created by Humans, a company that aims to make it easy for creators to be compensated when their work is used for AI model training, has emerged from stealth with $5 million in funding. Positioning itself as “the AI rights licensing platform for creators,” the company was launched by Trip Adler, formerly the CEO of document sharing service and publishing platform Scribd. Noted author Walter Isaacson is an investor and creative advisor. In streamlining the licensing process, Created by Humans hopes to spare individuals and smaller companies from the proposition of engaging in costly litigation against LLM firms. Continue reading Created by Humans: AI Rights Licensing Platform for Creators
By
Paula ParisiJuly 2, 2024
Deepfake videos are becoming increasingly problematic, not only in spreading disinformation on social media but also in enterprise attacks. Now researchers at Drexel University College of Engineering say they have developed an advanced algorithm with a 98 percent accuracy rate in detecting deepfake videos. Called the MISLnet algorithm, for the school’s Multimedia and Information Security Lab where it was invented, the platform uses machine learning to recognize and extract the “digital fingerprints” of video generators including Stable Video Diffusion, VideoCrafter and CogVideo. Continue reading Drexel Claims Its AI Has 98 Percent Rate Detecting Deepfakes
By
Paula ParisiJuly 1, 2024
China’s Sneaki Design has a new smartphone camera technology called SwitchLens that makes it possible to use professional-quality interchangeable lenses with existing Android and iOS phones. It does this via a phone-mounting external camera unit that has its own one-inch CMOS sensor and coupling device for lenses built to the Micro Four Thirds (M43) open standard. The pro-sized sensor captures still images as 21MP in either the RAW or JPEG formats, and 60p MOV video at up to 4K. Existing M43 compatible lenses from manufacturers including Panasonic and Olympus work with SwitchLens, according to Sneaki Design. Continue reading SwitchLens Adds 1-Inch Sensor, M43 Lenses to Smartphones
By
Rob ScottJune 27, 2024
To address Gen Z’s ongoing interest in social video content, Pinterest announced it is updating its app so that users will have the ability to create video versions of the more than 10 billion curated boards on Pinterest. The videos can then be shared on popular social platforms such as TikTok and Instagram. Pinterest users have been using manual methods such as screenshots and green screen effects to share their boards on other apps. According to the company — which refers to this as the “mecore” trend — searches for boards labeled “mecore” jumped 255 percent since last year. The updated approach to board sharing is designed to leverage this growing trend. Continue reading Pinterest Introduces the Ability to Convert Boards into Videos
By
Paula ParisiJune 26, 2024
Amazon is launching Ad Relevance, a cookieless consumer tracking solution that will be available to those using Amazon DSP, a tool that lets advertisers buy Internet ad placements on and off Amazon’s website. Ad Relevance “uses the latest in AI technology to analyze billions of browsing, buying, and streaming signals in conjunction with real-time information about the content being viewed” to reveal customer shopping patterns and serve relevant ads across devices, channels, and content types without using third-party cookies. The technology accommodates Google’s long-delayed cookie deprecation, currently set for 2025. Continue reading Amazon Debuts Ad Relevance Cookieless Solution in Cannes
By
Paula ParisiJune 21, 2024
Anthropic has launched a powerful new AI model, Claude 3.5 Sonnet, that can analyze text and images and generate text. That its release comes a mere three months after Anthropic debuted Claude 3 indicates just how quickly the field is developing. The Google-backed company says Claude 3.5 Sonnet has set “new industry benchmarks for graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval).” Sonnet is Anthropic’s mid-tier model, between Haiku and, on the high-end, Opus. Anthropic says 3.5 Sonnet is twice as fast as 3 Opus, offering “frontier intelligence at 2x the speed.” Continue reading Anthropic’s Claude 3.5: ‘Frontier Intelligence at 2x the Speed’
By
Paula ParisiJune 20, 2024
Meta Platforms is publicly releasing five new AI models from its Fundamental AI Research (FAIR) team, which has been experimenting with artificial intelligence since 2013. These models including image-to-text, text-to-music generation, and multi-token prediction tools. Meta is introducing a new technique called AudioSeal, an audio watermarking technique designed for the localized detection of AI-generated speech. “AudioSeal makes it possible to pinpoint AI-generated segments within a longer audio snippet,” according to Meta. The feature is timely in light of concern about potential misinformation surrounding the fall presidential election. Continue reading Meta’s FAIR Team Announces a New Collection of AI Models
By
Paula ParisiJune 19, 2024
Runway ML has introduced a new foundation model, Gen-3 Alpha, which the company says can generate high-quality, realistic scenes of up to 10 seconds long from text prompts, still images or a video sample. Offering a variety of camera movements, Gen-3 Alpha will initially roll out to Runway’s paid subscribers, but the company plans to add a free version in the future. Runway says Gen-3 Alpha is the first of a new series of models trained on the company’s new large-scale multimodal infrastructure, which offers improvements “in fidelity, consistency, and motion over Gen-2,” released last year. Continue reading Runway’s Gen-3 Alpha Creates AI Videos Up to 10-Seconds
By
Paula ParisiJune 14, 2024
Northern California startup Luma AI has released Dream Machine, a model that generates realistic videos from text prompts and images. Built on a scalable and multimodal transformer architecture and “trained directly on videos,” Dream Machine can create “action-packed scenes” that are physically accurate and consistent, says Luma, which has a free version of the model in public beta. Dream Machine is what Luma calls the first step toward “a universal imagination engine,” while others are calling it “powerful” and “slammed with traffic.” Though Luma has shared scant details, each posted sequence looks to be about 5 seconds long. Continue reading Luma AI Dream Machine Video Generator in Free Public Beta
By
Paula ParisiMay 29, 2024
Elon Musk’s xAI has secured $6 billion in Series B funding. While the company says the funds will be “used to take xAI’s first products to market, build advanced infrastructure, and accelerate the research and development,” some outlets are reporting a significant portion is earmarked to build an AI supercomputer to power the next generation of its foundation model Grok. The company publicly released the open-source Grok-1 as a chatbot on X social in November, and recently debuted Grok-1.5 and 1.5V iterations with long-context capability and image understanding. Continue reading Musk Said to Envision Supercomputer as xAI Raises $6 Billion
By
Paula ParisiMay 28, 2024
Meta Platforms has unveiled its first natively multimodal model, Chameleon, which observers say can make it competitive with frontier model firms. Although Chameleon is not yet released, Meta says internal research indicates it outperforms the company’s own Llama 2 in text-only tasks and “matches or exceeds the performance of much larger models” including Google’s Gemini Pro and OpenAI’s GPT-4V in a mixed-modal generation evaluation “where either the prompt or outputs contain mixed sequences of both images and text.” In addition, Meta calls Chameleon’s image generation “non-trivial,” noting that’s “all in a single model.” Continue reading Meta Advances Multimodal Model Architecture with Chameleon
By
Paula ParisiMay 16, 2024
Google has infused search with more Gemini AI, adding expanded AI Overviews and more planning and research capabilities. “Ask whatever’s on your mind or whatever you need to get done — from researching to planning to brainstorming — and Google will take care of the legwork” culling from “a knowledge base of billions of facts about people, places and things,” explained Google and Alphabet CEO Sundar Pichai at the Google I/O developer conference. AI Overviews will roll out to all U.S. users this week. Coming soon are customizable AI Overview options that can simplify language or add more detail. Continue reading Google Ups AI Quotient with Search-Optimized Gemini Model